Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

主题帖获取问题 #226

Closed
codinghahaha opened this issue Aug 29, 2024 · 13 comments
Closed

主题帖获取问题 #226

codinghahaha opened this issue Aug 29, 2024 · 13 comments
Labels
discussion discussion

Comments

@codinghahaha
Copy link

FastStoneEditor1
2024-08-29_215319
page_size=105,最大能获取100条。剩余5条该怎么获取呢?

@n0099
Copy link

n0099 commented Aug 29, 2024

current_page=3
https://github.com/n0099/tbclient.protobuf/blob/12.51.7.1/proto/Page.proto#L9

while page.has_more = 1
    current_page++

@codinghahaha
Copy link
Author

@n0099 我测试了,翻到下一页是从下一页开始获取,上一页的最后5条获取不到的。

@n0099
Copy link

n0099 commented Aug 30, 2024

rn是您要求贴吧每pn给您多少帖子
然而其服务端显然有着给您返回任意多条帖子的自由例如插入一些LLM #210 (comment) 广告 #216 (comment)
您应该以其实际返回了多少帖子而非您提供的rn为准
最简单的例子:如果您请求的pn根本不存在(亦或该pn没有那么多帖子)那显然应该期望其返回0(亦或该pn下的所有帖子)条而非您要求的rn条帖子
另外对于offset paginationpn更好的是cursor/keyset pagination 正如同年初的 #158 (comment)

@lumina37
Copy link
Owner

有测试样本吗,比如哪个帖的后五条回复获取不到

@codinghahaha
Copy link
Author

FastStoneEditor122
FastStoneEditor1
@Starry-ovo 我调用的是get_threads接口,获取帖子时后最后五条获取不到,不是帖子的回复。我是在土木工程吧测试的。上面两张图感觉像是一瓶105ml的水,最多只能喝100ml

@n0099
Copy link

n0099 commented Aug 31, 2024

上面两张图感觉像是一瓶105ml的水,最多只能喝100ml

是贴吧只给您至多100ml的水,想要喝105ml就必须换下一瓶然后倒掉剩余的95ml

#226 (comment)

while page.has_more = 1
    current_page++

@codinghahaha
Copy link
Author

@n0099 一瓶水喝不完,最后5ml只能倒掉嘛。把req_proto.data.rn = 105,改成req_proto.data.rn = 100,一瓶水改成100ml的,最多喝100ml。这样改行不行,还是说会引出其他问题。

@n0099
Copy link

n0099 commented Aug 31, 2024

是贴吧只给您至多100ml的水

而非您能喝多少水的问题
所以为什么要用奇怪的现实类比徒增抽象隔离?

@codinghahaha
Copy link
Author

改成req_proto.data.rn = 100可以的

@lumina37
Copy link
Owner

虽然不知道之前为什么要设rn=105不过应该是有道理的

——《屎山的诞生》

@n0099
Copy link

n0099 commented Aug 31, 2024

6a70a5b#diff-1b7e1df8383da27d0517a40fabe0e965c2d7a435983688a5d6ee0ce6f023dd4bR16
因为这只有rn_need才是真的rn
而由于 #83 (comment) 必须让 $$rn>rn\_need$$ 所以随便来个比rn上限100大的数,我自换到protobuf接口 n0099/open-tbm@a83b672#diff-c631ae64eec29d54d4c85f3831440863196a63d90205f7f36c78129588152c6dR55 以来一直是rn=90 rn_need=30但我也忘了22年时为什么要这样设大3倍从而避免了 #83 (comment)

@codinghahaha
Copy link
Author

rn>rn_need会漏爬数据

@n0099
Copy link

n0099 commented Aug 31, 2024

因为这只有rn_need才是真的rn

您设rn_need=100 rn=114514贴吧也只会给您100条主题帖
但如果 $$rn<=rn\_need$$ 对某些(fid, pn)组合就可能100条都没有(假设该pn下有那么多主题帖)

@lumina37 lumina37 added question StackOverflow discussion discussion and removed question StackOverflow labels Sep 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion discussion
Projects
None yet
Development

No branches or pull requests

3 participants