-
-
Notifications
You must be signed in to change notification settings - Fork 923
Remove rest of page limit #5429
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
CouldntCreatePasswordResetRequest, | ||
CouldntCreateLoginToken, | ||
CouldntUpdateLocalSiteUrlBlocklist, | ||
CouldntCreateEmailVerification, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it really necessary to have separate error messages for all of these? At least Create and Update errors could be merged, eg with ChangeTaglineError
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think its worth it, mainly because a failed create denotes a constraint conflict, while a failed update means its missing, or something else. They're already done anyway.
)] | ||
#[cfg_attr(feature = "full", diesel(belongs_to(crate::source::post::Post)))] | ||
#[cfg_attr(feature = "full", diesel(table_name = post_actions))] | ||
#[cfg_attr(feature = "full", diesel(primary_key(person_id, post_id)))] | ||
#[cfg_attr(feature = "full", diesel(check_for_backend(diesel::pg::Pg)))] | ||
#[cfg_attr(feature = "full", ts(export))] | ||
#[cfg_attr(feature = "full", cursor_keys_module(name = post_actions_keys))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this for? The library doesnt have any docs unfortunately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its for sorting. I use this in 2 upcoming PRs, that are fairly simple and will make it clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CursorKeysModule
derive here creates the db_schema::source::post::post_action_keys
module. Until I add missing docs, you might understand it better by looking at references to the module's items. I couldn't quickly find a reference to post_actions_keys
, but post view imports a similar module (post_keys as key
).
let (_, id) = pids | ||
.as_slice() | ||
.first() | ||
.ok_or(LemmyErrorType::CouldntParsePaginationToken)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic is used various times, it should go into a helper method on PaginationCursor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, I'll do that now.
let sort = o.sort.unwrap_or_default(); | ||
let sort_direction = asc_if(sort == Old || sort == NameAsc); | ||
|
||
let mut pq = paginate(query, sort_direction, o.cursor_data, None, o.page_back); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not keep using query
variable here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It changes type after it passes through paginate, so I always change its name to paginated_query
(or pq
) to be more clear.
|
||
Ok(res) | ||
.await | ||
.with_lemmy_type(LemmyErrorType::NotFound) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why add explicit NotFound error in so many places? Diesel's NotFound error is automatically converted to LemmyError NotFound so this shouldnt be necessary at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So that all of these can explicitly return lemmy errors instead of diesel ones.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again these get converted automatically.
https://github.com/LemmyNet/lemmy/blob/main/crates/utils/src/error.rs#L221
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Part of this rework was to try to remove all diesel errors from schema and db_views, and return LemmyResult
for all the functions. Seems like a bad idea to only use diesel results for only these ones; I'd rather have the return types uniformly be lemmy results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried and .map_err(diesel::result::Error::into)
works, but thats not less verbose than .with_lemmy_type(LemmyErrorType::NotFound)
, and less explicit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
does the conversion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately that'd require wrapping every result in Ok(
.
crates/apub/src/lib.rs
Outdated
local_site, | ||
allowed_instances, | ||
blocked_instances, | ||
})) | ||
}) | ||
.await?, | ||
.await.map_err(|_e| LemmyErrorType::NotFound)?, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its generally not a good idea to ignore errors, so use .with_lemmy_type()
or simply remove map_err.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't find a way around this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right this one is strange, in other places it is handled like this:
.await
.map_err(|e| anyhow::anyhow!("err getting activity: {e:?}"))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Is there currently any test coverage for pagination? Something like create a few dozen posts, get the first page, then get next few pages and get previous pages. If not it should be added, whether as Rust unit test, API test or the db perf check. |
|
.optional()?; | ||
|
||
if let Some(largest_subscribed) = largest_subscribed { | ||
// TODO Not sure this is correct |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not correct, because it has to use the same sort as the normal query for correct results, and it has to use the same filters for full optimization
probably best to:
- move everything currently in
list
to a new helper function likelist_inner
which takes additional parameterscursor_before_data: Option<Post>, community_id_for_prefetch: Option<CommunityId>
- remove
cursor_before_data
andcommunity_id_just_for_prefetch
fromPostQuery
- make
list
just be a wrapper function that does mostly the same thing as theprefetch_cursor_before_data
function here but uses 2 calls tolist_inner
(first one is for getting the upper bound) - also in the
list
wrapper function, add back the conditionself.listing_type == Some(ListingType::Subscribed)
and useif let
to replace theself.local_user.is_some()
condition, so getting an upper bound is only done for subscribed view
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to make this its own issue and reference your comment, because this needs to be handled separately, and should be done in a clean and clear way that doesn't involve calling the same query twice.
We also need to clarify what specifically this is used for (I think only backwards pagination for the subscribed query?). Because there are potentially much faster and clearer ways that we could be doing subscribed queries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this will be fixed separately, then temporarily disable the optimization completely in this PR by changing o.cursor_before_data
to None
in the call to paginate
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't think it's reasonable to remove a significant optimization in the name of "prettying the code", without actually having a plan to add it back. what's the point? why do you dislike calling a function so much?
It was 3 lines of code, and you made it into 10 lines of code that do something else, worse (incorrect).
clean and clear way that doesn't involve calling the same query twice
It literally has to be the same query, except for the single change. That's the whole point of it. Otherwise the result will not be valid.
We also need to clarify what specifically this is used for
It is used whenever multiple communities are queried, which is almost always. I think it's pretty clear from the comment in the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update indexes to match changes made here, including the removed uses of reversetimestampkey
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one I'm not sure about, because I didn't remove any published sort. Anyways fixing the indexes should probably wait on #5555
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reverse_timestamp_key(published)
must be replaced with published
in the indexes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As an example for one index:
"idx_post_featured_local_active" btree (featured_local DESC, hot_rank_active DESC, published DESC, id DESC)
You mean that should be changed to a PUBLISHED ASC
?
There are like 20 of these indexes, so even if they should be changed, it should probably wait on #5555
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, that index stays the same. Only some indexes, including the one for the "old" sort type, used reverse_timestamp_key. Also I just realized that index changes might also be needed to match the removal of the featured column in the new/old sort. Both new and old sort will use the same index on (published, id)
, with both columns having the same direction.
These changes may be done later as long as you don't forget. (edit: I mentioned this in the issue you linked)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh gotcha. I'll remove / edit those ones now to all use published desc then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(community_id, featured_community DESC, published DESC, id DESC)
must be changed to (community_id, published DESC, id DESC)
because sorting by featured is now under if sort != New && sort != Old
also there might be a similar index, without the community id, that needs to be changed too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add one for that specifically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thats done, but there are a ton of complicated post indexes, and they all need to actually be tested in practice as a part of #5555 .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reverse_timestamp_key(published)
must be replaced with published
in the indexes
.optional()?; | ||
|
||
if let Some(largest_subscribed) = largest_subscribed { | ||
// TODO Not sure this is correct |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this will be fixed separately, then temporarily disable the optimization completely in this PR by changing o.cursor_before_data
to None
in the call to paginate
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems like this PR removes a significant optimization for no particular reason (above comment)
The context is linked in the code comment, but I will post it here since people may have forgotten. The TL;DR is that this prefetch is the difference between every page load fetching hundreds of thousands of posts vs fetching ~100, a difference between multiple seconds (remember when we had 60+ second page loads?) and <10ms. PR: #3872 Important Comment 1: #2877 (comment) Visualization: ![]() |
My main issue isn't with that logic, it was just with the post_view recursively calling itself, rather than prefetching the cursor (which was part of this refactor). I've added #5621 to the 1.0 milestone to make sure this gets added back in some form, so we'll make sure this gets re-added / fixed before any release. I'm going to merge this (once CI passes) as its blocking a lot of other PRs that need reviewed. |
This PR adds a lot, but its mostly boilerplate.
page_cursor
,page_back
, andlimit
to every API list fetch.cursor_data
is always fetched before the DB viewlist
function. Ideally the list function should only ever run 1 query.post_view
to fit this model.next_page
andprev_page
for the result of every API list fetch.Other cleanups:
db_schema
andb_views
returnLemmyResult
s, so no more error wrappers are necessary in the API code.PostTag
.keyword_block
code a bit, and made sure it pre-fetches the keywords.#4517