You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The reasoning behind this is that the server needs to know the exact column to build the index over and search.
Now that we have an API to get the schema, we can do much better.
The logic should now be:
Get the schema for the dataset
Scan the schema to find a SINGLE match for the component selector
If there are no matches, or multiple matches this should raise an error about an ambiguous component selector.
Note: tagged components are going to create situations that will be inherently ambiguous (follow up with @Wumpf to make sure we are considering this in the tagged component workstream -- component Selectors will need to be expanded to be tagged-component aware.)
This logic needs to be replicated in all of:
create_fts_index
create_vector_index
fts_search
vector_search
This probably means writing some helper on dataset like resolve_component_selector. There is some similar logic for view contents now in dataframe_query.rs`.
The text was updated successfully, but these errors were encountered:
### Related
* Part of #9837
### What
Move `{Column|ComponentColumn|TimelineColumn}Selector` to `re_sorbet`
where they belong (alongside the `*Descriptor` crowd).
### Related
* Part of #9837
### What
Move `{Column|ComponentColumn|TimelineColumn}Selector` to `re_sorbet`
where they belong (alongside the `*Descriptor` crowd).
…arch APIs (#9854)
### Related
* Fixes#9837
* Further issue to address:
* #9853
* #9855
### What
Initial attempt to formalise component column selector, how they are
matched against a schema, and how they are expressed in our Python API.
Applied on dataset index creation/search APIs.
TODO:
- [x] use `AnyComponentColumn` in APIs
- [x] cleanup and fix type stubs
---------
Co-authored-by: Jeremy Leibs <[email protected]>
The Vector/FTS Search APIs currently have inconsistent hacks that only sometimes do the right thing.
See, for example:
rerun/rerun_py/src/catalog/dataset.rs
Lines 227 to 232 in bc0e318
The reasoning behind this is that the server needs to know the exact column to build the index over and search.
Now that we have an API to get the schema, we can do much better.
The logic should now be:
This logic needs to be replicated in all of:
create_fts_index
create_vector_index
fts_search
vector_search
This probably means writing some helper on dataset like
resolve_component_selector. There is some similar logic for view contents now in
dataframe_query.rs`.The text was updated successfully, but these errors were encountered: