-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Slack Federated Search v0 #4962
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR Summary
Introduces federated search capabilities by integrating Slack messages alongside document index search, running keyword-based Slack retrieval in parallel with main document search.
- Added new federated search module in
/onyx/context/search/federated/
implementing Slack message search with parallel execution and source filtering - Added Pydantic models in
models.py
for structured handling of Slack messages and elements - Modified
search_runner.py
to coordinate parallel execution of document and federated searches with source-based filtering - Slack search uses basic keyword matching which may limit effectiveness on complex queries
- Zero scoring of Slack results means they'll always appear after regular search results unless explicitly marked relevant
6 files reviewed, 3 comments
Edit PR Review Bot Settings | Greptile
@@ -115,34 +114,6 @@ def combine_retrieval_results( | |||
return sorted_chunks | |||
|
|||
|
|||
def get_query_embedding(query: str, db_session: Session) -> Embedding: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved it into utils as I might play around with embedding the query and chunks in slack_search.py to score the chunks (if it's here, it'll lead to circular imports).
f34483f
to
a678e5d
Compare
a678e5d
to
aaf5635
Compare
Merged as part of #4969 (comment) |
Description
Current apporach:
slack_retrieval
in parallel withdoc_index_retrieval
to get slack documentsNUM_FEDERATED_SECTIONS
sections returned from federated search. The sections are sorted by score so the slack documents will always appear at the end as they have 0 score (unlesssection_relevance_list
is provided, in which case sections marked not relevant will appear even lower)Notes:
How Has This Been Tested?
Locally, could write test cases
Backporting (check the box to trigger backport action)
Note: You have to check that the action passes, otherwise resolve the conflicts manually and tag the patches.