Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add VTT file support to Document Extractor #11148

Conversation

fujita-h
Copy link
Contributor

Summary

VTT files are used as transcription result files in web conferencing systems such as Zoom and Teams. There are cases where these contents are passed to LLM to be summarized. Therefore, we would like to be able to extract VTT files to Document Extractor as plain text files.

Resolves #11147

Checklist

Important

Please review the checklist below before submitting your pull request.

  • This change requires a documentation update, included: Dify Document
  • I understand that this PR may be closed in case there was no previous discussion or issues. (This doesn't apply to typos!)
  • I've added a test for each change that was introduced, and I tried as much as possible to make a single atomic change.
  • I've updated the documentation accordingly.
  • I ran dev/reformat(backend) and cd web && npx lint-staged(frontend) to appease the lint gods

@dosubot dosubot bot added size:XS This PR changes 0-9 lines, ignoring generated files. 💪 enhancement New feature or request labels Nov 26, 2024
@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Nov 27, 2024
@crazywoola crazywoola merged commit a918cea into langgenius:main Nov 27, 2024
5 checks passed
@fujita-h fujita-h deleted the feat-add-vtt-support-to-document-extractor branch November 28, 2024 08:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
💪 enhancement New feature or request lgtm This PR has been approved by a maintainer size:XS This PR changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

feat: add VTT file support to Document Extractor
2 participants