Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question]: About the maximum amount of data that can be stored and processed #2544

Open
emocat17 opened this issue Mar 7, 2025 · 2 comments
Labels
question Further information is requested

Comments

@emocat17
Copy link

emocat17 commented Mar 7, 2025

Recently, I'm considering using a database for RAG and AI Search, and the amount of data to be stored might be extremely large (at the PB level), with high requirements for data retrieval accuracy. So, I have a few questions to ask:

  • What is the maximum amount of data that this database can currently handle?
  • If deployed locally, can this database make corresponding expansions and migrations when space is insufficient?
  • Can this database store multiple file formats? For example, TXT, XLS/XLSX/CSV, PDF, JPEG/JPG/PNG, BMP, DOC/DOCX, JSON, HTML?

I'm already aware of some performance comparisons between this database and Elasticsearch, but I still want to know if, in the case of large-scale data as I described earlier, it can handle the above issues better, more conveniently, and with higher accuracy than Elasticsearch?

THANKS!!!

@emocat17 emocat17 added the question Further information is requested label Mar 7, 2025
@JinHai-CN
Copy link
Contributor

  1. Depending on the disk and memory your machine. Infinity doesn't limit the capacity.
  2. We are developing the backup and restore function. Before of that, you can export the data as CSV/Parquet/JSONL format of files. But the indexes are not involved.
  3. This database stores the data of vector/full-text/tensor, but not the file.

The benchmark comparison of Infinity and ES we provided are tested on the same hardware configuration. On your question, we think the answer is YES: Infinity will be better.

@emocat17
Copy link
Author

TNANKYOUVERYMUCH

@emocat17 emocat17 reopened this Mar 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants