How is dataview so fast? #2116
-
My understanding of modern database engines like postgresql is that although they aggressively optimize, they still need to use runtime indices and whatnot to make their queries fast. Obsidian's underlying data is stored on the filesystem as files, which I would imagine makes fetching data quite slow compared to postgresql. So I guess my question is, what is the secret sauce that dataview uses, making the queries very fast? |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
Hey @starptr! This is an interesting Q I've wondered about too. I'm not affiliated with the Dataview team at all, so speaking here from an external perspective with some guesses. I think you're right that a traditional DBMS like Postgres would outshine Dataview in terms of speed/efficiency (like you allude to, using robust indexing, aggressive optimizations, and concurrency control mechanisms) in some use cases with very large datasets or very complex querying needs. That said, in an Obsidian vault environment specifically, I think some of the major factors in DV's speed include:
Hope that answers your question! Would love to hear your & others' thoughts too. |
Beta Was this translation helpful? Give feedback.
-
Sqlite and Postgres are absolutely faster than Dataview, it's just that the amount of data in a user vault is small enough for it to not really matter in most cases. Dataview essentially stores an in-memory cache of all of the useful metadata in the vault (every file, file frontmatter, paths, links, etc) and then uses that to do fast searches / execute queries. You'll notice that Dataview does not really support searching over the content of files - that would then require actually going to disk and scanning thousands of files and suddenly it would become much slower for certain kinds of queries. |
Beta Was this translation helpful? Give feedback.
Sqlite and Postgres are absolutely faster than Dataview, it's just that the amount of data in a user vault is small enough for it to not really matter in most cases. Dataview essentially stores an in-memory cache of all of the useful metadata in the vault (every file, file frontmatter, paths, links, etc) and then uses that to do fast searches / execute queries. You'll notice that Dataview does not really support searching over the content of files - that would then require actually going to disk and scanning thousands of files and suddenly it would become much slower for certain kinds of queries.