How is dataview so fast? #2116

starptr · 2023-10-18T21:04:36Z

starptr
Oct 18, 2023

My understanding of modern database engines like postgresql is that although they aggressively optimize, they still need to use runtime indices and whatnot to make their queries fast. Obsidian's underlying data is stored on the filesystem as files, which I would imagine makes fetching data quite slow compared to postgresql. So I guess my question is, what is the secret sauce that dataview uses, making the queries very fast?

Answered by blacksmithgu

Oct 30, 2023

Sqlite and Postgres are absolutely faster than Dataview, it's just that the amount of data in a user vault is small enough for it to not really matter in most cases. Dataview essentially stores an in-memory cache of all of the useful metadata in the vault (every file, file frontmatter, paths, links, etc) and then uses that to do fast searches / execute queries. You'll notice that Dataview does not really support searching over the content of files - that would then require actually going to disk and scanning thousands of files and suddenly it would become much slower for certain kinds of queries.

View full answer

ryanmcd118 · 2023-10-29T23:09:58Z

ryanmcd118
Oct 29, 2023

Hey @starptr! This is an interesting Q I've wondered about too. I'm not affiliated with the Dataview team at all, so speaking here from an external perspective with some guesses. I think you're right that a traditional DBMS like Postgres would outshine Dataview in terms of speed/efficiency (like you allude to, using robust indexing, aggressive optimizations, and concurrency control mechanisms) in some use cases with very large datasets or very complex querying needs.

That said, in an Obsidian vault environment specifically, I think some of the major factors in DV's speed include:

You mention localized data access / filesystem storage -- for this use case, I actually think that local machine data access is much faster than any networked db system where data is fetched over a network (introducing network latency variables, e.g.). Potentially significant speed points right off the bat.
There's no need for ACID (Atomicity, Consistency, Isolation, Durability) compliance like there is in a SQL db. That means less security for Obsidian & DV, yes, but also less overhead in regards to managing transactions, logging, and other safety measures. In the end, one positive result is faster query execution.
Datasets are generally smaller -- since PKS like Obsidian typically manage much less data than a typical SQL db, I'd imagine they inherently require less query time.
An Obsidian vault is configured for single-user access, so for DV there's no need for complex concurrency control, multi-user transaction management, etc (unlike in a multi-user DBMS).
Just guessing here, but there are likely other optimizations built into DV as well -- maybe things like caching of frequently accessed data / metadata, indexing on some data attributes to speed up querying, in-memory data representation (vs continually reading from disk), etc.

Hope that answers your question! Would love to hear your & others' thoughts too.

0 replies

blacksmithgu · 2023-10-30T04:27:59Z

blacksmithgu
Oct 30, 2023
Maintainer

Sqlite and Postgres are absolutely faster than Dataview, it's just that the amount of data in a user vault is small enough for it to not really matter in most cases. Dataview essentially stores an in-memory cache of all of the useful metadata in the vault (every file, file frontmatter, paths, links, etc) and then uses that to do fast searches / execute queries. You'll notice that Dataview does not really support searching over the content of files - that would then require actually going to disk and scanning thousands of files and suddenly it would become much slower for certain kinds of queries.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How is dataview so fast? #2116

{{title}}

Replies: 2 comments

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

How is dataview so fast? #2116

starptr Oct 18, 2023

Replies: 2 comments

ryanmcd118 Oct 29, 2023

blacksmithgu Oct 30, 2023 Maintainer

starptr
Oct 18, 2023

ryanmcd118
Oct 29, 2023

blacksmithgu
Oct 30, 2023
Maintainer