Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
These changes introduce a new abstraction called a
cref
that is an expansion on the idea of aref
geared toward organizing business logic rather than materializing a database object.This feature was built with Claude Code (Claude 4.0).
Problem
Consider these two mini dbt models in an imaginary DAG - which is a better implementation?
(1)
(2)
Someone unfamiliar with dbt might think (1) is better because it's more DRY. But an experienced Analytics Engineer would know (2) is better because it avoids creating a dependency on
fct_orders
, which is probably a very deep node (= a heavy dependency) meant for external use, not as an input to further internal constructions. (2) will parallelizes better.The problem is that building like (2) is harder than building like (1). To build like (2), the developer needs to look through the existing DAG and find the shallowest available references to the features they need (
o.created_at, o.status, customer_review
), and then re-construct a join that probably exists infct_orders
already.It would be great if developer's could simply name an entity grain and its features, and have dbt generate the minimal joins (and cycle detections) automatically.
Solution
This PR introduces the
Conceptual Ref
-cref()
andConcept
definition (the wordEntity
was already taken!).A 'concept' is defined with YAML, similar to a LookML Explore or semantic model. Except the joins must be either M:1 or 1:1 relative to the base table. A concept represents the potential feature joins to a given grain.
It allows models to look more like query (1), while maintaining the advantages of (2).
It parses/compiles to the minimal reference, as a subquery, similar to the new
microbatch
date filters.The underlying YAML would look like this:
Checklist