Smart Data Enrichment Tool

Some ideas for a potential "smart data augmentation" tool that could be built on top of Open Datasets.

The idea is to pass your data through a set of "checks" or "matches". You get back a bunch of extra columns that might be relevant. These are derived from all the open datasets. 

The matching is done by an LLM. It receives a every column name and a sample of values, and tries to match it with known relevant columns¹.

Additionally, suggest some LLMs-derived columns from existing ones (e.g: Country column passed through a "Capital" prompt) or let the user set a custom prompt to "augment" one of the columns. This won't use any real data but could be useful (e.g: to classify a text sentiment).

¹ To make it fast, each column could run in parallel. The samples could be embedded and used to retrieve similar columns in the Open Datasets space. Same could be done at the column level. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Smart Data Enrichment Tool #48

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Smart Data Enrichment Tool #48

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions