Skip to content

Commit 8dc9283

Browse files
committed
checkpoint
1 parent 919cd5b commit 8dc9283

File tree

6 files changed

+102
-43
lines changed

6 files changed

+102
-43
lines changed

docs/source/overview/ai-queries.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ This page details how you can use AI models in different ways to construct AI qu
99

1010
.. note::
1111

12-
EvaDB ships with a wide range of built-in functions listed in the :ref:`models` page. If your desired AI model is not available, you can also bring your own AI function by referrring to the :ref:`custom_ai_function` page.
12+
EvaDB ships with a wide range of built-in functions listed in the :ref:`models` page. If your desired AI model is not available, you can also bring your own AI function by referring to the :ref:`custom_ai_function` page.
1313

1414
SELECT Clause
1515
-------------

docs/source/overview/roadmap.rst

Lines changed: 32 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -1,60 +1,58 @@
11
Roadmap
22
=======
33

4-
The goal of this doc is to align core and community efforts for the project and to share what's the focus for the next 6 months.
4+
The goal of this roadmap is to align the efforts of the core EvaDB team and community contributors by describing the biggest focus areas for the next 6 months:
55

6-
What is the core EvaDB team working on right now?
7-
--------------------------------------------------
8-
9-
Our biggest priorities right now are improving the user experience of LLM data wrangling and classical AI tasks (e.g., regression, classification, and forecasting).
6+
.. note::
7+
Please ping us on our `Slack<https://evadb.ai/slack>`_ if you any questions or feedback on these focus areas.
108

11-
LLM data wrangling
12-
~~~~~~~~~~~~~~~~~~
9+
LLM-based Data Wrangling
10+
~~~~~~~~~~~~~~~~~~~~~~~~
1311

14-
* Prompt Engineering: more flexibility of constructing prompt and better experience/feedback to tune the prompt.
15-
* LLM Cache: reuse the LLM calls based on the model, prompt, and input columns.
16-
* LLM Batch: intelligently group multiple LLM calls into one to reduce the cost and latency.
17-
* Cost Calculation and Estimation: show the cost (i.e., time, token usage, and dollars) of the query at the plan time and after execution.
12+
* Prompt Engineering: more flexibility of constructing prompt and better developer experience/feedback to tune the prompt.
13+
* LLM Cache: Reuse the results of LLM calls based on the model, prompt, and input columns.
14+
* LLM Batching: Intelligently group multiple LLM calls into a single call to reduce cost and latency.
15+
* LLM Cost Calculation and Estimation: Show the estimated cost metrics (i.e., time, token usage, and dollars) of the query at optimization time and the actual cost metrics after query execution.
1816

19-
Classical AI tasks
17+
Classical AI Tasks
2018
~~~~~~~~~~~~~~~~~~
2119

22-
* Accuracy: show the accuracy of the training.
23-
* Configuration guidance: provide guidance and suggestion on how to configure the AutoML framework (e.g., which frequency to use for forcasting).
24-
* Cost calculation and estimation: show the cost (i.e., time) of the query the plan time and after exectuion.
25-
* Path to Scale: improve the processing pipeline for large datasets.
26-
27-
What areas are great for community contributions?
28-
--------------------------------------------------
20+
* Accuracy: Show the accuracy of the training loop.
21+
* Configuration Guidance: Provide guidance on how to configure the AutoML framework (e.g., which frequency to use for forecasting).
22+
* Task Cost calculation and Estimation: Show the estimated cost metrics (i.e., time) of the query at optimization time and the actual cost metrics after execution.
23+
* Path to Scalability: Improve the efficiency of the query processing pipeline for large datasets.
2924

30-
.. note::
31-
If you are unsure about your idea, feel free to chat with us in the **#community** channel in our `Slack <https://evadb.ai/slack>`_.
3225

3326
We are looking forward to expand our integrations including data sources and AI functions, where we can use them with the rest of the ecosystem of EvaDB.
3427

35-
Example Data Sources
36-
~~~~~~~~~~~~~~~~~~~~
28+
More Application Data Sources
29+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
30+
31+
`GitHub <https://github.com/georgia-tech-db/evadb/tree/staging/evadb/third_party/databases/github>`_ is an **application data source** already available in EvaDB. Such data sources allow the developer to quickly build AI applications without focusing on extracting, loading, and transforming data from the application.
3732

38-
`GitHub <https://github.com/georgia-tech-db/evadb/tree/staging/evadb/third_party/databases/github>`_ is one application data sources we have added in EvaDB. These application data sources help the user to develop AI applications without the needs of extracting, loading, and transforming data. Example application data sources that are not in EvaDB yet, but we think can boost the AI applications, include (but not limited to) the following:
33+
Data sources that are not available in EvaDB yet, but would be super relevant for emerging AI applications, include (but not limited to) the following applications:
3934

4035
* YouTube
4136
* Google Search
4237
* Reddit
4338
* arXiv
39+
* Hacker News
40+
41+
When adding a data source to EvaDB, please add a documentation page in your PR explaining the usage. Here is an `illustrative documentation page <https://evadb.readthedocs.io/en/stable/source/reference/databases/github.html>`_ for the GitHub data source in EvaDB.
42+
43+
More AI functions
44+
~~~~~~~~~~~~~~~~~
4445

45-
When adding a data source to EvaDB, we do expect a documentation page to explain the usage. This is an `example documentation page <https://evadb.readthedocs.io/en/stable/source/reference/databases/github.html>`_ for the GitHub integration.
46+
Adding more AI functions in EvaDB will enable more choices for app developers while building AI applications.
4647

47-
Example AI functions
48-
~~~~~~~~~~~~~~~~~~~~
48+
`Stable Diffusion <https://github.com/georgia-tech-db/evadb/blob/staging/evadb/functions/stable_diffusion.py>`_ is an illustrative AI function in EvaDB that generates an image given a text prompt.
4949

50-
Adding more AI functions in EvaDB can give users more choices and possibilities for developing AI applications.
51-
`Stable Diffusion <https://github.com/georgia-tech-db/evadb/blob/staging/evadb/functions/stable_diffusion.py>`_ is an example AI function in EvaDB that generates an image given a prompt.
52-
Example AI functions that are not in EvaDB yet, but we think can boost the AI applications, include (but not limited to) the following:
50+
AI functions that are not available in EvaDB yet, but would be super relevant for emerging AI applications, include (but not limited to) the following:
5351

54-
* Sklearn (besides the linear regression)
52+
* Sklearn (beyond linear regression)
5553
* OCR (PyTesseract)
56-
* AWS Rekognition service
54+
* AWS Rekognition service
5755

58-
When adding a AI function to EvaDB, we do expect a documentation page to explain the usage. This is an `example documetation page <https://evadb.readthedocs.io/en/latest/source/reference/ai/stablediffusion.html>`_ for Stable Diffusion. Optionally, but highly recommended is also to have a notebook to showcase the use cases.
59-
Example `notebook <https://colab.research.google.com/github/georgia-tech-db/eva/blob/master/tutorials/18-stable-diffusion.ipynb>`_ for Stable Diffusion.
56+
When adding an AI function to EvaDB, please add a documentation page in your PR explaining the usage. Here is an `illustrative documentation page <https://evadb.readthedocs.io/en/latest/source/reference/ai/stablediffusion.html>`_ for Stable Diffusion.
6057

58+
Notebooks are also super helpful to showcase use-cases! Here is an illustrative `notebook <https://colab.research.google.com/github/georgia-tech-db/eva/blob/master/tutorials/18-stable-diffusion.ipynb>`_ on using Stable Diffusion in EvaDB queries.

docs/source/reference/ai/model-train-xgboost.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ To use the `Flaml XGBoost AutoML framework <https://microsoft.github.io/FLAML/do
2323
PREDICT 'rental_price';
2424
2525
In the above query, you are creating a new customized function by training a model from the ``HomeRentals`` table using the ``Flaml XGBoost`` framework.
26-
The ``rental_price`` column will be the target column for predication, while the rest columns from the ``SELET`` query are the inputs.
26+
The ``rental_price`` column will be the target column for predication, while the rest columns from the ``SELECT`` query are the inputs.
2727

2828
3. Model Training Parameters
2929
----------------------------

evadb/binder/function_expression_binder.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -112,10 +112,10 @@ def bind_func_expr(binder: StatementBinder, node: FunctionExpression):
112112
if string_comparison_case_insensitive(node.name, "CHATGPT"):
113113
# if the user didn't provide any API_KEY, check if we have one in the catalog
114114
if "OPENAI_API_KEY" not in properties.keys():
115-
openapi_key = binder._catalog().get_configuration_catalog_value(
115+
OpenAI_key = binder._catalog().get_configuration_catalog_value(
116116
"OPENAI_API_KEY"
117117
)
118-
properties["openai_api_key"] = openapi_key
118+
properties["openai_api_key"] = openai_key
119119

120120
node.function = lambda: function_class(**properties)
121121
except Exception as e:

evadb/binder/statement_binder_context.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -152,10 +152,10 @@ def raise_error():
152152
res = process.extractOne(col_name, all_columns)
153153
if res is not None:
154154
guess_column, _ = res
155-
err_msg = f"Cannnot find column {col_name}. Did you mean {guess_column}? The feasible columns are {all_columns}."
155+
err_msg = f"Cannot find column {col_name}. Did you mean {guess_column}? The feasible columns are {all_columns}."
156156
else:
157157
err_msg = (
158-
f"Cannnot find column {col_name}. There are no feasible columns."
158+
f"Cannot find column {col_name}. There are no feasible columns."
159159
)
160160
logger.error(err_msg)
161161
raise BinderError(err_msg)

0 commit comments

Comments
 (0)