Skip to content

Commit ecbabbd

Browse files
fix: Add getting started tutorial to git (#870)
<!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin. --------- Co-authored-by: hajdul88 <[email protected]>
1 parent ce3a37f commit ecbabbd

File tree

12 files changed

+770
-6
lines changed

12 files changed

+770
-6
lines changed

.dlt/config.toml

Lines changed: 0 additions & 6 deletions
This file was deleted.

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -63,6 +63,7 @@ More on [use-cases](https://docs.cognee.ai/use-cases) and [evals](https://github
6363
Get started quickly with a Google Colab <a href="https://colab.research.google.com/drive/1jHbWVypDgCLwjE71GSXhRL3YxYhCZzG1?usp=sharing">notebook</a> , <a href="https://deepnote.com/workspace/cognee-382213d0-0444-4c89-8265-13770e333c02/project/cognee-demo-78ffacb9-5832-4611-bb1a-560386068b30/notebook/Notebook-1-75b24cda566d4c24ab348f7150792601?utm_source=share-modal&utm_medium=product-shared-content&utm_campaign=notebook&utm_content=78ffacb9-5832-4611-bb1a-560386068b30">Deepnote notebook</a> or <a href="https://github.com/topoteretes/cognee-starter">starter repo</a>
6464

6565

66+
6667
## Contributing
6768
Your contributions are at the core of making this a true open source project. Any contributions you make are **greatly appreciated**. See [`CONTRIBUTING.md`](CONTRIBUTING.md) for more information.
6869

cognee-starter-kit/.env.template

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
# In case you choose to use OpenAI provider, just adjust the model and api_key.
2+
LLM_API_KEY=""
3+
LLM_MODEL="openai/gpt-4o-mini"
4+
LLM_PROVIDER="openai"
5+
# Not needed if you use OpenAI
6+
LLM_ENDPOINT=""
7+
LLM_API_VERSION=""
8+
9+
# In case you choose to use OpenAI provider, just adjust the model and api_key.
10+
EMBEDDING_API_KEY=""
11+
EMBEDDING_MODEL="openai/text-embedding-3-large"
12+
EMBEDDING_PROVIDER="openai"
13+
# Not needed if you use OpenAI
14+
EMBEDDING_ENDPOINT=""
15+
EMBEDDING_API_VERSION=""
16+
17+
18+
GRAPHISTRY_USERNAME=""
19+
GRAPHISTRY_PASSWORD=""

cognee-starter-kit/.gitignore

Lines changed: 196 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,196 @@
1+
.data
2+
.env
3+
.local.env
4+
.prod.env
5+
cognee/.data/
6+
7+
code_pipeline_output*/
8+
9+
*.lance/
10+
.DS_Store
11+
# Byte-compiled / optimized / DLL files
12+
__pycache__/
13+
*.py[cod]
14+
*$py.class
15+
16+
full_run.ipynb
17+
18+
# C extensions
19+
*.so
20+
21+
# Distribution / packaging
22+
.Python
23+
build/
24+
develop-eggs/
25+
dist/
26+
downloads/
27+
eggs/
28+
.eggs/
29+
lib/
30+
lib64/
31+
parts/
32+
sdist/
33+
var/
34+
wheels/
35+
share/python-wheels/
36+
*.egg-info/
37+
.installed.cfg
38+
*.egg
39+
MANIFEST
40+
41+
# PyInstaller
42+
# Usually these files are written by a python script from a template
43+
# before PyInstaller builds the exe, so as to inject date/other infos into it.
44+
*.manifest
45+
*.spec
46+
47+
# Installer logs
48+
pip-log.txt
49+
pip-delete-this-directory.txt
50+
51+
# Unit test / coverage reports
52+
htmlcov/
53+
.tox/
54+
.nox/
55+
.coverage
56+
.coverage.*
57+
.cache
58+
nosetests.xml
59+
coverage.xml
60+
*.cover
61+
*.py,cover
62+
.hypothesis/
63+
.pytest_cache/
64+
cover/
65+
66+
# Translations
67+
*.mo
68+
*.pot
69+
70+
# Django stuff:
71+
*.log
72+
local_settings.py
73+
db.sqlite3
74+
db.sqlite3-journal
75+
76+
# Cognee logs directory - keep directory, ignore contents
77+
logs/*
78+
!logs/.gitkeep
79+
!logs/README.md
80+
81+
# Flask stuff:
82+
instance/
83+
.webassets-cache
84+
85+
# Scrapy stuff:
86+
.scrapy
87+
88+
# Sphinx documentation
89+
docs/_build/
90+
91+
# PyBuilder
92+
.pybuilder/
93+
target/
94+
95+
# Jupyter Notebook
96+
.ipynb_checkpoints
97+
98+
# IPython
99+
profile_default/
100+
ipython_config.py
101+
102+
# pyenv
103+
# For a library or package, you might want to ignore these files since the code is
104+
# intended to run in multiple environments; otherwise, check them in:
105+
# .python-version
106+
107+
# pipenv
108+
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
109+
# However, in case of collaboration, if having platform-specific dependencies or dependencies
110+
# having no cross-platform support, pipenv may install dependencies that don't work, or not
111+
# install all needed dependencies.
112+
#Pipfile.lock
113+
114+
# poetry
115+
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
116+
# This is especially recommended for binary packages to ensure reproducibility, and is more
117+
# commonly ignored for libraries.
118+
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
119+
#poetry.lock
120+
121+
# pdm
122+
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
123+
#pdm.lock
124+
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
125+
# in version control.
126+
# https://pdm.fming.dev/#use-with-ide
127+
.pdm.toml
128+
129+
# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
130+
__pypackages__/
131+
132+
# Celery stuff
133+
celerybeat-schedule
134+
celerybeat.pid
135+
136+
# SageMath parsed files
137+
*.sage.py
138+
139+
# Environments
140+
.env
141+
.env.local
142+
.venv
143+
env/
144+
venv/
145+
ENV/
146+
env.bak/
147+
venv.bak/
148+
149+
# Spyder project settings
150+
.spyderproject
151+
.spyproject
152+
153+
# Rope project settings
154+
.ropeproject
155+
156+
# mkdocs documentation
157+
/site
158+
159+
# mypy
160+
.mypy_cache/
161+
.dmypy.json
162+
dmypy.json
163+
164+
# Pyre type checker
165+
.pyre/
166+
167+
# pytype static type analyzer
168+
.pytype/
169+
170+
# Cython debug symbols
171+
cython_debug/
172+
173+
# PyCharm
174+
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
175+
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
176+
# and can be added to the global gitignore or merged into this file. For a more nuclear
177+
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
178+
.idea/
179+
180+
.vscode/
181+
cognee/data/
182+
cognee/cache/
183+
184+
# Default cognee system directory, used in development
185+
.cognee_system/
186+
.data_storage/
187+
.artifacts/
188+
.anon_id
189+
190+
node_modules/
191+
192+
# Evals
193+
SWE-bench_testsample/
194+
195+
# ChromaDB Data
196+
.chromadb_data/

cognee-starter-kit/README.md

Lines changed: 98 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,98 @@
1+
2+
# Cognee Starter Kit
3+
Welcome to the <a href="https://github.com/topoteretes/cognee">cognee</a> Starter Repo! This repository is designed to help you get started quickly by providing a structured dataset and pre-built data pipelines using cognee to build powerful knowledge graphs.
4+
5+
You can use this repo to ingest, process, and visualize data in minutes.
6+
7+
By following this guide, you will:
8+
9+
- Load structured company and employee data
10+
- Utilize pre-built pipelines for data processing
11+
- Perform graph-based search and query operations
12+
- Visualize entity relationships effortlessly on a graph
13+
14+
# How to Use This Repo 🛠
15+
16+
## Install uv if you don't have it on your system
17+
```
18+
pip install uv
19+
```
20+
## Install dependencies
21+
```
22+
uv sync
23+
```
24+
25+
## Setup LLM
26+
Add environment variables to `.env` file.
27+
In case you choose to use OpenAI provider, add just the model and api_key.
28+
```
29+
LLM_PROVIDER=""
30+
LLM_MODEL=""
31+
LLM_ENDPOINT=""
32+
LLM_API_KEY=""
33+
LLM_API_VERSION=""
34+
35+
EMBEDDING_PROVIDER=""
36+
EMBEDDING_MODEL=""
37+
EMBEDDING_ENDPOINT=""
38+
EMBEDDING_API_KEY=""
39+
EMBEDDING_API_VERSION=""
40+
```
41+
42+
Activate the Python environment:
43+
```
44+
source .venv/bin/activate
45+
```
46+
47+
## Run the Default Pipeline
48+
49+
This script runs the cognify pipeline with default settings. It ingests text data, builds a knowledge graph, and allows you to run search queries.
50+
51+
```
52+
python src/pipelines/default.py
53+
```
54+
55+
## Run the Low-Level Pipeline
56+
57+
This script implements its own pipeline with custom ingestion task. It processes the given JSON data about companies and employees, making it searchable via a graph.
58+
59+
```
60+
python src/pipelines/low_level.py
61+
```
62+
63+
## Run the Custom Model Pipeline
64+
65+
Custom model uses custom pydantic model for graph extraction. This script categorizes programming languages as an example and visualizes relationships.
66+
67+
```
68+
python src/pipelines/custom-model.py
69+
```
70+
71+
## Graph preview
72+
73+
cognee provides a visualize_graph function that will render the graph for you.
74+
75+
```
76+
graph_file_path = str(
77+
pathlib.Path(
78+
os.path.join(pathlib.Path(__file__).parent, ".artifacts/graph_visualization.html")
79+
).resolve()
80+
)
81+
await visualize_graph(graph_file_path)
82+
```
83+
If you want to use tools like Graphistry for graph visualization:
84+
- create an account and API key from https://www.graphistry.com
85+
- add the following environment variables to `.env` file:
86+
```
87+
GRAPHISTRY_USERNAME=""
88+
GRAPHISTRY_PASSWORD=""
89+
```
90+
Note: `GRAPHISTRY_PASSWORD` is API key.
91+
92+
93+
# What will you build with cognee?
94+
95+
- Expand the dataset by adding more structured/unstructured data
96+
- Customize the data model to fit your use case
97+
- Use the search API to build an intelligent assistant
98+
- Visualize knowledge graphs for better insights

cognee-starter-kit/pyproject.toml

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
[project]
2+
name = "cognee-starter"
3+
version = "0.1.1"
4+
description = "Starter project which can be harvested for parts"
5+
readme = "README.md"
6+
7+
requires-python = ">=3.10, <=3.13"
8+
9+
dependencies = [
10+
"cognee>=0.1.38",
11+
]
Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
[
2+
{
3+
"name": "TechNova Inc.",
4+
"departments": [
5+
"Engineering",
6+
"Marketing"
7+
]
8+
},
9+
{
10+
"name": "GreenFuture Solutions",
11+
"departments": [
12+
"Research & Development",
13+
"Sales",
14+
"Customer Support"
15+
]
16+
},
17+
{
18+
"name": "Skyline Financials",
19+
"departments": [
20+
"Accounting"
21+
]
22+
},
23+
{
24+
"name": "MediCare Plus",
25+
"departments": [
26+
"Healthcare",
27+
"Administration"
28+
]
29+
},
30+
{
31+
"name": "NextGen Robotics",
32+
"departments": [
33+
"AI Development",
34+
"Manufacturing",
35+
"HR"
36+
]
37+
}
38+
]

0 commit comments

Comments
 (0)