-
Notifications
You must be signed in to change notification settings - Fork 435
fix: Add getting started tutorial to git #870
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
b82bc41
add gettin started to git
Vasilije1990 39b8a6a
Sort ref
Vasilije1990 646debc
Merge branch 'dev' into add_test_to_git
Vasilije1990 5eb9954
Merge branch 'dev' into add_test_to_git
hajdul88 ef90a99
Merge branch 'dev' into add_test_to_git
hajdul88 ad6cd8f
chore: formats cognee starter kit with ruff
hajdul88 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file was deleted.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
# In case you choose to use OpenAI provider, just adjust the model and api_key. | ||
LLM_API_KEY="" | ||
LLM_MODEL="openai/gpt-4o-mini" | ||
LLM_PROVIDER="openai" | ||
# Not needed if you use OpenAI | ||
LLM_ENDPOINT="" | ||
LLM_API_VERSION="" | ||
|
||
# In case you choose to use OpenAI provider, just adjust the model and api_key. | ||
EMBEDDING_API_KEY="" | ||
EMBEDDING_MODEL="openai/text-embedding-3-large" | ||
EMBEDDING_PROVIDER="openai" | ||
# Not needed if you use OpenAI | ||
EMBEDDING_ENDPOINT="" | ||
EMBEDDING_API_VERSION="" | ||
|
||
|
||
GRAPHISTRY_USERNAME="" | ||
GRAPHISTRY_PASSWORD="" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,196 @@ | ||
.data | ||
.env | ||
.local.env | ||
.prod.env | ||
cognee/.data/ | ||
|
||
code_pipeline_output*/ | ||
|
||
*.lance/ | ||
.DS_Store | ||
# Byte-compiled / optimized / DLL files | ||
__pycache__/ | ||
*.py[cod] | ||
*$py.class | ||
|
||
full_run.ipynb | ||
|
||
# C extensions | ||
*.so | ||
|
||
# Distribution / packaging | ||
.Python | ||
build/ | ||
develop-eggs/ | ||
dist/ | ||
downloads/ | ||
eggs/ | ||
.eggs/ | ||
lib/ | ||
lib64/ | ||
parts/ | ||
sdist/ | ||
var/ | ||
wheels/ | ||
share/python-wheels/ | ||
*.egg-info/ | ||
.installed.cfg | ||
*.egg | ||
MANIFEST | ||
|
||
# PyInstaller | ||
# Usually these files are written by a python script from a template | ||
# before PyInstaller builds the exe, so as to inject date/other infos into it. | ||
*.manifest | ||
*.spec | ||
|
||
# Installer logs | ||
pip-log.txt | ||
pip-delete-this-directory.txt | ||
|
||
# Unit test / coverage reports | ||
htmlcov/ | ||
.tox/ | ||
.nox/ | ||
.coverage | ||
.coverage.* | ||
.cache | ||
nosetests.xml | ||
coverage.xml | ||
*.cover | ||
*.py,cover | ||
.hypothesis/ | ||
.pytest_cache/ | ||
cover/ | ||
|
||
# Translations | ||
*.mo | ||
*.pot | ||
|
||
# Django stuff: | ||
*.log | ||
local_settings.py | ||
db.sqlite3 | ||
db.sqlite3-journal | ||
|
||
# Cognee logs directory - keep directory, ignore contents | ||
logs/* | ||
!logs/.gitkeep | ||
!logs/README.md | ||
|
||
# Flask stuff: | ||
instance/ | ||
.webassets-cache | ||
|
||
# Scrapy stuff: | ||
.scrapy | ||
|
||
# Sphinx documentation | ||
docs/_build/ | ||
|
||
# PyBuilder | ||
.pybuilder/ | ||
target/ | ||
|
||
# Jupyter Notebook | ||
.ipynb_checkpoints | ||
|
||
# IPython | ||
profile_default/ | ||
ipython_config.py | ||
|
||
# pyenv | ||
# For a library or package, you might want to ignore these files since the code is | ||
# intended to run in multiple environments; otherwise, check them in: | ||
# .python-version | ||
|
||
# pipenv | ||
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. | ||
# However, in case of collaboration, if having platform-specific dependencies or dependencies | ||
# having no cross-platform support, pipenv may install dependencies that don't work, or not | ||
# install all needed dependencies. | ||
#Pipfile.lock | ||
|
||
# poetry | ||
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control. | ||
# This is especially recommended for binary packages to ensure reproducibility, and is more | ||
# commonly ignored for libraries. | ||
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control | ||
#poetry.lock | ||
|
||
# pdm | ||
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control. | ||
#pdm.lock | ||
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it | ||
# in version control. | ||
# https://pdm.fming.dev/#use-with-ide | ||
.pdm.toml | ||
|
||
# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm | ||
__pypackages__/ | ||
|
||
# Celery stuff | ||
celerybeat-schedule | ||
celerybeat.pid | ||
|
||
# SageMath parsed files | ||
*.sage.py | ||
|
||
# Environments | ||
.env | ||
.env.local | ||
.venv | ||
env/ | ||
venv/ | ||
ENV/ | ||
env.bak/ | ||
venv.bak/ | ||
|
||
# Spyder project settings | ||
.spyderproject | ||
.spyproject | ||
|
||
# Rope project settings | ||
.ropeproject | ||
|
||
# mkdocs documentation | ||
/site | ||
|
||
# mypy | ||
.mypy_cache/ | ||
.dmypy.json | ||
dmypy.json | ||
|
||
# Pyre type checker | ||
.pyre/ | ||
|
||
# pytype static type analyzer | ||
.pytype/ | ||
|
||
# Cython debug symbols | ||
cython_debug/ | ||
|
||
# PyCharm | ||
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can | ||
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore | ||
# and can be added to the global gitignore or merged into this file. For a more nuclear | ||
# option (not recommended) you can uncomment the following to ignore the entire idea folder. | ||
.idea/ | ||
|
||
.vscode/ | ||
cognee/data/ | ||
cognee/cache/ | ||
|
||
# Default cognee system directory, used in development | ||
.cognee_system/ | ||
.data_storage/ | ||
.artifacts/ | ||
.anon_id | ||
|
||
node_modules/ | ||
|
||
# Evals | ||
SWE-bench_testsample/ | ||
|
||
# ChromaDB Data | ||
.chromadb_data/ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,98 @@ | ||
|
||
# Cognee Starter Kit | ||
Welcome to the <a href="https://github.com/topoteretes/cognee">cognee</a> Starter Repo! This repository is designed to help you get started quickly by providing a structured dataset and pre-built data pipelines using cognee to build powerful knowledge graphs. | ||
|
||
You can use this repo to ingest, process, and visualize data in minutes. | ||
|
||
By following this guide, you will: | ||
|
||
- Load structured company and employee data | ||
- Utilize pre-built pipelines for data processing | ||
- Perform graph-based search and query operations | ||
- Visualize entity relationships effortlessly on a graph | ||
|
||
# How to Use This Repo 🛠 | ||
|
||
## Install uv if you don't have it on your system | ||
``` | ||
pip install uv | ||
``` | ||
## Install dependencies | ||
``` | ||
uv sync | ||
``` | ||
|
||
## Setup LLM | ||
Add environment variables to `.env` file. | ||
In case you choose to use OpenAI provider, add just the model and api_key. | ||
``` | ||
LLM_PROVIDER="" | ||
LLM_MODEL="" | ||
LLM_ENDPOINT="" | ||
LLM_API_KEY="" | ||
LLM_API_VERSION="" | ||
|
||
EMBEDDING_PROVIDER="" | ||
EMBEDDING_MODEL="" | ||
EMBEDDING_ENDPOINT="" | ||
EMBEDDING_API_KEY="" | ||
EMBEDDING_API_VERSION="" | ||
``` | ||
|
||
Activate the Python environment: | ||
``` | ||
source .venv/bin/activate | ||
``` | ||
|
||
## Run the Default Pipeline | ||
|
||
This script runs the cognify pipeline with default settings. It ingests text data, builds a knowledge graph, and allows you to run search queries. | ||
|
||
``` | ||
python src/pipelines/default.py | ||
``` | ||
|
||
## Run the Low-Level Pipeline | ||
|
||
This script implements its own pipeline with custom ingestion task. It processes the given JSON data about companies and employees, making it searchable via a graph. | ||
|
||
``` | ||
python src/pipelines/low_level.py | ||
``` | ||
|
||
## Run the Custom Model Pipeline | ||
|
||
Custom model uses custom pydantic model for graph extraction. This script categorizes programming languages as an example and visualizes relationships. | ||
|
||
``` | ||
python src/pipelines/custom-model.py | ||
``` | ||
|
||
## Graph preview | ||
|
||
cognee provides a visualize_graph function that will render the graph for you. | ||
|
||
``` | ||
graph_file_path = str( | ||
pathlib.Path( | ||
os.path.join(pathlib.Path(__file__).parent, ".artifacts/graph_visualization.html") | ||
).resolve() | ||
) | ||
await visualize_graph(graph_file_path) | ||
``` | ||
If you want to use tools like Graphistry for graph visualization: | ||
- create an account and API key from https://www.graphistry.com | ||
- add the following environment variables to `.env` file: | ||
``` | ||
GRAPHISTRY_USERNAME="" | ||
GRAPHISTRY_PASSWORD="" | ||
``` | ||
Note: `GRAPHISTRY_PASSWORD` is API key. | ||
|
||
|
||
# What will you build with cognee? | ||
|
||
- Expand the dataset by adding more structured/unstructured data | ||
- Customize the data model to fit your use case | ||
- Use the search API to build an intelligent assistant | ||
- Visualize knowledge graphs for better insights |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
[project] | ||
name = "cognee-starter" | ||
version = "0.1.1" | ||
description = "Starter project which can be harvested for parts" | ||
readme = "README.md" | ||
|
||
requires-python = ">=3.10, <=3.13" | ||
|
||
dependencies = [ | ||
"cognee>=0.1.38", | ||
] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
[ | ||
{ | ||
"name": "TechNova Inc.", | ||
"departments": [ | ||
"Engineering", | ||
"Marketing" | ||
] | ||
}, | ||
{ | ||
"name": "GreenFuture Solutions", | ||
"departments": [ | ||
"Research & Development", | ||
"Sales", | ||
"Customer Support" | ||
] | ||
}, | ||
{ | ||
"name": "Skyline Financials", | ||
"departments": [ | ||
"Accounting" | ||
] | ||
}, | ||
{ | ||
"name": "MediCare Plus", | ||
"departments": [ | ||
"Healthcare", | ||
"Administration" | ||
] | ||
}, | ||
{ | ||
"name": "NextGen Robotics", | ||
"departments": [ | ||
"AI Development", | ||
"Manufacturing", | ||
"HR" | ||
] | ||
} | ||
] |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Fix markdown formatting and language issues.
The static analysis tools have identified several formatting and language issues that should be addressed for better documentation quality.
Install dependencies
-
+
bashuv sync
Activate the Python environment:
-
+
bashsource .venv/bin/activate
@@ -56,7 +56,7 @@ python src/pipelines/default.py
This script implements its own pipeline with custom ingestion task. It processes the given JSON data about companies and employees, making it searchable via a graph.
-
+
bashpython src/pipelines/low_level.py
@@ -72,7 +72,7 @@ python src/pipelines/custom-model.py
cognee provides a visualize_graph function that will render the graph for you.
-
+
pythongraph_file_path = str(
pathlib.Path(
os.path.join(pathlib.Path(file).parent, ".artifacts/graph_visualization.html")
@@ -81,10 +81,10 @@ cognee provides a visualize_graph function that will render the graph for you.
await visualize_graph(graph_file_path)
-Note:
GRAPHISTRY_PASSWORD
is API key.+Note:
GRAPHISTRY_PASSWORD
is an API key.In cognee-starter-kit/README.md around lines 17 to 19 and other specified
ranges, fix markdown formatting by replacing plain code blocks with
language-specific fenced code blocks (e.g., use
bash or
python) for bettersyntax highlighting. Correct language issues such as changing "add the following
environment variables to
.env
file" to "add the following environmentvariables to the
.env
file" and clarify notes like changing"GRAPHISTRY_PASSWORD is API key" to "GRAPHISTRY_PASSWORD is an API key." Apply
these formatting and language corrections consistently throughout the mentioned
sections.