Skip to content

Commit f993da5

Browse files
committed
Creating contribution docs
1 parent 3a8b9a5 commit f993da5

File tree

6 files changed

+221
-32
lines changed

6 files changed

+221
-32
lines changed

CODE_OF_CONDUCT.md

+70
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
# Contributor Covenant Code of Conduct
2+
3+
## Our Pledge
4+
5+
We as members, contributors, and leaders pledge to make participation in our community a harassment-free experience for everyone, regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, religion, or sexual identity and orientation.
6+
7+
We pledge to act and interact in ways that contribute to an open, welcoming, diverse, inclusive, and healthy community.
8+
9+
## Our Standards
10+
11+
Examples of behavior that contributes to a positive environment for our community include:
12+
13+
- Demonstrating empathy and kindness toward other people
14+
- Being respectful of differing opinions, viewpoints, and experiences
15+
- Giving and gracefully accepting constructive feedback
16+
- Accepting responsibility and apologizing to those affected by our mistakes, and learning from the experience
17+
- Focusing on what is best not just for us as individuals, but for the overall community
18+
19+
Examples of unacceptable behavior include:
20+
21+
- The use of sexualized language or imagery, and sexual attention or advances of any kind
22+
- Trolling, insulting or derogatory comments, and personal or political attacks
23+
- Public or private harassment
24+
- Publishing others’ private information, such as a physical or email address, without their explicit permission
25+
- Other conduct which could reasonably be considered inappropriate in a professional setting
26+
27+
## Enforcement Responsibilities
28+
29+
Community leaders are responsible for clarifying and enforcing our standards of acceptable behavior and will take appropriate and fair corrective action in response to any behavior that they deem inappropriate, threatening, offensive, or harmful.
30+
31+
Community leaders have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, and will communicate reasons for moderation decisions when appropriate.
32+
33+
## Scope
34+
35+
This Code of Conduct applies within all community spaces, and also applies when an individual is officially representing the community in public spaces. Examples of representing our community include using an official email address, posting via an official social media account, or acting as an appointed representative at an online or offline event.
36+
37+
## Enforcement
38+
39+
Instances of abusive, harassing, or otherwise unacceptable behavior may be reported to the community leaders responsible for enforcement (@deven96, @davidonuh, @haksoat). All complaints will be reviewed and investigated promptly and fairly.
40+
41+
All community leaders are obligated to respect the privacy and security of the reporter of any incident.
42+
43+
## Enforcement Guidelines
44+
45+
Community leaders will follow these Community Impact Guidelines in determining the consequences for any action they deem in violation of this Code of Conduct:
46+
47+
### 1. Correction
48+
**Community Impact**: Use of inappropriate language or other behavior deemed unprofessional or unwelcome in the community.
49+
**Consequence**: A private, written warning from community leaders, providing clarity around the nature of the violation and an explanation of why the behavior was inappropriate. A public apology may be requested.
50+
51+
### 2. Warning
52+
**Community Impact**: A violation through a single incident or series of actions.
53+
**Consequence**: A warning with consequences for continued behavior. No interaction with the people involved, including unsolicited interaction, will be allowed for a specified period. This includes avoiding interactions in community spaces as well as external channels like social media. Violating these terms may lead to a temporary or permanent ban.
54+
55+
### 3. Temporary Ban
56+
**Community Impact**: A serious violation of community standards, including sustained inappropriate behavior.
57+
**Consequence**: A temporary ban from any sort of interaction or public communication with the community for a specified period. During this time, a resolution to address concerns must be reached with community leaders.
58+
59+
### 4. Permanent Ban
60+
**Community Impact**: Demonstrating a pattern of violation of community standards, including sustained inappropriate behavior, harassment of an individual, or aggression toward or disparagement of classes of individuals.
61+
**Consequence**: A permanent ban from any sort of interaction or public communication within the community.
62+
63+
## Attribution
64+
65+
This Code of Conduct is adapted from the [Contributor Covenant](https://www.contributor-covenant.org), version 2.1, available at [https://www.contributor-covenant.org/version/2/1/code_of_conduct.html](https://www.contributor-covenant.org/version/2/1/code_of_conduct.html).
66+
67+
Community Impact Guidelines were inspired by [Mozilla’s code of conduct enforcement ladder](https://github.com/mozilla/diversity).
68+
69+
For answers to common questions about this code of conduct, see the FAQ at [https://www.contributor-covenant.org/faq](https://www.contributor-covenant.org/faq). Translations are available at [https://www.contributor-covenant.org/translations](https://www.contributor-covenant.org/translations).
70+

CONTRIBUTING.md

+98
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,98 @@
1+
# Contributing to Ahnlich
2+
3+
Thank you for your interest in contributing to **Ahnlich**! We welcome contributions of all kinds, including bug fixes, feature enhancements, documentation updates, and examples. Follow the steps below to get started.
4+
5+
---
6+
7+
## Table of Contents
8+
9+
1. [Code of Conduct](#code-of-conduct)
10+
2. [How to Contribute](#how-to-contribute)
11+
3. [Setting Up the Project](#setting-up-the-project)
12+
4. [Submitting Changes](#submitting-changes)
13+
5. [Reporting Issues](#reporting-issues)
14+
6. [Pull Request Guidelines](#pull-request-guidelines)
15+
16+
---
17+
18+
## Code of Conduct
19+
20+
By participating in this project, you agree to abide by our [Code of Conduct](CODE_OF_CONDUCT.md). Please treat everyone with respect and professionalism.
21+
22+
---
23+
24+
## How to Contribute
25+
26+
You can contribute in the following ways:
27+
- Reporting bugs or suggesting features via the [Issues](https://github.com/deven96/ahnlich/issues) tab.
28+
- Improving documentation, including adding or updating examples.
29+
- Fixing bugs or implementing new features through pull requests.
30+
31+
---
32+
33+
## Setting Up the Project
34+
35+
Follow these steps to set up the project locally:
36+
37+
1. **Fork the Repository**:
38+
Click the "Fork" button on the GitHub repository to create a copy under your GitHub account.
39+
40+
2. **Clone the Forked Repository**:
41+
42+
```bash
43+
git clone https://github.com/deven96/ahnlich.git
44+
cd ahnlich
45+
```
46+
3. **Install Rust**
47+
Ensure you have Rust installed. If not use [rustup](https://rustup.rs)
48+
49+
4. **Build the project**
50+
```bash
51+
cargo build
52+
```
53+
5. **Run tests**
54+
```bash
55+
make test
56+
```
57+
6. **Client Libraries**
58+
Currently client libraries are generated via a very hacky process to be improved
59+
View [client library generation guide](docs/libgen.md)
60+
61+
## Submitting Changes
62+
63+
1. **Create a Branch**
64+
Use descriptive names for your branches:
65+
```bash
66+
git checkout -b feature/improve-docs
67+
```
68+
69+
2. **Make changes**
70+
Make your changes and then commit them with clear and descriptive messages.
71+
```bash
72+
git add .
73+
git commit -m "Improve documentation for image-search example"
74+
```
75+
3. **Push to your fork**
76+
```bash
77+
git push origin feature/improve-docs
78+
```
79+
4. **Submit a pull request**
80+
Go to the main repository on Github, navigate to the "Pull Requests" section and click on "New Pull Request"
81+
82+
## Reporting Issues
83+
84+
If you encounter a bug or have a feature request, please create an issue:
85+
86+
1. Search for existing issues to avoid duplicates.
87+
2. If no existing issue matches, create a new [issue](https://github.deven96/ahnlich/issues/new) and include:
88+
* A descriptive title.
89+
* Steps to reproduce the bug or details about the feature request.
90+
* Logs, screenshots, or any other supporting information.
91+
92+
## Pull Request Guidelines
93+
94+
* Ensure your code passes all tests (cargo test).
95+
* Format your code with cargo fmt and check for common issues with cargo clippy.
96+
* Write clear commit messages.
97+
* Reference any related issue in the pull request description (e.g., "Fixes #42").
98+
* Include tests for new features or bug fixes, if applicable.

README.md

+4-24
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,8 @@
33

44
[![All Test](https://github.com/deven96/ahnlich/actions/workflows/test.yml/badge.svg)](https://github.com/deven96/ahnlich/actions/workflows/test.yml)
55

6+
⚠️ **Note:** ahnlich is not production-ready yet and is still in **testing** and so might experience breaking changes.
7+
68
ähnlich means similar in german. It comprises of multiple tools for usage and development such as:
79

810
- [`ahnlich-db`](ahnlich/db): In-memory vector key value store for storing embeddings/vectors with corresponding metadata(key-value maps). It's a powerful system which enables AI/ML engineers to store and search similar vectors using linear (cosine, euclidean) or non-linear similarity (kdtree) algorithms. It also leverages search within metadata values to be able to filter out entries using metadata values. A simple example can look like
@@ -56,31 +58,9 @@ The DB can be used without the AI proxy for more fine grained control of the gen
5658

5759
2. The CLI comes packaged into the docker images.
5860

61+
### Contributing
5962

60-
61-
## Development
62-
63-
### Using Spec documents to interact with Ahnlich DB
64-
65-
To generate the spec documents, run
66-
```bash
67-
cargo run --bin typegen generate
68-
```
69-
It is worth noting that any changes to the types crate, requires you to run the above command. This helps keep our spec document and types crate in sync.
70-
71-
To Convert spec documents to a programming language, run:
72-
73-
```bash
74-
cargo run --bin typegen create-client <Programming Language>
75-
```
76-
Available languages are:
77-
- python
78-
- golang
79-
- typescript.
80-
81-
In order to communicate effectively with the ahnlich db, you would have to extend the bincode serialization protocol automatically provided by `serde_generate`.
82-
Your message(in bytes) should be serialized and deserialized in the following format => `AHNLICH_HEADERS` + `VERSION` + `QUERY/SERVER_RESPONSE`. Bytes are `Little Endian`.
83-
63+
View [contribution guide](CONTRIBUTING.md)
8464

8565
### How Client Releases Work
8666

docs/libgen.md

+24
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
# LibGen
2+
3+
## Using Spec documents to interact with Ahnlich DB
4+
5+
To generate the spec documents, run
6+
```bash
7+
cd ahnlich
8+
cargo run --bin typegen generate
9+
```
10+
It is worth noting that any changes to the types crate, requires you to run the above command. This helps keep our spec document and types crate in sync.
11+
12+
To Convert spec documents to a programming language, run:
13+
14+
```bash
15+
cargo run --bin typegen create-client <Programming Language>
16+
```
17+
Available languages are:
18+
- python
19+
- golang
20+
- typescript.
21+
22+
In order to communicate effectively with the ahnlich db, you would have to extend the bincode serialization protocol automatically provided by `serde_generate`.
23+
Your message(in bytes) should be serialized and deserialized in the following format => `AHNLICH_HEADERS` + `VERSION` + `QUERY/SERVER_RESPONSE`. Bytes are `Little Endian`.
24+

examples/python/book-search/README.md

+13-5
Original file line numberDiff line numberDiff line change
@@ -1,19 +1,27 @@
1-
## Book search example for Python SDK
1+
## Book search
22

33
An example on how to use the python sdk that shows the process of splitting and
44
inserting an epub ebook into the db and querying it via a search phrase either directly or contextually
55

6+
### What this Example Does
7+
68
To install dependencies (ensure you have poetry installed)
79
```poetry install```
810

911
To insert run
1012
```poetry run insertbook```
1113

12-
![insertion gif](insertbook.gif)
14+
- The book _(Animal Farm by George Orwell)_ is processed and indexed
15+
* `epub` file is split up into paragraph and cleaned a bit
16+
* Embeddings are generated by `ahnlich-ai` using the `BGEBaseEnV15`
17+
* Generated embeddings are stored within `ahnlich-db` vector datastore
18+
* ![insertion gif](insertbook.gif)
1319

1420
To search run
1521
```poetry run searchbook```
1622

17-
![insertion gif](searchphrase.gif)
18-
19-
Note that the epub file being split _(Animal Farm by George Orwell)_ is available locally in the example file and the example can be editted to customize processes and play around with input and output.
23+
- Query text is provided
24+
* `BGEBaseEnV15` is used to generated embeddings for the query text
25+
* Embedding from the query text is compared using `CosineAlgorithm` against every embedding in the vector datastore
26+
* Closest 5 embeddings are identified and are the corresponding paragraphs are printed out in order of similarity
27+
* ![insertion gif](searchphrase.gif)

examples/rust/image-search/README.md

+12-3
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,27 @@
1-
## Image search example for Rust SDK
1+
## Image search
22

33
An example on how to use the rust sdk that shows the process of indexing a couple of images and
44
into the db and querying those via text
55

6+
### What This Example Does
7+
68
To install dependencies (ensure you have cargo installed)
79
```cargo build```
810

911
Place the images into the images folder and run
1012
```cargo run index```
1113

12-
![insertion gif](index-image.gif)
14+
- Each image within the `images` folder is indexed
15+
* One of the models supported by `ahnlich-ai` i.e `ClipVitB32Image` is used to generate embeddings for the images
16+
* Embeddings are then stored within `ahnlich-db` vector datastore
17+
* ![insertion gif](index-image.gif)
1318

1419
To search run
1520
```cargo run query```
1621

17-
![insertion gif](query-image.gif)
22+
- Query text is provided
23+
* `ClipVitB32Text` is used to generate embeddings for the query text.
24+
* Embedding from the query text is compared using `CosineAlgorithm` against every embedding in the vector datastore
25+
* Closest embedding is identified and the corresponding image pixels are rendered to screen
26+
* ![query gif](query-image.gif)
1827

0 commit comments

Comments
 (0)