Skip to content

Commit 80f50e6

Browse files
authored
Rename project to LocalAI (#35)
Signed-off-by: mudler <[email protected]>
1 parent 7fec26f commit 80f50e6

File tree

12 files changed

+93
-310
lines changed

12 files changed

+93
-310
lines changed

.github/workflows/image.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ jobs:
1919
- name: Prepare
2020
id: prep
2121
run: |
22-
DOCKER_IMAGE=quay.io/go-skynet/llama-cli
22+
DOCKER_IMAGE=quay.io/go-skynet/local-ai
2323
VERSION=master
2424
SHORTREF=${GITHUB_SHA::8}
2525

.gitignore

+3-2
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,9 @@
22
go-llama
33
go-gpt4all-j
44

5-
# llama-cli build binary
6-
llama-cli
5+
# LocalAI build binary
6+
LocalAI
7+
local-ai
78

89
# Ignore models
910
models/*.bin

.goreleaser.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# Make sure to check the documentation at http://goreleaser.com
2-
project_name: llama-cli
2+
project_name: local-ai
33
builds:
44
- ldflags:
55
- -w -s

Dockerfile

+2-2
Original file line numberDiff line numberDiff line change
@@ -8,5 +8,5 @@ ARG BUILD_TYPE=
88
RUN make build${BUILD_TYPE}
99

1010
FROM debian:$DEBIAN_VERSION
11-
COPY --from=builder /build/llama-cli /usr/bin/llama-cli
12-
ENTRYPOINT [ "/usr/bin/llama-cli" ]
11+
COPY --from=builder /build/local-ai /usr/bin/local-ai
12+
ENTRYPOINT [ "/usr/bin/local-ai" ]

Earthfile

+1-1
Original file line numberDiff line numberDiff line change
@@ -2,4 +2,4 @@ VERSION 0.7
22

33
build:
44
FROM DOCKERFILE -f Dockerfile .
5-
SAVE ARTIFACT /usr/bin/llama-cli AS LOCAL llama-cli
5+
SAVE ARTIFACT /usr/bin/local-ai AS LOCAL local-ai

Makefile

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
GOCMD=go
22
GOTEST=$(GOCMD) test
33
GOVET=$(GOCMD) vet
4-
BINARY_NAME=llama-cli
4+
BINARY_NAME=local-ai
55
GOLLAMA_VERSION?=llama.cpp-5ecff35
66

77
GREEN := $(shell tput -Txterm setaf 2)

README.md

+33-82
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,8 @@
1-
## :camel: llama-cli
1+
## :camel: LocalAI
22

3+
> :warning: This project has been renamed from `llama-cli` to `LocalAI` to reflect the fact that we are focusing on a fast drop-in OpenAI API rather on the CLI interface. We think that there are already many projects that can be used as a CLI interface already, for instance [llama.cpp](https://github.com/ggerganov/llama.cpp) and [gpt4all](https://github.com/nomic-ai/gpt4all). If you are were using `llama-cli` for CLI interactions and want to keep using it, use older versions or please open up an issue - contributions are welcome!
34
4-
llama-cli is a straightforward, drop-in replacement API compatible with OpenAI for local CPU inferencing, based on [llama.cpp](https://github.com/ggerganov/llama.cpp), [gpt4all](https://github.com/nomic-ai/gpt4all) and [ggml](https://github.com/ggerganov/ggml), including support GPT4ALL-J which is Apache 2.0 Licensed and can be used for commercial purposes.
5+
LocalAI is a straightforward, drop-in replacement API compatible with OpenAI for local CPU inferencing, based on [llama.cpp](https://github.com/ggerganov/llama.cpp), [gpt4all](https://github.com/nomic-ai/gpt4all) and [ggml](https://github.com/ggerganov/ggml), including support GPT4ALL-J which is Apache 2.0 Licensed and can be used for commercial purposes.
56

67
- OpenAI compatible API
78
- Supports multiple-models
@@ -18,12 +19,15 @@ Note: You might need to convert older models to the new format, see [here](https
1819

1920
## Usage
2021

21-
The easiest way to run llama-cli is by using `docker-compose`:
22+
> `LocalAI` comes by default as a container image. You can check out all the available images with corresponding tags [here](https://quay.io/repository/go-skynet/local-ai?tab=tags&tag=latest).
23+
24+
The easiest way to run LocalAI is by using `docker-compose`:
2225

2326
```bash
2427

25-
git clone https://github.com/go-skynet/llama-cli
26-
cd llama-cli
28+
git clone https://github.com/go-skynet/LocalAI
29+
30+
cd LocalAI
2731

2832
# copy your models to models/
2933
cp your-model.bin models/
@@ -45,8 +49,11 @@ curl http://localhost:8080/v1/completions -H "Content-Type: application/json" -d
4549
}'
4650
```
4751

48-
Note: The API doesn't inject a default prompt for talking to the model, while the CLI does. You have to use a prompt similar to what's described in the standford-alpaca docs: https://github.com/tatsu-lab/stanford_alpaca#data-release.
52+
## Prompt templates
53+
54+
The API doesn't inject a default prompt for talking to the model. You have to use a prompt similar to what's described in the standford-alpaca docs: https://github.com/tatsu-lab/stanford_alpaca#data-release.
4955

56+
<details>
5057
You can use a default template for every model present in your model path, by creating a corresponding file with the `.tmpl` suffix next to your model. For instance, if the model is called `foo.bin`, you can create a sibiling file, `foo.bin.tmpl` which will be used as a default prompt, for instance this can be used with alpaca:
5158

5259
```
@@ -58,70 +65,19 @@ Below is an instruction that describes a task. Write a response that appropriate
5865
### Response:
5966
```
6067

61-
See the [prompt-templates](https://github.com/go-skynet/llama-cli/tree/master/prompt-templates) directory in this repository for templates for most popular models.
62-
63-
## Container images
64-
65-
`llama-cli` comes by default as a container image. You can check out all the available images with corresponding tags [here](https://quay.io/repository/go-skynet/llama-cli?tab=tags&tag=latest)
66-
67-
To begin, run:
68-
69-
```
70-
docker run -ti --rm quay.io/go-skynet/llama-cli:latest --instruction "What's an alpaca?" --topk 10000 --model ...
71-
```
72-
73-
Where `--model` is the path of the model you want to use.
74-
75-
Note: you need to mount a volume to the docker container in order to load a model, for instance:
76-
77-
```
78-
# assuming your model is in /path/to/your/models/foo.bin
79-
docker run -v /path/to/your/models:/models -ti --rm quay.io/go-skynet/llama-cli:latest --instruction "What's an alpaca?" --topk 10000 --model /models/foo.bin
80-
```
81-
82-
You will receive a response like the following:
83-
84-
```
85-
An alpaca is a member of the South American Camelid family, which includes the llama, guanaco and vicuña. It is a domesticated species that originates from the Andes mountain range in South America. Alpacas are used in the textile industry for their fleece, which is much softer than wool. Alpacas are also used for meat, milk, and fiber.
86-
```
68+
See the [prompt-templates](https://github.com/go-skynet/LocalAI/tree/master/prompt-templates) directory in this repository for templates for most popular models.
8769

88-
## Basic usage
89-
90-
To use llama-cli, specify a pre-trained GPT-based model, an input text, and an instruction for text generation. llama-cli takes the following arguments when running from the CLI:
91-
92-
```
93-
llama-cli --model <model_path> --instruction <instruction> [--input <input>] [--template <template_path>] [--tokens <num_tokens>] [--threads <num_threads>] [--temperature <temperature>] [--topp <top_p>] [--topk <top_k>]
94-
```
95-
96-
| Parameter | Environment Variable | Default Value | Description |
97-
| ------------ | -------------------- | ------------- | -------------------------------------- |
98-
| template | TEMPLATE | | A file containing a template for output formatting (optional). |
99-
| instruction | INSTRUCTION | | Input prompt text or instruction. "-" for STDIN. |
100-
| input | INPUT | - | Path to text or "-" for STDIN. |
101-
| model | MODEL | | The path to the pre-trained GPT-based model. |
102-
| tokens | TOKENS | 128 | The maximum number of tokens to generate. |
103-
| threads | THREADS | NumCPU() | The number of threads to use for text generation. |
104-
| temperature | TEMPERATURE | 0.95 | Sampling temperature for model output. ( values between `0.1` and `1.0` ) |
105-
| top_p | TOP_P | 0.85 | The cumulative probability for top-p sampling. |
106-
| top_k | TOP_K | 20 | The number of top-k tokens to consider for text generation. |
107-
| context-size | CONTEXT_SIZE | 512 | Default token context size. |
108-
109-
Here's an example of using `llama-cli`:
110-
111-
```
112-
llama-cli --model ~/ggml-alpaca-7b-q4.bin --instruction "What's an alpaca?"
113-
```
114-
115-
This will generate text based on the given model and instruction.
70+
</details>
11671

11772
## API
11873

119-
`llama-cli` also provides an API for running text generation as a service. The models once loaded the first time will be kept in memory.
74+
`LocalAI` provides an API for running text generation as a service, that follows the OpenAI reference and can be used as a drop-in. The models once loaded the first time will be kept in memory.
12075

76+
<details>
12177
Example of starting the API with `docker`:
12278

12379
```bash
124-
docker run -p 8080:8080 -ti --rm quay.io/go-skynet/llama-cli:latest api --models-path /path/to/models --context-size 700 --threads 4
80+
docker run -p 8080:8080 -ti --rm quay.io/go-skynet/local-api:latest --models-path /path/to/models --context-size 700 --threads 4
12581
```
12682

12783
And you'll see:
@@ -136,15 +92,15 @@ And you'll see:
13692
└───────────────────────────────────────────────────┘
13793
```
13894

139-
Note: Models have to end up with `.bin`.
95+
Note: Models have to end up with `.bin` so can be listed by the `/models` endpoint.
14096

14197
You can control the API server options with command line arguments:
14298

14399
```
144-
llama-cli api --models-path <model_path> [--address <address>] [--threads <num_threads>]
100+
local-api --models-path <model_path> [--address <address>] [--threads <num_threads>]
145101
```
146102

147-
The API takes takes the following:
103+
The API takes takes the following parameters:
148104

149105
| Parameter | Environment Variable | Default Value | Description |
150106
| ------------ | -------------------- | ------------- | -------------------------------------- |
@@ -155,6 +111,8 @@ The API takes takes the following:
155111

156112
Once the server is running, you can start making requests to it using HTTP, using the OpenAI API.
157113

114+
</details>
115+
158116
### Supported OpenAI API endpoints
159117

160118
You can check out the [OpenAI API reference](https://platform.openai.com/docs/api-reference/chat/create).
@@ -212,41 +170,34 @@ python 828bddec6162a023114ce19146cb2b82/gistfile1.txt models tokenizer.model
212170

213171
### Windows compatibility
214172

215-
It should work, however you need to make sure you give enough resources to the container. See https://github.com/go-skynet/llama-cli/issues/2
173+
It should work, however you need to make sure you give enough resources to the container. See https://github.com/go-skynet/LocalAI/issues/2
216174

217175
### Kubernetes
218176

219-
You can run the API directly in Kubernetes:
220-
221-
```bash
222-
kubectl apply -f https://raw.githubusercontent.com/go-skynet/llama-cli/master/kubernetes/deployment.yaml
223-
```
177+
You can run the API in Kubernetes, see an example deployment in [kubernetes](https://github.com/go-skynet/LocalAI/tree/master/kubernetes)
224178

225179
### Build locally
226180

227181
Pre-built images might fit well for most of the modern hardware, however you can and might need to build the images manually.
228182

229-
In order to build the `llama-cli` container image locally you can use `docker`:
183+
In order to build the `LocalAI` container image locally you can use `docker`:
230184

231185
```
232-
# build the image as "alpaca-image"
233-
docker build -t llama-cli .
234-
docker run llama-cli --instruction "What's an alpaca?"
186+
# build the image
187+
docker build -t LocalAI .
188+
docker run LocalAI
235189
```
236190

237-
Or build the binary with:
191+
Or build the binary with `make`:
238192

239193
```
240-
# build the image as "alpaca-image"
241-
docker run --privileged -v /var/run/docker.sock:/var/run/docker.sock --rm -t -v "$(pwd)":/workspace -v earthly-tmp:/tmp/earthly:rw earthly/earthly:v0.7.2 +build
242-
# run the binary
243-
./llama-cli --instruction "What's an alpaca?"
194+
make build
244195
```
245196

246197
## Short-term roadmap
247198

248-
- [x] Mimic OpenAI API (https://github.com/go-skynet/llama-cli/issues/10)
249-
- Binary releases (https://github.com/go-skynet/llama-cli/issues/6)
199+
- [x] Mimic OpenAI API (https://github.com/go-skynet/LocalAI/issues/10)
200+
- Binary releases (https://github.com/go-skynet/LocalAI/issues/6)
250201
- Upstream our golang bindings to llama.cpp (https://github.com/ggerganov/llama.cpp/issues/351)
251202
- [x] Multi-model support
252203
- Have a webUI!

api/api.go

+1-2
Original file line numberDiff line numberDiff line change
@@ -5,8 +5,7 @@ import (
55
"strings"
66
"sync"
77

8-
model "github.com/go-skynet/llama-cli/pkg/model"
9-
8+
model "github.com/go-skynet/LocalAI/pkg/model"
109
gptj "github.com/go-skynet/go-gpt4all-j.cpp"
1110
llama "github.com/go-skynet/go-llama.cpp"
1211
"github.com/gofiber/fiber/v2"

docker-compose.yaml

+2-14
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,8 @@
11
version: '3.6'
22

33
services:
4-
5-
# chatgpt:
6-
# image: ghcr.io/mckaywrigley/chatbot-ui:main
7-
# # platform: linux/amd64
8-
# ports:
9-
# - 3000:3000
10-
# environment:
11-
# - 'OPENAI_API_KEY=sk-000000000000000'
12-
# - 'OPENAI_API_HOST=http://api:8080'
13-
144
api:
15-
image: quay.io/go-skynet/llama-cli:latest
5+
image: quay.io/go-skynet/local-ai:latest
166
build:
177
context: .
188
dockerfile: Dockerfile
@@ -25,6 +15,4 @@ services:
2515
- CONTEXT_SIZE=$CONTEXT_SIZE
2616
- THREADS=$THREADS
2717
volumes:
28-
- ./models:/models:cached
29-
command: api
30-
18+
- ./models:/models:cached

go.mod

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
module github.com/go-skynet/llama-cli
1+
module github.com/go-skynet/LocalAI
22

33
go 1.19
44

kubernetes/deployment.yaml

+1-3
Original file line numberDiff line numberDiff line change
@@ -23,9 +23,7 @@ spec:
2323
spec:
2424
containers:
2525
- name: llama
26-
args:
27-
- api
28-
image: quay.io/go-skynet/llama-cli:latest
26+
image: quay.io/go-skynet/local-ai:latest
2927
---
3028
apiVersion: v1
3129
kind: Service

0 commit comments

Comments
 (0)