You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+33-82
Original file line number
Diff line number
Diff line change
@@ -1,7 +1,8 @@
1
-
## :camel:llama-cli
1
+
## :camel:LocalAI
2
2
3
+
> :warning: This project has been renamed from `llama-cli` to `LocalAI` to reflect the fact that we are focusing on a fast drop-in OpenAI API rather on the CLI interface. We think that there are already many projects that can be used as a CLI interface already, for instance [llama.cpp](https://github.com/ggerganov/llama.cpp) and [gpt4all](https://github.com/nomic-ai/gpt4all). If you are were using `llama-cli` for CLI interactions and want to keep using it, use older versions or please open up an issue - contributions are welcome!
3
4
4
-
llama-cli is a straightforward, drop-in replacement API compatible with OpenAI for local CPU inferencing, based on [llama.cpp](https://github.com/ggerganov/llama.cpp), [gpt4all](https://github.com/nomic-ai/gpt4all) and [ggml](https://github.com/ggerganov/ggml), including support GPT4ALL-J which is Apache 2.0 Licensed and can be used for commercial purposes.
5
+
LocalAI is a straightforward, drop-in replacement API compatible with OpenAI for local CPU inferencing, based on [llama.cpp](https://github.com/ggerganov/llama.cpp), [gpt4all](https://github.com/nomic-ai/gpt4all) and [ggml](https://github.com/ggerganov/ggml), including support GPT4ALL-J which is Apache 2.0 Licensed and can be used for commercial purposes.
5
6
6
7
- OpenAI compatible API
7
8
- Supports multiple-models
@@ -18,12 +19,15 @@ Note: You might need to convert older models to the new format, see [here](https
18
19
19
20
## Usage
20
21
21
-
The easiest way to run llama-cli is by using `docker-compose`:
22
+
> `LocalAI` comes by default as a container image. You can check out all the available images with corresponding tags [here](https://quay.io/repository/go-skynet/local-ai?tab=tags&tag=latest).
23
+
24
+
The easiest way to run LocalAI is by using `docker-compose`:
Note: The API doesn't inject a default prompt for talking to the model, while the CLI does. You have to use a prompt similar to what's described in the standford-alpaca docs: https://github.com/tatsu-lab/stanford_alpaca#data-release.
52
+
## Prompt templates
53
+
54
+
The API doesn't inject a default prompt for talking to the model. You have to use a prompt similar to what's described in the standford-alpaca docs: https://github.com/tatsu-lab/stanford_alpaca#data-release.
49
55
56
+
<details>
50
57
You can use a default template for every model present in your model path, by creating a corresponding file with the `.tmpl` suffix next to your model. For instance, if the model is called `foo.bin`, you can create a sibiling file, `foo.bin.tmpl` which will be used as a default prompt, for instance this can be used with alpaca:
51
58
52
59
```
@@ -58,70 +65,19 @@ Below is an instruction that describes a task. Write a response that appropriate
58
65
### Response:
59
66
```
60
67
61
-
See the [prompt-templates](https://github.com/go-skynet/llama-cli/tree/master/prompt-templates) directory in this repository for templates for most popular models.
62
-
63
-
## Container images
64
-
65
-
`llama-cli` comes by default as a container image. You can check out all the available images with corresponding tags [here](https://quay.io/repository/go-skynet/llama-cli?tab=tags&tag=latest)
66
-
67
-
To begin, run:
68
-
69
-
```
70
-
docker run -ti --rm quay.io/go-skynet/llama-cli:latest --instruction "What's an alpaca?" --topk 10000 --model ...
71
-
```
72
-
73
-
Where `--model` is the path of the model you want to use.
74
-
75
-
Note: you need to mount a volume to the docker container in order to load a model, for instance:
76
-
77
-
```
78
-
# assuming your model is in /path/to/your/models/foo.bin
79
-
docker run -v /path/to/your/models:/models -ti --rm quay.io/go-skynet/llama-cli:latest --instruction "What's an alpaca?" --topk 10000 --model /models/foo.bin
80
-
```
81
-
82
-
You will receive a response like the following:
83
-
84
-
```
85
-
An alpaca is a member of the South American Camelid family, which includes the llama, guanaco and vicuña. It is a domesticated species that originates from the Andes mountain range in South America. Alpacas are used in the textile industry for their fleece, which is much softer than wool. Alpacas are also used for meat, milk, and fiber.
86
-
```
68
+
See the [prompt-templates](https://github.com/go-skynet/LocalAI/tree/master/prompt-templates) directory in this repository for templates for most popular models.
87
69
88
-
## Basic usage
89
-
90
-
To use llama-cli, specify a pre-trained GPT-based model, an input text, and an instruction for text generation. llama-cli takes the following arguments when running from the CLI:
llama-cli --model ~/ggml-alpaca-7b-q4.bin --instruction "What's an alpaca?"
113
-
```
114
-
115
-
This will generate text based on the given model and instruction.
70
+
</details>
116
71
117
72
## API
118
73
119
-
`llama-cli` also provides an API for running text generation as a service. The models once loaded the first time will be kept in memory.
74
+
`LocalAI`provides an API for running text generation as a service, that follows the OpenAI reference and can be used as a drop-in. The models once loaded the first time will be kept in memory.
120
75
76
+
<details>
121
77
Example of starting the API with `docker`:
122
78
123
79
```bash
124
-
docker run -p 8080:8080 -ti --rm quay.io/go-skynet/llama-cli:latest api --models-path /path/to/models --context-size 700 --threads 4
0 commit comments