Skip to content

Commit 991437d

Browse files
committed
update readme with more example/details
1 parent 7df7b7f commit 991437d

File tree

1 file changed

+65
-2
lines changed

1 file changed

+65
-2
lines changed

README.md

Lines changed: 65 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -26,10 +26,73 @@ wget https://huggingface.co/sanjay920/Llama-3-8b-function-calling-alpha-v1.gguf/
2626

2727
4. start openai compatible server:
2828
```
29-
./llama-server -ngl 35 -m Llama-3-8b-function-calling-alpha-v1.gguf --port 1234 --host 0.0.0.0 -c 16000 --chat-template llama3
29+
./llama-server -ngl 37 -m Llama-3-8b-function-calling-alpha-v1.gguf --port 1234 --host 0.0.0.0 -c 8000 --chat-template llama3
3030
```
3131

32-
5. That's it! MAKE SURE you turn `stream` OFF when making api calls to the server, as the streaming feature is not supported yet. And we will support streaming too soon.
32+
5. Test to make sure the server is available:
33+
```bash
34+
curl localhost:1234/v1/chat/completions \
35+
-H "Content-Type: application/json" \
36+
-H "Authorization: Bearer tokenabc-123" \
37+
-d '{
38+
"model": "rubra-model",
39+
"messages": [
40+
{
41+
"role": "system",
42+
"content": "You are a helpful assistant."
43+
},
44+
{
45+
"role": "user",
46+
"content": "hello"
47+
}
48+
]
49+
}'
50+
```
51+
52+
6. Try a python function calling example:
53+
```python
54+
from openai import OpenAI
55+
client = OpenAI(api_key="123", base_url = "http://localhost:1234/v1/")
56+
57+
tools = [
58+
{
59+
"type": "function",
60+
"function": {
61+
"name": "get_current_weather",
62+
"description": "Get the current weather in a given location",
63+
"parameters": {
64+
"type": "object",
65+
"properties": {
66+
"location": {
67+
"type": "string",
68+
"description": "The city and state, e.g. San Francisco, CA",
69+
},
70+
"unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
71+
},
72+
"required": ["location"],
73+
},
74+
}
75+
}
76+
]
77+
messages = [{"role": "user", "content": "What's the weather like in Boston today?"}]
78+
completion = client.chat.completions.create(
79+
model="rubra-model",
80+
messages=messages,
81+
tools=tools,
82+
tool_choice="auto"
83+
)
84+
85+
print(completion)
86+
```
87+
88+
The output should look like this:
89+
```
90+
ChatCompletion(id='chatcmpl-EmHd8kai4DVwBUOyim054GmfcyUbjiLf', choices=[Choice(finish_reason='tool_calls', index=0, logprobs=None, message=ChatCompletionMessage(content=None, role='assistant', function_call=None, tool_calls=[ChatCompletionMessageToolCall(id='e885974b', function=Function(arguments='{"location":"Boston"}', name='get_current_weather'), type='function')]))], created=1719528056, model='rubra-model', object='chat.completion', system_fingerprint=None, usage=CompletionUsage(completion_tokens=29, prompt_tokens=241, total_tokens=270))
91+
```
92+
93+
That's it! MAKE SURE you turn `stream` OFF when making api calls to the server, as the streaming feature is not supported yet. And we will support streaming too soon.
94+
95+
For more function calling examples, you can checkout `test_llamacpp.ipynb` notebook.
3396

3497
### Recent API changes
3598

0 commit comments

Comments
 (0)