You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: examples/offline_inference/openai/openai_batch.md
+32-1
Original file line number
Diff line number
Diff line change
@@ -13,7 +13,7 @@ The OpenAI batch file format consists of a series of json objects on new lines.
13
13
Each line represents a separate request. See the [OpenAI package reference](https://platform.openai.com/docs/api-reference/batch/requestInput) for more details.
14
14
15
15
```{note}
16
-
We currently only support `/v1/chat/completions`and `/v1/embeddings` endpoints (completions coming soon).
16
+
We currently support `/v1/chat/completions`, `/v1/embeddings`, and `/v1/score` endpoints (completions coming soon).
Add score requests to your batch file. The following is an example:
216
+
217
+
```
218
+
{"custom_id": "request-1", "method": "POST", "url": "/v1/score", "body": {"model": "BAAI/bge-reranker-v2-m3", "text_1": "What is the capital of France?", "text_2": ["The capital of Brazil is Brasilia.", "The capital of France is Paris."]}}
219
+
{"custom_id": "request-2", "method": "POST", "url": "/v1/score", "body": {"model": "BAAI/bge-reranker-v2-m3", "text_1": "What is the capital of France?", "text_2": ["The capital of Brazil is Brasilia.", "The capital of France is Paris."]}}
220
+
```
221
+
222
+
You can mix chat completion, embedding, and score requests in the batch file, as long as the model you are using supports them all (note that all requests must use the same model).
223
+
224
+
### Step 2: Run the batch
225
+
226
+
You can run the batch using the same command as in earlier examples.
227
+
228
+
### Step 3: Check your results
229
+
230
+
You can check your results by running `cat results.jsonl`
INPUT_SCORE_BATCH="""{"custom_id": "request-1", "method": "POST", "url": "/v1/score", "body": {"model": "BAAI/bge-reranker-v2-m3", "text_1": "What is the capital of France?", "text_2": ["The capital of Brazil is Brasilia.", "The capital of France is Paris."]}}
26
+
{"custom_id": "request-2", "method": "POST", "url": "/v1/score", "body": {"model": "BAAI/bge-reranker-v2-m3", "text_1": "What is the capital of France?", "text_2": ["The capital of Brazil is Brasilia.", "The capital of France is Paris."]}}"""
27
+
24
28
25
29
deftest_empty_file():
26
30
withtempfile.NamedTemporaryFile(
@@ -102,3 +106,36 @@ def test_embeddings():
102
106
# Ensure that the output format conforms to the openai api.
103
107
# Validation should throw if the schema is wrong.
104
108
BatchRequestOutput.model_validate_json(line)
109
+
110
+
111
+
deftest_score():
112
+
withtempfile.NamedTemporaryFile(
113
+
"w") asinput_file, tempfile.NamedTemporaryFile(
114
+
"r") asoutput_file:
115
+
input_file.write(INPUT_SCORE_BATCH)
116
+
input_file.flush()
117
+
proc=subprocess.Popen([
118
+
sys.executable,
119
+
"-m",
120
+
"vllm.entrypoints.openai.run_batch",
121
+
"-i",
122
+
input_file.name,
123
+
"-o",
124
+
output_file.name,
125
+
"--model",
126
+
"BAAI/bge-reranker-v2-m3",
127
+
], )
128
+
proc.communicate()
129
+
proc.wait()
130
+
assertproc.returncode==0, f"{proc=}"
131
+
132
+
contents=output_file.read()
133
+
forlineincontents.strip().split("\n"):
134
+
# Ensure that the output format conforms to the openai api.
0 commit comments