You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/user_guide/request_cancellation.md
+4-4
Original file line number
Diff line number
Diff line change
@@ -28,7 +28,7 @@
28
28
29
29
# Request Cancellation
30
30
31
-
Starting from 23.10, Triton supports handling request cancellation received
31
+
Starting from r23.10, Triton supports handling request cancellation received
32
32
from the gRPC client or a C API user. Long running inference requests such
33
33
as for auto generative large language models may run for an indeterminate
34
34
amount of time or indeterminate number of steps. Additionally clients may
@@ -39,7 +39,7 @@ resources.
39
39
40
40
## Issuing Request Cancellation
41
41
42
-
### Triton C API
42
+
### In-Process C API
43
43
44
44
[In-Process Triton Server C API](../customization_guide/inference_protocols.md#in-process-triton-server-api) has been enhanced with `TRITONSERVER_InferenceRequestCancel`
45
45
and `TRITONSERVER_InferenceRequestIsCancelled` to issue cancellation and query
@@ -77,9 +77,9 @@ detection and handling within Triton core is work in progress.
77
77
78
78
## Handling in Backend
79
79
80
-
Upon receiving request cancellation, triton does its best to terminate request
80
+
Upon receiving request cancellation, Triton does its best to terminate request
81
81
at various points. However, once a request has been given to the backend
82
-
for execution, it is upto the individual backends to detect and handle
82
+
for execution, it is up to the individual backends to detect and handle
83
83
request termination.
84
84
Currently, the following backends support early termination:
0 commit comments