Skip to content

Commit 4a8a85a

Browse files
committed
Update documentation
1 parent 6bf071e commit 4a8a85a

File tree

1 file changed

+10
-8
lines changed

1 file changed

+10
-8
lines changed

docs/user_guide/model_management.md

+10-8
Original file line numberDiff line numberDiff line change
@@ -212,9 +212,8 @@ repository, copy in the new shared libraries, and then reload the
212212
model.
213213

214214
* If only the model instance configuration on the 'config.pbtxt' is modified
215-
(i.e. increasing/decreasing the instance count) for non-sequence models,
216-
then Triton will update the model rather then reloading it, when either a load
217-
request is received under
215+
(i.e. increasing/decreasing the instance count), then Triton will update the
216+
model rather then reloading it, when either a load request is received under
218217
[Model Control Mode EXPLICIT](#model-control-mode-explicit) or change to the
219218
'config.pbtxt' is detected under
220219
[Model Control Mode POLL](#model-control-mode-poll).
@@ -225,11 +224,14 @@ request is received under
225224
configuration, so its presence in the model directory may be detected as a new file
226225
and cause the model to fully reload when only an update is expected.
227226

228-
* If a sequence model is updated with in-flight sequence(s), Triton does not
229-
guarentee any remaining request(s) from the in-flight sequence(s) will be routed
230-
to the same model instance for processing. It is currently the responsibility of
231-
the user to ensure any in-flight sequence(s) is complete before updating a
232-
sequence model.
227+
* If a sequence model is *reloaded* with in-flight sequence(s) (i.e. changes to
228+
the model file), Triton does not guarentee any remaining request(s) from the
229+
in-flight sequence(s) will be routed to the same model instance for processing.
230+
It is currently the responsibility of the user to ensure any in-flight
231+
sequence(s) are completed before reloading a sequence model.
232+
* If a sequence model is *updated* (i.e. increasing/decreasing the instance
233+
count), Triton will wait until the in-flight sequence is completed (or
234+
timed-out) before the instance behind the sequence is removed.
233235

234236
## Concurrently Loading Models
235237

0 commit comments

Comments
 (0)