@@ -212,9 +212,8 @@ repository, copy in the new shared libraries, and then reload the
212
212
model.
213
213
214
214
* If only the model instance configuration on the 'config.pbtxt' is modified
215
- (i.e. increasing/decreasing the instance count) for non-sequence models,
216
- then Triton will update the model rather then reloading it, when either a load
217
- request is received under
215
+ (i.e. increasing/decreasing the instance count), then Triton will update the
216
+ model rather then reloading it, when either a load request is received under
218
217
[ Model Control Mode EXPLICIT] ( #model-control-mode-explicit ) or change to the
219
218
'config.pbtxt' is detected under
220
219
[ Model Control Mode POLL] ( #model-control-mode-poll ) .
@@ -225,11 +224,14 @@ request is received under
225
224
configuration, so its presence in the model directory may be detected as a new file
226
225
and cause the model to fully reload when only an update is expected.
227
226
228
- * If a sequence model is updated with in-flight sequence(s), Triton does not
229
- guarentee any remaining request(s) from the in-flight sequence(s) will be routed
230
- to the same model instance for processing. It is currently the responsibility of
231
- the user to ensure any in-flight sequence(s) is complete before updating a
232
- sequence model.
227
+ * If a sequence model is * reloaded* with in-flight sequence(s) (i.e. changes to
228
+ the model file), Triton does not guarentee any remaining request(s) from the
229
+ in-flight sequence(s) will be routed to the same model instance for processing.
230
+ It is currently the responsibility of the user to ensure any in-flight
231
+ sequence(s) are completed before reloading a sequence model.
232
+ * If a sequence model is * updated* (i.e. increasing/decreasing the instance
233
+ count), Triton will wait until the in-flight sequence is completed (or
234
+ timed-out) before the instance behind the sequence is removed.
233
235
234
236
## Concurrently Loading Models
235
237
0 commit comments