You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[Doc] Document new fields in databricks_model_serving resource (#4615)
## Changes
<!-- Summary of your changes that are easy to understand -->
This includes `budget_policy_id`, fallback config in AI gateway, custom
API provider
## Tests
<!--
How is this tested? Please see the checklist below and also describe any
other relevant tests
-->
- [x] relevant change in `docs/` folder
Copy file name to clipboardExpand all lines: docs/resources/model_serving.md
+12
Original file line number
Diff line number
Diff line change
@@ -55,6 +55,7 @@ The following arguments are supported:
55
55
*`rate_limits` - A list of rate limit blocks to be applied to the serving endpoint. *Note: only external and foundation model endpoints are supported as of now.*
56
56
*`ai_gateway` - (Optional) A block with AI Gateway configuration for the serving endpoint. *Note: only external model endpoints are supported as of now.*
57
57
*`route_optimized` - (Optional) A boolean enabling route optimization for the endpoint. *Note: only available for custom models.*
58
+
*`budget_policy_id` - (Optiona) The Budget Policy ID set for this serving endpoint.
58
59
59
60
### served_entities Configuration Block
60
61
@@ -81,6 +82,15 @@ The following arguments are supported:
81
82
*`cohere_config` - Cohere Config
82
83
*`cohere_api_key` - The Databricks secret key reference for a Cohere API key.
83
84
*`cohere_api_key_plaintext` - The Cohere API key provided as a plaintext string.
85
+
*`custom_provider_config` - Custom Provider Config. Only required if the provider is 'custom'.
86
+
*`custom_provider_url` (Required) - URL of the custom provider API.
87
+
*`api_key_auth` - (Optional) API key authentication for the custom provider API. Conflicts with `bearer_token_auth`.
88
+
*`key` (Required) - The name of the API key parameter used for authentication.
89
+
*`value` (Optional) - The Databricks secret key reference for an API Key.
90
+
*`value_plaintext` (Optional) - The API Key provided as a plaintext string.
91
+
*`bearer_token_auth` (Optional) - bearer token authentication for the custom provider API. Conflicts with `api_key_auth`.
92
+
*`token` (Optional) - The Databricks secret key reference for a token.
93
+
*`token_plaintext` (Optional) - The token provided as a plaintext string.
84
94
*`databricks_model_serving_config` - Databricks Model Serving Config
85
95
*`databricks_api_token` - The Databricks secret key reference for a Databricks API token that corresponds to a user or service principal with Can Query access to the model serving endpoint pointed to by this external model.
86
96
*`databricks_api_token_plaintext` - The Databricks API token that corresponds to a user or service principal with Can Query access to the model serving endpoint pointed to by this external model provided as a plaintext string.
@@ -154,6 +164,8 @@ The following arguments are supported:
154
164
155
165
### ai_gateway Configuration Block
156
166
167
+
*`fallback_config` - (Optional) block with configuration for traffic fallback which auto fallbacks to other served entities if the request to a served entity fails with certain error codes, to increase availability.
168
+
*`enabled` - Whether to enable traffic fallback. When a served entity in the serving endpoint returns specific error codes (e.g. 500), the request will automatically be round-robin attempted with other served entities in the same endpoint, following the order of served entity list, until a successful response is returned.
157
169
*`guardrails` - (Optional) Block with configuration for AI Guardrails to prevent unwanted data and unsafe data in requests and responses. Consists of the following attributes:
158
170
*`input` - A block with configuration for input guardrail filters:
159
171
*`invalid_keywords` - List of invalid keywords. AI guardrail uses keyword or string matching to decide if the keyword exists in the request or response content.
0 commit comments