|
1 | 1 | ---
|
2 |
| -title: Deploy models in Azure AI Foundry portal |
| 2 | +title: Deployment options for Azure AI Foundry Models |
3 | 3 | titleSuffix: Azure AI Foundry
|
4 |
| -description: Learn about deploying models in Azure AI Foundry portal. |
| 4 | +description: Learn about deployment options for Azure AI Foundry Models. |
5 | 5 | manager: scottpolly
|
6 | 6 | ms.service: azure-ai-foundry
|
7 | 7 | ms.topic: concept-article
|
8 |
| -ms.date: 03/24/2025 |
| 8 | +ms.date: 06/30/2025 |
9 | 9 | ms.reviewer: fasantia
|
10 | 10 | ms.author: mopeakande
|
11 | 11 | author: msakande
|
12 | 12 | ---
|
13 | 13 |
|
14 |
| -# Overview: Deploy AI models in Azure AI Foundry portal |
| 14 | +# Deployment overview for Azure AI Foundry Models |
15 | 15 |
|
16 |
| -The model catalog in Azure AI Foundry portal is the hub to discover and use a wide range of models for building generative AI applications. Models need to be deployed to make them available for receiving inference requests. Azure AI Foundry offers a comprehensive suite of deployment options for models, depending on your needs and model requirements. |
| 16 | +The model catalog in Azure AI Foundry is the hub to discover and use a wide range of Foundry Models for building generative AI applications. Models need to be deployed to make them available for receiving inference requests. Azure AI Foundry offers a comprehensive suite of deployment options for Foundry Models, depending on your needs and model requirements. |
17 | 17 |
|
18 |
| -## Deploying models |
| 18 | +## Deployment options |
19 | 19 |
|
20 |
| -Deployment options vary depending on the model offering: |
| 20 | +Azure AI Foundry provides several deployment options depending on the type of models and resources you need to provision. The following deployment options are available: |
21 | 21 |
|
22 |
| -* **Azure OpenAI in Azure AI Foundry Models:** The latest OpenAI models that have enterprise features from Azure with flexible billing options. |
23 |
| -* **Serverless API deployment:** These models don't require compute quota from your subscription and are billed per token in a serverless API deployment. |
24 |
| -* **Open and custom models:** The model catalog offers access to a large variety of models across modalities, including models of open access. You can host open models in your own subscription with a managed infrastructure, virtual machines, and the number of instances for capacity management. |
| 22 | +- Standard deployment in Azure AI Foundry resources |
| 23 | +- Deployment to serverless API endpoints |
| 24 | +- Deployment to managed computes |
25 | 25 |
|
26 |
| -Azure AI Foundry offers four different deployment options: |
| 26 | +### Standard deployment in Azure AI Foundry resources |
27 | 27 |
|
28 |
| -|Name | Azure OpenAI | Azure AI Foundry Models | Serverless API deployment | Managed compute | |
29 |
| -|-------------------------------|----------------------|-------------------|----------------|-----------------| |
30 |
| -| Which models can be deployed? | [Azure OpenAI models](../../ai-services/openai/concepts/models.md) | [Azure OpenAI models and serverless API deployment](../../ai-foundry/model-inference/concepts/models.md) | [serverless API deployment](../how-to/model-catalog-overview.md) | [Open and custom models](../how-to/model-catalog-overview.md#availability-of-models-for-deployment-as-managed-compute) | |
31 |
| -| Deployment resource | Azure OpenAI resource | Azure AI services resource | AI project resource | AI project resource | |
32 |
| -| Requires Hubs/Projects | No | No | Yes | Yes | |
33 |
| -| Data processing options | Regional <br /> Data-zone <br /> Global | Global | Regional | Regional | |
34 |
| -| Private networking | Yes | Yes | Yes | Yes | |
35 |
| -| Content filtering | Yes | Yes | Yes | No | |
36 |
| -| Custom content filtering | Yes | Yes | No | No | |
37 |
| -| Key-less authentication | Yes | Yes | No | No | |
38 |
| -| Best suited when | You're planning to use only OpenAI models | You're planning to take advantage of the flagship models in Azure AI catalog, including OpenAI. | You're planning to use a single model from a specific provider (excluding OpenAI). | If you plan to use open models and you have enough compute quota available in your subscription. | |
39 |
| -| Billing bases | Token usage & [provisioned throughput units](../../ai-services/openai/concepts/provisioned-throughput.md) | Token usage | Token usage<sup>1</sup> | Compute core hours<sup>2</sup> | |
40 |
| -| Deployment instructions | [Deploy to Azure OpenAI](../how-to/deploy-models-openai.md) | [Deploy to Foundry Models](../model-inference/how-to/create-model-deployments.md) | [Deploy to serverless API deployment](../how-to/deploy-models-serverless.md) | [Deploy to Managed compute](../how-to/deploy-models-managed.md) | |
| 28 | +Azure AI Foundry resources (formerly referred to as Azure AI Services resources), is **the preferred deployment option** in Azure AI Foundry. It offers the widest range of capabilities, including regional, data zone, or global processing, and it offers standard and [provisioned throughput (PTU)](../../ai-services/openai/concepts/provisioned-throughput.md) options. Flagship models in Azure AI Foundry Models support this deployment option. |
41 | 29 |
|
42 |
| -<sup>1</sup> A minimal endpoint infrastructure is billed per minute. You aren't billed for the infrastructure that hosts the model in serverless API deployment. After you delete the endpoint, no further charges accrue. |
| 30 | +This deployment option is available in: |
43 | 31 |
|
44 |
| -<sup>2</sup> Billing is on a per-minute basis, depending on the product tier and the number of instances used in the deployment since the moment of creation. After you delete the endpoint, no further charges accrue. |
| 32 | +* Azure AI Foundry resources |
| 33 | +* Azure OpenAI resources<sup>1</sup> |
| 34 | +* Azure AI hub, when connected to an Azure AI Foundry resource (requires the [Deploy models to Azure AI Foundry resources](#configure-azure-ai-foundry-portal-for-deployment-options) feature to be turned on). |
| 35 | + |
| 36 | +<sup>1</sup>If you're using Azure OpenAI resources, the model catalog shows only Azure OpenAI in Foundry Models for deployment. You can get the full list of Foundry Models by upgrading to an Azure AI Foundry resource. |
| 37 | + |
| 38 | +To get started with standard deployment in Azure AI Foundry resources, see [How-to: Deploy models to Azure AI Foundry Models](../foundry-models/how-to/create-model-deployments.md). |
| 39 | + |
| 40 | +### Serverless API endpoint |
| 41 | + |
| 42 | +This deployment option is available **only in** [Azure AI hub resources](ai-resources.md) and it allows the creation of dedicated endpoints to host the model, accessible via API. Azure AI Foundry Models support serverless API endpoints with pay-as-you-go billing. |
| 43 | + |
| 44 | +Only regional deployments can be created for serverless API endpoints, and to use it, you _must_ **turn off** the "Deploy models to Azure AI Foundry resources" option. |
| 45 | + |
| 46 | +To get started with deployment to a serverless API endpoint, see [Deploy models as serverless API deployments](../how-to/deploy-models-serverless.md). |
45 | 47 |
|
46 |
| -> [!TIP] |
47 |
| -> To learn more about how to track costs, see [Monitor costs for models offered through Azure Marketplace](../how-to/costs-plan-manage.md#monitor-costs-for-models-offered-through-the-azure-marketplace). |
| 48 | +### Managed compute |
48 | 49 |
|
49 |
| -### How should I think about deployment options? |
| 50 | +This deployment option is available **only in** [Azure AI hub resources](ai-resources.md) and it allows the creation of a dedicated endpoint to host the model in a **dedicated compute**. You need to have compute quota in your subscription to host the model, and you're billed per compute uptime. |
50 | 51 |
|
51 |
| -Azure AI Foundry encourages you to explore various deployment options and choose the one that best suites your business and technical needs. In general, Consider using the following approach to select a deployment option: |
| 52 | +Managed compute deployment is required for model collections that include: |
52 | 53 |
|
53 |
| -* Start with [Foundry Models](../../ai-foundry/model-inference/overview.md), which is the option with the largest scope. This option allows you to iterate and prototype faster in your application without having to rebuild your architecture each time you decide to change something. If you're using Azure AI Foundry hubs or projects, enable this option by [turning on the Foundry Models feature](../model-inference/how-to/quickstart-ai-project.md#configure-the-project-to-use-foundry-models). |
| 54 | +* Hugging Face |
| 55 | +* NVIDIA inference microservices (NIMs) |
| 56 | +* Industry models (Saifr, Rockwell, Bayer, Cerence, Sight Machine, Page AI, SDAIA) |
| 57 | +* Databricks |
| 58 | +* Custom models |
54 | 59 |
|
55 |
| -* When you're looking to use a specific model: |
| 60 | +To get started, see [How to deploy and inference a managed compute deployment](../how-to/deploy-models-managed.md) and [Deploy Azure AI Foundry Models to managed compute with pay-as-you-go billing](../how-to/deploy-models-managed-pay-go.md). |
| 61 | + |
| 62 | +## Capabilities for the deployment options |
| 63 | + |
| 64 | +We recommend using [Standard deployments in Azure AI Foundry resources](#standard-deployment-in-azure-ai-foundry-resources) whenever possible, as it offers the largest set of capabilities among the available deployment options. The following table lists details about specific capabilities available for each deployment option: |
| 65 | + |
| 66 | +| Capability | Standard deployment in Azure AI Foundry resources | Serverless API Endpoint | Managed compute | |
| 67 | +|-------------------------------|--------------------------------------------------|------------------------|-----------------| |
| 68 | +| Which models can be deployed? | [Foundry Models](../../ai-foundry/foundry-models/concepts/models.md) | [Foundry Models with pay-as-you-go billing](../how-to/model-catalog-overview.md) | [Open and custom models](../how-to/model-catalog-overview.md#availability-of-models-for-deployment-as-managed-compute) | |
| 69 | +| Deployment resource | Azure AI Foundry resource | AI project (in AI hub resource) | AI project (in AI hub resource) | |
| 70 | +| Requires AI Hubs | No | Yes | Yes | |
| 71 | +| Data processing options | Regional <br /> Data-zone <br /> Global | Regional | Regional | |
| 72 | +| Private networking | Yes | Yes | Yes | |
| 73 | +| Content filtering | Yes | Yes | No | |
| 74 | +| Custom content filtering | Yes | No | No | |
| 75 | +| Key-less authentication | Yes | No | No | |
| 76 | +| Billing bases | Token usage & [provisioned throughput units](../../ai-services/openai/concepts/provisioned-throughput.md) | Token usage<sup>1</sup> | Compute core hours<sup>2</sup> | |
| 77 | + |
| 78 | +<sup>1</sup> A minimal endpoint infrastructure is billed per minute. You aren't billed for the infrastructure that hosts the model in standard deployment. After you delete the endpoint, no further charges accrue. |
| 79 | + |
| 80 | +<sup>2</sup> Billing is on a per-minute basis, depending on the product tier and the number of instances used in the deployment since the moment of creation. After you delete the endpoint, no further charges accrue. |
56 | 81 |
|
57 |
| - * If you're interested in Azure OpenAI models, use Azure OpenAI in Foundry Models. This option is designed for Azure OpenAI models and offers a wide range of capabilities for them. |
| 82 | +## Configure Azure AI Foundry portal for deployment options |
58 | 83 |
|
59 |
| - * If you're interested in a particular model from serverless pay per token offer, and you don't expect to use any other type of model, use [serverless API deployment](../how-to/deploy-models-serverless.md). serverless API deployments allow deployment of a single model under a unique set of endpoint URL and keys. |
| 84 | +Azure AI Foundry portal might automatically pick up a deployment option based on your environment and configuration. We recommend using Azure AI Foundry resources for deployment whenever possible. To do that, ensure that the **Deploy models to Azure AI Foundry resources** feature is **turned on**. |
60 | 85 |
|
61 |
| -* When your model isn't available in serverless API deployment and you have compute quota available in your subscription, use [Managed Compute](../how-to/deploy-models-managed.md), which supports deployment of open and custom models. It also allows a high level of customization of the deployment inference server, protocols, and detailed configuration. |
| 86 | +:::image type="content" source="../media/concepts/deployments-overview/docs-flag-enable-foundry.png" alt-text="A screenshot showing the steps to enable deployment to Azure AI Foundry resources in the Azure AI Foundry portal." lightbox="../media/concepts/deployments-overview/docs-flag-enable-foundry.png"::: |
62 | 87 |
|
| 88 | +Once the **Deploy models to Azure AI Foundry resources** feature is enabled, models that support multiple deployment options default to deploy to Azure AI Foundry resources for deployment. To access other deployment options, either disable the feature or use the Azure CLI or Azure Machine Learning SDK for deployment. You can disable and enable the feature as many times as needed without affecting existing deployments. |
63 | 89 |
|
64 | 90 | ## Related content
|
65 | 91 |
|
66 |
| -* [Configure your AI project to use Foundry Models](../../ai-foundry/model-inference/how-to/quickstart-ai-project.md) |
67 |
| -* [Add and configure models to Foundry Models](../model-inference/how-to/create-model-deployments.md) |
| 92 | +* [Configure your AI project to use Foundry Models](../../ai-foundry/foundry-models/how-to/quickstart-ai-project.md) |
| 93 | +* [Add and configure models to Foundry Models](../foundry-models/how-to/create-model-deployments.md) |
68 | 94 | * [Deploy Azure OpenAI models with Azure AI Foundry](../how-to/deploy-models-openai.md)
|
69 | 95 | * [Deploy open models with Azure AI Foundry](../how-to/deploy-models-managed.md)
|
70 |
| -* [Model catalog and collections in Azure AI Foundry portal](../how-to/model-catalog-overview.md) |
| 96 | +* [Explore Azure AI Foundry Models](../how-to/model-catalog-overview.md) |
0 commit comments