Edit

Share via


Bring Your Own Model (BYOM) with Voice Live API

The Voice Live API provides Bring Your Own Model (BYOM) capabilities, allowing you to integrate your custom models into the voice interaction workflow. BYOM is useful for the following scenarios:

  • Fine-tuned models: Use your custom Azure OpenAI or Azure Foundry models
  • Any Foundry model not pre-deployed by Voice Live: Use models from the Foundry model catalog such as Anthropic Claude, Grok, Fireworks custom weights, or a model router deployment
  • Provisioned throughput: Use your PTU (Provisioned Throughput Units) deployments for consistent performance
  • Content safety: Apply customized content safety configurations with your LLM

Important

You can integrate any model deployed in your Azure Foundry resource with the Voice Live API. To use model deployments from a different Foundry resource, see Resource overrides.

Tip

When you use your own model deployment with Voice Live, we recommend you set its content filtering configuration to Asynchronous filtering to reduce latency. Content filtering settings can be configured in the Microsoft Foundry portal.

Authentication setup

When using Microsoft Entra ID authentication with Voice Live API, in byom-azure-openai-chat-completion or byom-foundry-anthropic-messages mode, you need to configure proper permissions for your Foundry resource. Since tokens expire during long sessions, the system-assigned managed identity of the Foundry resource requires access to model deployments for these BYOM modes.

Run the following Azure CLI commands to configure the necessary permissions:

export subscription_id=<your-subscription-id>
export resource_group=<your-resource-group>
export foundry_resource=<your-foundry-resource>

# Enable system-assigned managed identity for the foundry resource
az cognitiveservices account identity assign --name ${foundry_resource} --resource-group ${resource_group} --subscription ${subscription_id}

# Get the system-assigned managed identity object ID
identity_principal_id=$(az cognitiveservices account show --name ${foundry_resource} --resource-group ${resource_group} --subscription ${subscription_id} --query "identity.principalId" -o tsv)

# Assign the Azure AI User role to the system identity of the foundry resource
az role assignment create --assignee-object-id ${identity_principal_id} --role "Azure AI User" --scope /subscriptions/${subscription_id}/resourceGroups/${resource_group}/providers/Microsoft.CognitiveServices/accounts/${foundry_resource}

Cross-resource authentication

When you use resource overrides, authentication setup is mandatory regardless of your authentication method (API key or Microsoft Entra ID). You must configure permissions for both the Voice Live Foundry resource and the model Foundry resource. Run the following commands to configure the necessary permissions:

export subscription_id_for_model=<your-subscription-id-for-model-resource>
export resource_group_for_model=<your-resource-group-for-model-resource>
export foundry_resource_for_model=<your-foundry-resource-for-model>

export subscription_id_for_voice_live=<your-subscription-id-for-voice-live-resource>
export resource_group_for_voice_live=<your-resource-group-for-voice-live-resource>
export foundry_resource_for_voice_live=<your-foundry-resource-for-voice-live>

# Enable system-assigned managed identity for the Voice Live Foundry resource
az cognitiveservices account identity assign \
    --name ${foundry_resource_for_voice_live} \
    --resource-group ${resource_group_for_voice_live} \
    --subscription ${subscription_id_for_voice_live}

# Get the system-assigned managed identity object ID
# for the Voice Live resource
identity_principal_id=$(az cognitiveservices account show \
    --name ${foundry_resource_for_voice_live} \
    --resource-group ${resource_group_for_voice_live} \
    --subscription ${subscription_id_for_voice_live} \
    --query "identity.principalId" -o tsv)

# Assign the Azure AI User role to the Voice Live resource's
# system identity on the model Foundry resource
az role assignment create \
    --assignee-object-id ${identity_principal_id} \
    --role "Azure AI User" \
    --scope /subscriptions/${subscription_id_for_model}/resourceGroups/${resource_group_for_model}/providers/Microsoft.CognitiveServices/accounts/${foundry_resource_for_model}

Choose BYOM integration mode

The Voice Live API supports three BYOM integration modes:

Mode Description Example Models
byom-azure-openai-realtime Azure OpenAI realtime models for streaming voice interactions gpt-realtime, gpt-realtime-mini
byom-azure-openai-chat-completion Azure OpenAI chat completion models for text-based interactions. Also applies to other Foundry models gpt-5.4, gpt-5.3-chat, grok-4
byom-foundry-anthropic-messages Anthropic Claude models deployed in Azure Foundry, using the Messages API (preview) claude-sonnet-4.6, claude-haiku-4.5

Note

The byom-foundry-anthropic-messages mode is currently in preview. Preview features are subject to change and might have limited availability.

Integrate BYOM

Update the endpoint URL in your API call to include your BYOM configuration:

wss://<your-foundry-resource>.cognitiveservices.azure.com/voice-live/realtime?api-version=2025-10-01&profile=<your-byom-mode>&model=<your-model-deployment>

Get the <your-model-deployment> value from the Foundry portal. It corresponds to the name you gave the model at deployment time.

For example, to use an Anthropic Claude model deployed in Azure Foundry:

wss://<your-foundry-resource>.cognitiveservices.azure.com/voice-live/realtime?api-version=2025-10-01&profile=byom-foundry-anthropic-messages&model=<your-claude-deployment-name>

To use a model deployment from a different Foundry resource, add the foundry-resource-override parameter:

wss://<your-foundry-resource>.cognitiveservices.azure.com/voice-live/realtime?api-version=2025-10-01&profile=<your-byom-mode>&model=<your-model-deployment>&foundry-resource-override=<foundry-resource>

The <foundry-resource> value is the resource name without the domain suffix. For example, if the Foundry resource endpoint is https://my-foundry-resource.services.ai.azure.com, then use my-foundry-resource.

Note

When you use the byom-foundry-anthropic-messages mode, the usage field in response.done events only contains audio token usage (for speech recognition and text-to-speech). LLM token usage from the Anthropic model is reported separately in the response metadata.

Resource overrides

By default, the Voice Live API uses LLM deployments in the same Foundry resource as the Voice Live service. If your model deployments are in a different Foundry resource, specify the foundry-resource-override query parameter to redirect the API to the correct resource. This supports cross-region scenarios where the Voice Live service and the model deployments are in different regions.

The foundry-resource-override value is the resource name without the domain suffix. For example, if the Foundry resource endpoint is https://my-foundry-resource.services.ai.azure.com, use my-foundry-resource.

See each tab in the Integrate BYOM section for implementation details.

Important

When you use resource overrides, you must configure cross-resource authentication regardless of your authentication method (API key or Microsoft Entra ID).