Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Important
Items marked (preview) in this article are currently in public preview. This preview is provided without a service-level agreement, and we don't recommend it for production workloads. Certain features might not be supported or might have constrained capabilities. For more information, see Supplemental Terms of Use for Microsoft Azure Previews.
Import and deploy your own model weights on Foundry using the Fireworks inference runtime.
In this article, you learn how to import, register, and deploy your own custom model weights in Microsoft Foundry. Custom model import (also known as bring your own weights) lets you run your proprietary or fine-tuned open-weight models within the Foundry ecosystem.
Note
This custom model import guide uses the Fireworks on Foundry integration. For an overview of available catalog models, supported architectures, data privacy, and limitations, see Use Fireworks models on Foundry.
The import workflow has four steps:
- Prepare your model files in a supported architecture.
- Register the model in the Foundry portal.
- Upload model weights using the Azure Developer CLI.
- Deploy the model to Fireworks inference infrastructure.
Prerequisites
Before you begin, make sure your Azure environment is set up and that you have the required tools installed. To complete the steps in this article, you need the following resources and permissions:
- An Azure subscription. If you don't have one, create a free account.
- A Foundry resource with a Foundry project.
- The Fireworks on Foundry preview feature enabled in your subscription. For setup steps, see Use Fireworks models on Foundry.
- The Cognitive Services Contributor role or equivalent permissions on the Foundry resource to create and manage deployments. For more information, see Azure role based access control.
- Azure Developer CLI (
azd) installed locally. The import workflow usesazdto upload model weights.
Region availability
Support for deploying custom models is available in all global Azure regions except for Azure Government cloud environments.
Model requirements
Custom models must match a supported architecture and include specific files for Foundry to register and deploy them. Review both requirements before starting the import process.
Supported architectures
Custom models must be based on one of the following model architectures:
| Model Architecture | Versions |
|---|---|
| DeepSeek | V3.1, V3.2 |
| Kimi | K2, K2.5 |
| GLM | 4.7, 4.8 |
| OpenAI | gpt-oss-120b |
| Qwen | qwen3-14b |
Required model files
Your model directory must include the following files:
| File | Description |
|---|---|
config.json |
Model configuration (architecture, hyperparameters). |
*.safetensors or *.bin |
One or more model weight files. |
*.index.json |
At least one weights index file that maps weight shards. |
tokenizer.model, tokenizer.json, or tokenizer_config.json |
Tokenizer files required for the model. |
Important
Only full-weight models with original quantization are supported. LoRA adapters or custom quantized models aren't currently supported in this preview.
Import a custom model
The import process starts in the Foundry portal, where you register your model, and then uses the Azure Developer CLI to upload the model weights from your local machine.
Sign in to the Foundry portal.
From the Foundry portal homepage, select Build in the upper-right navigation, then select Models in the left pane.
Select the Custom Models tab.
Select Add a custom model.
Configure the following settings:
Model name: Enter a descriptive name for your custom model.
Base model architecture: Select the model architecture that matches your model (for example,
DeepSeek V3.2orGLM 4.7).
The portal generates an
azdcommand. Copy the command and paste it into a local terminal. Update the--sourceparameter to point to the directory that contains your model weight files.Tip
Make sure the directory you specify contains all the required model files. Missing files cause the import to fail.
Wait for the upload to complete. Upload time depends on the model size and your network bandwidth. Large models (tens of gigabytes) can take a significant amount of time over standard connections.
Verify model registration
After the upload finishes, confirm that Foundry successfully registered the model before proceeding to deployment.
Return to the Foundry portal and refresh the Custom Models page.
Confirm that your imported custom model appears in the list with a Registered status.
Select your model to review its details, including the architecture and file manifest.
Deploy the imported model
With the model registered, you can deploy it to Fireworks' cloud for inference.
From the Custom Models list, select your custom model.
Select Deploy.
Configure the deployment:
- Deployment name: provide a deployment name. During inference, this name is used in the
modelparameter to route requests to this deployment. - Provisioned throughput units: allocate the number of provisioned throughput units (PTUs) for the deployment. For more information, see Provisioned throughput concepts.
- Deployment name: provide a deployment name. During inference, this name is used in the
Review and acknowledge the pricing terms.
Select Deploy.
When the deployment completes, the status shows Succeeded in your deployment list.
Note
You can only have one active model deployment of a custom model at a time in a given project.
Deployment examples
Use the following examples to automate parts of the deployment workflow after the custom model is registered. Each example deploys the custom model with 80 units of Global Provisioned throughput. Be sure to replace any placeholders with your details.
PUT https://management.azure.com/subscriptions/{subscription-id}/resourceGroups/{resource-group}/providers/Microsoft.CognitiveServices/accounts/{foundry-account}/deployments/{deployment-name}?api-version=2025-06-01
Authorization: Bearer <access-token>
Content-Type: application/json
{
"sku": {
"name": "GlobalProvisionedManaged",
"capacity": 80
},
"properties": {
"model": {
"name": "<registered-model-name>",
"format": "FireworksCustom",
"version": "1",
"source": "/subscriptions/{subscription-id}/resourceGroups/{resource-group}/providers/Microsoft.CognitiveServices/accounts/{foundry-account}/projects/{foundry-project}"
}
}
}
Test your deployment
After the deployment succeeds, verify it works by sending a test request:
- Open the Foundry Playground.
- Select your custom model deployment from the model list.
- Send a test prompt and confirm the model returns a valid response.
Troubleshooting
If you encounter issues during import or deployment, use the following table to identify common problems and resolutions.
| Issue | Resolution |
|---|---|
| Import fails with missing files | Verify your model directory contains all required model files, including config.json, weight files, an index file, and tokenizer files. |
| Architecture mismatch | Confirm the architecture you selected matches your model. See supported architectures. |
| Upload times out or stalls | Check your network connection and retry. For large models, use a stable high-bandwidth connection. |
| Deployment fails | Confirm you have sufficient quota and that the Fireworks preview feature is enabled and registered in your subscription. |
| Quota exceeded | Request more quota or reallocate provisioned throughput units from existing deployments. |
For more troubleshooting guidance, see Troubleshoot Fireworks on Foundry.
Related content
Explore the following resources to learn more about Fireworks models, deployment options, and authentication on Foundry.