Import custom models with Fireworks (preview)

Important

Items marked (preview) in this article are currently in public preview. This preview is provided without a service-level agreement, and we don't recommend it for production workloads. Certain features might not be supported or might have constrained capabilities. For more information, see Supplemental Terms of Use for Microsoft Azure Previews.

Import and deploy your own model weights on Foundry using the Fireworks inference runtime.

In this article, you learn how to import, register, and deploy your own custom model weights in Microsoft Foundry. Custom model import (also known as bring your own weights) lets you run your proprietary or fine-tuned open-weight models within the Foundry ecosystem.

Note

This custom model import guide uses the Fireworks on Foundry integration. For an overview of available catalog models, supported architectures, data privacy, and limitations, see Use Fireworks models on Foundry.

The import workflow has four steps:

Prepare your model files in a supported architecture.
Register the model in the Foundry portal.
Upload model weights using the Azure Developer CLI.
Deploy the model to Fireworks inference infrastructure.

Prerequisites

Before you begin, make sure your Azure environment is set up and that you have the required tools installed. To complete the steps in this article, you need the following resources and permissions:

An Azure subscription. If you don't have one, create a free account.
A Foundry resource with a Foundry project.
The Fireworks on Foundry preview feature enabled in your subscription. For setup steps, see Use Fireworks models on Foundry.
The Cognitive Services Contributor role or equivalent permissions on the Foundry resource to create and manage deployments. For more information, see Azure role based access control.
Azure Developer CLI (azd) installed locally. The import workflow uses azd to upload model weights.

Region availability

Support for deploying custom models is available in all global Azure regions except for Azure Government cloud environments.

Model requirements

Custom models must match a supported architecture and include specific files for Foundry to register and deploy them. Review both requirements before starting the import process.

Supported architectures

Custom models must be based on one of the following model architectures:

Model Architecture	Versions
DeepSeek	V3.1, V3.2
Kimi	K2, K2.5
GLM	4.7, 4.8
OpenAI	gpt-oss-120b
Qwen	qwen3-14b

Required model files

Your model directory must include the following files:

File	Description
`config.json`	Model configuration (architecture, hyperparameters).
`.safetensors` or `.bin`	One or more model weight files.
`*.index.json`	At least one weights index file that maps weight shards.
`tokenizer.model`, `tokenizer.json`, or `tokenizer_config.json`	Tokenizer files required for the model.

Important

Only full-weight models with original quantization are supported. LoRA adapters or custom quantized models aren't currently supported in this preview.

Import a custom model

The import process starts in the Foundry portal, where you register your model, and then uses the Azure Developer CLI to upload the model weights from your local machine.

Sign in to the Foundry portal.
From the Foundry portal homepage, select Build in the upper-right navigation, then select Models in the left pane.
Select the Custom Models tab.
Select Add a custom model.
Configure the following settings:
- Model name: Enter a descriptive name for your custom model.
- Base model architecture: Select the model architecture that matches your model (for example, DeepSeek V3.2 or GLM 4.7).
The portal generates an azd command. Copy the command and paste it into a local terminal. Update the --source parameter to point to the directory that contains your model weight files.

Tip

Make sure the directory you specify contains all the required model files. Missing files cause the import to fail.
Wait for the upload to complete. Upload time depends on the model size and your network bandwidth. Large models (tens of gigabytes) can take a significant amount of time over standard connections.

Verify model registration

After the upload finishes, confirm that Foundry successfully registered the model before proceeding to deployment.

Return to the Foundry portal and refresh the Custom Models page.
Confirm that your imported custom model appears in the list with a Registered status.
Select your model to review its details, including the architecture and file manifest.

Deploy the imported model

With the model registered, you can deploy it to Fireworks' cloud for inference.

From the Custom Models list, select your custom model.
Select Deploy.
Configure the deployment:
- Deployment name: provide a deployment name. During inference, this name is used in the model parameter to route requests to this deployment.
- Provisioned throughput units: allocate the number of provisioned throughput units (PTUs) for the deployment. For more information, see Provisioned throughput concepts.
Review and acknowledge the pricing terms.
Select Deploy.

When the deployment completes, the status shows Succeeded in your deployment list.

Note

You can only have one active model deployment of a custom model at a time in a given project.

Deployment examples

Use the following examples to automate parts of the deployment workflow after the custom model is registered. Each example deploys the custom model with 80 units of Global Provisioned throughput. Be sure to replace any placeholders with your details.

PUT https://management.azure.com/subscriptions/{subscription-id}/resourceGroups/{resource-group}/providers/Microsoft.CognitiveServices/accounts/{foundry-account}/deployments/{deployment-name}?api-version=2025-06-01
Authorization: Bearer <access-token>
Content-Type: application/json

{
  "sku": {
    "name": "GlobalProvisionedManaged",
    "capacity": 80
  },
  "properties": {
    "model": {
      "name": "<registered-model-name>",
      "format": "FireworksCustom",
      "version": "1",
      "source": "/subscriptions/{subscription-id}/resourceGroups/{resource-group}/providers/Microsoft.CognitiveServices/accounts/{foundry-account}/projects/{foundry-project}"
    }
  }
}

az cognitiveservices account deployment create `
  --name "<foundry-resource-name>" `
  --resource-group "<resource-group>" `
  --deployment-name "<deployment-name>" `
  --model-name "<registered-model-name>" `
  --model-version "1" `
  --model-format "FireworksCustom" `
  --model-source "/subscriptions/{subscription-id}/resourceGroups/{resource-group}/providers/Microsoft.CognitiveServices/accounts/{foundry-account}/projects/{foundry-project}" `
  --sku-name "GlobalProvisionedManaged" `
  --sku-capacity 80

az cognitiveservices account deployment create \
  --name "<foundry-resource-name>" \
  --resource-group "<resource-group>" \
  --deployment-name "<deployment-name>" \
  --model-name "<registered-model-name>" \
  --model-version "1" \
  --model-format "FireworksCustom" \
  --model-source "/subscriptions/{subscription-id}/resourceGroups/{resource-group}/providers/Microsoft.CognitiveServices/accounts/{foundry-account}/projects/{foundry-project}" \
  --sku-name "GlobalProvisionedManaged" \
  --sku-capacity 80

Test your deployment

After the deployment succeeds, verify it works by sending a test request:

Open the Foundry Playground.
Select your custom model deployment from the model list.
Send a test prompt and confirm the model returns a valid response.

Troubleshooting

If you encounter issues during import or deployment, use the following table to identify common problems and resolutions.

Issue	Resolution
Import fails with missing files	Verify your model directory contains all required model files, including `config.json`, weight files, an index file, and tokenizer files.
Architecture mismatch	Confirm the architecture you selected matches your model. See supported architectures.
Upload times out or stalls	Check your network connection and retry. For large models, use a stable high-bandwidth connection.
Deployment fails	Confirm you have sufficient quota and that the Fireworks preview feature is enabled and registered in your subscription.
Quota exceeded	Request more quota or reallocate provisioned throughput units from existing deployments.

For more troubleshooting guidance, see Troubleshoot Fireworks on Foundry.

Explore the following resources to learn more about Fireworks models, deployment options, and authentication on Foundry.

Feedback

Was this page helpful?

Last updated on 2026-03-20