Share via

Azure VM Provisioning Failure Due to Real-Time Capacity Constraints (SkuNotAvailable Issue) - Azure Server Prepare Automation

Ashish Sharma 0 Reputation points
2026-04-07T07:38:30.7466667+00:00

Problem Statement

I am building an automated system to provision Azure Virtual Machines (VMs) using APIs. The system is fully automated and does not require manual intervention.

The issue occurs at the final step of VM creation, which frequently fails with a SkuNotAvailable error due to temporary capacity shortages in the selected region, even though earlier validation indicates the VM size is available.


Current Workflow

The system follows these steps:

Validate VM Size Availability

  • Call Resource SKUs API:
      GET /subscriptions/{subId}/providers/Microsoft.Compute/skus
    
  • Filter by region (e.g., Central India)
  • If the VM size (e.g., Standard_B1s) has no restrictions, it is considered available

Create Supporting Resources

  • Resource Group
  • Virtual Network
  • Public IP
  • Network Interface (wait until provisioning succeeds)

Create Virtual Machine

  • Call VM creation API:
      PUT /subscriptions/{subId}/resourceGroups/{rg}/providers/Microsoft.Compute/virtualMachines/{vmName}
    
  • This step sometimes fails

Failure Details

Even after successful validation, VM creation fails with the following error:

SkuNotAvailable: The requested VM size is currently not available in the selected region due to capacity restrictions.

Key observations:

  • The issue is caused by temporary capacity shortages
  • The same VM size may work later or in another region
  • Resource SKUs API does not provide real-time capacity information

User Experience

User Journey

  1. User logs in with Microsoft account
  2. Selects Azure subscription
  3. Chooses region (e.g., Central India)
  4. Enters expected user load (e.g., 100 users)
  5. System recommends a VM size
  6. User clicks Deploy
  7. After 2–3 minutes, deployment fails due to capacity issue

Impact

  • User waits unnecessarily
  • Deployment fails at the final step
  • User must restart the process
  • Leads to poor user experience

Backend Process

After user clicks Deploy:

  1. Validate subscription, region, and VM size
  2. Create required resources:
    • Resource Group
    • Virtual Network
    • Public IP
    • Network Interface
  3. Attempt VM creation
  4. Failure occurs due to real-time capacity issue
  5. Partial resources remain and require cleanup

Identified Gap

  • No Azure API provides real-time capacity availability for VM sizes
  • Resource SKUs API only shows static or subscription-level restrictions
  • Temporary capacity constraints are not exposed before deployment

Requirements

Primary Requirement

A reliable way to determine, at deployment time, whether a VM size is actually available considering real-time capacity.

Alternative Requirement

A recommended backend design pattern to handle this scenario efficiently.


Questions / Solution Areas

  1. Is there any Azure API that provides real-time VM capacity availability before deployment?
  2. If not, what is the best practice to handle this in an automated system?
  3. Should the system implement:
    • Fallback VM sizes?
    • Retry logic with different regions or zones?
    • Lightweight VM creation validation before full resource setup?
    • Azure services like Compute Fleet for capacity handling?
  4. Are there any new or preview APIs that address this limitation?

The system currently relies on static availability checks, but failures occur due to real-time capacity constraints at the final deployment step. A more reliable validation or fallback strategy is required to improve success rate and user experience.

I'm waiting for your response. thank you

Azure Virtual Machines
Azure Virtual Machines

An Azure service that is used to provision Windows and Linux virtual machines.

0 comments No comments

2 answers

Sort by: Most helpful
  1. Jilakara Hemalatha 11,515 Reputation points Microsoft External Staff Moderator
    2026-04-07T08:08:48.93+00:00

    Hello Ashish,

    Thank you for the detailed explanation of your workflow and the issue observed during VM provisioning.

    Based on your description, the behavior you are encountering is expected and aligns with how Azure handles compute capacity allocation.

    The validation step using the Resource SKUs API confirms whether a VM size is supported in a region and whether there are any subscription-level restrictions. However, it does not provide real-time capacity availability. Azure capacity is dynamically allocated across regions and customers, and availability can change between the validation step and the actual VM creation request.

    As a result, even if a VM size appears available during validation, the deployment may still fail with a SkuNotAvailable error at the time of allocation due to temporary capacity constraints in the selected region or availability zone.

    At present, there is no Azure API that exposes real-time VM capacity availability prior to deployment. This is a known platform behavior and not an issue with your implementation.

    Given this, the recommended approach is to design the automation workflow to handle such allocation failures gracefully.

    You may consider attempting On-Demand Capacity Reservation at the beginning of the workflow so that capacity is validated upfront. If the reservation is unsuccessful, you can immediately try alternate options without creating dependent resources: https://learn.microsoft.com/en-us/azure/virtual-machines/capacity-reservation-overview

    It is also recommended to maintain a fallback strategy with alternate VM sizes, availability zones, and regions. In case of a SkuNotAvailable or AllocationFailed error, the system can automatically retry using the next available option: https://learn.microsoft.com/en-us/troubleshoot/azure/virtual-machines/allocation-failure

    Additionally, implementing retry logic with exponential backoff is important, as these failures are often temporary and may succeed on subsequent attempts: https://learn.microsoft.com/en-us/troubleshoot/azure/virtual-machines/allocation-failure

    To avoid partial resource creation, you may consider using ARM or Bicep templates for deployment, which ensure automatic rollback if the VM allocation fails.

    For scenarios where capacity must be guaranteed, Azure Capacity Reservations can be used to reserve compute resources in advance, ensuring availability at deployment time: https://learn.microsoft.com/en-us/azure/virtual-machines/capacity-reservation-overview

    Hope this helps! Please let me know if you have any queries.

    0 comments No comments

  2. Q&A Assist
    2026-04-07T07:39:21.54+00:00
    AI generated content. This question contains an answer created with AI from Q&A Assist. Learn more
    1. There is no API that exposes real‑time capacity

    Azure doesn’t expose real‑time, per‑SKU capacity availability via any public API. The behavior described matches documented behavior:

    • SkuNotAvailable is raised when a VM size isn’t available in a region/zone at allocation time, even if the SKU is listed as available for that region and subscription.
    • The Resource SKUs list only tells whether a SKU can be used in a region/subscription, not whether there is current capacity.

    This is explicitly called out in the guidance for SkuNotAvailable and allocation failures: the recommendation is to choose another size or region, or retry later when capacity frees up.

    1. Recommended patterns and best practices

    Because real‑time capacity isn’t exposed, the pattern has to be resilient to allocation failure rather than trying to avoid it completely.

    2.1. Use On‑demand Capacity Reservations for critical SKUs

    If the workload is important enough and uses supported VM series, use On‑demand Capacity Reservations:

    • Reserve capacity in advance for a specific VM size/region (and optionally zone/fault domains).
    • Once reserved, deployments against that reservation won’t hit transient capacity shortages (within the reserved quantity and constraints).
    • Capacity reservations require quota and are limited to specific VM series and sizes.

    This is the only way to get a strong guarantee that capacity will be available when the VM is created.

    Relevant points from the documentation:

    • Capacity reservations require quota just like VMs.
    • Only certain VM series/sizes are supported; supported SKUs are advertised via the compute Resource SKUs list.
    • There are limitations (no Spot, no Availability Sets, some constraints like PPG/UltraSSD not supported, max 3 fault domains, etc.).

    For user‑facing flows where failure is very costly to UX, consider:

    • Pre‑creating capacity reservations per region and VM family that the system offers.
    • Associating the VM/VMSS explicitly with the reservation when deploying.

    2.2. Implement robust fallback and retry logic

    For scenarios where capacity reservations are not used or not supported, design the backend to expect allocation failures and react quickly:

    1. Fast retry with same SKU/region
      • If SkuNotAvailable or allocation failure occurs, perform a limited number of quick retries (e.g., 1–2) with short backoff. Sometimes capacity frees up quickly.
    2. Fallback SKUs in the same region
      • Maintain a mapping of “equivalent” or “acceptable alternative” SKUs per region (e.g., B1s → B1ms → D2s_v3, etc., depending on your sizing logic).
      • On SkuNotAvailable, automatically attempt the next SKU in the list in the same region.
      • Surface to the user that an alternative size was used.
    3. Fallback regions or zones
      • For user scenarios where region flexibility is acceptable, maintain a prioritized list of regions (e.g., primary + backup region(s)).
      • On repeated failures in the primary region, automatically attempt deployment in a backup region.
      • For zonal deployments, consider:
        • Removing the zone constraint (regional VM) if acceptable.
        • Or trying another zone in the same region.
    4. Handle overconstrained requests
      • Allocation failures can also be caused by combinations of constraints (size + zone + PPG + Ultra disk + accelerated networking, etc.).
      • Implement logic to relax non‑essential constraints when OverconstrainedAllocationRequest/OverconstrainedZonalAllocationRequest‑type failures occur:
        • Try without PPG.
        • Try without UltraSSD/PremiumSSDv2.
        • Try without accelerated networking.
        • Try as regional instead of zonal.

    2.3. Make VM creation the first heavy operation

    To avoid creating many supporting resources before discovering capacity issues:

    • In ARM/Bicep or template‑driven flows, define all resources in a single deployment, with the VM as part of the template. If the VM fails allocation, the deployment fails and dependent resources are rolled back.
    • If using imperative API calls:
      • Create the VM first with minimal dependencies (e.g., a basic VNet/subnet that can be reused or is pre‑created).
      • Only after VM creation succeeds, create additional, more expensive or specific resources.
      • Or use a “staging” resource group for the VM; if VM creation fails, delete the whole group to clean up.

    This reduces the amount of orphaned resources and cleanup work when capacity issues occur.

    2.4. User‑experience‑oriented design

    To improve UX in the described journey:

    • Pre‑validation and messaging
      • Explain that capacity is dynamic and that, in rare cases, the system may automatically choose a nearby region or alternative size.
    • Non‑blocking deployment
      • Instead of making the user wait synchronously for 2–3 minutes, accept the request, return a “deployment in progress” state, and update status asynchronously.
      • If the first attempt fails, the backend can transparently retry with fallbacks before informing the user.
    • Configurable policies
      • Allow tenants to configure:
        • Whether automatic region fallback is allowed.
        • Which alternative SKUs are acceptable.
        • Whether to require exact size/region or allow substitutions.
    1. Specific answers to the questions
    2. Is there any Azure API that provides real‑time VM capacity availability before deployment?
      No public API exposes real‑time capacity per SKU/region/zone. Availability is only definitively checked at allocation time, which is when SkuNotAvailable and allocation failures are raised.
    3. Best practice to handle this in an automated system?
      • Use On‑demand Capacity Reservations for critical, predictable workloads on supported SKUs.
      • Otherwise, design for failure: implement retry, fallback SKUs, fallback regions/zones, and constraint‑relaxation logic, and minimize pre‑VM resource creation.
    4. Should the system implement fallback sizes/regions/retries/lightweight validation/Compute Fleet?
      • Yes to:
        • Fallback VM sizes.
        • Retry logic with different regions or zones where acceptable.
        • Template‑based or “VM‑first” deployments to avoid heavy pre‑work.
      • There is no mention in the provided material of a “Compute Fleet” service for capacity handling; capacity reservations and standard allocation‑failure workarounds are the recommended mechanisms.
    5. Any new or preview APIs that address this limitation?
      The provided material does not describe any new or preview API that exposes real‑time capacity. The documented guidance is to:
      • Use On‑demand Capacity Reservations to guarantee capacity.
      • Handle allocation failures with retries, alternative sizes, and/or regions.
    6. Additional concrete techniques from the documentation
    • When SkuNotAvailable occurs, the official guidance is to “try another size or deploy to a different location or zones.”
    • For allocation failures in general, recommended workarounds include:
      • Retry the allocation.
      • Resize the VM to a different size.
      • Change the region or zone.
      • Adjust constraints (remove PPG, UltraSSD/PremiumSSDv2, accelerated networking, or zonal constraints) to reduce over‑constraint.
    • To ensure capacity is always available for critical workloads, use On‑demand Capacity Reservations.

    These patterns can be encoded directly into the automation logic to improve success rates and user experience.


    References:

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.