An Azure service for ingesting, preparing, and transforming data at scale.
Yes, there is per-activity overhead in Azure Data Factory, and what is observed is expected behavior for small/fast queries.
Key points based on the documented behavior:
- Activity startup and queue time
- Pipelines and activities can sit in a queued state due to concurrency limits or IR availability. This is explicitly called out as “queue time” and can add several seconds before the activity actually starts.
- When using a Managed Virtual Network / virtual network–joined Integration Runtime, each copy or compute activity has a warm‑up phase. The service does not reserve a dedicated compute node per instance, so each activity may incur extra startup time compared to Azure IR.
- Integration Runtime warm‑up overhead
- Documentation notes that when running on a Managed virtual network IR, activities “take more time on average than the run when based on Azure IR” because of this warm‑up and queue behavior.
- This overhead is largely independent of how fast the underlying MySQL query runs. Even if the query itself completes in milliseconds, the IR startup, connection setup, and orchestration overhead can easily add many seconds.
- Why a simple MySQL query appears “slow” in ADF
- The total activity duration is roughly: queue time + IR warm‑up + connection + query execution + teardown.
- For small queries, the fixed overhead dominates, so the activity duration (e.g., ~20 seconds) is mostly orchestration and IR behavior, not database execution time.
- Optimization options
From the available guidance, the following levers exist:- Use Azure IR instead of Managed VNet IR when possible. The documentation explicitly states that Managed VNet IR has longer queue/warm‑up time than Azure IR. If network and security requirements allow, switching to Azure IR typically reduces startup latency.
- Reduce contention / concurrency issues:
- Check pipeline concurrency policies and ensure there are no old runs stuck “In Progress” that block new runs.
- In the Monitoring view, cancel any long‑running or stuck runs so new activities do not wait behind them.
- Avoid unnecessary fan‑out that increases orchestration overhead:
- If many small activities are used (e.g., many tiny queries in separate activities or inside a
ForEach), the per‑activity overhead accumulates. Where possible, batch logic into fewer activities or combine queries.
- If many small activities are used (e.g., many tiny queries in separate activities or inside a
- When behavior is abnormal
- If queue/startup time is much longer than usual (for example, runs stuck in queued state for more than an hour), documentation recommends treating it as a potential transient or platform issue and opening a support case after basic checks.
In summary, a ~20‑second activity duration for a sub‑second MySQL query is consistent with ADF’s orchestration and IR warm‑up behavior, especially on Managed VNet IR. The main optimizations are to prefer Azure IR where feasible, minimize pipeline concurrency bottlenecks, and reduce the number of tiny, separate activities.
References: