An Azure service for ingesting, preparing, and transforming data at scale.
@Kien Ngo Hey Kien, it sounds like you’re bumping into the “cold start” delays of the Azure Data Factory Integration Runtime (IR). By default, whenever you kick off a pipeline or activity and there’s no live cluster available, Data Factory spins up a fresh IR compute environment—which can easily take 2–3 minutes (and even longer if you’re using a managed VNet). That delay happens per activity or pipeline run.
Here are a few ways to shrink that startup time:
- Enable TTL (Time-to-Live) • For Data Flow IR: set a TTL of up to 4 hours so the Spark cluster stays alive between runs. • For managed-VNet Copy activities: use the TTL preview feature for pipeline and external activities to avoid repeated cold starts.
- Run Data Flows in parallel • In your Data Flow settings, turn on Run in parallel so that multiple sinks share the same warm cluster rather than creating one per sink.
- Pre-warm or scale your compute for Custom Activities • If you’re using a .NET custom activity on Azure Batch, keep enough idle nodes in your Batch pool (or configure auto-scale) so tasks don’t wait for nodes to spin up.
- Monitor and tune your IR • Use Azure Monitor dashboards and alerts to track IR startup times, cluster utilization, and queue times. • Adjust DIUs or compute size on Copy activities if you see resource constraints.
Quick check—can you tell me:
- Which type of IR you’re using (Azure IR vs. self-hosted vs. managed VNet)?
- What activity types you’re running (Copy, Data Flow, Custom, etc.)?
- Whether you’ve already configured TTL or parallelism?
- The typical startup/queue time you’re seeing today?
With those details we can tailor the recommendations further.
Reference Docs:
- Azure Data Factory Integration Runtime Concepts https://learn.microsoft.com/azure/data-factory/concepts-integration-runtime
- Monitor Data Factory with Azure Monitor https://learn.microsoft.com/azure/data-factory/monitor-data-factory
- Optimize Performance with Data Flows https://learn.microsoft.com/azure/data-factory/concepts-data-flow-performance
Note: This content was drafted with the help of an AI system. Please verify the information before relying on it for decision-making.