Store MLflow traces in Unity Catalog

Important

This feature is in Beta. Workspace admins can control access to this feature from the Previews page. See Manage Azure Databricks previews.

Azure Databricks supports storing MLflow traces in Unity Catalog tables using an OpenTelemetry-compatible format (OTEL). By default, MLflow stores traces organized by experiments in the MLflow control plane service. However, storing traces in Unity Catalog using OTEL format provides the following benefits:

Access control is managed through Unity Catalog schema and table permissions rather than experiment-level ACLs. Users with access to the Unity Catalog tables can view all traces stored in those tables, regardless of which experiment the traces belong to.
Trace IDs use URI format instead of the tr-<UUID> format, improving compatibility with external systems.
Store unlimited traces in Delta tables, enabling long-term retention and analysis of trace data. See Performance considerations.
Query trace data directly using SQL through a Databricks SQL warehouse, enabling advanced analytics and custom reporting.
OTEL format ensures compatibility with other OpenTelemetry clients and tools

Prerequisites

A Unity Catalog-enabled workspace.
Ensure the "OpenTelemetry on Databricks" preview is enabled. See Manage Azure Databricks previews.
Permissions to create catalogs and schemas in Unity Catalog.
A Databricks SQL warehouse with CAN USE permissions. Save the warehouse ID for later reference.
While this feature is in Beta, your workspace must be in one of the following regions:
- westus
- westus2
- westus3
- eastus
- eastus2
- centralus
- northcentralus
- southcentralus
- canadacentral
- brazilsouth
- westeurope
- northeurope
- germanywestcentral
- swedencentral
- switzerlandnorth
- uksouth
- australiaeast
- centralindia
- southeastasia
MLflow Python library version 3.11.0 or later installed in your environment:
```
pip install mlflow[databricks]>=3.11.0 --upgrade --force-reinstall
```

Setup: Create an experiment with a Unity Catalog trace location

Run the following code to create and bind an experiment to a Unity Catalog trace location:

# Example values for the placeholders below:
# MLFLOW_TRACING_SQL_WAREHOUSE_ID: "abc123def456" (found in SQL warehouse URL)
# experiment_name: "/Users/user@company.com/traces"
# catalog_name: "main" or "my_catalog"
# schema_name: "mlflow_traces" or "production_traces"
# table_prefix: "my_otel"

import os
import mlflow
from mlflow.entities.trace_location import UnityCatalog

mlflow.set_tracking_uri("databricks")

# Specify the ID of a SQL warehouse you have access to.
os.environ["MLFLOW_TRACING_SQL_WAREHOUSE_ID"] = "<SQL_WAREHOUSE_ID>"
# Specify the name of the MLflow Experiment to use for viewing traces in the UI.
experiment_name = "<MLFLOW_EXPERIMENT_NAME>"
# Specify the name of the Catalog to use for storing traces.
catalog_name = "<UC_CATALOG_NAME>"
# Specify the name of the Schema to use for storing traces.
schema_name = "<UC_SCHEMA_NAME>"
# Specify the name of the prefix appended to every table storing trace data.
table_prefix = "<UC_TABLE_PREFIX>"

# mlflow.set_experiment is an upsert operation
experiment = mlflow.set_experiment(
    experiment_name=experiment_name,
    trace_location=UnityCatalog(
        catalog_name=catalog_name,
        schema_name=schema_name,
        table_prefix=table_prefix,  # defaults to experiment id if not provided
    ),
)

print(f"Experiment ID: {experiment.experiment_id}")
print(experiment.trace_location.full_otel_spans_table_name)

You can also use mlflow.create_experiment with the same trace_location parameter. Unlike set_experiment, create_experiment does not set the active experiment, so you must call set_experiment afterward in order to ensure that traces are routed to the correct location:

experiment_id = mlflow.create_experiment(
    name=experiment_name,
    trace_location=UnityCatalog(
        catalog_name=catalog_name,
        schema_name=schema_name,
        table_prefix=table_prefix,
    ),
)

# trace_location is optional here since
# the experiment is already bound to the UC trace location above.
mlflow.set_experiment(experiment_id=experiment_id)

Once you bind an experiment to a UC trace location, you cannot reassign the experiment to a different UC trace location. However, multiple experiments can share the same UC trace location.

Verify tables

After running the setup code, four new Unity Catalog tables will be visible in the schema in the Catalog Explorer UI:

<table_prefix>_otel_annotations
<table_prefix>_otel_logs
<table_prefix>_otel_metrics
<table_prefix>_otel_spans

Grant permissions

The following permissions are required for a Databricks user or service principal to write or read MLflow Traces from the Unity Catalog tables:

USE_CATALOG permissions on the catalog.
USE_SCHEMA permissions on the schema.
MODIFY and SELECT permissions on each of the <table_prefix>_<type> tables.

Note

ALL_PRIVILEGES is not sufficient for accessing Unity Catalog trace tables. You must explicitly grant MODIFY and SELECT permissions.

Log traces to the Unity Catalog tables

After creating the tables, you can write traces to them from various sources by specifying the trace location. How you do this depends on the source of the traces.

MLflow SDK

The Unity Catalog trace location can be specified using the mlflow.set_experiment Python API, or by setting the MLFLOW_TRACING_DESTINATION environment variable.

import mlflow

from mlflow.entities.trace_location import UnityCatalog

mlflow.set_tracking_uri("databricks")

# Specify the catalog, schema, and table prefix to use for storing Traces
catalog_name = "<UC_CATALOG_NAME>"
schema_name = "<UC_SCHEMA_NAME>"
table_prefix = "<UC_TABLE_PREFIX>"

# Option 1: Use the `set_experiment` API
# For existing experiments, it is not necessary to specify `trace_location`. MLflow
# retrieves the UC trace location bound to the experiment and routes traces to
# that location.
mlflow.set_experiment(
    experiment_name="...",
    trace_location=UnityCatalog(
        catalog_name=catalog_name,
        schema_name=schema_name,
        table_prefix=table_prefix,
    ),  # optional for existing experiments
)

# Option 2: Use the `MLFLOW_TRACING_DESTINATION` environment variable
import os

os.environ["MLFLOW_TRACING_DESTINATION"] = f"{catalog_name}.{schema_name}.{table_prefix}"

# Create and ingest an example trace using the `@mlflow.trace` decorator
@mlflow.trace
def test(x):
    return x + 1

test(100)

Model Serving endpoint

To write traces from a Databricks model serving endpoint to Unity Catalog tables, you must configure a Personal Access Token (PAT).

Grant a user or service principal MODIFY and SELECT access to the logs and spans tables.
Generate a PAT for the user or service principal.
Add the PAT to the Databricks model serving endpoint's environment variables configuration, specifying DATABRICKS_TOKEN as the environment variable name.
Add the trace location to write to as a "{catalog}.{schema}.{table_prefix}" string to the Databricks model serving endpoint's environment variables configuration with MLFLOW_TRACING_DESTINATION as the environment variable name:

import mlflow

from mlflow.entities.trace_location import UnityCatalog

mlflow.set_tracking_uri("databricks")

# Specify the catalog, schema, and table prefix to use for storing Traces
catalog_name = "<UC_CATALOG_NAME>"
schema_name = "<UC_SCHEMA_NAME>"
table_prefix = "<UC_TABLE_PREFIX>"

# Option 1: Use the `set_experiment` API
# For existing experiments, it is not necessary to specify `trace_location`. MLflow
# retrieves the UC trace location bound to the experiment and routes traces to
# that location.
mlflow.set_experiment(
    experiment_name="...",
    trace_location=UnityCatalog(
        catalog_name=catalog_name,
        schema_name=schema_name,
        table_prefix=table_prefix,
    ),  # optional for existing experiments
)

# Option 2: Use the `MLFLOW_TRACING_DESTINATION` environment variable
import os

os.environ["MLFLOW_TRACING_DESTINATION"] = f"{catalog_name}.{schema_name}.{table_prefix}"

# Create and ingest an example trace using the `@mlflow.trace` decorator
@mlflow.trace
def test(x):
    return x + 1

test(100)

3rd party OTEL client

One benefit of storing traces in the OTEL format is that you can write to the Unity Catalog tables using third party clients that support OTEL. Traces written this way will appear in an MLflow experiment linked to the table as long as they have a root span. The following example shows OpenTelemetry OTLP exporters.

from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter

# Span exporter configuration
otlp_trace_exporter = OTLPSpanExporter(
    # Databricks hosted OTLP traces collector endpoint
    endpoint="https://myworkspace.databricks.com/api/2.0/otel/v1/traces",
    headers={
        "content-type": "application/x-protobuf",
        "X-Databricks-UC-Table-Name": "<catalog>.<schema>.<table_prefix>_otel_spans",
        "Authorization": "Bearer MY_API_TOKEN"
    },
)

See Export Langfuse traces to Azure Databricks MLflow.

View traces in the UI

View traces stored in OTEL format the same way you view other traces:

In your Workspace, go to Experiments.
Find the experiment where your traces are logged. For example, the experiment set by mlflow.set_experiment("/Shared/my-genai-app-traces").
Click Traces tab to see a list of all traces logged to that experiment.
If you stored your traces in a Unity Catalog table, Azure Databricks retrieves traces using an SQL warehouse. Select a SQL warehouse from the drop-down menu.

For more information on using the UI to search for traces, see View traces in the Databricks MLflow UI.

Enable production monitoring

To use production monitoring with traces stored in Unity Catalog, you must configure a SQL warehouse ID for the experiment. The monitoring job requires this configuration to execute scorer queries against Unity Catalog tables.

Set the SQL warehouse ID using set_databricks_monitoring_sql_warehouse_id():

from mlflow.tracing import set_databricks_monitoring_sql_warehouse_id

# Set the SQL warehouse ID for monitoring
set_databricks_monitoring_sql_warehouse_id(
    sql_warehouse_id="<SQL_WAREHOUSE_ID>",
    experiment_id="<EXPERIMENT_ID>"  # Optional, uses active experiment if not specified
)

Alternatively, you can set the MLFLOW_TRACING_SQL_WAREHOUSE_ID environment variable before starting monitoring.

If you skip this step, monitoring jobs will fail with an error indicating the mlflow.monitoring.sqlWarehouseId experiment tag is missing.

To configure monitoring for Unity Catalog traces, you need the following workspace-level permissions:

CAN USE permission on the SQL warehouse
CAN EDIT permission on the MLflow experiment
Permission on the monitoring job (automatically granted when you register the first scorer)

The monitoring job runs under the identity of the user who first registered a scorer on the experiment. This user's permissions determine what the monitoring job can access.

Limitations

Trace ingestion is limited to 200 traces per second per workspace and 100MB per second per table.
An experiment can only be bound to a Unity Catalog trace location at experiment creation time.
Traces stored in Unity Catalog are not supported with Knowledge Assistant or Supervisor Agent.
UI performance may degrade when over 2TB of trace data are stored in the table. See Performance considerations.
Deleting individual traces is not supported for traces stored in Unity Catalog. To remove traces, you must delete rows directly from the underlying Unity Catalog tables using SQL. This differs from experiment traces, which can be deleted using the MLflow UI or API.
MLflow MCP server does not support interacting with traces stored in Unity Catalog.

Next steps

Query MLflow traces using MLflow Databricks SQL
Search for traces by OTel span attributes: Search for third-party OTel traces stored in Unity Catalog by span attributes.

Feedback

Was this page helpful?

Last updated on 2026-04-07