An Apache Spark-based analytics platform optimized for Azure.
Based on the provided information, the behavior aligns with known gRPC/Spark Connect issues on serverless compute and with Databricks Connect 17.x.
Actionable steps:
- Validate Databricks Connect setup and compatibility
- Run the built‑in validation to ensure the client, environment, and serverless runtime are compatible:
databricks-connect test - For Databricks Connect 14.3 and above, also validate via the session builder in the client code:
from databricks.connect import DatabricksSession DatabricksSession.builder.validateSession(True).getOrCreate() - If there is any incompatibility between the Databricks Connect client version and the serverless compute version, this command fails with a non‑zero exit code and an error message. Resolve any reported version mismatch first.
- Run the built‑in validation to ensure the client, environment, and serverless runtime are compatible:
- Check for serverless + Databricks Connect version mismatch
- Databricks Connect 17.1, 17.2, and 17.3 releases explicitly address client–server compatibility and behavior when serverless is upgraded:
- For 17.1/17.2/17.3 Scala clients, when connecting to serverless compute and the client version is lower than the serverless version, the client now prints a warning instead of throwing an error so that Databricks Connect continues to work when serverless is upgraded.
- If serverless compute is running a newer Databricks Runtime than the Databricks Connect client was built for, upgrade the Databricks Connect client to the matching 17.x release for that runtime (for example, Databricks Connect for Databricks Runtime 17.3 LTS when using 17.3 LTS serverless).
- Databricks Connect 17.1, 17.2, and 17.3 releases explicitly address client–server compatibility and behavior when serverless is upgraded:
- Consider known serverless + UDF / result‑sync issues
- Databricks Connect 17.1.x, 17.2.x, and 17.3.x release notes mention fixes for client‑server syncing issues that could cause failures when executing UDFs on serverless compute.
- If the affected queries use UDFs (Python or Scala) or complex expressions that rely on client‑server synchronization, upgrade to the latest patch version of Databricks Connect for the specific runtime (for example, 17.3.2 for 17.3 LTS) to pick up these fixes.
- Verify that Databricks Connect/Spark Connect is enabled on the cluster
- Databricks Connect and Spark Connect can be disabled per cluster using the Spark configuration:
spark.databricks.service.server.enabled false - Ensure this setting is not applied (or is set to
true/ absent) on the serverless compute environment. If it is disabled, gRPC behavior will be impacted.
- Databricks Connect and Spark Connect can be disabled per cluster using the Spark configuration:
- Test with a minimal query and compare behavior
- Run a very simple query via Databricks Connect (no UDFs, small result set) and confirm whether results are streamed back:
from databricks.connect import DatabricksSession spark = DatabricksSession.builder.getOrCreate() df = spark.range(10) df.show() - If even simple queries exhibit
result_fetch_duration_ms = 0and never start result delivery, focus on:- Version compatibility (step 2).
- Network/proxy behavior on HTTP/2/gRPC between client and serverless.
- Run a very simple query via Databricks Connect (no UDFs, small result set) and confirm whether results are streamed back:
- If using a proxy, ensure HTTP/2/gRPC support and headers
- Databricks Connect communicates with Databricks clusters via gRPC over HTTP/2. If a proxy is in the path, ensure it fully supports HTTP/2 and streaming.
- If the proxy requires custom headers, configure them using the DatabricksSession builder:
from databricks.connect import DatabricksSession spark = (DatabricksSession .builder .header('x-custom-header', 'value') .getOrCreate()) - Misconfigured proxies can allow the query submission but interfere with streaming results back, which matches the described pattern.
- Isolate Spark Connect vs Databricks Connect
- To confirm whether the issue is specific to Databricks Connect or to Spark Connect/gRPC itself, optionally test against an open‑source Spark Connect server:
- Start a local Spark Connect server (see “How to use Spark Connect” in Apache Spark docs).
- Point Databricks Connect to it:
export SPARK_REMOTE="sc://localhost" - Initialize the session:
from databricks.connect import DatabricksSession spark = DatabricksSession.builder.getOrCreate()
- If local Spark Connect works but serverless does not, the problem is likely in the serverless environment or its networking/proxy layer.
- To confirm whether the issue is specific to Databricks Connect or to Spark Connect/gRPC itself, optionally test against an open‑source Spark Connect server:
- When to contact support
- If:
-
databricks-connect testpasses, - Databricks Connect client and serverless runtime versions are aligned and on the latest patch,
- Spark Connect is enabled on the cluster,
- Network/proxy supports HTTP/2 and streaming,
- Simple queries still show
result_fetch_duration_ms = 0and never stream results,
-
- then collect query history details and gRPC error logs and open a support case, as this may be an internal server‑side gRPC streaming issue.
- If:
References: