Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Prerequisites
- An Azure account with an active subscription; create an account for free.
- A Microsoft Planetary Computer Pro GeoCatalog resource
- A Blob Storage account create a Blob Storage account.
- A Blob Storage container with data cube assets (NetCDF, HDF5, GRIB2), STAC Items, and static STAC Catalog. Learn how to create STAC Items.
Set up ingestion source
Before you can begin to ingest data cube data, you'll need to set up an Ingestion Source, which will serve as your credentials to access the Blob Storage account where your assets and STAC Items are stored. You can set up an Ingestion Source using Managed Identity or SAS Token.
Create a data cube collection
Once your Ingestion Source is set up, you can create a Collection for your data cube assets. Steps to create a collection can be followed in Create a STAC Collection with Microsoft Planetary Computer Pro using Python.
Ingest data cube assets
The initiation of the ingestion process for data cube data, and other data types, can be followed in Ingestion Overview. As described in Data Cube Overview, however, ingestion is the step in Planetary Computer Pro's data handling that differs for these file types. While GRIB2 data and associated STAC Items are ingested just like any other two-dimensional raster file, NetCDF and HDF5 assets undergo further data enrichment. The generation of Kerchunk Manifests is documented in Data Cube Overview, but what is important to note is that Kerchunk assets will be added to your Blob Storage container alongside the original assets, and an additional cube:variables field are added to the STAC Item JSON. This is important when rendering these data types in the Planetary Computer Pro Explorer.
Configure a data cube collection
Configuration of your data cube collection is another step that will look slightly different from that of other data types. You can follow the steps described in Configure a collection with the Microsoft Planetary Computer Pro web interface to configure your data cube collection, but you'll need to be aware of the following differences when building your Render Configuration:
Render configuration for NetCDF and HDF5 assets
Remembering that a standard Render Configuration argument in JSON format looks like this:
[
{
"id": "prK1950-06-30",
"name": "prK1950-06-30",
"type": "raster-tile",
"options": "assets=pr-kerchunk&subdataset_name=pr&rescale=0,0.01&colormap_name=viridis&datetime=1950-06-30",
"minZoom": 1
}
]
The options field is where you'll want to utilize the cloud optimized, Kerchunk asset, as opposed to the original asset listed in the STAC Item. You'll also need to include the subdataset_name argument, which is the name of the variable you want to render.
Render configuration for GRIB2 assets
The options field for the Render Configuration of GRIB2 assets look similar to the previous example, but you won't need to include the subdataset_name argument. This is because GRIB2 data is already optimally structured and referenced via their Index files. The assets argument, in this case, represents the band, or 2D raster layer, you want to render. Below is an example of a GRIB2 Render Configuration:
[
{
"id": "render-config-1",
"name": "Mean Zero-Crossing Wave Period",
"description": "A sample render configuration. Update `options` below.",
"type": "raster-tile",
"options": "assets=data&subdataset_bands=1&colormap_name=winter&rescale=0,10",
"minZoom": 1
}
]
Render configuration for Zarr assets
The options field for Zarr assets is similar to NetCDF and HDF5, but there are two important requirements:
- Select a single time slice for rendering by using
sel=time=.... - Reduce data to a 2D output before rendering.
If additional dimensions are not collapsed, rendering can fail with errors such as Source data must be 1 band.
You can read more about the sel parameter in Xarray DataArray.sel. The following is a minimal working Zarr render configuration for a single timestep and 2D output:
[
{
"id": "era5-zarr-single-time",
"name": "ERA5 single timestep",
"type": "raster-tile",
"options": "assets=data&subdataset_name=precipitation_amount_1hour_Accumulation&sel=time=2024-01-01&sel_method=nearest&colormap_name=viridis&rescale=0,0.01",
"minZoom": 12
}
]
Visualize data cube assets in the Explorer
Once your data cube assets are ingested and configured, you can visualize them in the Planetary Computer Pro Explorer. A step-by-step guide for using the Explorer can be followed in Quickstart: Use the Explorer in Microsoft Planetary Computer Pro.
While Microsoft Planetary Computer Pro includes a tiler that can be used to visualize some data cube assets, there are some caveats to note when it comes to each supported data type.
NetCDF and HDF5 visualization
Not all NetCDF datasets that can be ingested into Microsoft Planetary Computer are compatible by the Planetary Computer Pro's visualization tiler. A dataset must have X and Y axes, latitude and longitude coordinates, and spatial dimensions and bounds to be visualized. For example, a dataset in which latitude and longitude are variables, but not coordinates, isn't compatible with Planetary Computer Pro's tiler.
Before attempting to visualize your NetCDF or HDF5 dataset, you can use the following to check whether it meets the requirements.
Install the required dependencies
pip install xarray[io] rioxarray cf_xarrayRun the following function:
import xarray as xr import cf_xarray import rioxarray def is_dataset_visualizable(ds: xr.Dataset): """ Test if the dataset is compatible with the Planetary Computer tiler API. Raises an informative error if the dataset is not compatible. """ if not ds.cf.axes: raise ValueError("Dataset does not have CF axes") if not ds.cf.coordinates: raise ValueError("Dataset does not have CF coordinates") if not {"X", "Y"} <= ds.cf.axes.keys(): raise ValueError(f"Dataset must have CF X and Y axes, found: {ds.cf.axes.keys()}") if not {"latitude", "longitude"} <= ds.cf.coordinates.keys(): raise ValueError("Dataset must have CF latitude and longitude coordinates, " f"actual: {ds.cf.coordinates.keys()}") if ds.rio.x_dim is None or ds.rio.y_dim is None: raise ValueError("Dataset does not have rioxarray spatial dimensions") if ds.rio.bounds() is None: raise ValueError("Dataset does not have rioxarray bounds") left, bottom, right, top = ds.rio.bounds() if left < -180 or right > 180 or bottom < -90 or top > 90: raise ValueError("Dataset bounds are not valid; they must be within [-180, 180] and [-90, 90]") if ds.rio.resolution() is None: raise ValueError("Dataset does not have rioxarray resolution") if ds.rio.transform() is None: raise ValueError("Dataset does not have rioxarray transform") print("✅ Dataset is compatible with the Planetary Computer tiler API.")
GRIB2 visualization
GRIB2 assets that have been ingested into Microsoft Planetary Computer Pro can be visualized in the Explorer as long as they have an associated Index file (.idx) stored in the same Blob Storage container. The Index file is generated during ingestion and is required for optimal access and rendering of GRIB2 data.
Zarr visualization
Zarr assets ingested into Microsoft Planetary Computer Pro can be visualized in the Explorer as long as the Render Configuration specifies which variable and time slice to render using the sel parameter in the options field. Failure to do so will result in the Explorer attempting to render all variables and time slices of the Zarr store at once, which will cause the Explorer to crash.
The size of the Zarr store and spatial chunks will also impact performance. You should aim to keep the total size of a Zarr store under 2 GB, and each chunk less than 100 MB for optimal performance of the tiler.
Zarr visualization limitations and known issues
Time slider behavior (known limitation)
Note
Time slider behavior for Zarr is currently limited. The Explorer time slider only appears when a temporal dimension is correctly detected during ingestion.
Even when Zarr assets contain time values, ingestion may fail to detect temporal metadata for some datasets. In those cases, the time slider will not render, and you must visualize one timestep at a time in render configuration (for example, sel=time=2024-01-01).
Required and recommended temporal metadata
To enable time-aware behavior, your STAC metadata should include a temporal dimension in cube:dimensions with:
type: temporalextentstep
For Zarr source data, follow CF time conventions where possible, for example:
standard_name="time"axis="T"
These conventions are necessary for consistent metadata interpretation, but due to current limitations they are not always sufficient to guarantee time slider support for every Zarr dataset.
Kerchunk notes
Kerchunk can improve performance for multidimensional access patterns, but it does not resolve time slider issues when temporal dimensions were not detected during ingestion.
Some Zarr datasets may also fail during index processing with errors such as Index must be monotonic increasing or decreasing.
Roadmap and future support
Current and planned support is:
- Zarr v2: supported today
- Zarr v3: not supported yet, planned for future support
- Multi-time Zarr visualization and temporal handling: partial today, with continued improvements planned
Time slider for data cube visualization
If your data cube assets have a temporal component, you can use the time slider in the Explorer to visualize changes over time. The time slider will appear automatically if your STAC Items contains assets with a time dimension with an extent and step field.
Note
For Zarr assets, see Zarr visualization limitations and known issues for current time slider constraints and required render configuration patterns.