Edit

Share via


Add Azure Cosmos DB CDC as source in Real-Time hub

This article describes how to add Azure Cosmos DB for NoSQL Change Data Capture (CDC) as an event source in Fabric Real-Time hub.

The Azure Cosmos DB Change Data Capture (CDC) source connector for Microsoft Fabric eventstreams lets you capture a snapshot of the current data in an Azure Cosmos DB database. The connector then monitors and records any future row-level changes to this data. Once the changes are captured in the eventstream, you can process this CDC data in real-time and send it to different destinations within Fabric for further processing or analysis.

Prerequisites

  • Access to a workspace with the Fabric capacity or Fabric Trial workspace type with Contributor or higher permissions.
  • Access to an Azure Cosmos DB for NoSQL account and database.
  • Your Azure Cosmos DB for NoSQL database should be publicly accessible and not be behind a firewall or secured in a virtual network. If it resides in a protected network, connect to it by using Eventstream connector virtual network injection.
  • If you don't have an eventstream, create an eventstream.

Get connection details from the Azure portal

The labels for the items you need to collect from the Azure portal are shown in the following steps. You always need the endpoint URI, in a format like https://<account>.<api>.azure.com:<port>/, the Primary Key, and the Database name and item identifiers (IDs) you want to collect data for.

Note

Azure Cosmos DB for NoSQL CDC is using the Latest Version Mode of Azure Cosmos DB Change Feed. It captures the changes to records in the latest version. Deletions are't captured with this mode.

  1. On the Azure portal page for your Azure Cosmos DB account, select Keys under Settings in the left navigation.

  2. On the Keys page, copy the URI and Primary key values to use for setting up the eventstream connection.

    A screenshot of the URI and Primary key on the Azure Cosmos DB Keys page in the Azure portal.

  3. On the Azure portal Overview page for your Azure Cosmos DB account, note the Database and item ID you want to collect data for.

    A screenshot of the Containers listing for an Azure Cosmos DB NoSQL API account.

Get events from an Azure Cosmos DB CDC

You can get events from an Azure Cosmos DB CDC into Real-Time hub in one of the ways:

Data sources page

  1. Sign in to Microsoft Fabric.

  2. If you see Power BI at the bottom-left of the page, switch to the Fabric workload by selecting Power BI and then by selecting Fabric.

    Screenshot that shows how to switch to the Fabric workload.

  3. Select Real-Time on the left navigation bar.

    Screenshot that shows how to launch Connect to data source experience.

  4. The Streaming data page opens by default. Click on the Add data button to get to the Data sources page.

    Screenshot that shows the Data sources page in the Real-Time hub.

    You can also get to the Data sources page directly by selecting the Add data option in the left navigation bar.

    Screenshot that shows the Connect data source button.

Use instructions from the Add Azure Cosmos DB CDC as a source section.

Microsoft sources page

  1. In Real-Time hub, select Microsoft sources.

  2. In the Source drop-down list, select Azure Cosmos DB (CDC).

  3. For Subscription, select an Azure subscription that has the resource group with your Cosmos DB account.

  4. For Resource group, select a resource group that has your Cosmos DB account.

  5. For Region, select a location where your Cosmos DB is located.

  6. Now, move the mouse over the name of the Cosmos DB CDC source that you want to connect to Real-Time hub in the list of databases, and select the Connect button, or select ... (ellipsis), and then select the Connect button.

    Screenshot that shows the Microsoft sources page with filters to show Cosmos DB CDC and the connect button.

    To configure connection information, use steps from the Add Azure Cosmos DB CDC as a source section. Skip the first step of selecting Azure Cosmos DB CDC as a source type in the Add source wizard.

Add Azure Cosmos DB CDC as a source

  1. On the Connect screen, under Connection, select New connection to create a cloud connection linking to your Azure Cosmos DB database.

    Screenshot that shows the Connect page with the New connection link selected.

  2. On the Connection settings screen, enter the following information:

    • Cosmos DB Endpoint: Enter the URI or Endpoint for your Cosmos DB account that you copied from the Azure portal.
    • Connection name: Automatically generated, or you can enter a new name for this connection.
    • Account key: Enter the Primary Key for your Azure Cosmos DB account that you copied from the Azure portal.

    A screenshot of the Connection settings for the Azure Cosmos DB CDC source.

  3. Select Connect.

  4. Provide the following information for your Azure Cosmos DB resources.

    • Container ID: Enter the name of the Azure Cosmos DB container or table you want to connect to.
    • Database: Enter the name of your Azure Cosmos DB database.
    • Offset policy: Select whether to start reading Earliest or Latest offsets if there's no commit.

Stream or source details

  1. On the Connect page, follow one of these steps based on whether you're using Eventstream or Real-Time hub.

    • Eventstream:

      In the Source details pane to the right, follow these steps:

      1. For Source name, select the Pencil button to change the name.

      2. Notice that Eventstream name and Stream name are read-only.

    • Real-Time hub:

      In the Stream details section to the right, follow these steps:

      1. Select the Fabric workspace where you want to create the eventstream.

      2. For Eventstream name, select the Pencil button, and enter a name for the eventstream.

      3. The Stream name value is automatically generated for you by appending -stream to the name of the eventstream. This stream appears on the real-time hub's All data streams page when the wizard finishes.

  2. Select Next at the bottom of the Configure page.

Review and connect

On the Review + connect screen, review the summary, and select Add (Eventstream) or Connect (Real-Time hub).

View data stream details

  1. On the Review + connect page, if you select Open eventstream, the wizard opens the eventstream that it created for you with the selected Azure Cosmos DB CDC as a source. To close the wizard, select Close or X* in the top-right corner of the page.

    Screenshot that shows the Review + connect page after successful creation of the source.

  2. In Real-Time hub, select All data streams. To see the new data stream, refresh the All data streams page.

    Screenshot that shows the Real-Time hub All data streams page with the stream you just created.

    For detailed steps, see View details of data streams in Fabric Real-Time hub.

To learn about consuming data streams, see the following articles: