Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Important
Non-English translations are provided for convenience only. Please consult the EN-US version of this document for the definitive version.
Content Understanding builds upon the functionality of Document Intelligence, Speech to Text, Image Analysis, Face, Video and Azure OpenAI, each designed with compliance, privacy, and security at its core. This combined service processes various types of customer-provided data, such as documents, audio, images, biometric data (face), text, and video to deliver powerful analysis and intelligence capabilities. Importantly, users are responsible for ensuring that their use of this service complies with all applicable laws and regulations in their jurisdiction, including data protection, privacy, and communications laws, as well as any specific requirements around biometric data when leveraging facial recognition features. It's essential to acquire all necessary permissions, licenses, or third-party rights for the content and data submitted for processing.
Since the data processed in this integrated service may involve personal or sensitive information, including biometric identifiers and human speech content, users must follow all jurisdictional requirements related to data protection. For instance, when using biometric technologies, it's crucial to provide clear, conspicuous disclosure to individuals, particularly in regions with strict biometric data governance. Data provided to the Azure OpenAI service is stored and processed to monitor compliance with product terms, and Microsoft’s Products and Services Data Protection Addendum applies to all data handling within the Azure OpenAI framework. By combining these technologies, our service offers robust insights while ensuring users maintain responsibility for adhering to legal and regulatory standards.
What data does Content Understanding process?
Content Understanding can process audio input or voice audio, image files, document files, and video files. Each input type has different file limits, such as file type, size, length, and resolution. The limits are outlined in the service quotas and limits documentation.
How does Content Understanding process data?
Authenticate
Content Understanding first requires users to authenticate access to Content Understanding API by using Foundry Tools API key. Each request to the service URL must include an authentication header. This header passes along an API key (or token if applicable), which is used to validate your subscription for a service. Apart from authenticating access with API Key, Content Understanding also supports Azure Active Directory (Azure AD) and Entra ID Authentication. For more information, see Authenticate requests to Foundry Tools, which has additional information on Azure AD, Entra ID, and authorizing access to managed identities.
Secure data in transit
All Foundry Tools endpoints use HTTPS URLs for encrypting data during transit. The client operating system needs to support Transport Layer Security (TLS) 1.3 for calling the end points. For more information, see Transport Layer Security.
Encrypts input data for processing
When you submit your files to a Content Understanding operation, it starts the process of analyzing the input. Your data and results are then temporarily encrypted and stored in Azure Storage in the same region as your Content Understanding resource before being sent to Azure OpenAI for further processing. While compute resources aren't dedicated per customer, requests are processed in logically isolated, sandboxed containers to ensure workload separation and prevent cross-tenant data exposure.
Data at rest and processing locations
Content Understanding stores customer data at rest in the same region as the Content Understanding resource.
Processing locations depend on the type of operation:
- Analyzers
prebuilt-readandprebuilt-layoutonly: You can control where data is processed on a per-request basis using theprocessingLocationparameter. You can select a geography (for example, Japan or United States), a data zone (for example, Europe or United States), or a global setting (any geography). - Content extraction (document, audio, and video): You bring your own LLM instance and capacity, which Content Understanding uses to process the data. Customer data might be processed outside the resource region based on the LLM deployment type you choose: geography (for example, Japan or United States), data zone (for example, Europe or United States), or global (any geography).
Retrieve the results
The "Get Result" operation is authenticated against the same API key that was used to call the "Analyze" operation to ensure no other customer can access your data. It returns the analysis job completion status, When the status shows as succeeded, the operation also returns the extracted results in JSON format.
Data retention
Input documents and intermediate representations are written to secure Microsoft-managed storage only for the duration of processing and are deleted once the operation completes. Output results are retained for up to 24 hours to support asynchronous retrieval, after which they're automatically deleted. The analyzer name is logged for reporting and debugging.
Face
Face is a gated feature as it processes biometric data. We detect faces in the input files and group them by their similarity. All intermediate data don't persist beyond the processing of the request. The face groupings associated with analysis results are persisted for 48 hours unless the user explicitly deletes face data. For more information, refer to the Data and Privacy for Face documentation.
Azure OpenAI
Content Understanding also utilizes Azure OpenAI model once each modality input is processed through the underlying Foundry Tools. Refer to the Azure OpenAI Data, privacy, and security documentation for more information.