Share via

Inquiry: Security Policies & Stateless Processing for Medical Document Extraction (Azure Content Understanding)

Loc Tong 0 Reputation points
2026-04-02T01:12:50.69+00:00

Dear Azure Support Team,

I am writing to seek technical clarification regarding the implementation of Azure Content Understanding for a medical project based in Australia. Our clients, who are senior medical specialists, have extremely high requirements for data privacy and legal compliance.

In addition to the requirements below, our use case involves extracting structured patient information from predefined document templates (e.g., referral letters, medical forms). The system is expected to accurately identify and extract specific fields such as patient name, ID, Medicare number, and other clinical metadata.

  1. Patient Identifiable Data Redaction (PII/PHI)

We must ensure that all Personally Identifiable Information (PII) and Protected Health Information (PHI)—such as patient names, Medicare numbers, and IDs—are automatically identified and redacted during the scanning process.

Does Azure Content Understanding support both:

Automated field-level extraction from predefined templates, and

Automated redaction of PII/PHI fields for Australian medical documents?

  1. Zero Data Leakage

Australian regulations prohibit senior specialists from allowing sensitive patient data to be stored or exposed on external cloud systems.

Can you confirm whether Azure Content Understanding ensures that no customer data is stored or retained (i.e., operates in a stateless manner) and that there is no data leakage?

  1. Regional Compliance

To comply with local regulations, all data processing must occur within the Australia East region.

Can you confirm that Azure Content Understanding can be fully deployed and operated within this region?

We would appreciate it if you could provide official documentation or a Data Protection Addendum (DPA) that covers stateless processing, no-data-retention configurations, and compliance for handling medical data in Australia.

Thank you for your professional support.

Best regards,

Loc

Azure Advisor
Azure Advisor

An Azure personalized recommendation engine that helps users follow best practices to optimize Azure deployments.

0 comments No comments

1 answer

Sort by: Most helpful
  1. Siva shunmugam Nadessin 7,735 Reputation points Microsoft External Staff Moderator
    2026-04-03T19:38:44.7+00:00

    Hello Loc Tong,

    Thank you for reaching out to the Microsoft Q&A forum. 

    Automated field-level extraction from templates

    • Azure’s “Content Understanding” (often implemented today with Azure Form Recognizer or the new Document Intelligence APIs) absolutely lets you build or use custom template models. You upload a handful of sample referral letters or medical forms, label the fields (e.g., patient name, ID, Medicare number, clinical metadata), train a custom model, and then it’ll reliably pull out those exact fields on new docs.

    • If you need a turnkey experience, there are also some prebuilt models (e.g., invoices, receipts)—but for medical-specific forms you’ll want custom training.

    PII/PHI redaction

    • There isn’t a fully managed “one-click” redaction step built into Form Recognizer today, but you have two common patterns:

    Run your doc through the form/model extraction first, identify the bounding boxes or text spans for the PII fields, then overlay a redaction mask in your own code or via Azure Functions.

    Combine with Azure Content Moderator’s text‐screening APIs (or a similar text redaction library) after OCR to scrub out names, IDs, numbers, etc.

    • You can chain both services in a pipeline so that extracted PII is automatically redacted before you store or surface the document.

    Zero data leakage / stateless processing

    • As per our Responsible AI guidance for Azure Health Insights (and the same residency/retention principles apply to Content Understanding services):

    – Data never leaves the region you deploy in (you’ll choose Australia East for your resource).

    – Input and output documents are encrypted at rest.

    – Documents AND results are only stored up to 24 hours and then purged automatically.

    – There’s no long-term customer data retention on Microsoft’s side

    —Azure doesn’t build a customer document store for these services.

    – During that short retention period, data is only accessible to on-call support engineers under strict audit for catastrophic failures.

    Regional compliance & DPA

    • You can absolutely deploy your Content Understanding/Form Recognizer resource in Australia East. All processing, encryption, and temporary storage happens there.

    • For your legal/compliance teams, Microsoft’s Data Protection Addendum (DPA) and Azure Online Services Terms cover the no-retention, encryption, and regional‐processing commitments. You can find those here:

    – Azure Trust Center Compliance: https://www.microsoft.com/TrustCenter/Compliance

    – Online Services Terms (Data Protection Addendum section): https://www.microsoftvolumelicensing.com/DocumentSearch.aspx?Mode=3&DocumentTypeId=31

    Next steps / references

    1. Get started with custom templates in Form Recognizer / Document Intelligence: https://docs.microsoft.com/azure/applied-ai-services/form-recognizer/
    2. Redaction guidance (pattern): https://docs.microsoft.com/azure/applied-ai-services/content-moderator/
    3. Responsible AI & data privacy for healthcare: https://docs.microsoft.com/azure/azure-health-insights/responsible-ai/data-privacy-security

    Let me know if you need more on the redaction pipeline or a direct link to the DPA excerpt!

    1 person found this answer helpful.

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.