Inquiry: Security Policies & Stateless Processing for Medical Document Extraction (Azure Content Understanding)

Question

Inquiry: Security Policies & Stateless Processing for Medical Document Extraction (Azure Content Understanding)

Loc Tong 0

Dear Azure Support Team,

I am writing to seek technical clarification regarding the implementation of Azure Content Understanding for a medical project based in Australia. Our clients, who are senior medical specialists, have extremely high requirements for data privacy and legal compliance.

In addition to the requirements below, our use case involves extracting structured patient information from predefined document templates (e.g., referral letters, medical forms). The system is expected to accurately identify and extract specific fields such as patient name, ID, Medicare number, and other clinical metadata.

Patient Identifiable Data Redaction (PII/PHI)

We must ensure that all Personally Identifiable Information (PII) and Protected Health Information (PHI)—such as patient names, Medicare numbers, and IDs—are automatically identified and redacted during the scanning process.

Does Azure Content Understanding support both:

Automated field-level extraction from predefined templates, and

Automated redaction of PII/PHI fields for Australian medical documents?

Zero Data Leakage

Australian regulations prohibit senior specialists from allowing sensitive patient data to be stored or exposed on external cloud systems.

Can you confirm whether Azure Content Understanding ensures that no customer data is stored or retained (i.e., operates in a stateless manner) and that there is no data leakage?

Regional Compliance

To comply with local regulations, all data processing must occur within the Australia East region.

Can you confirm that Azure Content Understanding can be fully deployed and operated within this region?

We would appreciate it if you could provide official documentation or a Data Protection Addendum (DPA) that covers stateless processing, no-data-retention configurations, and compliance for handling medical data in Australia.

Thank you for your professional support.

Best regards,

Loc

0 comments

1 answer

Your answer

Answer 1

Hello Loc Tong,

Thank you for reaching out to the Microsoft Q&A forum.

Automated field-level extraction from templates

• Azure’s “Content Understanding” (often implemented today with Azure Form Recognizer or the new Document Intelligence APIs) absolutely lets you build or use custom template models. You upload a handful of sample referral letters or medical forms, label the fields (e.g., patient name, ID, Medicare number, clinical metadata), train a custom model, and then it’ll reliably pull out those exact fields on new docs.

• If you need a turnkey experience, there are also some prebuilt models (e.g., invoices, receipts)—but for medical-specific forms you’ll want custom training.

PII/PHI redaction

• There isn’t a fully managed “one-click” redaction step built into Form Recognizer today, but you have two common patterns:

Run your doc through the form/model extraction first, identify the bounding boxes or text spans for the PII fields, then overlay a redaction mask in your own code or via Azure Functions.

Combine with Azure Content Moderator’s text‐screening APIs (or a similar text redaction library) after OCR to scrub out names, IDs, numbers, etc.

• You can chain both services in a pipeline so that extracted PII is automatically redacted before you store or surface the document.

Zero data leakage / stateless processing

• As per our Responsible AI guidance for Azure Health Insights (and the same residency/retention principles apply to Content Understanding services):

– Data never leaves the region you deploy in (you’ll choose Australia East for your resource).

– Input and output documents are encrypted at rest.

– Documents AND results are only stored up to 24 hours and then purged automatically.

– There’s no long-term customer data retention on Microsoft’s side

—Azure doesn’t build a customer document store for these services.

– During that short retention period, data is only accessible to on-call support engineers under strict audit for catastrophic failures.

Regional compliance & DPA

• You can absolutely deploy your Content Understanding/Form Recognizer resource in Australia East. All processing, encryption, and temporary storage happens there.

• For your legal/compliance teams, Microsoft’s Data Protection Addendum (DPA) and Azure Online Services Terms cover the no-retention, encryption, and regional‐processing commitments. You can find those here:

– Azure Trust Center Compliance: https://www.microsoft.com/TrustCenter/Compliance

– Online Services Terms (Data Protection Addendum section): https://www.microsoftvolumelicensing.com/DocumentSearch.aspx?Mode=3&DocumentTypeId=31

Next steps / references

Get started with custom templates in Form Recognizer / Document Intelligence: https://docs.microsoft.com/azure/applied-ai-services/form-recognizer/
Redaction guidance (pattern): https://docs.microsoft.com/azure/applied-ai-services/content-moderator/
Responsible AI & data privacy for healthcare: https://docs.microsoft.com/azure/azure-health-insights/responsible-ai/data-privacy-security

Let me know if you need more on the redaction pipeline or a direct link to the DPA excerpt!

Siva shunmugam Nadessin 7,735 Reputation points Microsoft External Staff Moderator

2026-04-06T10:44:24.5366667+00:00

Loc Tong, Just checking in to see if the solution shared above help you to resolve your issue. please reach out to us If you have any further questions.
Loc Tong 0 Reputation points

2026-04-06T14:35:10.5266667+00:00

Thank you for your previous advice regarding Azure’s Content Understanding and PII/PHI protection.Our primary goal is Data Extraction: we need the AI to identify and extract sensitive fields (such as Patient Name, Medicare Number, and Clinical Metadata) to automatically populate a digital form.

My concern is as follows: If we perform PII Redaction before the extraction step, the AI will be unable to "see" or "read" the masked information, which would defeat the purpose of the automated extraction.
Loc Tong 0 Reputation points

2026-04-07T04:34:36.76+00:00

I got it. Thanks for your extremely detailed explanation.
Siva shunmugam Nadessin 7,735 Reputation points Microsoft External Staff Moderator

2026-04-07T16:13:49.2933333+00:00

Loc Tong, If the answer was helpful, kindly and "up-vote" this can be beneficial to other community members.

Share via

Inquiry: Security Policies & Stateless Processing for Medical Document Extraction (Azure Content Understanding)

1 answer

Your answer