The Gap Between "Deployed" and "Secure"
Every enterprise I work with has deployed or is deploying Azure OpenAI. The pace is extraordinary — I've never seen a technology adopted this fast in 26 years of enterprise IT. And that speed is exactly the problem.
Most Azure OpenAI deployments I audit were built by an innovation team in a sandbox subscription, proved the use case, and then got promoted to production before anyone asked the security questions. The model works. The integration works. The security posture is an afterthought — if it's a thought at all.
This checklist is the document I hand every client before their Azure OpenAI deployment touches production data. It's not theoretical — every item on this list exists because I've seen the failure mode it prevents.
Reality check: Azure OpenAI is a Microsoft-managed service running on Azure infrastructure. That means Microsoft handles model security, training data integrity, and service availability. What they don't handle is your deployment configuration, your network architecture, your identity scoping, and your data classification. You own the security posture of how you use the service. This checklist covers your responsibilities.
1. Network Isolation
The default Azure OpenAI deployment accepts traffic from the public internet. For an enterprise deployment processing internal data, this is unacceptable. Here's what the network architecture needs to look like:
- Private endpoint on the Azure OpenAI resource — disable public network access entirely. The OpenAI resource should only be reachable from your virtual network via private link. This is the most important single configuration change you'll make.
- DNS configuration — configure a private DNS zone for
privatelink.openai.azure.comand link it to the VNet where your application runs. Without this, your application will resolve the public IP address and fail to connect through the private endpoint. - Network Security Group rules — restrict outbound traffic from the application subnet to only the private endpoint subnet. The application should not need outbound internet access to call Azure OpenAI — the traffic stays within your VNet.
- API Management gateway (optional but recommended) — place Azure API Management in front of the OpenAI endpoint. This gives you rate limiting, request/response logging, usage analytics, and the ability to swap backend models without changing application code. Deploy APIM in internal VNet mode so it's also not internet-facing.
Why this matters: Without network isolation, any Azure service or user with the API key can call your OpenAI endpoint from anywhere. A leaked API key in a code repository becomes an unrestricted access point to a model that's been fine-tuned on or prompted with your enterprise data. Network isolation ensures that even with a valid key, the caller must be on your network.
2. Identity and Access Scoping
Azure OpenAI supports two authentication methods: API keys and Entra ID (Azure AD) RBAC. Use RBAC. Turn off API key authentication entirely.
- Disable API key access — in the Azure OpenAI resource properties, set
disableLocalAuth: true. This forces all callers to authenticate via Entra ID managed identities or service principals. No shared secrets. No key rotation headaches. No keys to leak. - Use managed identities for application access — your application should use a system-assigned or user-assigned managed identity to call Azure OpenAI. Assign the
Cognitive Services OpenAI Userrole to the managed identity, scoped to the specific OpenAI resource. Don't useCognitive Services Contributor— that grants management plane access the application doesn't need. - Separate identities per application — if multiple applications call the same OpenAI resource, each should have its own managed identity with its own role assignment. This gives you per-application audit trails and the ability to revoke one application's access without affecting others.
- No user-level access to the model endpoint — developers should not be calling the production OpenAI deployment directly from their machines. Provide a dev/test deployment with a separate model instance for experimentation. Production access is application-to-service only, mediated by managed identity.
3. Prompt Injection Defense
Prompt injection is the SQL injection of the AI era, and the industry is about as prepared for it as we were for SQL injection in 2004. If your application takes user input and includes it in a prompt to Azure OpenAI, you are exposed to prompt injection attacks.
The defense is layered:
Input Validation
Validate and sanitize user input before it reaches the prompt. Strip control characters, limit input length, and reject inputs that contain known injection patterns (e.g., "ignore previous instructions," "system prompt override"). This won't catch everything, but it eliminates the obvious attacks.
System Prompt Hardening
Design your system prompt to be resistant to override attempts. Include explicit instructions that the model should not reveal the system prompt, should not execute instructions from user input that contradict system-level directives, and should refuse requests outside its defined scope. Test this with adversarial prompts.
Output Filtering
Never trust model output. Validate, sanitize, and scope every response before presenting it to the user or acting on it programmatically. If the model's output will be rendered as HTML, sanitize it. If it will be used in a database query, parameterize it. If it will trigger an action, verify authorization before executing.
Content Filtering Configuration
Azure OpenAI includes built-in content filtering. Configure it. Set severity thresholds for hate, sexual, self-harm, and violence categories. Enable jailbreak detection. Review filtered requests weekly to tune thresholds — too aggressive filters block legitimate use cases; too permissive filters miss real attacks.
Least Privilege for RAG Sources
If your application uses Retrieval-Augmented Generation (RAG) to ground model responses in your data, enforce access control on the retrieval layer. The user's query to the model should only retrieve documents the user is authorized to see. Azure AI Search supports security trimming — use it. A model that retrieves and surfaces data the user shouldn't see is a data leak, regardless of how well the network is locked down.
The attack I see most often: A RAG application grounded on SharePoint data where the search index was built with a service principal that has access to all sites. Every user query can now surface content from executive compensation documents, HR investigations, and M&A planning folders. The model becomes a search engine that bypasses your entire permission model. Always index with user-delegated permissions or apply security trimming at query time.
4. Data Residency and Processing Boundaries
This is the compliance question that gets asked after deployment, when it should be answered before deployment.
- Deploy in the correct Azure region — Azure OpenAI is not available in all regions. Choose a region that satisfies your data residency requirements. If you're subject to GDPR, deploy in a European region. If you're subject to Canadian data sovereignty requirements, deploy in Canada Central or Canada East. Document the region choice and the regulatory justification.
- Understand the data processing commitment — Azure OpenAI does not use customer data to train or improve Microsoft's models. This is stated in the Azure OpenAI data, privacy, and security documentation. Verify this commitment is documented in your enterprise's Microsoft agreement — don't rely on public documentation alone.
- Disable abuse monitoring if eligible — by default, Azure OpenAI stores prompts and completions for 30 days for abuse monitoring. If your compliance requirements prohibit this, apply for the Limited Access abuse monitoring exception. Once approved, prompts and completions are not stored by Microsoft.
- Classify the data your prompts contain — if your application sends customer PII, financial data, or health data in prompts, those prompts are subject to the same data classification and handling requirements as the source data. Treat prompts as data in transit. Apply encryption, access controls, and retention policies accordingly.
5. Audit Trail Requirements
If you can't prove what the model was asked, what it answered, and who asked it, you don't have an auditable AI deployment. Regulators are starting to require AI audit trails, and your internal risk team should be demanding them now.
- Log every interaction — capture the prompt, the completion, the model version, the token count, the calling identity, the timestamp, and the response latency. Store this in an append-only log (Azure Event Hubs into Azure Data Explorer, or a dedicated Log Analytics table).
- Correlate with business context — every model call should carry a correlation ID that ties it back to the user session, the business process, and the application feature. When an auditor asks "why did the system make this recommendation?", you need to trace from the output back to the exact prompt and the exact data that was retrieved.
- Retain for your compliance window — most industries require 7-year retention for AI-generated outputs that influence business decisions (financial services, healthcare). Even if you're not in a regulated industry, retain for at least 12 months. AI audit requirements are evolving fast — having the data and not needing it is better than the alternative.
- Monitor for anomalies — track prompt volume per user, token consumption trends, content filter trigger rates, and error rates. A sudden spike in token consumption from a single identity could indicate a prompt injection attack extracting data. A spike in content filter triggers could indicate adversarial usage. Build dashboards and set alerts.
The Complete Checklist
Here's the full list — print it, hand it to your team, and don't deploy to production until every item is checked:
Network
Private endpoint enabled — public network access disabled
Private DNS zone configured — privatelink.openai.azure.com
NSG rules scoped — application subnet to private endpoint only
API Management gateway deployed — rate limiting, logging, backend abstraction
Identity
API key authentication disabled — disableLocalAuth: true
Managed identities configured — per-application identity
RBAC role scoped — Cognitive Services OpenAI User (not Contributor)
No direct user access to production endpoint — dev/test deployment provided separately
Prompt Security
Input validation layer deployed — length limits, pattern filtering
System prompt hardened — override resistance tested with adversarial inputs
Output sanitization active — HTML encoding, parameterized queries
Content filtering configured — severity thresholds set, jailbreak detection enabled
RAG security trimming enabled — user-scoped document retrieval
Data & Compliance
Region selected for residency requirements — documented justification
Data processing commitment verified — in enterprise agreement
Abuse monitoring exception applied — if compliance requires no prompt storage
Prompt data classified — same handling as source data classification
Audit
Interaction logging deployed — prompt, completion, identity, timestamp
Correlation IDs implemented — traces from output to prompt to data
Retention policy configured — minimum 12 months, 7 years for regulated
Anomaly monitoring active — volume, token consumption, filter triggers
Final Thoughts
Azure OpenAI is a powerful platform, and Microsoft has done the hard work of making the models available with enterprise-grade infrastructure. But infrastructure security and deployment security are different responsibilities. Microsoft secures the platform. You secure the deployment. This checklist is the minimum viable security posture for any enterprise Azure OpenAI deployment that processes internal data.
Don't skip items because they seem like overhead. Every checklist item exists because I've seen the breach, the compliance finding, or the data leak that happens without it. Secure it before you scale it.
— Jamel A. Housen, Melhousen Solutions