AWS Bedrock vs Azure OpenAI vs Vertex AI: Choosing an Enterprise LLM Platform in 2026
Quick answer: Your hyperscaler choice does more work than your model choice. Pick AWS Bedrock if you want the broadest model catalog under one contract and the deepest federal compliance footprint. Pick Azure AI Foundry if you are already on Microsoft 365 and want first-day access to OpenAI frontier releases. Pick Google Vertex AI if you run on GCP, need native multimodal at the lowest price per token, or want the strongest agent-building stack. Multi-cloud teams almost always default to Bedrock as the model-neutral hub.
TL;DR comparison
| Factor | AWS Bedrock | Azure AI Foundry | Google Vertex AI |
|---|---|---|---|
| Primary models | Claude (Anthropic), Llama (Meta), Mistral, Cohere, Stability, Amazon Nova/Titan, DeepSeek | OpenAI (GPT-5.5, GPT-image, Sora), Anthropic Claude, Llama, Mistral, xAI Grok | Gemini (Google), Anthropic Claude, Llama, Mistral, plus Model Garden (200+ open models) |
| Frontier model exclusivity | Anthropic preferred home, $4B+ investment | OpenAI first-party, frontier ships here first | Gemini first-party, frontier ships here first |
| Native cloud | AWS only | Azure only | GCP only |
| Cross-cloud (model only) | Claude reachable via Bedrock from any environment with AWS IAM | Limited — Foundry endpoints | Limited — Vertex endpoints |
| Pricing posture | Pass-through to model provider plus AWS infra | Same as OpenAI list, EA discounts | Pass-through, GCP commit discounts |
| Long-context economics | Best for Claude (flat to 1M), variable for others | OpenAI's tiered pricing (2x above 272K) | Gemini doubles above 200K, Claude flat |
| FedRAMP High / DoD IL4-5 | Yes (Bedrock GovCloud, IL4/5) | Yes (Azure Government, OpenAI in IL5) | Yes (Vertex AI on GCC) |
| Default Zero Data Retention | Contractual, no per-request flagging | Requires EA + per-resource approval | Contractual on Vertex |
| Agent SDK | Strands, AgentCore, Bedrock Flows | Foundry Agent Service, Semantic Kernel | Agent Builder, ADK, Agent Engine |
| Best for | Multi-cloud, regulated, model-agnostic | Microsoft-stack, OpenAI-exclusive workloads | GCP shops, multimodal, agent-heavy |
The reframe: the platform is a contract, the model is a config
Most enterprise LLM evaluations start with the model — "do we want Claude or GPT-5.5 or Gemini?" — and only later think about where it runs. That sequence is backwards for production deployments.
The model decision is now soft. Frontier capability has converged (we covered this in the OpenAI vs Anthropic vs Google comparison), abstraction layers like LiteLLM and Bedrock cross-model APIs are mature, and most teams swap models multiple times in a year. The platform decision is hard. You sign a single enterprise contract with a hyperscaler. You inherit its compliance posture, its regional footprint, its IAM model, its networking primitives, and its procurement terms. Switching costs are measured in quarters, not weeks.
Two questions decide most of this:
- Which cloud is your data already in? That answer narrows the field to one or two, almost always.
- What is your model breadth requirement? If you want one provider under one contract: Bedrock. If you want OpenAI frontier on day one: Azure. If you want Gemini-first plus an open-model garden: Vertex.
Everything else — agent SDK quality, batch pricing, fine-tuning surface — is a tiebreaker after those two.
What each platform actually is
AWS Bedrock in 2026
Bedrock is AWS's managed LLM service. It is built as a model marketplace: you sign one Bedrock contract and get serverless API access to Anthropic Claude, Meta Llama, Mistral, Cohere, Stability, Amazon Nova and Titan, DeepSeek, and others. The platform's bet is that no single model wins, so AWS will host all the strong ones under a single inference, billing, and compliance surface.
Anthropic is the headline tenant — AWS has invested over $4 billion and Claude on Bedrock is the de facto default deployment for enterprise Claude. Amazon's own Nova family (Pro, Lite, Micro, plus multimodal Nova Canvas and Reel) rounds out the cheap end. The agent stack — Strands SDK, AgentCore for runtime, Bedrock Agents and Flows for orchestration — has been the most aggressively expanded surface in 2025 and 2026.
Strengths: the only platform where you can mix Claude, Llama, Mistral, and Amazon-native models under one IAM, one VPC posture, and one billing contract. Deepest federal compliance (FedRAMP High, DoD IL4 and IL5 on Bedrock GovCloud). Strongest model-agnostic story. Native PrivateLink and VPC endpoints for data residency. Friction: Bedrock is AWS-only. There is no Bedrock-on-Azure path. Anthropic on Bedrock is operationally excellent but you are still routing Claude through AWS infrastructure — if your security team objects to that on principle, you are back to Anthropic direct or Vertex.
Azure AI Foundry in 2026
Azure AI Foundry (formerly Azure AI Studio, with Azure OpenAI Service folded in) is Microsoft's enterprise AI platform. The defining feature is OpenAI exclusivity for first-party frontier: GPT-5.5, GPT-image, Sora, and the Realtime voice API ship in Foundry on day one, often before the public OpenAI API. Microsoft has expanded the model catalog to include Anthropic Claude (currently Anthropic-hosted, with Azure regional support promised in 2026), Llama, Mistral, and xAI Grok, but OpenAI remains the gravitational center.
The platform's bet is that most enterprises run on Microsoft 365 and want their AI inside the same compliance, identity, and procurement boundary as Word, Teams, and Sentinel. Foundry Agent Service, Semantic Kernel, and the AutoGen framework are the agent surface. Azure Government provides the federal path.
Strengths: first-day access to OpenAI frontier, deepest Microsoft 365 and Sentinel integration, mature Enterprise Agreement procurement, full FedRAMP High and IL5 via Azure Government. Friction: the model catalog is OpenAI-weighted by design — Anthropic and others are guests, not co-equals. Zero Data Retention requires an Enterprise Agreement, must be approved by Microsoft per-resource, and OpenAI explicitly reserves opt-out rights for severe-risk investigations on GPT-5.5. Azure-only runtime — no path off the platform without re-architecture.
Google Vertex AI in 2026
Vertex AI is Google Cloud's unified ML and AI platform. The defining feature is Gemini-first with everything else as backup: Gemini 3 Pro, Flash, and Nano are first-party, the Model Garden hosts 200+ open and partner models (Anthropic Claude, Meta Llama, Mistral, Mixtral, Falcon, and others), and the platform is the strongest at multimodal because Gemini was designed multimodal-native.
The agent surface — Agent Builder, Agent Development Kit (ADK), Agent Engine — has been Google's most aggressive 2026 push. Vertex AI Search and the BigQuery integration give it the cleanest path for "AI plus structured enterprise data" workloads. Compliance covers FedRAMP High on GCC and HIPAA BAAs across the stack.
Strengths: lowest per-token pricing at every tier (Gemini Flash at $0.30 / $2.50 per million is the floor), best native multimodal (text, image, video, audio all at frontier), tightest integration with BigQuery and Vertex AI Search for RAG, deepest agent-builder stack. Native ZDR on Vertex with no per-request flagging. Friction: GCP-only runtime. The Model Garden is broad but most enterprises only seriously evaluate Gemini and Claude on Vertex — the other open models are mostly research-tier. Gemini's 200K context cliff (2x input cost above 200K) punishes aggressive RAG retrieval.
Deep dive: model catalog breadth
The honest comparison of what enterprises actually have available:
| Model Family | Bedrock | Foundry | Vertex AI |
|---|---|---|---|
| Anthropic Claude (Opus, Sonnet, Haiku) | Yes — primary | Yes — guest | Yes — first-class |
| OpenAI GPT-5.5, GPT-image, Sora | No | Yes — primary | No |
| Google Gemini 3 (Pro, Flash, Nano) | No | No | Yes — primary |
| Meta Llama 4 | Yes | Yes | Yes |
| Mistral Large 3, Codestral | Yes | Yes | Yes |
| Cohere Command R+ | Yes | No | Yes |
| Amazon Nova (Pro, Lite, Micro) | Yes — primary | No | No |
| xAI Grok | No | Yes | No |
| DeepSeek V3, R1 | Yes | Limited | Yes |
| Stability (image) | Yes | No | Yes |
The structural takeaway: Bedrock is the only platform that gives you Claude, Llama, Mistral, Cohere, and a first-party model line under one contract. Foundry is the only platform with OpenAI frontier. Vertex is the only platform with Gemini frontier. If your strategy requires GPT-5.5 plus Claude plus Llama all under one billing surface, no platform does it — you sign two contracts.
Deep dive: pricing posture
All three platforms pass model-provider pricing through with minimal markup, but the surrounding economics differ:
Bedrock. Pay the model provider's per-token rate plus standard AWS infrastructure (compute, data transfer, storage). Batch processing offers up to 50% discount. Provisioned throughput available for predictable high-volume workloads. Cross-region inference (CRIS) automatically routes to less-loaded regions at slightly higher per-token cost in exchange for higher availability. EDP commitments roll Bedrock spend into your AWS enterprise discount.
Foundry. OpenAI list pricing pass-through. Provisioned Throughput Units (PTUs) for guaranteed capacity at fixed cost — useful for workloads that need predictable latency. EA discounts apply to all Azure consumption, including Foundry. Microsoft can package Foundry into broader Microsoft 365 Copilot deals, which is the single biggest commercial lever and the reason many enterprises end up on Azure regardless of model preference.
Vertex AI. Gemini-native pricing is the lowest at every tier (Flash at $0.30 / $2.50 per million, Gemini 3 Pro at $1.25 / $10 under 200K context). Batch processing offers 50% discount. Committed Use Discounts (CUDs) apply to Vertex spend like any other GCP service. The cheapest absolute floor for high-volume mid-tier workloads.
Our heuristic: model the bill on three workloads — a routine task at mid-tier, a long-context agent run, and a cached high-frequency call — under each platform's actual contract terms (EA, EDP, CUD) before signing. The "cheapest" platform on the website is rarely the cheapest after enterprise discounts.
Deep dive: agent tooling
Where the platforms have invested most aggressively in 2025 and 2026:
- Bedrock ships Strands (SDK for agent loops), AgentCore (managed runtime with built-in memory, tool use, and observability), and Bedrock Agents and Flows (low-code orchestration). Strong native integration with AWS Lambda, Step Functions, and EventBridge for production agent workflows. The cleanest answer for "I want to run a long-lived agent with state inside AWS."
- Foundry ships Foundry Agent Service (managed agent runtime), Semantic Kernel (Microsoft's agent framework), and AutoGen (multi-agent research framework). Tightest integration with Microsoft 365 — agents can read Outlook, Teams, SharePoint, and Sentinel data natively under the same identity. Strongest answer for "I want an agent that works on top of our Microsoft data."
- Vertex AI ships Agent Builder (low-code), Agent Development Kit (Python and TypeScript SDK), and Agent Engine (managed runtime). Native integration with Vertex AI Search for retrieval and BigQuery for structured data. Strongest answer for "I want an agent that queries enterprise structured data" — the BigQuery hook is unique to Vertex.
In our deployments, the agent SDK rarely decides the platform — it tracks the cloud the enterprise is already on. But if you are starting from neutral, Vertex's BigQuery-native agents and Foundry's Microsoft 365 agents are the two most differentiated stacks. Bedrock is the most model-flexible runtime but does not have a single hook as distinctive.
Deep dive: compliance and contract terms
| Compliance Posture | Bedrock | Foundry | Vertex AI |
|---|---|---|---|
| FedRAMP High | Yes (GovCloud) | Yes (Azure Government) | Yes (GCC) |
| DoD IL4 / IL5 | Yes (Bedrock IL4/5) | Yes (Azure Gov, OpenAI in IL5) | Limited (IL2 standard) |
| HIPAA BAA | Yes (all in-scope models) | Yes (Azure OpenAI in-scope) | Yes (Vertex in-scope) |
| SOC 2 Type II / ISO 27001 | Yes | Yes | Yes |
| Default Zero Data Retention | Contractual, no per-request flag | EA-gated, per-resource, opt-out reserved | Contractual on Vertex |
| Regional residency | 30+ regions, PrivateLink | 60+ regions, Private Link | 40+ regions, Private Service Connect |
| EU sovereign deployment | Yes (Bedrock EU) | Yes (EU Data Boundary) | Yes (EU sovereign regions) |
For "no prompt or completion data ever persists outside our boundary" requirements, Bedrock and Vertex have the cleaner contracts in 2026 — both offer ZDR as default with no per-request flagging. Azure's ZDR posture requires Enterprise Agreement, per-resource Microsoft approval, and OpenAI reserves the right to revoke ZDR for GPT-5.5 in severe-risk investigations. For purely federal workloads, all three clear FedRAMP High; for DoD IL5 specifically, Bedrock and Azure Government are the deeper paths.
Deep dive: cross-cloud and lock-in
The honest answer on "how locked in am I":
- Bedrock is the most model-portable. You can call Claude on Bedrock from any environment with AWS IAM (cross-account, federated). You can switch models within Bedrock without re-architecting (same SDK, same auth, same observability). Cross-cloud — pulling Bedrock into Azure or GCP workloads — requires you to route traffic through AWS, which is a non-trivial networking project but does happen in multi-cloud enterprises.
- Foundry is the most Azure-locked. OpenAI endpoints are Azure-native; pulling them into AWS or GCP workloads is uncommon and operationally awkward (egress costs, latency, identity reconciliation). If you want OpenAI frontier outside Azure, the practical option is the OpenAI direct API, which loses Foundry's compliance posture.
- Vertex AI is GCP-native. Gemini outside GCP is available via the Google AI Studio API (consumer-grade compliance, not enterprise) or the Gemini API on Vertex (GCP-only). Claude on Vertex is GCP-only by definition. Multi-cloud teams using Vertex generally accept that as a GCP-side commitment.
The pattern we see in enterprise deployments: teams that explicitly want hyperscaler optionality default to Bedrock for the model surface. Teams that have already standardized on Microsoft 365 or GCP end up on Foundry or Vertex by gravity, not by evaluation. If your CTO has not picked a primary cloud yet, the LLM platform is a poor first reason to pick one.
When to choose which
Choose AWS Bedrock if you:
- Run on AWS or are explicitly multi-cloud
- Want Claude, Llama, Mistral, Cohere, and Amazon-native models under one contract
- Need FedRAMP High plus DoD IL4 or IL5
- Want the cleanest Zero Data Retention contract terms with no per-request flagging
- Are building long-lived agents and want managed runtime under AWS IAM
Choose Azure AI Foundry if you:
- Are already on Microsoft 365 and want AI inside the same compliance boundary
- Need OpenAI frontier (GPT-5.5, Sora, Realtime voice) on day one
- Want agents that operate on Outlook, Teams, SharePoint, and Sentinel data natively
- Have an Enterprise Agreement procurement preference
- Run federal workloads through Azure Government
Choose Google Vertex AI if you:
- Run on GCP or want to add LLM workloads to BigQuery and Vertex pipelines
- Need native multimodal (text, image, video, audio) at the lowest price tier
- Are building agents that query structured enterprise data
- Are price-sensitive at scale and your contexts stay under 200K tokens
- Want the deepest first-party multimodal model line (Gemini 3)
Alternatives to consider
If none of the three primary platforms fit:
- Anthropic direct API: For pure Claude workloads where you want no hyperscaler intermediary. Loses the cross-cloud reachability and FedRAMP posture you get on Bedrock, but is operationally simpler for small teams. We discuss the trade-off in Open-Source vs Commercial LLMs.
- OpenAI direct API: For OpenAI workloads where Azure procurement is not viable. Loses Foundry's compliance and Microsoft 365 integration; gains the latest features fastest.
- Self-hosted open models on your own infrastructure: Llama 4, DeepSeek V3, or Mistral Large on your own GPUs or managed inference (Together AI, Anyscale, Fireworks). Best for full control and known-finite cost; worst for operational burden. See Self-Hosted vs Cloud AI for the deep trade-off.
- Inference aggregators (LiteLLM, OpenRouter): Model-neutral routing layers that sit above all three platforms. Operationally useful for multi-model abstractions; not a replacement for the underlying compliance contract.
Our recommendation
After deploying eight production AI systems across finance, manufacturing, retail, and B2B SaaS, here is how we actually advise clients on this choice:
Pick the platform from your cloud strategy, not from your model preference. If your data, identity, and observability are already in AWS, default to Bedrock. If they are in Azure, default to Foundry. If they are in GCP, default to Vertex. The cost of running an AI platform outside your primary cloud — egress, identity reconciliation, separate compliance posture — almost always outweighs the model preference that motivated the move.
Bedrock is the cleanest answer for teams without a strong cloud preference. If you are starting from neutral, the model breadth plus federal posture plus ZDR terms make Bedrock the lowest-regret pick. We have seen this play out in three of our last five enterprise selections.
Architect for model swappability even if you do not architect for cloud swappability. Use an abstraction layer (LiteLLM, Bedrock cross-model API, your own router) so the model is a config, not a commitment. The platform is a commitment; the model should not be.
Bottom line:
- Pick Bedrock if you are multi-cloud, model-agnostic, or on AWS
- Pick Foundry if you are on Microsoft and want OpenAI frontier first
- Pick Vertex AI if you are on GCP, multimodal-heavy, or agent-focused
FAQ
Is AWS Bedrock cheaper than Azure OpenAI?
Headline price: Bedrock and Azure both pass through the model provider's list price, so the per-token rate is identical for the same model on the same tier. Real cost: Bedrock often comes out cheaper for enterprises with existing AWS EDP commitments (Bedrock spend rolls into the discount), and Bedrock's batch and cross-region inference options give more pricing levers. Azure's main lever is bundling Foundry into broader Microsoft 365 Copilot deals — for Microsoft-heavy enterprises that bundle can be the single largest discount. Model the workload under your actual contract terms before deciding.
Can I run OpenAI's GPT-5.5 on Bedrock or Vertex AI?
No. As of 2026, OpenAI's frontier models are available only through Azure AI Foundry (under the OpenAI partnership) and the OpenAI direct API. AWS Bedrock and Google Vertex AI do not host first-party OpenAI models. If you need GPT-5.5 and you are not on Azure, the practical option is the OpenAI direct API — but you lose Foundry's compliance posture and the Microsoft procurement lever.
Which platform has the best agent-building tools?
It depends on the workload. Bedrock's Strands SDK and AgentCore runtime give you the most model-flexible agent stack inside AWS. Foundry's Agent Service plus Semantic Kernel is the strongest for agents that operate on Microsoft 365 data (Outlook, Teams, SharePoint, Sentinel). Vertex AI's Agent Builder, ADK, and Agent Engine are the strongest for agents that query structured enterprise data through BigQuery and Vertex AI Search. The platform's agent stack rarely decides the choice on its own — it tracks the cloud the enterprise is already on.
How does Zero Data Retention compare across the three platforms?
Bedrock and Vertex AI both offer ZDR as a contractual default with no per-request flagging — sign the contract and prompts and completions are not retained. Azure AI Foundry requires an Enterprise Agreement, per-resource Microsoft approval, and OpenAI explicitly reserves the right to revoke ZDR for GPT-5.5 in severe-risk investigations. For "no data persists under any circumstance" requirements, Bedrock and Vertex have the cleaner contracts in 2026.
Should I sign one platform contract or multiple?
For most enterprises, one platform under one cloud. Running two LLM platforms in parallel doubles your compliance, identity, networking, and procurement overhead — and the model swappability you gain is almost always achievable inside a single platform (Bedrock alone hosts Claude, Llama, Mistral, and Cohere). The exception is teams that genuinely need both OpenAI frontier and the broader Anthropic and Llama catalog under enterprise terms — in that case Foundry plus Bedrock is the cleanest two-contract setup, and you accept the operational doubling as the cost of model breadth.
Need help with AI implementation?
We build production AI systems that actually ship. Not demos, not POCs—real systems that run your business.
Get in Touch