Stacks we can operate
The stacks we use to ship AI to production, and the decisions that make us choose one over another.
How we choose
State of the art is an operating decision, not a logo list.
The question is not which cloud or framework is fashionable. The question is where the system can be evaluated, deployed, observed, audited, and transferred to the team that will operate it.
Agent orchestration
LangGraph when the workflow needs state, replay, retries, human checkpoints, and explicit escalation paths.
Enterprise documents
Azure when Microsoft identity, Office, tenant boundaries, procurement, and compliance are on the critical path.
Governed runtime
Google Cloud when GKE, sandboxes, NetworkPolicy, PR environments, or agent runtimes define the product.
Operational control
AWS when the client already runs IAM, VPCs, queues, logs, security review, and audited accounts as production muscle.
The standard that does not change by stack
- The stack is not chosen by vendor preference. It is chosen by eval, security, operations, cost, and the team receiving handover.
- Every stack needs the same operating kit: permissions, traces, model routing, cost budget, runbook, and rollback.
- We do not abstract across clouds for aesthetics. We abstract only when it reduces real risk or protects portability the business needs.
Stack notes
Where each stack fits
AWS
Google Cloud
LangGraph
Microsoft Azure
Already have the stack decided?
Good. On the first call we check whether the chosen stack can support evals, tool use, observability, cost controls, security, and handover without inventing a parallel platform.