Stacks we can operate

The stacks we use to ship AI to production, and the decisions that make us choose one over another.

How we choose

State of the art is an operating decision, not a logo list.

The question is not which cloud or framework is fashionable. The question is where the system can be evaluated, deployed, observed, audited, and transferred to the team that will operate it.

Agent orchestration

LangGraph when the workflow needs state, replay, retries, human checkpoints, and explicit escalation paths.

Enterprise documents

Azure when Microsoft identity, Office, tenant boundaries, procurement, and compliance are on the critical path.

Governed runtime

Google Cloud when GKE, sandboxes, NetworkPolicy, PR environments, or agent runtimes define the product.

Operational control

AWS when the client already runs IAM, VPCs, queues, logs, security review, and audited accounts as production muscle.

The standard that does not change by stack

The stack is not chosen by vendor preference. It is chosen by eval, security, operations, cost, and the team receiving handover.
Every stack needs the same operating kit: permissions, traces, model routing, cost budget, runbook, and rollback.
We do not abstract across clouds for aesthetics. We abstract only when it reduces real risk or protects portability the business needs.

Stack notes

Where each stack fits

Already have the stack decided?

Good. On the first call we check whether the chosen stack can support evals, tool use, observability, cost controls, security, and handover without inventing a parallel platform.

Discuss the stack See live systems

Stacks we can operate

State of the art is an operating decision, not a logo list.

The standard that does not change by stack

AWS

Google Cloud

LangGraph

Microsoft Azure

Already have the stack decided?