The Release Should Explain Itself

Production Kubernetes delivery is no longer only human operators running Helm from CI. Teams are beginning to let AI agents inspect incidents, prepare migrations, and request changes. That changes the acceptance bar. RBAC can say who may write. GitOps can say what desired state should be. Neither explains why this particular stateful mutation is safe right now. Torque fills that gap by turning the change into evidence before execution: proof graph, gate report, policy decision, authorization record, and replayable stack ledger. The result is a production change surface where a human, CI job, or agent can propose work, but the write proceeds only when the proof matches the policy and the final run is inspectable later.

Argo CD and Crossplane were designed before agentic operations became the center of gravity. That is not a criticism of either project. Argo CD is a strong GitOps control plane: it compares Git state with cluster state and keeps applications synced. Crossplane is a strong infrastructure control plane: it turns cloud and platform resources into Kubernetes APIs. Both are useful.

But an AI agent asking to mutate production creates a different problem:

What proof makes this write safe enough to allow?

Torque's operating loop is intentionally narrow and proof-gated:

The agent asks to mutate.
Torque checks the proof.
Torque checks the policy.
Torque records authorization.
The mutation is allowed only with a passing gate.

I tested that loop end to end against a production Kubernetes cluster. The workload was not a synthetic terminal transcript and not a hello world chart. It was an Oracle/APEX-to-PostgreSQL modernization stack running against a live PostgreSQL target inside Kubernetes.

The durable evidence for the run stayed with the release artifacts under /root/torque-agentic-ops-e2e-20260523-095146.

The Workload

The stack was deliberately stateful. It modeled the kind of production change that usually falls between infrastructure-as-code, ticket comments, and a shell script someone hopes they can reconstruct later:

prove the Kubernetes target side is ready;
record change-window approval;
freeze the source Oracle/APEX system;
export source data;
create a PostgreSQL restore point;
expand the target schema;
backfill rows into shadow tables;
verify row counts and route state;
commit the cutover;
promote the application route;
contract the schema;
run a post-cutover check;
audit and export the run ledger.

The Kubernetes side created an isolated namespace, deployed PostgreSQL in the production cluster, and port-forwarded it to the Torque process. After the authorized stack apply completed, PostgreSQL reported:

stage rows: 3
shadow rows: 3
route flags: true,true,true
migration audit phase: contracted

That matters because this was not only "agent ran a command." The agent was asking to run a stateful production change program with sequencing, evidence, and irreversible-looking steps.

The Proof Gate

Before the mutation, Torque built a signed proof graph around the proposed change. The graph linked the stack plan, verifier report, digest-pinned image reference, BuildKit capture placeholder, SBOM, provenance, server dry-run evidence, runtime drift proof, rollout event proof, logs capture, SLO outcome, and repair channel.

proof gate: passed
gate checks: 23
release score: 100
flight events: 20
proof graph artifacts: 20
checked files: 12

The graph was signed with an ed25519 key generated for the run, and torque proof verify --require-signature verified the signature and file hashes.

GraphSigned file and artifact hashes.

GateRequired checks passed before mutation.

PolicyOperation had to be explicitly allowed.

LedgerStack run captured and exported.

The useful output was not only that the database ended in the right state. The run left behind a signed graph, a gate report, a score report, a flight file, replay output, an agent policy record, a release attestation, stack capture, stack audit, stack export, and event-chain integrity checks.

The Agent Was Denied Twice

The first request looked like an agent asking to perform a mutating stack apply:

{
  "actor": "codex-agent",
  "operation": "stack-apply",
  "command": ["torque", "stack", "apply", "--config", "./stack.yaml", "--yes"],
  "release": "oracle-postgres-k8s",
  "namespace": "data-platform",
  "reason": "migrate Oracle/APEX account data to the PostgreSQL target stack"
}

Torque denied it without an explicit allow-list entry:

torque agent policy check agent-request.json \
  --proof proof.graph.json \
  --require-gate \
  --out agent-policy-denied-no-allow.json

That failure is important. A passing proof graph is not enough by itself. The requested operation still has to be explicitly allowed.

Then I tampered with the verifier evidence and tried again with the operation allowed:

torque agent policy check agent-request.json \
  --proof proof.graph.json \
  --allow stack-apply \
  --require-gate \
  --out agent-policy-denied-tampered.json

Torque denied the request again because the proof gate no longer passed. The agent had permission, but the evidence had been changed after the graph was signed.

This is the operating model demonstrated by the run: no permission-only writes, and no proof-only writes. Mutating automation needs both.

The Agent Was Authorized, But Did Not Execute

After restoring the evidence, the policy check passed:

torque agent policy check agent-request.json \
  --proof proof.graph.json \
  --allow stack-apply \
  --require-gate \
  --out agent-policy-allowed.json

Then Torque wrote the authorization record:

torque agent run agent-request.json \
  --proof proof.graph.json \
  --allow stack-apply \
  --require-gate \
  --out agent-run.json

The run record said authorized: true and executed: false. That is the boundary. agent run does not mutate the cluster or the database. It records that the caller is allowed to perform the write.

Only after that did the actual write happen as a separate, explicit operation:

torque stack apply \
  --config ./stack.yaml \
  --yes \
  --capture=./runtime/stack-apply.sqlite

That separation is the right shape for agents in production. They should not be trusted because a prompt says the change is safe. They should carry a signed graph, pass a gate, match an explicit policy, and leave behind a record that a human can read later.

The Mutation

Once proof and policy passed, Torque ran the stack apply against the live PostgreSQL target in Kubernetes. The run succeeded and produced:

stack run status: succeeded
stack run id: 2026-05-23T10-02-37.032394298Z
stack audit artifacts: 30
stack audit events: 108
event-chain integrity: true
run-digest integrity: true

Torque then exported the stack run:

oracle-postgres-run.tgz
sha256: 22399861e1073d24219512d5a323c23c0c92295ceea88d905a06495d69af72f9

The stack capture was also preserved:

stack-apply.sqlite
sha256: 8893d62ed389ed3f8d23a19de9109a2fb1aa2bba5616cfac8bb0fc0ca250f0ba

So the final state is reviewable in three ways:

the signed proof graph explains why the mutation was allowed;
the agent policy and run records explain who was authorized and why;
the stack audit/export artifacts explain what happened during execution.

Why This Beats A Generic GitOps Story

Argo CD can keep ordinary applications synced. Crossplane can provision infrastructure. Torque is not a slightly different version of either one.

Torque owns high-assurance production change control:

this proposed write is tied to these exact files;
those files still hash to the signed graph;
the graph includes the required evidence;
the release gate passed;
the requested operation is explicitly allowed;
the request matches the proof release and namespace;
the agent authorization is recorded;
the actual mutation produced a replayable run ledger.

That is not just sync. It is production change control.

This is where Torque is strongest: database migration, backfill, cutover, contract, cleanup, incident recurrence, rollback evidence, and risky repair. These changes do not fit cleanly into "desired state equals live state." They are programs with risk, sequence, checkpoints, approval, and audit requirements.

Most tools help you deploy. Torque makes the whole release explainable after the fact.

Agents can ask to change production, but Torque only lets them proceed when proof, policy, and authorization all line up.

That gives Torque a sharper category than "another GitOps tool": proof-gated change control, replayable production evidence, agent authorization before mutation, stateful workload graphs, and audit artifacts that survive the terminal session.

The completed flow is concrete: Torque built a signed proof graph, verified the gate, denied requests that lacked an explicit allow decision, rejected tampered evidence, recorded a non-mutating authorization, executed the stack change separately, and preserved the audit/export ledger. Argo CD can sync ordinary apps. Crossplane can provision infrastructure. The production path shown here is where an agent requests a risky mutation and the decision needs evidence, not just RBAC.