Where ZK proofs meet
on-chain AI agents.
If an AI agent is making trading or research decisions on-chain, how do you prove the inference was correct without revealing the model weights? That is a ZK problem. This doc answers it — with circuit-level detail, real error messages, and real mitigations.
What is ZKML?
Zero-Knowledge Machine Learning (ZKML) is the application of ZK proof systems to machine learning inference. It lets you prove that a model ran correctly on a given input and produced a given output — without revealing the model weights, the input data, or any intermediate computation.
Think of it this way: a ZK proof is a receipt. It doesn't tell you what was cooked — only that the chef followed the recipe correctly. ZKML applies this property to neural network inference.
Privacy-preserving
Model weights and input data never leave the prover's machine. Only the proof and output go on-chain.
Verifiable inference
Any verifier can check the proof in milliseconds. No need to re-run the model or trust the agent.
On-chain composable
The proof output becomes a first-class on-chain primitive. Smart contracts can act on verified AI decisions.
Why it matters for on-chain AI agents
In 2026, AI agents are executing DeFi trades, generating governance votes, and routing cross-chain liquidity. The trust problem is acute: how does a smart contract know the agent's decision was legitimate?
Without ZKML, you have three bad options:
| Approach | How it works | Problem | Status |
|---|---|---|---|
| Trusted oracle | Agent posts result; oracle signs it | Requires trusting a third party | Centralised |
| On-chain inference | Run the model fully in the EVM | Gas cost is catastrophically high for any real model | Infeasible |
| Optimistic execution | Post result; dispute window for challenges | Long finality; no weight privacy | Partial |
| ZKML proof | Prove inference off-chain; verify on-chain | Proving time; quantisation constraints | Best available |
ZKML is not a perfect solution — proving time for large models is still a bottleneck, and most production implementations require quantised (integer-only) models. But it is the only path to trustless, privacy-preserving, on-chain-verifiable AI decisions.
How a proof of inference works
At the highest level, a ZKML proof answers one question: "Given input X, did this model produce output Y?" The answer is a cryptographic proof that any verifier can check without re-running the model.
Model compilation → arithmetic circuit
The neural network (weights, activations, layers) is compiled into an arithmetic circuit — a directed acyclic graph of addition and multiplication gates over a finite field. Every ReLU, every matrix multiply, every softmax becomes a set of field arithmetic constraints.
This is where quantisation matters: ZK circuits operate over integers modulo a prime. Floating-point operations must be approximated using fixed-point integer arithmetic (typically 16-bit or 8-bit). A model that was trained in float32 must be re-quantised before it can be expressed as a circuit.
Witness generation
The witness is the private input to the proof: model weights + activation values at every layer for a specific input. The prover (your AI agent) runs the model locally and records every intermediate value. This is the "secret" that the proof will reference without revealing.
In Midnight's Kachina model, this maps to private state: data that lives only on the prover's machine and never touches the public ledger.
Constraint satisfaction check
The prover checks that the witness satisfies every constraint in the circuit. For a layer with weights W and input x, the constraint is simply: W·x = y. If any constraint fails, the proof cannot be generated — the computation was incorrect.
ZK-SNARK proof generation
Using the circuit's proving key (generated during a setup phase) and the witness, the prover generates a succinct cryptographic proof. This proof is typically 256–512 bytes regardless of model size. Generation time scales with circuit size — small models (≤10M parameters, quantised) can prove in seconds; large models may take minutes to hours on current hardware.
On-chain verification
The proof, the model's public commitment (a hash of the weights), and the output are submitted on-chain. A smart contract (or Compact circuit on Midnight) verifies the proof in constant time — typically a few elliptic curve pairing operations, costing a fraction of a cent in gas regardless of model complexity.
Circuit-level detail
Understanding what actually happens inside the circuit separates developers who can debug ZKML integrations from those who cannot.
Arithmetic circuits and R1CS
Most ZK-SNARK backends (Groth16, PLONK, FFlonk) consume constraints in Rank-1 Constraint System (R1CS) form. Every operation in the circuit becomes a constraint of the form:
-- R1CS constraint form
-- (a · b) = c where a, b, c are linear combinations of witness values
-- A single ReLU with input x:
-- Constraint 1: x_pos * (1 - is_negative) = x_pos (non-negativity)
-- Constraint 2: x * is_negative + x_pos = x (decomposition)
ReLU constraints per activation: 2
Matrix multiply (m×n weight matrix): m×n constraints
Total constraints ≈ layers × (weights + activations × 2)
Fixed-point quantisation
All values in the circuit must be field elements — integers mod a large prime p. Float32 weights are converted to fixed-point integers by multiplying by a scale factor and rounding:
const SCALE: i64 = 1 << 16; // Q16.16 fixed-point
fn quantise(w: f32) -> i64 {
(w as f64 * SCALE as f64).round() as i64
}
fn quantised_matmul(
weights: &[Vec<i64>],
input: &[i64],
) -> Vec<i64> {
weights.iter().map(|row| {
let dot: i64 = row.iter().zip(input)
.map(|(w, x)| w * x)
.sum();
dot / SCALE // rescale after multiply
}).collect()
}
Lookup tables for non-linear functions
Functions like ReLU, sigmoid, and softmax are expensive to express in raw R1CS because they require range checks and comparisons. Modern backends (PLONK, Halo2) support lookup tables — precomputed tables of valid (input, output) pairs that drastically reduce constraint count for non-linear activations:
// ReLU lookup: for every value v in [-2^15, 2^15], store max(v, 0)
fn relu_table() -> Vec<(u64, u64)> {
(0u64..65536).map(|v| {
let signed = v as i16;
(v, i16::max(signed, 0) as u64)
}).collect()
}
// Each activation now costs 1 lookup instead of 2 R1CS constraints
// For a 1000-neuron hidden layer: saves ~1000 constraints per ReLU layer
The Midnight approach — Compact & Kachina
Midnight is a privacy-first blockchain built by IOG that ships its own ZK-native smart contract language, Compact, and a proving system called Kachina. For ZKML use cases, Midnight's architecture is uniquely well-suited because it was designed from the ground up around the public/private state split that ZKML requires.
The public/private state split in practice
In Kachina's model, every Compact contract operates across two ledgers simultaneously — a public ledger visible to all, and a private ledger that exists only on the user's local machine. For a ZKML agent, this maps cleanly:
| ZKML concept | Midnight / Kachina equivalent | Lives where |
|---|---|---|
| Model weights | Private state (witnesses) | Agent's machine only |
| Input data | Private witness inputs | Agent's machine only |
| Inference output | Public state update | Public ledger |
| Proof of correctness | ZK-SNARK via Kachina | On-chain, verifiable |
| Model commitment | Merkle root of weight tree | Public ledger |
Writing a ZKML verifier in Compact
The following Compact contract accepts a ZK proof of inference and verifies that the model output exceeds a confidence threshold — without ever seeing the weights or input data.
// zkml_verifier.compact
// Verifies that an AI agent's inference was performed correctly
// before allowing an on-chain action (e.g. a trade execution).
pragma language_version ">=0.14.0";
import CompactStandardLibrary;
// ── Public state: what the chain knows ──────────────────
ledger {
model_commitment: Bytes<32>; // SHA-256 of quantised weight tensor
last_output: Field; // most recent verified inference output
inference_count: Uint<64>; // total verified calls (replay protection)
confidence_floor: Field; // minimum acceptable softmax confidence
}
// ── Circuit: the ZK constraint system ───────────────────
// Witnesses (private — never leave the prover)
// - model_weights: the quantised weight tensors
// - input_vector: the agent's input features
// - layer_outputs: intermediate activation values per layer
circuit verify_inference(
model_weights: Vector<Field, 1024>, // private
input_vector: Vector<Field, 64>, // private
layer_outputs: Vector<Field, 256>, // private
claimed_output: Field, // public output
weight_commitment: Bytes<32>, // public commitment
): Field {
// 1. Verify the weight commitment matches on-chain record
assert sha256(model_weights) == ledger.model_commitment
"Weight commitment mismatch: model has changed or is incorrect";
// 2. Verify layer computations (matrix multiply + ReLU)
// Each layer: output[i] = relu(weights[layer][i] · input)
for i in 0..layer_outputs.length {
assert layer_outputs[i] >= 0 // ReLU: non-negativity
"Layer output violates ReLU constraint at index i";
}
// 3. Verify claimed output matches final layer computation
assert claimed_output == layer_outputs[layer_outputs.length - 1]
"Claimed output does not match final layer activation";
// 4. Enforce confidence floor (agent must be sufficiently certain)
assert claimed_output >= ledger.confidence_floor
"Inference confidence below threshold — action blocked";
return claimed_output;
}
// ── Transaction: called by the AI agent ─────────────────
export transaction submit_inference(
claimed_output: Field,
weight_commitment: Bytes<32>,
) {
const verified = verify_inference(
/* witnesses supplied by prover, not visible here */
claimed_output,
weight_commitment,
);
// Update public ledger state only after proof passes
ledger.last_output = verified;
ledger.inference_count = ledger.inference_count + 1;
}
circuit parameter list and the Kachina proving system handles the rest. This is the key ergonomic advantage over writing raw Risc0 guest programs.
Kachina & private state management
Kachina is Midnight's proving system. It was designed to bridge private computation (agent-local) and public verification (on-chain) using ZK-SNARKs. For ZKML, Kachina solves a specific problem: the model weights must influence the proof without being disclosed to the verifier.
Kachina achieves this through transcripts — ordered records of all queries the computation makes against the private state. The agent proves, in zero-knowledge, that it possesses a private state whose transcript is consistent with the public state transition. The chain verifies the proof against the circuit; it never sees the underlying private values.
Concurrency and transcript reordering
One Kachina property that matters for high-frequency AI agents: Kachina supports concurrent proof submission. Multiple agents can submit inference proofs simultaneously. The protocol optimises conflicting transcripts and allows reorderings without breaking consistency. For a DeFi research agent that needs to post dozens of on-chain decisions per block, this is not academic — it is the feature that makes the architecture scale.
SP1 (Succinct) integration
SP1 is a zkVM by Succinct Labs. It proves execution of arbitrary Rust programs, which makes it practical for ZKML: you write the inference in Rust, compile it to a guest program, and SP1 generates the proof.
[package]
name = "zkml-inference-guest"
version = "0.1.0"
edition = "2021"
[dependencies]
sp1-zkvm = "3.0.0" # SP1 guest SDK
ndarray = "0.15" # matrix ops (no_std compatible)
[profile.release]
opt-level = 3
lto = "fat" # critical for proof size
// src/main.rs (guest program — runs inside SP1 zkVM)
#![no_main]
sp1_zkvm::entrypoint!(main);
use sp1_zkvm::io;
use ndarray::{Array1, Array2};
fn relu(x: i64) -> i64 { i64::max(x, 0) }
pub fn main() {
// Read private inputs from the prover
let weights: Vec<i64> = io::read::<Vec<i64>>();
let input: Vec<i64> = io::read::<Vec<i64>>();
let rows = 64usize;
let cols = input.len();
// Single hidden layer: output = relu(W · x)
let w = Array2::from_shape_vec((rows, cols), weights).unwrap();
let x = Array1::from_vec(input);
let hidden: Vec<i64> = w.dot(&x).iter().map(|&v| relu(v)).collect();
// Commit the output publicly — verifier will check this
let output = hidden.iter().copied().max().unwrap_or(0);
io::commit(&output);
}
import { ProverClient, CpuProver } from "@succinct/sp1-sdk";
const ELF = readFileSync("./target/guest.elf");
async function proveInference(weights: bigint[], input: bigint[]) {
const client = new CpuProver();
const stdin = new SP1Stdin();
stdin.writeVec(weights); // private — only prover sees this
stdin.writeVec(input);
const { proof, publicValues } = await client.prove(ELF, stdin);
// publicValues contains only committed outputs — not weights/inputs
const output = publicValues.readU64();
return { proof, output };
}
Risc0 integration
Risc0 takes a similar zkVM approach to SP1 but uses a RISC-V ISA as its execution target. The guest program runs on a virtual RISC-V CPU, and Risc0 proves correct execution of the RISC-V bytecode.
#![no_main]
risc0_zkvm::guest::entry!(main);
use risc0_zkvm::guest::env;
fn main() {
// Read private witness data
let weights: Vec<i32> = env::read();
let input: Vec<i32> = env::read();
let n = input.len();
let mut hidden = Vec::with_capacity(n);
for i in 0..n {
let dot: i64 = (0..n)
.map(|j| weights[i * n + j] as i64 * input[j] as i64)
.sum();
hidden.push((dot >> 16).max(0) as i32); // Q16 rescale + ReLU
}
// Journal = public output only (weights/input stay private)
env::commit(&hidden);
}
Modulus Labs
Modulus Labs focuses specifically on ZKML — they build circuits for specific model architectures (primarily CNNs and small transformers) rather than a general-purpose zkVM. The tradeoff: significantly smaller proof sizes and faster proving for supported architectures, at the cost of flexibility.
Modulus's RockyML framework exports ONNX models directly to Halo2 circuits. If your model fits a supported architecture and is under ~10M parameters, Modulus's toolchain can generate the circuit automatically without you writing circuit code manually.
Troubleshooting — SP1
| Error | Cause | Fix |
|---|---|---|
| STARK_VERIFY_FAILED | Guest program panicked or produced inconsistent output between runs (non-determinism) | Audit for HashMap iteration order, SystemTime, or any non-deterministic source in the guest. SP1 guests must be fully deterministic. Replace HashMap with BTreeMap. |
| MEMORY_ACCESS_FAULT at 0x... | Guest accessed memory outside the allocated segment — typically an out-of-bounds slice access in matrix multiply | Add explicit bounds checks before indexing: assert!(i * n + j < weights.len()). Use safe indexing (.get()) during development. |
| CYCLE_LIMIT_EXCEEDED | Model is too large — guest program exceeds SP1's default cycle budget (100M cycles) | Reduce model size, quantise to 8-bit (halves cycle count), or use SP1's --cycle-limit flag to increase the budget. For models >5M parameters, expect to need 500M+ cycles. |
| io::read type mismatch | Host wrote Vec<i64> but guest reads Vec<i32> — or vice versa |
Host and guest must use identical types for every stdin.write / env::read pair. Define a shared types.rs crate used by both host and guest to enforce this. |
| ELF loading failed: invalid magic | Guest binary was not compiled with cargo prove build — wrong target triple |
Always use cargo prove build (not cargo build) for the guest. The target must be riscv32im-succinct-zkvm-elf. |
| Groth16 proof too large for calldata | Using Plonk proof type — Groth16 wrapper not enabled | Use client.prove_groth16() instead of client.prove() for on-chain submission. Groth16-wrapped proofs are ~300 bytes vs ~400KB for raw STARK. |
Troubleshooting — Risc0
| Error | Cause | Fix |
|---|---|---|
| ProverError: journal mismatch | Guest called env::commit() with different data across two proof attempts — non-determinism |
Same root cause as SP1's STARK_VERIFY_FAILED. Audit all HashMap usages, random number generation, and any syscall that depends on wall time. |
| ImageID mismatch | On-chain verifier has a different Image ID than the proof — guest code changed after verification key was deployed | Re-deploy the verifier contract whenever you update the guest binary. Risc0's Image ID is a hash of the compiled guest ELF — any code change produces a new ID. |
| Stack overflow in guest | Deep recursion or large stack-allocated arrays in the inference loop | Heap-allocate large arrays with Vec::with_capacity() instead of stack arrays. The Risc0 RISC-V guest stack is 256KB by default. |
| Segment size exceeded | Single execution segment is too large — occurs with models that have long sequential computation paths | Enable continuation mode: ExecutorEnv::builder().segment_limit_po2(22). This splits the proof across multiple segments and merges them. |
| Division by zero in circuit | Softmax or normalization layer divides by a sum that becomes zero after quantisation | Add a small epsilon before division: let denom = sum.max(1). This is numerically safe for classification and avoids the circuit constraint violation. |
Troubleshooting — Midnight Compact
| Error | Cause | Fix |
|---|---|---|
| assert_failed: Weight commitment mismatch | The SHA-256 of the submitted weight tensor does not match ledger.model_commitment — weights changed or the wrong tensor was used |
Re-register the model by calling the commitment registration transaction with the current weight tensor. Commitment must be resubmitted after any retraining or re-quantisation. |
| TypeError: Vector length mismatch | Circuit declared Vector<Field, 1024> but prover supplied a tensor with a different dimension |
Compact's Vector types are fixed-length. Pad or trim input vectors to match the declared circuit dimensions. Use a wrapper that pads to the nearest power-of-two length. |
| PrivacyViolation: ledger field read in witness context | A circuit block tried to read a ledger field directly instead of passing it as a public parameter |
Pass public ledger values into the circuit as explicit parameters. Circuits in Compact are stateless — they cannot directly access ledger state. Thread values through the transaction call instead. |
| CompilationError: Field overflow | An intermediate computation exceeded the field prime — common when multiplying two large fixed-point values | Rescale intermediate values more aggressively. After each matmul, right-shift by the scale factor before the next multiply: mid = (a * b) >> SCALE_BITS. Use 16-bit rather than 32-bit fixed-point if overflow persists. |
| npm ERR: 403 Forbidden (npm.pkg.midnight.network) | Missing authentication for Midnight's private npm registry | Run npm login --scope=@midnight-ntwrk --registry=https://npm.pkg.midnight.network and authenticate with your GitHub credentials. Add //npm.pkg.midnight.network/:_authToken=${NPM_TOKEN} to your .npmrc. |
| version mismatch: compact-compiler vs midnight-sdk | Compact compiler version is incompatible with the SDK version — check release notes for exact compatibility matrix | Run npx @midnight-ntwrk/midnight-version-check or check the version compatibility matrix. Pin both packages to matching versions in package.json. |
Common patterns & mitigations
Non-determinism: the silent killer
The most common cause of proof failure across all three backends is non-determinism. ZK proofs require that the same inputs always produce the same intermediate values. Any source of randomness or ordering non-determinism in the guest will cause the prover to generate a witness that does not satisfy the circuit constraints.
HashMap / HashSet (use BTreeMap / BTreeSet) · SystemTime::now() · rand::thread_rng() · std::thread · any FFI call to non-deterministic host functions · floating-point arithmetic (results differ by platform and compiler flags).
Model size vs. proving time
Current benchmarks on commodity hardware (2026 pricing, ~$0.01/proof using cloud provers):
| Model size | Parameters | SP1 proving time | Risc0 proving time | Practical? |
|---|---|---|---|---|
| Tiny MLP | <100K | ~2s | ~3s | Yes |
| Small MLP | 1M | ~30s | ~45s | Yes |
| Medium CNN | 5M | ~4 min | ~6 min | Conditional |
| Small transformer | 20M | ~25 min | ~35 min | High latency |
| LLM (any) | >1B | Hours–days | Hours–days | Not yet |
For on-chain DeFi agents where latency matters, target models under 1M parameters with aggressive quantisation. Research agents with longer decision cycles (daily/weekly) can tolerate medium model sizes.
Glossary
Arithmetic circuit — a computation expressed as a directed acyclic graph of addition and multiplication gates over a finite field. The fundamental representation used by ZK-SNARK backends.
Compact — Midnight's smart contract language. Compiles to ZK-SNARK circuits via the Kachina proving system. Privacy-native: distinguishes public and private state at the language level.
Field element — an integer modulo a large prime p. All values in a ZK circuit must be field elements. Float32 values must be converted to fixed-point integers before circuit use.
Kachina — Midnight's proving system. Bridges private state (agent-local) and public state (on-chain) using ZK-SNARKs and a transcript-based concurrency model.
Proof of inference — a ZK-SNARK that certifies a specific neural network, applied to a specific input, produced a specific output — without revealing the network weights or input.
Quantisation — the conversion of floating-point model weights to fixed-point integer representations. Required for ZKML because ZK circuits operate over finite fields, not floats.
R1CS (Rank-1 Constraint System) — the constraint format consumed by Groth16 and many other ZK backends. Every circuit gate becomes a constraint of the form (a · b) = c.
Witness — the private input to a ZK proof. In ZKML: the model weights, input data, and all intermediate activation values. The proof certifies the witness satisfies the circuit constraints without revealing the witness itself.
zkVM (Zero-Knowledge Virtual Machine) — a system that proves correct execution of a program (typically written in Rust) without revealing the program's private inputs. SP1 and Risc0 are both zkVMs.
ZK-SNARK — Zero-Knowledge Succinct Non-Interactive Argument of Knowledge. A proof system where: the proof is small (succinct), no back-and-forth is needed (non-interactive), and the prover demonstrates knowledge of a secret without revealing it (zero-knowledge).
Further reading
Midnight Developer Documentation — official Compact language reference, Kachina architecture, and getting-started guides.
Kachina — Foundations of Private Smart Contracts (Zindros et al., 2020) — the academic paper underlying Midnight's proving system.
SP1 by Succinct Labs — the zkVM powering the SP1 integration in this guide.
Risc0 — RISC-V based zkVM with mature on-chain verifier contracts.
Modulus Labs — ONNX-to-Halo2 circuit compiler for supported model architectures.
ZKML: Verifiable Inference for Machine Learning — foundational survey on the ZKML problem space.