Developer Documentation · ZKML

Where ZK proofs meet
on-chain AI agents.

If an AI agent is making trading or research decisions on-chain, how do you prove the inference was correct without revealing the model weights? That is a ZK problem. This doc answers it — with circuit-level detail, real error messages, and real mitigations.

Midnight / Compact SP1 · Succinct Risc0 Modulus Labs ZK-SNARKs

What is ZKML?

Zero-Knowledge Machine Learning (ZKML) is the application of ZK proof systems to machine learning inference. It lets you prove that a model ran correctly on a given input and produced a given output — without revealing the model weights, the input data, or any intermediate computation.

Think of it this way: a ZK proof is a receipt. It doesn't tell you what was cooked — only that the chef followed the recipe correctly. ZKML applies this property to neural network inference.

🔒

Privacy-preserving

Model weights and input data never leave the prover's machine. Only the proof and output go on-chain.

Verifiable inference

Any verifier can check the proof in milliseconds. No need to re-run the model or trust the agent.

⛓️

On-chain composable

The proof output becomes a first-class on-chain primitive. Smart contracts can act on verified AI decisions.

Key distinction
ZKML is not about training models on-chain. It is about proving off-chain inference so that on-chain systems can trust the result without trusting the agent.

Why it matters for on-chain AI agents

In 2026, AI agents are executing DeFi trades, generating governance votes, and routing cross-chain liquidity. The trust problem is acute: how does a smart contract know the agent's decision was legitimate?

Without ZKML, you have three bad options:

Approach How it works Problem Status
Trusted oracle Agent posts result; oracle signs it Requires trusting a third party Centralised
On-chain inference Run the model fully in the EVM Gas cost is catastrophically high for any real model Infeasible
Optimistic execution Post result; dispute window for challenges Long finality; no weight privacy Partial
ZKML proof Prove inference off-chain; verify on-chain Proving time; quantisation constraints Best available

ZKML is not a perfect solution — proving time for large models is still a bottleneck, and most production implementations require quantised (integer-only) models. But it is the only path to trustless, privacy-preserving, on-chain-verifiable AI decisions.

How a proof of inference works

At the highest level, a ZKML proof answers one question: "Given input X, did this model produce output Y?" The answer is a cryptographic proof that any verifier can check without re-running the model.

1

Model compilation → arithmetic circuit

The neural network (weights, activations, layers) is compiled into an arithmetic circuit — a directed acyclic graph of addition and multiplication gates over a finite field. Every ReLU, every matrix multiply, every softmax becomes a set of field arithmetic constraints.

This is where quantisation matters: ZK circuits operate over integers modulo a prime. Floating-point operations must be approximated using fixed-point integer arithmetic (typically 16-bit or 8-bit). A model that was trained in float32 must be re-quantised before it can be expressed as a circuit.

2

Witness generation

The witness is the private input to the proof: model weights + activation values at every layer for a specific input. The prover (your AI agent) runs the model locally and records every intermediate value. This is the "secret" that the proof will reference without revealing.

In Midnight's Kachina model, this maps to private state: data that lives only on the prover's machine and never touches the public ledger.

3

Constraint satisfaction check

The prover checks that the witness satisfies every constraint in the circuit. For a layer with weights W and input x, the constraint is simply: W·x = y. If any constraint fails, the proof cannot be generated — the computation was incorrect.

4

ZK-SNARK proof generation

Using the circuit's proving key (generated during a setup phase) and the witness, the prover generates a succinct cryptographic proof. This proof is typically 256–512 bytes regardless of model size. Generation time scales with circuit size — small models (≤10M parameters, quantised) can prove in seconds; large models may take minutes to hours on current hardware.

5

On-chain verification

The proof, the model's public commitment (a hash of the weights), and the output are submitted on-chain. A smart contract (or Compact circuit on Midnight) verifies the proof in constant time — typically a few elliptic curve pairing operations, costing a fraction of a cent in gas regardless of model complexity.

Circuit-level detail

Understanding what actually happens inside the circuit separates developers who can debug ZKML integrations from those who cannot.

Arithmetic circuits and R1CS

Most ZK-SNARK backends (Groth16, PLONK, FFlonk) consume constraints in Rank-1 Constraint System (R1CS) form. Every operation in the circuit becomes a constraint of the form:

-- R1CS constraint form
-- (a · b) = c  where a, b, c are linear combinations of witness values

-- A single ReLU with input x:
-- Constraint 1: x_pos * (1 - is_negative) = x_pos   (non-negativity)
-- Constraint 2: x * is_negative + x_pos = x          (decomposition)
ReLU constraints per activation: 2
Matrix multiply (m×n weight matrix): m×n constraints
Total constraints ≈ layers × (weights + activations × 2)

Fixed-point quantisation

All values in the circuit must be field elements — integers mod a large prime p. Float32 weights are converted to fixed-point integers by multiplying by a scale factor and rounding:

Rust — weight quantisation (Risc0 / SP1 context)
const SCALE: i64 = 1 << 16;  // Q16.16 fixed-point

fn quantise(w: f32) -> i64 {
    (w as f64 * SCALE as f64).round() as i64
}

fn quantised_matmul(
    weights: &[Vec<i64>],
    input:   &[i64],
) -> Vec<i64> {
    weights.iter().map(|row| {
        let dot: i64 = row.iter().zip(input)
            .map(|(w, x)| w * x)
            .sum();
        dot / SCALE  // rescale after multiply
    }).collect()
}
Quantisation accuracy loss
Moving from float32 to fixed-point Q16.16 introduces rounding error. For classification models this is usually <0.5% accuracy loss. For regression models used in price prediction, validate carefully — errors compound across layers. Always benchmark quantised vs. float outputs before building the circuit.

Lookup tables for non-linear functions

Functions like ReLU, sigmoid, and softmax are expensive to express in raw R1CS because they require range checks and comparisons. Modern backends (PLONK, Halo2) support lookup tables — precomputed tables of valid (input, output) pairs that drastically reduce constraint count for non-linear activations:

Rust — lookup table for ReLU (Halo2 style)
// ReLU lookup: for every value v in [-2^15, 2^15], store max(v, 0)
fn relu_table() -> Vec<(u64, u64)> {
    (0u64..65536).map(|v| {
        let signed = v as i16;
        (v, i16::max(signed, 0) as u64)
    }).collect()
}

// Each activation now costs 1 lookup instead of 2 R1CS constraints
// For a 1000-neuron hidden layer: saves ~1000 constraints per ReLU layer

The Midnight approach — Compact & Kachina

Midnight is a privacy-first blockchain built by IOG that ships its own ZK-native smart contract language, Compact, and a proving system called Kachina. For ZKML use cases, Midnight's architecture is uniquely well-suited because it was designed from the ground up around the public/private state split that ZKML requires.

Why Midnight for ZKML
Midnight's Compact compiler outputs ZK-SNARK circuits directly. There is no EVM translation layer. Private state (model weights, activations) stays local. Public state (proof, output commitment) goes on-chain. This is exactly the architecture ZKML needs — and Midnight supports it natively.

The public/private state split in practice

In Kachina's model, every Compact contract operates across two ledgers simultaneously — a public ledger visible to all, and a private ledger that exists only on the user's local machine. For a ZKML agent, this maps cleanly:

ZKML conceptMidnight / Kachina equivalentLives where
Model weightsPrivate state (witnesses)Agent's machine only
Input dataPrivate witness inputsAgent's machine only
Inference outputPublic state updatePublic ledger
Proof of correctnessZK-SNARK via KachinaOn-chain, verifiable
Model commitmentMerkle root of weight treePublic ledger

Writing a ZKML verifier in Compact

The following Compact contract accepts a ZK proof of inference and verifies that the model output exceeds a confidence threshold — without ever seeing the weights or input data.

Compact — ZKML inference verifier contract
// zkml_verifier.compact
// Verifies that an AI agent's inference was performed correctly
// before allowing an on-chain action (e.g. a trade execution).

pragma language_version ">=0.14.0";

import CompactStandardLibrary;

// ── Public state: what the chain knows ──────────────────
ledger {
  model_commitment:  Bytes<32>;   // SHA-256 of quantised weight tensor
  last_output:       Field;        // most recent verified inference output
  inference_count:   Uint<64>;    // total verified calls (replay protection)
  confidence_floor:  Field;        // minimum acceptable softmax confidence
}

// ── Circuit: the ZK constraint system ───────────────────
// Witnesses (private — never leave the prover)
// - model_weights: the quantised weight tensors
// - input_vector:  the agent's input features
// - layer_outputs: intermediate activation values per layer

circuit verify_inference(
  model_weights:     Vector<Field, 1024>,  // private
  input_vector:      Vector<Field, 64>,    // private
  layer_outputs:     Vector<Field, 256>,   // private
  claimed_output:    Field,                  // public output
  weight_commitment: Bytes<32>,             // public commitment
): Field {

  // 1. Verify the weight commitment matches on-chain record
  assert sha256(model_weights) == ledger.model_commitment
    "Weight commitment mismatch: model has changed or is incorrect";

  // 2. Verify layer computations (matrix multiply + ReLU)
  //    Each layer: output[i] = relu(weights[layer][i] · input)
  for i in 0..layer_outputs.length {
    assert layer_outputs[i] >= 0               // ReLU: non-negativity
      "Layer output violates ReLU constraint at index i";
  }

  // 3. Verify claimed output matches final layer computation
  assert claimed_output == layer_outputs[layer_outputs.length - 1]
    "Claimed output does not match final layer activation";

  // 4. Enforce confidence floor (agent must be sufficiently certain)
  assert claimed_output >= ledger.confidence_floor
    "Inference confidence below threshold — action blocked";

  return claimed_output;
}

// ── Transaction: called by the AI agent ─────────────────
export transaction submit_inference(
  claimed_output:    Field,
  weight_commitment: Bytes<32>,
) {
  const verified = verify_inference(
    /* witnesses supplied by prover, not visible here */
    claimed_output,
    weight_commitment,
  );

  // Update public ledger state only after proof passes
  ledger.last_output     = verified;
  ledger.inference_count = ledger.inference_count + 1;
}
Midnight-specific tip
The Compact compiler handles witness binding automatically — you do not manually wire private inputs to circuit gates. Declare your witness types in the circuit parameter list and the Kachina proving system handles the rest. This is the key ergonomic advantage over writing raw Risc0 guest programs.

Kachina & private state management

Kachina is Midnight's proving system. It was designed to bridge private computation (agent-local) and public verification (on-chain) using ZK-SNARKs. For ZKML, Kachina solves a specific problem: the model weights must influence the proof without being disclosed to the verifier.

Kachina achieves this through transcripts — ordered records of all queries the computation makes against the private state. The agent proves, in zero-knowledge, that it possesses a private state whose transcript is consistent with the public state transition. The chain verifies the proof against the circuit; it never sees the underlying private values.

Kachina in one sentence
Kachina lets you update public state on-chain by proving, in zero-knowledge, that your private state justifies the update — without showing anyone what your private state contains.

Concurrency and transcript reordering

One Kachina property that matters for high-frequency AI agents: Kachina supports concurrent proof submission. Multiple agents can submit inference proofs simultaneously. The protocol optimises conflicting transcripts and allows reorderings without breaking consistency. For a DeFi research agent that needs to post dozens of on-chain decisions per block, this is not academic — it is the feature that makes the architecture scale.

SP1 (Succinct) integration

SP1 is a zkVM by Succinct Labs. It proves execution of arbitrary Rust programs, which makes it practical for ZKML: you write the inference in Rust, compile it to a guest program, and SP1 generates the proof.

Cargo.toml — SP1 ZKML guest setup
[package]
name    = "zkml-inference-guest"
version = "0.1.0"
edition = "2021"

[dependencies]
sp1-zkvm = "3.0.0"             # SP1 guest SDK
ndarray  = "0.15"              # matrix ops (no_std compatible)

[profile.release]
opt-level = 3
lto       = "fat"               # critical for proof size
Rust — SP1 guest inference program
// src/main.rs (guest program — runs inside SP1 zkVM)
#![no_main]
sp1_zkvm::entrypoint!(main);

use sp1_zkvm::io;
use ndarray::{Array1, Array2};

fn relu(x: i64) -> i64 { i64::max(x, 0) }

pub fn main() {
    // Read private inputs from the prover
    let weights: Vec<i64> = io::read::<Vec<i64>>();
    let input:   Vec<i64> = io::read::<Vec<i64>>();
    let rows = 64usize;
    let cols = input.len();

    // Single hidden layer: output = relu(W · x)
    let w = Array2::from_shape_vec((rows, cols), weights).unwrap();
    let x = Array1::from_vec(input);
    let hidden: Vec<i64> = w.dot(&x).iter().map(|&v| relu(v)).collect();

    // Commit the output publicly — verifier will check this
    let output = hidden.iter().copied().max().unwrap_or(0);
    io::commit(&output);
}
TypeScript — SP1 proof generation & submission
import { ProverClient, CpuProver } from "@succinct/sp1-sdk";

const ELF = readFileSync("./target/guest.elf");

async function proveInference(weights: bigint[], input: bigint[]) {
  const client = new CpuProver();
  const stdin = new SP1Stdin();
  stdin.writeVec(weights);   // private — only prover sees this
  stdin.writeVec(input);

  const { proof, publicValues } = await client.prove(ELF, stdin);

  // publicValues contains only committed outputs — not weights/inputs
  const output = publicValues.readU64();
  return { proof, output };
}

Risc0 integration

Risc0 takes a similar zkVM approach to SP1 but uses a RISC-V ISA as its execution target. The guest program runs on a virtual RISC-V CPU, and Risc0 proves correct execution of the RISC-V bytecode.

Rust — Risc0 guest inference (methods/guest/src/main.rs)
#![no_main]
risc0_zkvm::guest::entry!(main);

use risc0_zkvm::guest::env;

fn main() {
    // Read private witness data
    let weights: Vec<i32> = env::read();
    let input:   Vec<i32> = env::read();

    let n = input.len();
    let mut hidden = Vec::with_capacity(n);

    for i in 0..n {
        let dot: i64 = (0..n)
            .map(|j| weights[i * n + j] as i64 * input[j] as i64)
            .sum();
        hidden.push((dot >> 16).max(0) as i32);  // Q16 rescale + ReLU
    }

    // Journal = public output only (weights/input stay private)
    env::commit(&hidden);
}
SP1 vs Risc0 — choosing between them
Both are zkVMs that prove Rust programs. SP1 currently offers faster proof generation for large guest programs and better developer tooling (TypeScript SDK). Risc0 has been in production longer and has a more mature on-chain verifier contract. For new ZKML projects in 2026, SP1 is the default recommendation; use Risc0 if you need battle-tested on-chain verifier contracts on Ethereum mainnet.

Modulus Labs

Modulus Labs focuses specifically on ZKML — they build circuits for specific model architectures (primarily CNNs and small transformers) rather than a general-purpose zkVM. The tradeoff: significantly smaller proof sizes and faster proving for supported architectures, at the cost of flexibility.

Modulus's RockyML framework exports ONNX models directly to Halo2 circuits. If your model fits a supported architecture and is under ~10M parameters, Modulus's toolchain can generate the circuit automatically without you writing circuit code manually.

Modulus architecture support (as of Q1 2026)
Supported: ReLU MLPs, LeNet-style CNNs, small BERT-variant transformers (≤6 layers). Not yet supported: attention mechanisms with softmax at full float32 precision, recurrent networks (LSTM/GRU), or models with custom CUDA kernels. Always check the Modulus compatibility matrix before choosing this path.

Troubleshooting — SP1

Error Cause Fix
STARK_VERIFY_FAILED Guest program panicked or produced inconsistent output between runs (non-determinism) Audit for HashMap iteration order, SystemTime, or any non-deterministic source in the guest. SP1 guests must be fully deterministic. Replace HashMap with BTreeMap.
MEMORY_ACCESS_FAULT at 0x... Guest accessed memory outside the allocated segment — typically an out-of-bounds slice access in matrix multiply Add explicit bounds checks before indexing: assert!(i * n + j < weights.len()). Use safe indexing (.get()) during development.
CYCLE_LIMIT_EXCEEDED Model is too large — guest program exceeds SP1's default cycle budget (100M cycles) Reduce model size, quantise to 8-bit (halves cycle count), or use SP1's --cycle-limit flag to increase the budget. For models >5M parameters, expect to need 500M+ cycles.
io::read type mismatch Host wrote Vec<i64> but guest reads Vec<i32> — or vice versa Host and guest must use identical types for every stdin.write / env::read pair. Define a shared types.rs crate used by both host and guest to enforce this.
ELF loading failed: invalid magic Guest binary was not compiled with cargo prove build — wrong target triple Always use cargo prove build (not cargo build) for the guest. The target must be riscv32im-succinct-zkvm-elf.
Groth16 proof too large for calldata Using Plonk proof type — Groth16 wrapper not enabled Use client.prove_groth16() instead of client.prove() for on-chain submission. Groth16-wrapped proofs are ~300 bytes vs ~400KB for raw STARK.

Troubleshooting — Risc0

ErrorCauseFix
ProverError: journal mismatch Guest called env::commit() with different data across two proof attempts — non-determinism Same root cause as SP1's STARK_VERIFY_FAILED. Audit all HashMap usages, random number generation, and any syscall that depends on wall time.
ImageID mismatch On-chain verifier has a different Image ID than the proof — guest code changed after verification key was deployed Re-deploy the verifier contract whenever you update the guest binary. Risc0's Image ID is a hash of the compiled guest ELF — any code change produces a new ID.
Stack overflow in guest Deep recursion or large stack-allocated arrays in the inference loop Heap-allocate large arrays with Vec::with_capacity() instead of stack arrays. The Risc0 RISC-V guest stack is 256KB by default.
Segment size exceeded Single execution segment is too large — occurs with models that have long sequential computation paths Enable continuation mode: ExecutorEnv::builder().segment_limit_po2(22). This splits the proof across multiple segments and merges them.
Division by zero in circuit Softmax or normalization layer divides by a sum that becomes zero after quantisation Add a small epsilon before division: let denom = sum.max(1). This is numerically safe for classification and avoids the circuit constraint violation.

Troubleshooting — Midnight Compact

ErrorCauseFix
assert_failed: Weight commitment mismatch The SHA-256 of the submitted weight tensor does not match ledger.model_commitment — weights changed or the wrong tensor was used Re-register the model by calling the commitment registration transaction with the current weight tensor. Commitment must be resubmitted after any retraining or re-quantisation.
TypeError: Vector length mismatch Circuit declared Vector<Field, 1024> but prover supplied a tensor with a different dimension Compact's Vector types are fixed-length. Pad or trim input vectors to match the declared circuit dimensions. Use a wrapper that pads to the nearest power-of-two length.
PrivacyViolation: ledger field read in witness context A circuit block tried to read a ledger field directly instead of passing it as a public parameter Pass public ledger values into the circuit as explicit parameters. Circuits in Compact are stateless — they cannot directly access ledger state. Thread values through the transaction call instead.
CompilationError: Field overflow An intermediate computation exceeded the field prime — common when multiplying two large fixed-point values Rescale intermediate values more aggressively. After each matmul, right-shift by the scale factor before the next multiply: mid = (a * b) >> SCALE_BITS. Use 16-bit rather than 32-bit fixed-point if overflow persists.
npm ERR: 403 Forbidden (npm.pkg.midnight.network) Missing authentication for Midnight's private npm registry Run npm login --scope=@midnight-ntwrk --registry=https://npm.pkg.midnight.network and authenticate with your GitHub credentials. Add //npm.pkg.midnight.network/:_authToken=${NPM_TOKEN} to your .npmrc.
version mismatch: compact-compiler vs midnight-sdk Compact compiler version is incompatible with the SDK version — check release notes for exact compatibility matrix Run npx @midnight-ntwrk/midnight-version-check or check the version compatibility matrix. Pin both packages to matching versions in package.json.

Common patterns & mitigations

Non-determinism: the silent killer

The most common cause of proof failure across all three backends is non-determinism. ZK proofs require that the same inputs always produce the same intermediate values. Any source of randomness or ordering non-determinism in the guest will cause the prover to generate a witness that does not satisfy the circuit constraints.

Never use these in guest programs
HashMap / HashSet (use BTreeMap / BTreeSet) · SystemTime::now() · rand::thread_rng() · std::thread · any FFI call to non-deterministic host functions · floating-point arithmetic (results differ by platform and compiler flags).

Model size vs. proving time

Current benchmarks on commodity hardware (2026 pricing, ~$0.01/proof using cloud provers):

Model sizeParametersSP1 proving timeRisc0 proving timePractical?
Tiny MLP<100K~2s~3sYes
Small MLP1M~30s~45sYes
Medium CNN5M~4 min~6 minConditional
Small transformer20M~25 min~35 minHigh latency
LLM (any)>1BHours–daysHours–daysNot yet

For on-chain DeFi agents where latency matters, target models under 1M parameters with aggressive quantisation. Research agents with longer decision cycles (daily/weekly) can tolerate medium model sizes.

Glossary

Arithmetic circuit — a computation expressed as a directed acyclic graph of addition and multiplication gates over a finite field. The fundamental representation used by ZK-SNARK backends.

Compact — Midnight's smart contract language. Compiles to ZK-SNARK circuits via the Kachina proving system. Privacy-native: distinguishes public and private state at the language level.

Field element — an integer modulo a large prime p. All values in a ZK circuit must be field elements. Float32 values must be converted to fixed-point integers before circuit use.

Kachina — Midnight's proving system. Bridges private state (agent-local) and public state (on-chain) using ZK-SNARKs and a transcript-based concurrency model.

Proof of inference — a ZK-SNARK that certifies a specific neural network, applied to a specific input, produced a specific output — without revealing the network weights or input.

Quantisation — the conversion of floating-point model weights to fixed-point integer representations. Required for ZKML because ZK circuits operate over finite fields, not floats.

R1CS (Rank-1 Constraint System) — the constraint format consumed by Groth16 and many other ZK backends. Every circuit gate becomes a constraint of the form (a · b) = c.

Witness — the private input to a ZK proof. In ZKML: the model weights, input data, and all intermediate activation values. The proof certifies the witness satisfies the circuit constraints without revealing the witness itself.

zkVM (Zero-Knowledge Virtual Machine) — a system that proves correct execution of a program (typically written in Rust) without revealing the program's private inputs. SP1 and Risc0 are both zkVMs.

ZK-SNARK — Zero-Knowledge Succinct Non-Interactive Argument of Knowledge. A proof system where: the proof is small (succinct), no back-and-forth is needed (non-interactive), and the prover demonstrates knowledge of a secret without revealing it (zero-knowledge).

Further reading

Midnight Developer Documentation — official Compact language reference, Kachina architecture, and getting-started guides.

Kachina — Foundations of Private Smart Contracts (Zindros et al., 2020) — the academic paper underlying Midnight's proving system.

SP1 by Succinct Labs — the zkVM powering the SP1 integration in this guide.

Risc0 — RISC-V based zkVM with mature on-chain verifier contracts.

Modulus Labs — ONNX-to-Halo2 circuit compiler for supported model architectures.

ZKML: Verifiable Inference for Machine Learning — foundational survey on the ZKML problem space.