Date: June 30, 2023.


 1.  Introduction
   1.1.  Configurability
   1.2.  The primer
   1.3.  Applications of ZKPs
   1.4.  The structure of this paper
   1.5.  Further information
 2.  Low degree proof example - Fibonacci Virtual Machine
   2.1.  Setup
   2.2.  Interpolating values
   2.3.  Constraints and relaxing them
   2.4.  Boundary conditions
   2.5.  Mixing
   2.6.  FRI Low Degree Testing
   2.7.  Deep variant
   2.8.  Fiat-Shamir heuristic
   2.9.  Summary
 3.  General Compulation
   3.1.  Goal
   3.2.  Program termination
   3.3.  Public input
   3.4.  Output
   3.5.  Recursion
 4.  Primitives
   4.1.  Polynomial constraints
   4.2.  Degree
   4.3.  Control, derived and external columns
   4.4.  Instructions
   4.5.  Permutation
   4.6.  Multiple permutation
   4.7.  Lookup
   4.8.  Injective lookup
   4.9.  Permutation and lookups with multiple columns
 5.  Instruction set
   5.1.  Operands and flag
   5.2.  Unsigned multiplication
   5.3.  Setting flag for zero result
   5.4.  Unsigned arithmetic, comparisons and move
   5.5.  Logic
   5.6.  Word range
   5.7.  Signed arithmetic
   5.8.  Memory
   5.9.  Program lookup
 6.  Prover/Verifier protocol
   6.1.  VM configuration
   6.2.  Prover code for configured VM
   6.3.  Verifier code for configured VM
Appendix A. VM Configuration

1. Introduction

A zero-knowledge proof is a cryptographic concept that allows one party (Prover), to demonstrate knowledge of a certain fact or statement to another party (Verifier), while keeping some sensitive information confidential. In practice, the goal of a zero-knowledge proof is to establish trust and confidence in the validity of a claim without divulging sensitive or confidential data.

In this paper, we describe how to implement of a Zero Knowledge Virtual Machine (ZKVM) for generic computation. The setting is that Prover runs a public program on private inputs and wants to convince Verifier that the program has executed correctly and produces an asserted output, without revealing anything about the computation’s input or intermediate state.


As an example, Prover may want to convince Verifier that they have a password that hashes to a given value, without revealing anything about the password:


Generally, Prover should be able to run some code that performs the hashing operation, and produce data called ‘the proof’, which would allow anyone to verify that the proof is correct, that is, that the Prover has, in fact, a valid pre-image of the given hash value.

ZK problems span a variety of use cases, from more specific to more general. The specific proof-of-hash-pre-image problem above has very practical applications in credential verification. Another narrow case for ZK-proofs are range-proofs used by the ZCash cryptocurrency network to prove availability of funds in a given account without revealing the exact amount of funds or the sequence of transactions that led to the current state.

If we generalize this further, we will discover a wide set of use cases for such primitives. For example, given a hash as a public key Kp = H(Ks), and its pre-image Ks as a private (secret) key, it is not hard to construct an asymmetrical signature scheme using zero-knowledge proofs: signature function S(m,Ks) on message m with key Ks is defined as a zero knowledge proof that H(Ks) = Kp and H(Ks + m) = X that reveals m and X, but not Ks.

So what if we could run any program, and prove, in zero knowledge, that it ran correctly and produced the output we claimed it had, without revealing any inputs? This is what we call zero-knowledge proofs for general computation. The first scheme that achieved this goal is due to Ben-Sasson at al. [BS+13]. It supports any program written in a specially-designed assembly language called TinyRAM that comes close to being viable for practical applications, perhaps with the exception of not supporting non-constant jumps and memory accesses. A precise description of how to implement this scheme is due to Bootle, Cerulli et al. [B+18Ce19], and they also introduce a primitive called lookup, that is very useful in ZK context. Since that time, there have been quite a few advances in the area, and we feel it is useful to revisit a ZKVM implementation with new techniques.

This paper’s primary purpose is to provide a source of introductory information into building a ZK proof system for general computation. Anyone with practical knowledge of numerical programming, starting from no knowledge about how ZK proofs work, should be able to

  1. understand the general approach to generalized computation ZK proofs

  2. understand all the “ingredients” of a full-functioning general-purpose ZKVM

  3. be able to implement their own ZKVM from scratch

While we aim to inform interested contributors just starting to learn about this technology, and to smooth the otherwise steep learning curve they face, we also make some contributions of our own. We, specifically, introduce:

  1. A new ZK-friendly memory scheme (see 5.8)

  2. A concept of a fully configurable ZKVM (a “ZKVM compiler”), introduced below and demonstrated further in 6.1

1.1. Configurability. In familiarizing ourselves with the current state of the art of general computation ZK-proof schemes, we have observed that, at the moment, the space of existing ZKVMs consists of two distinct categories. On one hand, we have TinyRAM already mentioned, and a few others, that use a “ZK-friendly” command set. On the other hand, there are implementations, such as the one provided by Risc0 and zkWASM, that implement ZK-proofs for the RiskV and WASM command sets respectively, both of which are standard LLVM-compatible targets. These two categories are different in terms of the tradeoffs they introduce. Some operations supported by modern CPUs (and WASM) introduce significant additional complexity into the Prover and, consequently, cause a significant degradation of performance. On the other hand, ZK-friendly ZKVMs typically lack the possibility of support by either LLVM or GCC toolchains, which means they can only be used for small custom-written software components. In order for a ZKVM to be both ZK-efficient and viable as an LLVM target, we need to find a viable balance between these tradeoffs.

This introduces the concept of VM configurability. Essentially, we aim to build LLVM for ZK, where a high-level description of a command set can be used to auto-generate both the Prover/Verifier components and an LLVM backend targeting the command set. Doing this would enable contributors to create new ZKVM command sets with minimal effort, and so run multiple experiments to determine this optimal balance. In practice, such “ZKVM compiler” is more than feasible, as we will see below, due to the modular nature of the constraints associated with proving micro-operations that comprise individual instructions.

But there is an even more important reason to build such ZKVM compiler. We are referring to the issue of fragility that challenges the ongoing security of ZK schemes. The issue can be expressed briefly as follows: there must be an efficient process to update Prover/Verifier pairs in response to security incidents and the fast pace of innovation in ZK. Consider the fact that a missing constraint (for example, to prove that the value of an overflow flag can not be anything other than zero or one) can lead to the possibility of constructing invalid proofs. It is not realistic to expect that a set of constraints used to describe software is complete from the get go. Consequently, we must build our ZK frameworks expecting to have to promptly respond to discoveries of such omissions and any other type of exploit.

Furthermore, the fast pace of research in ZK means that the foundational components of any ZK scheme may fast become out of date, and, what is even more concerning, may be discovered to be exploitable. All of this points towards the need to significantly streamline the reconfigure/recompile/redeploy process in any network using ZKPs. The ZKVM compiler being discussed aims to facilitate this.

1.2. The primer. For the Prover/Verifier scheme there are various choices, based on polynomial commitments, notably Bulletproofs [B+23], KZG [KZG] and FRI [BS+18]. For our primer, we chose FRI (with a DEEP modification [BS+19]) as it needs no trusted setup, it is good for constructing recursive proofs, and it is post-quantum in that its security only relies on a (flexible choice of a) collision-resistant hash function. There are also choices of a prime field for it, such as p = 264-232+1 that make for very fast arithmetic and hence efficient Prover/Verifier code. For lookups, we use our scheme [BD23] based on the extensions Plookup [GW20] and Plonkup [P+22] of the original lookup protocol in [B+18].

We note, however, that these choices are mostly for clarity of demonstration and are not consequential in the context of a configurable ZK-compiler. After all, we should be able to easily regenerate our Prover/Verifier with different schemes, as needed.

1.3. Applications of ZKPs. We are most interested in applications of ZKPs to use cases in the financial and legal industries. In both of these contexts, the need for reliable verification of accuracy of facts pertaining to information is critical. Yet, it stands in stark opposition to the need for privacy, which is no less critical. To demonstrate this opposition on a specific example, take two imaginary financiers, Rocky and Morgan. Rocky has invented a cutting edge financial strategy which brings him annual returns of 25%, and has invested all his personal funds into it. He now needs to borrow cash to finance a further increase in trading volume. He believes he can borrow at about 7%, and make a whopping 17% margin. There’s only one problem. Morgan, whose business it is to lend capital, requires Rocky to put up collateral to guarantee the loan. But Rocky doesn’t want to disclose to Morgan the contents of his portfolio, as that would leak information about his wonderful strategy.

After some negotiation it becomes apparent that Rocky would either need to give up his dreams of getting a loan, or would have to disclose his portfolio to Morgan and take the risk of the latter replicating his strategy and eroding his edge. Rocky comes back to his office in a gloomy mood, downs a glass of Scotch, and spills the whole story to his secretary, who then goes ”Wait!”, and tells him that her husband is working on this new technology called zero-knowledge proofs that could potentially help. Rocky agrees to have a meeting with the husband, at which he learns that there is a way to prove the worth of his portfolio to Morgan without disclosing it.

Here’s how it works. Morgan would provide to Rocky some computer code that can determine the value of Rocky’s portfolio. Rocky would run the code and provide its output to Morgan along with the zero-knowledge proof that the code has been run correctly. This output would go to Morgan daily, and if the value is ever too low, Rocky would face a margin call that may require him to sell his entire portfolio in order to repay Morgan.1

This example makes clear the following general position: zero-knowledge proof technology is an essential tool that enables safe and meaningful balance between privacy and verification in the financial and legal context. Here are some more examples that illustrate this position:

  1. Revenue-based lending: verify an organization’s revenue stream, without knowing the specific transaction details or channels.

  2. Regulatory surveillance: verify that an organization is adhering to regulation without requiring access to the details of its activities.

  3. KYC/AML verification without privacy violation: verify age and citizenship of individuals without requiring them to supply their name or any other personally identifiable information.

  4. Contract verification: verify that an organization has entered into a valuable contract with a counterparty without disclosing who the counterparty is.

  5. Vice clause verification: verify that an organization has not been trading with unethical organizations, without having access to the list of organization’s trading counterparties.

1.4. The structure of this paper.

1.5. Further information. This paper was written as a contribution to the global ZK community by the research team at ADAPT Framework Solutions. To learn more about ADAPT, a high-level developer framework for building decentralized systems, please see the ADAPT whitepaper.

Acknowledgements. We would like to thank Delendum Ventures and especially Daniel Lubarov for an opportunity for us to use a Delendum Fellowship and fruitful discussions.

2. Low degree proof example - Fibonacci Virtual Machine

2.1. Setup. Before going into the technical details of the actual protocol of verifying general program execution, we illustrate the construction with a simple one-column one-constraint ‘Fibonacci Virtual Machine’. Its ‘execution’ is a sequence of integers

a ,a , a ,a ,a ,a ,a  , a
 1  2   3  4   5  6  7   8

that obey the Fibonacci rule an = an-1 + an-2 for n = 3,..., 8. The input is a1,a2 and a8 is the output. Suppose Prover wants to convince Verifier that they know such a sequence with output a8 = 47. They have in mind

(1,3,  4,7,11,18, 29,) 47

but do not want to reveal the input a1 = 1,a2 = 3 or any intermediate values a3,...,a7.

2.2. Interpolating values. We represent the ai as integers modulo a prime p, that is as elements of the finite field F = ∕p= Fp. In this example, we take p = 97. Additively,

F =  ℤ∕97ℤ  = {0,1,2,3...,96 }.

Multiplicatively, s = 5 is a primitive root in F. In other words,

                  ×        2  3     96
F \ {0} = (ℤ ∕97ℤ)  =  {5,5 ,5 ,...,5  }.

The element ζ = 53 is then of order 32, so it is a 2-power root of unity, and we let

x = (ζ4,ζ8,ζ12,...,ζ32),    y = (sζ,sζ2,sζ3, ...,sζ32).

The xi (i = 1,..., 8) are the 8th roots of unity in F, and the yi (i = 1,..., 32) are the 32nd roots of unity shifted by s.

Let a = (1, 3, 4, 7, 11, 18, 29, 47) be the column of values that Prover wishes to encode. With Langrange interpolation, we can find the unique polynomial f(x) F[x] of degree < 8 with f(xi) = ai. It turns out to be

f(x) = 67x7 + 7x6 + 2x5 + 28x4 +  6x3 + 74x2 + 42x + 15.

There is a very efficient way to do this, ‘the number-theoretic transform NTT’ (essentially a version of the finite Fourier transform), that is specifically tailored to domains consisting of 2-power roots of unity in a field. Knowing f is equivalent to knowing the ai. Prover evaluates f on the 32 points y1,...,y32 using NTT again, and these values are plotted in Figure 1. On the horizontal axis we plot F \{0} multiplicatively (to make the xi and yi equally spaced), and on the vertical axis we plot F additively.


Figure 1: Values of f(x) = 67x7 + 7x6 + 2x5 + 28x4 + 6x3 + 74x2 + 42x + 15 on x i = ζ4i and yi = i in F 97.

As one (and only) black box, we will use the existence of the following (cf. also 2.6, 2.8)

FRI low degree testing algorithm. It is possible for a Verifier efficiently, and in zero knowledge, to check that a given list of 32 values lie close to the values of some polynomial of degree < 8 evaluated at the yi.

In practice, 8 and 32 will be, say, 222 and 226. Essentially, Prover commits all the values of the disguised trace in a Merkle tree, sends its root to Verifier, and Verifier, by quering very few of the values (~log of their number) is able to convince themselves that they lie close to some polynomial. This number of queries is so small that it is does not reveal enough information about the polynomial to be able to reconstruct its values at the xi (the original trace), which is where ‘zero knowledge’ comes from.

2.3. Constraints and relaxing them. The Fibonacci rule

an = an- 1 + an- 2   for    n ⁄=  1,2

is equivalent to the ‘constraint polynomial’

C(x) = (x - x1)(x - x2)(f (x ) - f (x∕ζ4) - f(x∕ζ8))

to be 0 at x = x1,...,x8. The condition f(x) - f(x∕ζ4) - f(x∕ζ8) = 0 guarantees that an - an-1 - an-2 = 0, and the ‘relaxing’ terms x - x1 and x - x2 allow this to be violated for n = 1, 2.

Equivalently, C(x) is divisible by the ‘zeroes polynomial’

Z(x ) =   (x - xi) = x8 - 1.

As2 deg C(x) = 2 + 2 deg f 16, this is equivalent to the ‘verification polynomial’

V(x ) = C (x)∕Z (x)

to be inded a polynomial of degree 8 in x. To summarise, the conditions

  1. {f(yi)}i=132 are values of a polynomial of degree 7

  2. {V (yi)}i=132 are values of a polynomial of degree 8

  3. V (yi) = (yi- x1)(yi-x2)
----y8i-1----(f(yi) - f(yi-4) - f(yi-8)) for i = 1,..., 32

are what Verifier needs to know that the execution trace is valid.

2.4. Boundary conditions. We used multipliers x - x1, x - x2 to relax the constraint at n = 1, 2. We can also do the opposite, and divide by a linear factor to enforce a value for n = 8. For any function F(x),

F(x) is a polynomial and F(c) = b F-(x)---b
  x - c is a polynomial.

In our example, the output value a8 = 47 guarantees that f(x)--47
 x-x8 is still a polynomial, so we replace the condition (1) by

2.5. Mixing. There is a simple observation (‘mixing’) that

F(x),G(x) are polynomialsF(x) + δG(x) is a polynomial
of degree d of degree d for any δ F.

Moreover, when F is large, one random choice of δ F is enough, as long as (a) δ comes from Verifier rather than Prover, and (b) it is given to Prover after F and G are committed.

As the low degree testing is quite expensive, we use mixing to get (1) and (2) with one test, replacing them by

Same idea can also be used to deal with multiple constraints, mixing them into one in the same way, say, with successive powers of δ.

2.6. FRI Low Degree Testing. This is also basically how FRI low degree testing works. Take a polynomial F(x) = cixi, write every term xi as xjx4k with j ∈{0, 1, 2, 3} and group those with the same j together. We get

            4         4     2    4    3    4
F (x ) = f0(x  ) + xf1 (x ) + x f2(x ) + x f3(x ),

and F is a polynomial if only if all the fj are. Heuristically, if δ F is random, it is enough to check (2.1) for a number of x-coordinates, and that

f (x) + δf (x) + δ2f (x) + δ3f (x )
 0        1         2         3

is a polynomial for a random δ F. This reduces verifying that a function is a polynomial of degree d to verifying that some other function is a polynomial of degree d∕4, and FRI is a recursive algorithm based on this. We refer the reader to [BS+18] for details and security analysis.

2.7. Deep variant. We are now down to two verifications:

The original FRI scheme would check (3) for few indices i, chosen at random by Verifier. The DEEP modification adds extra security in verifying this relation so that just one query suffices. We test the equation at a random point outside the range of committed values yi. In our example, let

     2 j
ξ = s ζ     (deep  x-coordinate)

for a random j ∈{1,..., 32} provided by Verifier. This is shifted away from the xi and the yi, so it is a different point in our base field. (In fact, we could even go a bit further, and take it from an extension Fpd over Fp for some small d, improving security even further though at a cost of Verifier running time.) Because (3) comes from a relation between polynomials, it holds with yi replaced by ξ as well. In other words,

The relation on the left is the one we test. We have to ensure that the values on the right are correct, and this is again of the type ‘prove that a polynomial passes through a given point’, so it can be incorporated into (2). Using Lagrange interpolation, construct the unique polynomial q of degree 3 so that

q(x8) = 48,  q(ξ) = v2,  q(ξ∕ ζ) = v3,  q(ξ∕ζ ) = v4

and replace (2) with

Thus, we reduced everything to one low degree test (2′′) and one additional check (3).

2.8. Fiat-Shamir heuristic. The introduction of the mixing constant δ and the DEEP parameter ξ that come from Verifier has an undesired effect to making the protocol interactive. We would like a non-interactive protocol in which Prover produces a proof, and Verifier (or anybody else) could verify its correctness at any time afterwards, with no interaction. There is a standard trick how to achieve this, the Fiat-Shamir heuristic. It asserts that random input from an honest Verifier could be replaced with a hash coming from the preceding part of the protocol. In our case, Prover commits {f(yi)}y=132 to a Merkle tree, and its Merkle root could be used as a source of the constants ξ and δ used in the rest of the protocol.

2.9. Summary. Let us collect all the steps for the validity of the trace for the Fibonacci Virtual Machine. Set F = 97, s = 5 F, ζ = 53 F (of order 32), x i = ζ4i, y i = i.


  1. ‘Execute the code’ by applying the Fibonacci rule starting from a1 = 1, a2 = 3 and up to a8 = 47. This is the ‘original execution trace’

    a =  (1, 3,4,7,11,18,29, 47).
  2. Interpolate the trace with a polynomial f, so that f(xi) = ai for i = 1,..., 8:

    f(x) = 67x7 + 7x6 + 2x5 + 28x4 +  6x3 + 74x2 + 42x + 15.
  3. Compute f(yi) for i = 1,..., 32, the ‘disguised execution trace’.

  4. Compute V (yi) = (yi-x1y)8-(yi1-x2)
     i(f(yi) - f(yi-4) - f(yi-8)) for i = 1,..., 32.

  5. Store triples (i,f(yi),V (yi)) as leaves of a Merkle tree and compute Merkle root δ. This δ will be used as a mixing parameter.

  6. Let ξ = s2ζδ, or some other pre-arranged function of δ (‘deep x-coordinate’).

  7. Set (v1,...,v4) = (V (ξ),f(ξ),f(ξ∕ζ),f(ξ∕ζ2)) (‘deep taps’).

  8. Using Lagrange interpolation construct the unique polynomial q of degree 3 so that q(x8) = 48,q(ξ) = v2,q(ξ∕ζ) = v3,q(ξ∕ζ2) = v 4.

  9. Generate a FRI low degree proof P (=some branches of the Merkle tree) that

    y ↦→  y4------------f-(yi)---q(yi)----------- + δ V(yi) --v1-
 i    i(yi - x8)(yi - ξ)(yi - ξ∕ζ)(yi - ξ∕ζ2)     yi - ξ

    is a polynomial of degree 7.

  10. The proof of validity of the execution trace is P plus the deep taps v1,...,v4.

Verifier. Given P and v1,...,v4,

3. General Compulation

Now we turn to the general setup. We work with a virtual machine running some assembly language. Prover executes a program, and generates an execution trace. For every time t = 1, 2,...,N, it stores the values of the program line counter pc, flag, registers (if any) and various auxiliary variables after the command at time t has been executed. All values lie in a finite field F = Fp with p elements, and are stored in columns of size N, say C1,C2,.


pc instructionimm⋅⋅⋅

1 ax:=ax+1717 ⋅⋅⋅
2 jmpnc 1 1 ⋅⋅⋅
⋅⋅⋅⋅⋅⋅ ⋅⋅⋅ ⋅⋅⋅

Execution trace

C 1 C2 C3 C4 C5 C6 ⋅⋅⋅
t pc tflag tax tbx timm t⋅⋅⋅

1 2 0 17 0 17 ⋅⋅⋅
2 1 0 17 0 1 ⋅⋅⋅
3 2 0 34 0 17 ⋅⋅⋅
⋅⋅⋅⋅⋅⋅⋅⋅⋅ ⋅⋅⋅⋅⋅⋅ ⋅⋅⋅ ⋅⋅⋅

Table 1: Source program and an execution trace

We will fix the following primary constants for the virtual machine:

constdescription example

W VM word size, even power of 2 232
N number of memory words & bound on the number of time steps226
p prime modulo which execution trace entries are computed 264-232+1

3.1. Goal. Our goal is to describe a protocol for Prover to convince Verifier that

Validity of the execution trace will be ensured with polynomial constraints (see 4), which are polynomial relations between columns, and boundary conditions, which are statements of the form ‘column C at time t has value a’. Let us first deal with (b) and (c) — program termination, input and output.

3.2. Program termination. To verify where the program has terminated, and that it has terminated correctly, we can have a special instruction, say halt (or answer), for a correct termination, that simply works as a jump back to the same instruction. There could be several of those in the program:

pc instruction⋅⋅⋅

⋅⋅⋅⋅⋅⋅ ⋅⋅⋅
121 halt ⋅⋅⋅
⋅⋅⋅⋅⋅⋅ ⋅⋅⋅
394 halt ⋅⋅⋅
⋅⋅⋅⋅⋅⋅ ⋅⋅⋅

If, say, the first one is reached, then the program enters an infinite loop until the time t = N. So we just need to check that the value of pc at time N is 121, say. This is a boundary constraint on the pc column.

3.3. Public input. We precede the program with a code that hashes the public part of the memory with some ZK-friendly hash (say Poseidon) and asserts that the hash agrees with a given constant h. In other words, the program cannot continue (with a valid execution trace) unless the value is h. Verifier checks that the expected public input hashes to that value.

3.4. Output. Similarly, at the end of the program, right before the termination block, we introduce a code that hashes the part of the memory responsible for the program output. Again, the code checks that the hash agrees with some value that Verifier possesses.

3.5. Recursion. These steps also allow to break the trace into ‘pieces’, which is necessary if the program takes more than N steps to execute. Each piece starts with a code that verifies that the previous one has executed correctly (which we will get to), and that its output hashes to the same value as the input of the current one.

4. Primitives

We are left to deal with (a) of 3.1 — how Prover can show the trace satisfies set of constraints with boundary conditions (6), and how to formulate those constraints that ensure validity of the program execution for our chosen VM architecture (5). A collection of columns, constraints and instruction definitions for a given VM forms a ‘configuration script’, and we begin by describing the primitives that make such a script.

4.1. Polynomial constraints. Validity of the trace will be mostly ensured with polynomial constraints. These are polynomial relations between the values of several columns at time t and t - 1. We use # to indicate ‘t - 1’, so that, for example,

constraint C1C2 + C3#C 4# = 0

stands for

C1 (t)C2 (t) + C3 (t - 1)C4 (t - 1)  for all    t = 1,...,N.

We use the convention that the time ‘loops’, so Ci(0) := Ci(N). We also allow exceptions at some specified times t:

Example 4.1 (Time column). The following constraints with a boundary condition at t = 1 force C to be the ‘time’ column C = (1, 2, 3,...,N).

constraintC = C# + 1for t1
boundary C(1) = 1

4.2. Degree. The degree of a constraint is its degree as a polynomial in the variables C1,C1#,C 2,.... For example,

degree(C = C# + 1) = 1, degree(C 12 = C 2C3) = 2, degree(C1C2C3 = 0) = 3.

Higher degree constraints are normally expensive, and in our sample VM based on FRI we have all constraints of degree 2 (quadratic). Constraints of degree 1, or, more generally, variables that occur only once and linearly in some constraint are not particularly useful, as those variables can be eliminated. Higher degree constraints, when we need them, are broken into quadratic ones, at the cost of introducing new columns. For example, for

C1C2C3  =  0

we can introduce a new column Q = C1C2, and replace the constraint by an equivalent quadratic pair

Q = C1C2,      C3Q  = 0.

If we opt for a higher degree limit, say 3 or 4 instead of 2, we can reverse this and remove some columns. The optimal choice depends on how the running time grows with the number of columns and the degree bound.

4.3. Control, derived and external columns. To formulate a set of constraints for the execution trace, it will be useful to distinguish between 3 types of columns:

Column typedescription Table 1

control Columns shared with the program (except pc) imm
that depend on the instruction and its operands
derived Columns C that are polynomial expressions in t, pc
(or variable) other columns, and possibly C#
external Columns whose values are computed ‘externally’,flag
outside the configuration script

Control columns come from the program (e.g. all immediate values of the instructions), derived ones are polynomials in other columns, and the rest are external. Write

var C5 = C1C2 + C3C4,

to specify control and derived columns. Every var creates a column and also generates the corresponding constraint. If we wish to store, say, the highest bit of the values of one column C in another column H, then H is not a polynomial expression in C, so H is external. Prover can have a function highestbit to compute the highest bit of a machine word w [0,...,W-1], and the configuration command

external H=highestbit(C)

states that H is computed externally, and is a function of C (needed to order the computation of the derived columns and such external calls). This generates no constraints, so they have to be specified manually (say H(H-1) = 0 etc., in our example).

Example 4.2 (Jumps). This is how conditional (on flag) and unconditional jumps can be implemented; cf jmp, cjmp, cnjmp in Table 2:

external flag
control j1, j2, addr
var j = j1+j2*flag
var pc = (1-j)*(pc#+1)+j*addrfor t0

Here pc is a derived column, controlled by the values of j1, j2 and addr. When j1=j2=0, this forces pct=pct-1+1, so the program counter is incremented by one (default behaviour). If j1=1, j2=0, it becomes an unconditional jump to addr. For j1=0, j2=1, this is cjmp, and for j1=1, j2=-1, this is cnjmp.

4.4. Instructions. Generally, everything about an instruction is encoded in the control variables, either via immediates (e.g. addr in 4.2) or control flags (e.g. j1, j2 in 4.2). We will write instructions as follows:

inst instructionname⟩⟨operand1⟩⟨operand2... : flag1=value, flag2=value, ...

with operands and flags from control columns; for example (see 4.2 again),

instjmp addr : j1=1, j2=0
inst jmpc addr : j1=0, j2=1
inst jmpnc addr: j1=1, j2=-1

4.5. Permutation. The last primitives that we need are permutation and lookup, two very useful tools derived from polynomial constraints with boundary.

Let us start with permutation. Suppose we have two columns C1, C2 and we want to show that they are permutations of one another. That is, they have the same N elements,

{C1,1,...,C1,N} = {C2,1,...,C2,N }

as unordered multisets, but possibly in different order. Let us denote this as

C1  ~ C2.

Note that we do not want to reveal any of the elements or anything about the permutation. The trick is to note that having such a permutation is equivalent to the equality of polynomials t=1N(x-C 1,i) = t=1N(x-C 2,i), which can in turn be checked by substituting a random value of β for x. Thus,

                  N∏  β - C1,i
C1 ~ C2    ⇐ ⇒       -------- = 1.
                  t=1 β - C2,i

Like in the Fibonacci example, we get β through the Fiat-Shamir heuristic, as a Merkle root of previously committed columns. We view the probability that β happens to equal to one of the C2,i (or C1,i) as negligible.

Build an auxiliary column H whose values accummulate the partial products Ht = i=1tβ--C1,i
β- C2,i. Then H satisfies a recursion (which is how we compute it) Ht = Ht-1β-C
β-C12,t,t-, so we have a quadratic constraint with boundary

H#  (β - C1) = H (β - C2 ),    H (0) = H (N ) = 1.

If C1,C2 and H satisfy this constraint, then C1 and C2 are permutations of one another. To summarise, we write

permutation[β] C1 ~ C2


set H = ( i=1tβ-C1,i-
boundary H(0) = H(N) = 1
constraintH#(β - C 1) = H(β - C2)

4.6. Multiple permutation. As a slight extension, say we have two lists of m columns C1(1),...,C 1(m) and C 2(1),...,C 2(m). We can show that the unions j=1mC 1(j) and j=1mC 2(j) have the same mN values as multisets, with an obvious generalisation

set H = ( i=1t j=1mβ-C (j)
β-C 2,i)t=1N
boundary H(0) = H(N) = 1
constraintH# j=1m(β - C 1(j)) = H j=1m(β - C 2(j))

except that the resulting constraint is of degree m + 1 and has to be split into m quadratics. We write

permutation[β] C1(1),...,C 1(m) ~ C 2(1),...,C 2(m)

for the resulting scheme.

4.7. Lookup. Say we have two columns F = (Fi)i=1n (‘query’) and T = (T i)i=1d (‘table’). We want to prove that F T as sets, ignoring repetitions. Let S be a vector of length n + d. Gabizon et al. [GW20] prove that the three conditions

  1. F T as a set (unordered, ignoring repetitions)

  2. S = F T as a multiset (unordered, with correct multiplicities)

  3. S is sorted by T (if Si = Ti,Sj = Tj, then i<j i<j)

are satisfied if and only if the following polynomials in F[β,γ] are equal:

         ∏n         d∏-1                          n+∏d-1
(1 + β)n    (γ + Fi )   (γ(1 + β) + Ti + βTi+1 ) =     (γ(1 + β) + Si + βSi+1).
         i=1         i=1                           i=1

We will take d = n + 1 = N, the only case we need. Then this takes the form

 n      n      2n
∏      ∏       ∏
    ⋅⋅⋅   ⋅⋅⋅∕    ⋅⋅⋅ = 1.
 i=1    i=1      i=1

Splitting the last product into pairs, we get a condition of the form i=1nQ i = 1, which can be verified by building a column with partial products Zi = j=1i-1Q j, and imposing the conditions that Z1 = Zn+1 = 1 and a recursive formula Zi+1 = ZiQi; this is very similar to the permutation scheme in 4.5-4.6.

In i=12n⋅⋅⋅ , Bootle et al. group the ith and (i + n)th term together, and Pearson et al. [GW20] observe that it is more efficient to group ith and (i + 1)st, saving one boundary condition. The ‘cost’ of the lookup in terms of quadratic constraints is

  • 4 additional columns: Z, Q, and two halves H1, H2 of S,

  • 2 quadratic constraints Zi+1 = ZiQi and the expression for Qi,

  • boundary conditions Z(1) = Z(n + 1) = 1.

Similarly to 4.6, we can generalise this scheme to multiple queries with the same table:

Algorithm 4.3. Let (Fi(1)) i=1n,...,(F i(k)) i=1n be k queries for the same table T = (T i)i=1n+1.


  1. Set S = T F(1) F(k) as a multiset, sorted by T.

  2. Set H(j) = (S j,Sj+k+1,Sj+2(k+1),..., ) for j = 1,...,k + 1.
    Thus, |S| = kn + n + 1, |H(1)| = n + 1 and |H(2)| = ... = |H(k+1)| = n.

  3. Commit H(1),...,H(k+1).

  4. Get random constants β,γ k from Verifier3 .

  5. Compute columns Z, Q(1),...,Q(k) recursively, as follows.

    10 111:   Z1 1
    112:   for i 1 to n do
    113:    v γ(1+β)+Hi(k+1)+βH i+1(1)
    114:    for j k to 1 by - do
    115:    v *:= (γ(1 + β)+Hi(j)+βH i(j+1))(γ+F i(j))
    116:    Qi(j) := v
    117:    end for
    118:    Zi+1 := Zi(γ(1+β)+Ti+βTi+1)(1+β)k∕v
    119:   end for

  6. Commit Z, Q(1),...,Q(k).

Verifier: Check that Z1 = Zn+1 = 1, and the following k + 1 quadratic constraints. For all i = 1,...,n,

Zi+1Q (1i) = Zi(1 + β)k(γ(1+ β )+Ti +βTi+1 ),
            (j)     (j+1)  (j+1)         (j)  (j)
(γ(1+ β)+H  i(k+) βH i(k+1))Qi    =  (γ+F(ik+1))Qi   (fo1)r   j = 1..(.kk) -(1k,)
(γ(1+ β)+H  i + βH  i   )(γ(1 + β)+H  i    +βH  i+1) = (γ +F i )Q i .

We write

lookup[β,γT  F(1),...,F(k)

for the resulting scheme, and group lookups with the same table. We will use this primitive

  • to check that numbers are in the word range in 5.6, and

  • to check that we are executing the correct program in 5.9.

4.8. Injective lookup. There is a version of this, called ‘bounded lookup’ [BD23], that puts a cap m on how many times an entry in T can occur in the F(j) in total. In practice, the F(j) and T can have different sizes (n), and we pad them with a ‘dummy’ value μ T, which will be yet another Fiat-Shamir constant. We only need the m = 1 case, ‘injective lookup’. This is the case used for memory accesses in [B+18Ce19], and we will use it in a very similar way. The algorithm then shows that every value Tiμ occurs at most once in jF(j). This case is simpler, and has a cost of 2k auxiliary columns.

View F(j) as having length n + 1, padding them with μ if necessary. What we want to show is that there are pairwise disjoint sets of indices X(j) ⊂{1,...,n + 1} such that

F(j) = permutation of {T i if i X(j) and μ if i∕∈X(j)} i=1n+1.

We already described how permutation can be implemented, and we get:

Algorithm 4.4 (Multiple Queries Injective Lookup). Let (Fi(1)) i=1n+1,...,(F i(k)) i=1n+1 and T = (Ti)i=1n+1 be vectors in a field F, and μ T. The algorithm provides a proof that Fi(j) T, and every T iμ occurs at most once as an entry.


  1. Let X(j)⊂{1,...,n + 1} be pairwise disjoint sets of indices for j =1,...,k so that

    F(j) = permutation of T(j), T(j) = {T i if i X(j) and μ if i∕∈X(j)} i=1n+1.

  2. Commit F(j), T and T(j) for j = 1,,k.

  3. Get random constant β F from Verifier.

  4. Define vectors z(j) = ( l=1iβ-F(lj)
   l)i=1n+1 for j =1,...,k, and commit them.

Verifier: Verify that

  1. zi(j)(β-T i(j)) = z i-1(j)(β-F i(j)) for i = 1,..,n+1 and z n+1(j) = 1, for j = 1,...,k.

  2. (Ti(j) - T i)(Ti(j) - μ) = 0 for i = 1,..,n + 1, and j = 1,...,k.

  3. (Ti + (k - 1)μ - j=1kT i(j))(- j=1kT i(j)) = 0 for i = 1,..,n + 1.

The first condition shows that F(j) and T(j) are permutations of one another. The second shows that T(j) either agrees with T or equals μ at every place. The last one guarantees that at every place T(j) can agree with T for at most one j, which is equivalent to the X(j) being disjoint.

We refer the reader to [A+22] for the discussion of lookup in the context when the F(j) and T have very different size, and optimizations for Prover in that case.

We write

ilookup[β,μT  F(1),...,F(k)

for the resulting injective lookup scheme, and, again, group ilookups with the same table. This will be used in memory access checks in 5.8.

4.9. Permutation and lookups with multiple columns. Both permutation and lookup schemes can be extended to cover multiple columns. For example, if we we want to show that the sets of rows in a collection of columns C(1),...,C(k) is the same as a the set of rows in a different collection D(1),...,D(k), we can mix the columns together with a Fiat-Shamir constant α and check for permutation:

var L = C(1) + αC(2) + α2C(3) + + αk-1C(k)
var R = D(1) + αD(2) + α2D(3) + + αk-1D(k)
permutation[β]L ~ R

The same idea works for lookups and injective lookups as well.

5. Instruction set

We now describe how to implement a full instruction set in terms of the above primitives (apart from jumps that have already been covered in Example 4.2 and 4.4). See Table 2 for the descriptions of what they do, and a summary of control column settings. We follow [B+18] closely here, and the instruction set is very close that of TinyRam [BS+13]. The only substantial change is that we propose a memory model that is quite different: we have no registers, as every memory cell can work as a register. As memory is quite ‘cheap’ in terms of ZK costs and every register adds a few additional columns, we found this scheme more efficient; see 5.8.

InstructionOperandsEffect flag set if
o1 o2 o3

and o1 o2 o3 o1:=o2 BitwiseAND o3 result=0
or o1 o2 o3 o1:=o2 BitwiseOR o3 result=0
xor o1 o2 o3 o1:=o2 BitwiseXOR o3 result=0
not o1 o3 o1:=BitwiseNOT o2 result=0

add o1 o2 o3 o1:=o2+o3 mod 2W overflow: o2+o32W
sub o1 o2 o3 o1:=o2-o3 mod 2W underflow: o2-o3< 0
mull o1 o2 o3 o1:=o2×o3 mod 2W underflow: o2×o3< 2W
umulh o1 o2 o3 o1:=o2×o3 div 2W underflow: o2×o3< 2W
smulh o1 o2 o3 o1:=o2×signedo3 div 2W underflow: result=0
udiv o1 o2 o3 o1:=o2 div o3 division by 0: o3=0
umod o1 o2 o3 o1:=o2 mod o3 division by 0: o3=0

cmpe o1 o2 compare if = o1=o2
cmpa o1 o2 compare if > o1>o2
cmpae o1 o2 compare if o1o2
cmpg o1 o2 signed compare if > o1> signedo2
cmpge o1 o2 signed compare if o1signedo2

mov o1 o2 o1:=o2 unchanged
cmov o1 o2 o1:=o2 if flag=1 unchanged

jmp o1 pc:=o1 unchanged
cjmp o1 pc:=o1 if flag=1 unchanged
cnjmp o1 pc:=o1 if flag=0 unchanged

answer o1 halt with exitcode o1 unchanged

Inst Operands cFLAG δ ϵj1j2 λ eAND eMOD eOR ePROD eSPROD eSSUM eSUM eXOR

and [i1] [i2] [i3]|i4 1 1
or [i1] [i2] [i3]|i4 1 1
xor [i1] [i2] [i3]|i4 1 1
add [i1] [i2] {[i3]}{i4} 11 1 1 1
sub [i1] [i2] {[i3]}{i4} 11-1 -1 1
mull [i1] [i2] [i3]|i4 1 1 1
umulh[i1] [i2] [i3]|i4 1 1
smulh[i1] [i2] [i3]|i4 1 1
udiv [i1] [i2] [i3]|i4 1 1 1
umod [i1] [i2] [i3]|i4 1 1
cmpe [i2] [i3]|i4 1 1
cmpa [i3] {[i2]}{i4} 1 -1 -1 1
cmpg [i3] [i2] i4 1 1
mov [i1{+s1[i2]}] [i3{+s3[i2]}] 1 1
jmp i4{+[i3{+s3[i2]}]} 1
jmpc i4{+[i3{+s3[i2]}]} 1
jmpnci4{+[i3{+s3[i2]}]} 1-1

Table 2: Instruction set and control variable settings

5.1. Operands and flag. All our instructions have 3 operands v1 (output), v2 (input 1), v3 (input 2), and some need an advice variable v4, and/or use or affect flag:

v2 (input 1), v3 (input 2), flagt-1 =⇒ v1 (output), flagt.

We will add range checks that v1,...,v4 are all in the word range [0,...,W-1] for every t (see below), and the following constraints for the flag:

flag = 0 or 1
(flag-flag#)*(1-c flag) = 0
flag allowed to change if cflag=1

5.2. Unsigned multiplication. Here is an example of a pair of instructions mull (product - lower word) and umulh (product - higher word, unsigned); they affect the flag, so cflag is set:

eprod, ϵ
hi = ϵ*v4+(1-ϵ)*v1
lo = ϵ*v1+(1-ϵ)*v4
prod = v2*v3 - (lo+W*hi)
inst mull : eprod=1, cflag=1, ϵ=1
inst umulh: eprod=1, cflag=1, ϵ=0

Control variable ϵ determines whether v1 (output) is the higher word hi or the lower word lo; v4 (advice) is the other one. Condition prod=0 and range checks ensure that lo and hi are computed correctly. Control variable eprod is set to 1 for mull/umulh and to 0 for all other instructions, turning the prod=0 check on/off.

5.3. Setting flag for zero result. Multiplication and several other instructions (see Table 2) set the flag when either v1 or hi (zero result) or input v3 (division by 0) is zero. To check that this is done correctly, we introduce a column zer that keeps control of the expression whose vanishing sets the flag. For example, in the case of unsigned multiplication, eprod = 1 and zer=hi. In the case of an instruction which does not need this check all the control flags in the expression for zer are zero, and zer=1-flag (which passes the check).

zer = (eand+eor+exor+esprod+essum)*v1 + eprod*hi + emod*v3
+ (1-(eand+eor+exor+esprod+essum)-eprod-emod)*(1-flag)
Inv = invert(flag+zer)

The last 3 lines ensure that zer=0 flag =1. Here invert is an external function that sends an element a F to its inverse 1∕a if a0 (and has unspecified effect when a = 0).

5.4. Unsigned arithmetic, comparisons and move. Addition and unsigned comparison are done in a similar way to unsigned multiplication. Control variables δ, ϵ, λ determine whether we are doing addition, subtraction or comparison or simply move v1:=v3, and the overall check is otherwise the same. The check also ensures that the flag is set correctly.

esum, δ, ϵ, λ
SUM = v1-(δ*v2+ϵ*v3+λ*(i4-W*flag))
inst add : esum=1, cflag=1, δ=1, λ=1
inst sub : esum=1, cflag=1, δ=1, λ=-1
inst cmpa: esum=1, cflag=1, ϵ=-1, λ=-1
inst mov : esum=1, ϵ=1

Next we turn to unsigned division (udiv) and remainder (umod),

quo = v2 div v3, rem = v2 mod v3.

As for the lower/higher word dichotomy in case of multiplication, we use the same verification block for the two instructions, with a flag ϵ that control which of the variables quo and rem is the output v1, and which one is an advice word v4.

emod, ϵ
rem = ϵ*v4+(1-ϵ)*v1
quo = ϵ*v1+(1-ϵ)*v4
MOD = v2-(quo*v3+rem)
MODrange = (1-flag)*(v3-rem-1) + flag*(-quo)
inst udiv : emod=1, ϵ=1, cflag=1
inst umod: emod=1, ϵ=0, cflag=1

As above, we first check that flag is set if and only v3=0 (division by 0). We will check below that the column MODrange is in the word range [0,...,W -1]. When flag=1 and v3=0,

MOD = v2-rem, MODrange=-quo,

so MOD=0, 0MODrange W -1 test that the quotient is 0 and the remainder is v2 in the division-by-0 case. In the normal case flag=0 and v30, MOD=0, 0MODrange W-1 test that v2,v3 satisfy the quotient/remainder relation and that the remainder, which is either v1,v4 and so 0, is moreover <v3.

5.5. Logic. There is a neat way to convert Fp-unfriendly logic operations to Fp-friendly arithmetic ones [B+18]. We can replace NOT by XOR with -1 at the stage of reading the code, so we just need AND, XOR and OR. The key observation is that for a,b ∈{0, 1},

a + b = 2(aAND   b) + (a XOR b).

The same is true if a,b have only even bits in their binary expansion, so splitting machine words into even and odd bits helps convert addition to AND and XOR.

Notation 5.1. For a machine word a decomposed into bits, we write

    ∑      i
a =     ai2 =  a0+2a1 +4a2 +8a3 +...=  ae + 2ao,

with ae = i evenai2i the ‘even bits’ and a o = i oddai2i-1 the shifted ‘odd bits’.

In this notation, for machine words a,b,c with a + b = c, we have

aAND   b =   2(ao + bo)o + (ae + be)o,
aXOR   b =   2(a  + b ) + (a  + b ),
                o    o e    e    e e
aOR  b   =   a AND  b + aXOR   b.

This converts to the following code and instructions:

eand, exor, eor
r1 = v2o+v3o
r2 = v2e+v3e
AND = v1-(2*r1o+r2o)
XOR = v1-(2*r1e+r2e)
OR = v1-(2*r1o+r2o+2*r1e+r2e)
eand*AND=0, exor*XOR=0, eor*OR=0
inst and: eand=1, cflag=1
inst or : eor=1, cflag=1
inst xor : exor=1, cflag=1

5.6. Word range. To finish off the logic checks, it remains to verify that all x ∈{v1,v2,v4,v4,r1,r2} decompose as

x = xe + 2xo,     xe,xo ∈ EvenBits,

where EvenBits is the table of all numbers in the word range that have only even bits in them. The equality on the left is just another constraint, and we implement the range check on the right with a table lookup (see 4.7),

EvenBits xe,xo

5.7. Signed arithmetic. Recall that our operand variables vi are unsigned, lying in the word range [0,...,W-1]. To implement signed arithmetic, for v ∈{v1,v2,v3} define a columm vσ for the highest significant bit of v, and vsigned=v-vσW [-W∕2,...,W∕2-1] for the signed version of v:

vσ = highestbit(v)
external call to compute vσ
must be 0 or 1
in [-W/2,...,W/2-1]
provided this is non-negative

We check that vsgncheck is in word range, in the same way as for the vi. Then we can define signed multiplication and signed comparison:

ssum = v1 - (v2signed+i4-v3signed+W*flag)
sprod = v2signed*v3signed- (v4+W*v1signed)
inst smulh : esprod=1, cflag=1
inst cmpg : essum=1, cflag=1, i4=0
inst cmpge: essum=1, cflag=1, i4=1

5.8. Memory. Recall that most of the instructions work as

v2 (input 1), v3 (input 2) =⇒ v1 (output).

We have no registers, and every memory cell can be any of the operands. At every time t we always read v2 and v3 and write v1. We reserve Mem[0](=Mem[N]) for a ‘dummy’ memory cell that we read or write when needed, but which is not used by the program. For example, when executing

mov 10,20 (Mem[10]:=Mem[20])

we need to ensure that v3 is read from Mem[20], v2 from Mem[0] and v1 written to Mem[10], while the instruction (mov) checks that v1 = v3 (see 5.4).

This memory scheme is implemented as follows. For every operand index i = 1, 2, 3 we define columns

vi,addrmemory address accessed
vi value in Mem[vi,addr] after access
vi,vinit initial value in Mem[vi,addr] when program starts
vi,usd 1 if memory has been accessed before, 0 else
vi,tlink previous time memory was accessed if vi,usd=1, and
last time when memory will be accessed if vi,usd=0
vi,vlinkvalue in Mem[vi,addr] after access at time vi,tlink

Column tm and the word ‘time’ in the table refer to the ‘memory access clock’ that runs at the triple speed to the time t column. When executing an instruction at time t, we

  • read v2 at tm = 3t-2,

  • read v3 at tm = 3t-1, and

  • write v1 at tm = 3t.

For example, suppose that Mem[1] is initialised to 0, Mem[2] to 10, and Mem[1], Mem[2] are accessed twice during program execution at times t1 < t2, once to store Mem[1]:=Mem[2] (:=10) and once to add Mem[1]:=Mem[1]+Mem[2] (:=20). Here are the values of the memory columns in this case (*’s in the first row stand for some values that are not important here):


t instruction(tmi) (i)v i,addrvi vi,vinitvi,usdvi,tlinkvi,vlink

t1 mov 1, 2 3t1-22 0 * * * * *
3t1-13 2 0 10 0 3t2-110
3t1 1 1 100 0 3t2 20
t 2 add 1,1,2 3t2-22 1 100 1 3t1 10
3t 2-13 2 1010 1 3t1-110
3t 2 1 1 200 1 3t2-210
⋅⋅⋅ ⋅⋅⋅

Here are the constraints on the valid memory access:

(M1) Memory access form cycles. In our example, we have cycles 3t1 3t2-2 3t2 3t1 of length 3 for Mem[1] and 3t1-1 3t2-1 3t1-1 of length 2 for Mem[2], such that tlink at every point agrees with time tmi of the previous one, and vlink agrees with v i of the previous one. In other words, if we look at the overall set of 3N triples

(                      )
 vi,addr,t,vi,tlink,t,vi,vlink,t              ,

it is a permutation of the 3N triples

(                     )
 vi,addr,t,3(t- 1)+i, vi,t              .

Mixing the three columns into one with some Fiat-Shamir constant α as in 4.9, this becomes a permutation of the type

union of 3 columns ~ union of 3 other columns

of the type discussed in 4.6 (with m = 3). Here is the resulting code:

v1πL = v1 addr + α*(v1tlink) + α2*v1 vlink
v1πR = v1 addr + α*(3*t) + α2*v1
v2πL = v2 addr + α*(v2tlink) + α2*v2 vlink
v2πR = v2 addr + α*(3*t-2) + α2*v2
v3πL = v3 addr + α*(v3tlink) + α2*v3 vlink
v3πR = v3 addr + α*(3*t-1) + α2*v3
v1πL,v2 πL,v3 πL ~ v1 πR,v2 πR,v3 πR

(M2) Memory locations cannot be in more than one cycle. In our example, we need to check that there are no times tt1,t2 for which vi,addr,t = 1 or 2 for some i. To do this, let MeInit be the initial state of the memory, and consider tables

(0, a, MeInit[a])a=1N and (1, a, MeInit[a]) a=1N

We want to show that for every i and t, the triple

(vi,usd,t, vi,addr,t, vi,vinit,t)

is contained in the first table exactly once when vi,usd,t = 0 and contained in the second table when vi,usd,t = 1. This ensures that all memory accesses belong to valid cycles, with each cell belonging to one cycle. This can be done with an injective lookup and a lookup, respectively. Again we mix the three columns using α:

Initial Memory, private
MeTable = 0 + α*t + α2*MeInit

Here MeTable is the mixed column corresponding to the first table, and for the second table it is simply MeTable+1 (replace 0+ by 1+ in its definition).

Let μ be another Fiat-Shamir constant. This is the code that we use for v ∈{v1,v2,v3}:

vMe = vusd+α*vaddr+α2*v vinit
mix (usd,addr,vinit) for v
vMe0 = (1-vusd)*vMe+μ*vusd
Replace usd=1 entries by μ
vMe1 = vMe+μ-vusd-vMe0
Replace usd=0 entries by μ
(vMe0 - μ)*(vMe1 - μ)=0
check one of them holds
MeTable vMe0
injective lookup when usd=0
MeTable vMe1
lookup when usd=1

(M3) Access times are in increasing order. We need to verify that times in the cycle are ordered correctly. That is, tmi < v i,tlink when vi,usd = 1, and tmi v i,tlink when vi,usd = 0. This guarantees that there is only one usd=0 entry in every cycle, at the smallest time. We can do both checks in one with a word range verification (see 5.6)

(1 - v   ) * (v    -  ti ) + v    * (ti - v    - 1 ) ∈ [0,...,W - 1].
      i,usd     i,tlink   m      i,usd    m    i,tlink

(M5) Load instructions are correctly executed. For every memory load operation (in other words, for v2 and v3), we have to check that vi = vi,vinit when vi,usd = 0 and vi = vi,vlink when vi,usd = 1:

for v2
for v3, unless v3 replaced by i4
in XXXi instructions (econst=1)

(M6) Memory addresses are correct. Finally, we need to check that we are accessing the right memory addresses.4 For most instructions the addresses for v1, v2, v3 are simply the immediates i1,i2,i3, but we wish to allow additional flexibility for mov’s and jmp’s; this is important for an assembly language that wants to support pointers of any kind. Specifically, in addition to

mov Mem[i1]:=Mem[i3]

we allow a more general version

mov Mem[i1 + s1 Mem[i2]]:=Mem[i3 + s3 Mem[i2]]
jmp/jmpc/jmpncMem[i3 + s3 Mem[i2]]

This is easy to implement as follows:

v1addr = i1+s1*v2
v2addr = i2
v3addr = i3+s3*v3

5.9. Program lookup. Finally, we mix the program counter column pc and all the control columns into one column InstExe, with some Fiat-Shamir constant α. We also mix the corresponding columns in the original source program into a column InstProg, and we note that this column (like, for example, EvenBits as well) is accessible to Verifier. Verifier knows what the program is, and needs to be certain that this is the one that Prover has executed. In other words, Verifier needs to be certain that all entries in InstProg are contained in InstExe. This is a lookup

InstProg InstExe

Not that is is certainly not an injective lookup, as the same program entry is generally executed multiple times, at different times t. The lookup does not reveal those times, only that the correct program was executed.

6. Prover/Verifier protocol

In Appendix A we summarise all the column declarations, constraints and instructions described in 5 in a configuration file. We are now ready to explain what the Prover/Verifier code looks like for verifying the correctness of an execution trace for our VM. This is done in two steps:

  1. The configuration file is analysed to generate the number and the names of columns and the set of all primitives, correctly ordered. These are the quadratic constraints on the columns (possibly relaxed or with boundary), permutations, lookups and instruction settings. This is done once per configuration, and the output of this process is pieces of code (say, in C or assembly), that are configuration specific, inserted into the Prover/Verifier code.

  2. The Prover/Verifier code is generally independent of the configuration apart from the pieces in (1) and a few external functions (such as highestbit, oddbits, CompleteMemoryCycles,... in our configuration). It handles the FRI machinery, but is agnostic to the exact columns and constraints.

The reason for this split is that the Prover/Verifier code that we get in the end can be highly optimised, and also adapted to a specific configuration. If, for example, a VM with registers or a VM with a tailored set of instructions is needed for a certain application, its configuration can be compiled quickly to an optimised Prover/Verifier code.

6.1. VM configuration. This is how the configuration file (Appendix A) is processed:

  • The configuration file is read, macros are processed and comments are removed. What remains is a sequence of commands var, constraint, inst, etc., one line per command. Underline signs are merged into identifiers

  • Lookups and injective lookups with the same table are grouped together

  • Permutations and lookups are replaced by column and constraint generating primitives (see 4.54.9)

  • Derived columns for which there is a linear expression in other columns can be eliminated; more generally, at this stage columns and constraints can be rewritten to any specified degree bound (not just quadratic, like in our example)

  • Derived and external columns are ordered in order that they can be generated

  • Column-dependent pieces of Prover/Verifier code are generated

6.2. Prover code for configured VM.

6.2.1. Global settings.

  • Set global constants p (prime), N (memory size and bound the number of time steps) and W (VM word size)

  • Let ζ Fp be an element of order 4N, and s Fp of order not dividing 4N

  • Let xi = ζ4i for i = 1,...,N, coordinates for the original trace

  • Let yi = i for i = 1,..., 4N, coordinates for the disguised trace

  • Choose a hash function to be used by the Merkle trees

  • Choose a pseudo-random generator (PRNG) for elements of Fp

6.2.2. Initialisation and program loading.

  • Create column array and fill it with zeroes

  • Read source program, initialising all control columns

  • Initialise memory and other initial tables (e.g. MeInit, EvenBits)

6.2.3. Program execution.

  • Execute program step by step

  • Along the way, populate all derived and external columns, excepts those that are global or rely on other global columns or Fiat-Shamir constants in their derivation

  • After the program has been executed, populate global columns except those that rely on Fiat-Shamir constants (e.g. complete memory cycles)

6.2.4. Stage 1 column commitment.

  • Interpolate all columns Cj computed so far from the execution trace with polynomials fj of degree N that have those values at the xi (using NTT, the number-theoretic transform)

  • Evaluate each fj at all the yi, using NTT again

  • Create a Merkle tree with 4N leaves Hash(i,f1(yi),f2(yi),...) for i = 1,..., 4N

  • Let r1 be the Merkle tree root, viewed as an element of Fp

6.2.5. Fiat-Shamir constants.

  • Compute all Fiat-Shamir constants used for mixing, permutations, lookups etc. (in our implementation α, β, γ, μ) with PRNG seeded to the Merkle root r1

6.2.6. All other columns, verification polynomial and commitment #2.

  • Compute all remaining columns (e.g. InstProg) in the execution trace (*)

  • Compute the values of verification polynomial using the constraints

  • Interpolate the verification polynomial at xi and evaluate it at the yi (**)

  • Commit all columns (*) and (**) as in 6.2.4 to a new Merkle tree with root r2.

6.2.7. FRI parameters.

  • Seed PRNG with its current seed from 6.2.5 plus r2

  • Using PRNG, pick a deep evaluation point in Fp, checking that it is not one of the xi or the yi

  • Generate FRI branch locations with PRNG

6.2.8. FRI low degree proof.

  • Evaluate the verification polynomials and all the polynomials interpolating the columns at the deep point and its ‘time step predecessor’, deep point dividing by ζ4; these are the deep taps

  • Generate FRI degree proof with branch locations from 6.2.7

6.2.9. Prover Verifier. The proof of correct program execution consists of

  • Merkle tree roots r1 and r2

  • Deep taps

  • FRI low degree proof for the verification polynomial, consisting of a number of branches for the Merkle trees

6.3. Verifier code for configured VM.

  • Get r1, r2, deep taps and the FRI proof as in 6.2.9 from Prover

  • Set the same global settings as in 6.2.1

  • From r1, set the same Fiat-Shamir parameters as in 6.2.5

  • From r2, set the same FRI parameters as in 6.2.7

  • Interpolate polynomials for columns that are not determined by the constraints but shared between Prover and Verifier (e.g. InstProg, EvenBits)

  • From these values and the deep taps compute the values of all column polynomials at the deep point

  • Deduce the values of the verification polynomial at all the FRI branch locations

  • Check that these values together with the column values hash to the leaves of the FRI branches

  • Check the all the FRI lead to the correct Merkle roots

  • Check that the FRI low degree proof is correct


[A+22]    H. M. Ardevol, J. B. Mel�, D. Lubarov, J. L. Mu�oz-Tapia, RapidUp: Multi-Domain Permutation Protocol for Lookup Tables, Cryptology ePrint Archive, Report 2022/1050, 2022,

[BS+13]    E. Ben-Sasson, A. Chiesa, E. Tromer, M. Virza, Succinct non-interactive arguments for a von neumann architecture, Cryptology ePrint Archive, Report 2013/879, 2013,

[BS+18]    E. Ben-Sasson, I. Bentov, Y. Horesh, M. Riabzev, Fast reed-solomon interactive oracle proofs of proximity, in: 45th international colloquium on automata, languages, and programming (icalp 2018), Schloss Dagstuhl-Leibniz-Zentrum f�r Informatik, 2018.

[BS+19]    E. Ben-Sasson, L. Goldberg, S. Kopparty, S. Saraf, DEEP-FRI: Sampling outside the box improves soundness, preprint, 2019, arXiv: 1903.12243.

[B+18]    J. Bootle, A. Cerulli, J. Groth, S. Jakobsen, M. Maller, Arya: Nearly Linear-Time Zero-Knowledge Proofs for Correct Program Execution, Cryptology ePrint Archive, Report 2018/380, 2018,

[BD23]    A. Bulkin, T. Dokchitser, Plonkup scheme with multiple queries, Cryptology ePrint Archive, Report 2023/066, 2023,

[B+23]    B. B�nz, J. Bootle, D. Boneh, A. Poelstra, P. Wuille, G. Maxwell, Bulletproofs: Short Proofs for Confidential Transactions and More, 2018 IEEE Symposium on Security and Privacy (2018), 315–334.

[Ce19]    Efficient Zero-Knowledge Proofs and their Applications, PhD thesis, UCL, 2019.

[GW20]    A. Gabizon, Z. J. Williamson, plookup: A simplified polynomial protocol for lookup tables, Cryptology ePrint Archive, Report 2020/315, 2020,

[GMR85]    S. Goldwasser, S. Micali, C. Rackoff, The knowledge complexity of interactive proof-systems (extended abstract), in: Proceedings of the 17th Annual ACM Symposium on Theory of Computing, May 6-8, 1985, Providence, Rhode Island, USA, pp 291–304, 1985.

[KZG]    A. Kate, G. M. Zaverucha, I. Goldberg, Constant-Size Commitments to Polynomials and Their Applications, in: Abe, M. (eds) Advances in Cryptology - ASIACRYPT 2010, LNCS, vol 6477 (2010), Springer, Berlin, Heidelberg.

[P+22]    L. Pearson, J. Fitzgerald, H. Masip, M. Bell�s-Mu�oz, J. L. Mu�oz-Tapia, PlonKup: Reconciling PlonK with plookup, Cryptology ePrint Archive, Report 2022/086, 2022,

Appendix A. VM Configuration

Configurator commands
One line per command, with the exception of multi-line macros

const k1, ... global constant declaration (N=#time steps, W=#bits, ...)
const FiatShamir α, ... random constants used in constraints, computed after the
first commitment via Fiat-Shamir heuristic
var c1=C1 [except t=t0], ... new derived column and its polynomial expression
constraint C1=C2 [except t=t1], polynomial constraint between columns
boundary c1(t1)=v1, ... boundary condition on a column
external c1,c2, ... : cmd call C++ code at every time t, replacing c1↦→C[c_c1][t],...
columns in rhs (except c1,..., and those referred with ~)
need already to be computed at this time
external c1=cmd shorthand for external c1: c1=cmd
global c1,c2, ... : cmd call C++ code with strings replaced c1↦→ c_c1,...
all columns in rhs (except c1,..., and those referred with ~)
need to be computed first for all t
global c1=cmd shorthand for global c1: c1=cmd
control c1[=v1],... define a control column and its default value if 0
inst name i1,... : e1=v1,... define a new instruction name with operands i1,...
and control flags e1=v1,...
permutation[β] c1,c2,... ~d1,d2,... permutation constraint ci ~ di with F-S constant α
lookup[β,γ] T F1,F2,... add F1,... to the lookup “F1<T” with F-S constants β, γ
ilookup[β,μ] T F1,F2,... add F1,... to the injective lookup “F1<T except μ
with F-S constants β, μ
macro name par1,par2,...= { code } define macro name with parameters par1,... and a given code
macros are processed before “ ” are merged in identifiers
// ... inline comment, to end of line
line \ line \ ... ignore line breaks, merging several lines into one command

Configuration code

const N,W,W4
const FiatShamir mu,alpha,beta,gamma         // from Verifier via Fiat-Shamir

var t=t#+1 except t=0                        // time column
boundary t(0)=0

external v1,v4,flag : execute(c_prev,c,~v1,~v4,~flag,v2,v3)   // v2,v3 -> v1(output),v4(advice),flag
global EvenBits = EvenBitsTable(k,Round(Log(2,W)),N)          // EvenBits: sums of powers of 4
global MeInit: for i:=1 to #M do C[MeInit][i]:=k!M[i]; end for;   // MeInit: Initial Memory (private)

// --- Macros, columns and constraints

macro word C = {                // decompose C into evenbits and odd bits
  external C_o = oddbits(C)     // and check that their values are in [0..W-1]
  var C_e = C-2*C_o
  lookup[beta,gamma] EvenBits C_e,C_o

macro Mem v,address,t = {
  external2 v_vlink, v_tlink
  word v
  var v_addr = address
  var v_Me = v_usd+alpha*v_addr+alpha^2*v_vinit
  var v_Me0 = (1-v_usd)*v_Me+mu*v_usd
  var v_Me1 = v_Me+mu-v_usd-v_Me0
  constraint (v_Me0-mu)*(v_Me1-mu)=0
  ilookup[beta,mu] MeInit v_Me0
  lookup[beta,gamma] MeInit v_Me1
  var v_mpos=(v_usd-1)*(t-v_tlink)+v_usd*(t-v_tlink-1)
  word v_mpos
  var v_piL = v_addr + alpha*(v_tlink) + alpha^2*v_vlink
  var v_piR = v_addr + alpha*(t)       + alpha^2*v

macro sign v = {
  external v_sigma = highestbit(v)
  constraint v_sigma*(1-v_sigma)=0
  var v_signed=v-v_sigma*W
  var v_sgncheck=v_o+(1-2*v_sigma)*W4

Mem v1,i1+s1*v2,3*t        // v1: for writing, v2,v3: for reading
Mem v2,i2      ,3*t-2      // v4: advice variable
Mem v3,i3+s3*v2,3*t-1      // process v2, then v3, then write to v1
word v4                    // all 4 have to be words

var v2=v2_vinit+v2_usd*(v2_vlink-v2_vinit)      // memory load constraints
var v3mem=v3_vinit+v3_usd*(v3_vlink-v3_vinit)   // determine v2 and v3
var v3=v3mem*(1-eCONST)+eCONST*i4               // unless v3 is replaced by i4 (imm) by XXXi instructions

external v2_vinit, v2_usd: memload(3*t-2,~V‘M,v2_addr,   ~v2_vinit,~v2_vlink,~v2_tlink,~v2_usd)
external v3_vinit, v3_usd: memload(3*t-1,~V‘M,v3_addr,   ~v3_vinit,~v3_vlink,~v3_tlink,~v3_usd)
external v1_vinit, v1_usd: memsave(3*t  ,~V‘M,v1_addr,v1,~v1_vinit,~v1_vlink,~v1_tlink,~v1_usd)

global v1_vlink, v1_tlink, v2_vlink, v2_tlink, v3_vlink, v3_tlink:      \
  CompleteMemoryCycles(~V‘C,~V‘M,<3,[0,-2,-1]>,[v1_usd,v2_usd,v3_usd],  \

permutation[beta] v1_piL,v2_piL,v3_piL ~ v1_piR,v2_piR,v3_piR

control i1,i2,i3,i4            // immediates and their defaults; for i1,i2,i3 this
                               // refers to Mem[0]=Mem[N] - advice memory cell
control epsilon,delta,lambda   // reusable control columns for various instructions
control s1,s3                  // dynamic memory address control columns

sign v1     // signs (highest bits) of v1,v2,v3 for
sign v2     // SMULH, CMPG, CMPGE (signed operations)
sign v3

var r1 = (eAND+eOR+eXOR)*(v2_o+v3_o) + eMOD*MODrange + (eSSUM+eSPROD)*v1_sgncheck
var r2 = (eAND+eOR+eXOR)*(v2_e+v3_e)                 + (eSSUM+eSPROD)*v2_sgncheck
var r3 =                                               (eSSUM+eSPROD)*v3_sgncheck

word r1     // auxiliary variables for range checks, depending on the instruction
word r2
word r3

control  j1=0,j2=0           // controls for conditional and unconditional jumps
var      jsel = j1+j2*flag   // =0 (pc->pc+1), =1 (jmp), =flag(jmpc), =1-flag(jmpnc)
var      pc = (1-jsel)*(pc#+1)+jsel*v3 except t=0    // change of pc constraint
boundary pc(0)=1

control    cFLAG
constraint flag*(1-flag)=0                // flag = 0 or 1
constraint (flag-flag#)*(1-cFLAG) = 0     // only allowed to change if control cFLAG=1

var        zer = (eAND+eOR+eXOR+eSPROD+eSSUM)*v1 + ePROD*hi + eMOD*v3 \
               + (1-(eAND+eOR+eXOR+eSPROD+eSSUM)-ePROD-eMOD)*(1-flag)
constraint flag*zer=0                     // zer=0 <=> flag=1
external   Inv = invert(flag+zer)         // with zer set to v1 or hi when necessary
constraint (flag+zer)*Inv=1               // or 1-flag otherwise (no effect)

var        SUM = v1-(delta*v2+epsilon*v3+lambda*(i4-W*flag))
var        hi = epsilon*v4+(1-epsilon)*v1
var        lo = epsilon*v1+(1-epsilon)*v4     // hi=v1, lo=v4 for epsilon=0 (default)
var        PROD = v2*v3 - (lo+W*hi)
var        AND = v1-(2*r1_o+r2_o)
var        XOR = v1-(2*r1_e+r2_e)
var        OR  = v1-(2*r1_o+r2_o+2*r1_e+r2_e)
var        rem=hi   // re-used, aliases   // UMOD => v1=rem (output), v4=quo (advice)
var        quo=lo                         // UDIV => v1=quo (output), v4=rem (advice)
var        MOD = v2-(quo*v3+rem)                             // v2=quo*v3+rem
var        MODrange = (1-flag)*(v3-rem-1) + flag*(-quo)      // flag=0 => word v3-rem-1 <=> rem<v3,
var        SSUM  = v1 - (v2_signed+i4-v3_signed+W*flag)      // flag=1 => quo=0 => rem=0
var        SPROD = v2_signed*v3_signed - (v4+W*v1_signed)    // for cmpg,cmpge; for smulh
var        CMOV = v1-flag*v2-(1-flag)*v3
constraint SUM*eSUM=0, PROD*ePROD=0, eSSUM*SSUM=0, eSPROD*SPROD=0, eMOD*MOD=0
constraint eOR*OR=0, eAND*AND=0, eXOR*XOR=0, eCMOV*CMOV=0

Instruction set
Operands and control variables

instand i1 i2 i3 : eand=1, cflag=1
inst andi i1 i2 i4 : eand=1, cflag=1, econst=1
inst or i1 i2 i3 : eor=1, cflag=1
inst ori i1 i2 i4 : eor=1, cflag=1, econst=1
inst xor i1 i2 i3 : exor=1, cflag=1
inst xori i1 i2 i4 : exor=1, cflag=1, econst=1
inst add i1 i2 i3 i4 : esum=1, cflag=1, δ=1, λ=1, ϵ=1
inst addi i1 i2 i4 : esum=1, cflag=1, δ=1, λ=1
inst sub i1 i2 i3 i4 : esum=1, cflag=1, δ=1, λ=-1, ϵ=-1
inst subi i1 i2 i4 : esum=1, cflag=1, δ=1, λ=-1
inst mull i1 i2 i3 : eprod=1, ϵ=1, cflag=1
inst mulli i1 i2 i4 : eprod=1, ϵ=1, cflag=1, econst=1
inst umulh i1 i2 i3 : eprod=1, cflag=1
inst umulhi i1 i2 i4 : eprod=1, cflag=1, econst=1
inst smulh i1 i2 i3 : esprod=1, cflag=1
inst smulhi i1 i2 i4 : esprod=1, cflag=1, econst=1
inst udiv i1 i2 i3 : emod=1, ϵ=1, cflag=1
inst udivi i1 i2 i4 : emod=1, ϵ=1, cflag=1, econst=1
inst umod i1 i2 i3 : emod=1, cflag=1
inst umodi i1 i2 i4 : emod=1, cflag=1, econst=1
inst cmpe i2 i3 : exor=1, cflag=1
inst cmpei i2 i4 : exor=1, cflag=1, econst=1
inst cmpa i3 i2 i4 : esum=1, cflag=1, δ=1, ϵ=-1, λ=-1
inst cmpai i3 i4 : esum=1, cflag=1, ϵ=-1, λ=-1
inst cmpg i3 i2 i4 : essum=1, cflag=1
inst mov i1 i3 s1 s3 i2: esum=1, ϵ=1
inst movi i1 i4 s1 i2 : esum=1, ϵ=1, econst=1
inst cmov i1 i2 i3 : ecmov=1
inst cmovi i1 i2 i4 : ecmov=1, econst=1
inst jmp i3 s3 i2 : j1=1, j2=0
inst jmpi i4 : j1=1, j2=0, econst=1
inst jmpc i3 s3 i2 : j1=0, j2=1
inst jmpci i4 : j1=0, j2=1, econst=1
inst jmpnc i3 s3 i2 : j1=1, j2=-1
inst jmpnci i4 : j1=1, j2=-1, econst=1

Adapt Framework solutions