Zero Knowledge Virtual Machine step by step

ZERO KNOWLEDGE VIRTUAL MACHINE STEP BY STEP

TIM DOKCHITSER AND ALEXANDR BULKIN

Date: June 30, 2023.

Contents
1.  Introduction
   1.1.  Configurability
   1.2.  The primer
   1.3.  Applications of ZKPs
   1.4.  The structure of this paper
   1.5.  Further information
2.  Low degree proof example - Fibonacci Virtual Machine
   2.1.  Setup
   2.2.  Interpolating values
   2.3.  Constraints and relaxing them
   2.4.  Boundary conditions
   2.5.  Mixing
   2.6.  FRI Low Degree Testing
   2.7.  Deep variant
   2.8.  Fiat-Shamir heuristic
   2.9.  Summary
3.  General Compulation
   3.1.  Goal
   3.2.  Program termination
   3.3.  Public input
   3.4.  Output
   3.5.  Recursion
4.  Primitives
   4.1.  Polynomial constraints
   4.2.  Degree
   4.3.  Control, derived and external columns
   4.4.  Instructions
   4.5.  Permutation
   4.6.  Multiple permutation
   4.7.  Lookup
   4.8.  Injective lookup
   4.9.  Permutation and lookups with multiple columns
5.  Instruction set
   5.1.  Operands and flag
   5.2.  Unsigned multiplication
   5.3.  Setting flag for zero result
   5.4.  Unsigned arithmetic, comparisons and move
   5.5.  Logic
   5.6.  Word range
   5.7.  Signed arithmetic
   5.8.  Memory
   5.9.  Program lookup
6.  Prover/Verifier protocol
   6.1.  VM configuration
   6.2.  Prover code for configured VM
   6.3.  Verifier code for configured VM
References
Appendix A. VM Configuration

1. Introduction

A zero-knowledge proof is a cryptographic concept that allows one party (Prover), to demonstrate knowledge of a certain fact or statement to another party (Verifier), while keeping some sensitive information confidential. In practice, the goal of a zero-knowledge proof is to establish trust and confidence in the validity of a claim without divulging sensitive or confidential data.

In this paper, we describe how to implement of a Zero Knowledge Virtual Machine (ZKVM) for generic computation. The setting is that Prover runs a public program on private inputs and wants to convince Verifier that the program has executed correctly and produces an asserted output, without revealing anything about the computation’s input or intermediate state.

IPOPPnrurupotibugpvltruaiattcme

As an example, Prover may want to convince Verifier that they have a password that hashes to a given value, without revealing anything about the password:

PHHPPaaarusssibshhvlwaioftcruednction

Generally, Prover should be able to run some code that performs the hashing operation, and produce data called ‘the proof’, which would allow anyone to verify that the proof is correct, that is, that the Prover has, in fact, a valid pre-image of the given hash value.

ZK problems span a variety of use cases, from more specific to more general. The specific proof-of-hash-pre-image problem above has very practical applications in credential verification. Another narrow case for ZK-proofs are range-proofs used by the ZCash cryptocurrency network to prove availability of funds in a given account without revealing the exact amount of funds or the sequence of transactions that led to the current state.

If we generalize this further, we will discover a wide set of use cases for such primitives. For example, given a hash as a public key K_p = H(K_s), and its pre-image K_s as a private (secret) key, it is not hard to construct an asymmetrical signature scheme using zero-knowledge proofs: signature function S(m,K_s) on message m with key K_s is defined as a zero knowledge proof that H(K_s) = K_p and H(K_s + m) = X that reveals m and X, but not K_s.

So what if we could run any program, and prove, in zero knowledge, that it ran correctly and produced the output we claimed it had, without revealing any inputs? This is what we call zero-knowledge proofs for general computation. The first scheme that achieved this goal is due to Ben-Sasson at al. [BS+13]. It supports any program written in a specially-designed assembly language called TinyRAM that comes close to being viable for practical applications, perhaps with the exception of not supporting non-constant jumps and memory accesses. A precise description of how to implement this scheme is due to Bootle, Cerulli et al. [B+18, Ce19], and they also introduce a primitive called lookup, that is very useful in ZK context. Since that time, there have been quite a few advances in the area, and we feel it is useful to revisit a ZKVM implementation with new techniques.

This paper’s primary purpose is to provide a source of introductory information into building a ZK proof system for general computation. Anyone with practical knowledge of numerical programming, starting from no knowledge about how ZK proofs work, should be able to

understand the general approach to generalized computation ZK proofs
understand all the “ingredients” of a full-functioning general-purpose ZKVM
be able to implement their own ZKVM from scratch

While we aim to inform interested contributors just starting to learn about this technology, and to smooth the otherwise steep learning curve they face, we also make some contributions of our own. We, specifically, introduce:

A new ZK-friendly memory scheme (see �5.8)
A concept of a fully configurable ZKVM (a “ZKVM compiler”), introduced below and demonstrated further in �6.1

1.1. Configurability. In familiarizing ourselves with the current state of the art of general computation ZK-proof schemes, we have observed that, at the moment, the space of existing ZKVMs consists of two distinct categories. On one hand, we have TinyRAM already mentioned, and a few others, that use a “ZK-friendly” command set. On the other hand, there are implementations, such as the one provided by Risc0 and zkWASM, that implement ZK-proofs for the RiskV and WASM command sets respectively, both of which are standard LLVM-compatible targets. These two categories are different in terms of the tradeoffs they introduce. Some operations supported by modern CPUs (and WASM) introduce significant additional complexity into the Prover and, consequently, cause a significant degradation of performance. On the other hand, ZK-friendly ZKVMs typically lack the possibility of support by either LLVM or GCC toolchains, which means they can only be used for small custom-written software components. In order for a ZKVM to be both ZK-efficient and viable as an LLVM target, we need to find a viable balance between these tradeoffs.

This introduces the concept of VM configurability. Essentially, we aim to build LLVM for ZK, where a high-level description of a command set can be used to auto-generate both the Prover/Verifier components and an LLVM backend targeting the command set. Doing this would enable contributors to create new ZKVM command sets with minimal effort, and so run multiple experiments to determine this optimal balance. In practice, such “ZKVM compiler” is more than feasible, as we will see below, due to the modular nature of the constraints associated with proving micro-operations that comprise individual instructions.

But there is an even more important reason to build such ZKVM compiler. We are referring to the issue of fragility that challenges the ongoing security of ZK schemes. The issue can be expressed briefly as follows: there must be an efficient process to update Prover/Verifier pairs in response to security incidents and the fast pace of innovation in ZK. Consider the fact that a missing constraint (for example, to prove that the value of an overflow flag can not be anything other than zero or one) can lead to the possibility of constructing invalid proofs. It is not realistic to expect that a set of constraints used to describe software is complete from the get go. Consequently, we must build our ZK frameworks expecting to have to promptly respond to discoveries of such omissions and any other type of exploit.

Furthermore, the fast pace of research in ZK means that the foundational components of any ZK scheme may fast become out of date, and, what is even more concerning, may be discovered to be exploitable. All of this points towards the need to significantly streamline the reconfigure/recompile/redeploy process in any network using ZKPs. The ZKVM compiler being discussed aims to facilitate this.

1.2. The primer. For the Prover/Verifier scheme there are various choices, based on polynomial commitments, notably Bulletproofs [B+23], KZG [KZG] and FRI [BS+18]. For our primer, we chose FRI (with a DEEP modification [BS+19]) as it needs no trusted setup, it is good for constructing recursive proofs, and it is post-quantum in that its security only relies on a (flexible choice of a) collision-resistant hash function. There are also choices of a prime field for it, such as p = 2⁶⁴-2³²+1 that make for very fast arithmetic and hence efficient Prover/Verifier code. For lookups, we use our scheme [BD23] based on the extensions Plookup [GW20] and Plonkup [P+22] of the original lookup protocol in [B+18].

We note, however, that these choices are mostly for clarity of demonstration and are not consequential in the context of a configurable ZK-compiler. After all, we should be able to easily regenerate our Prover/Verifier with different schemes, as needed.

1.3. Applications of ZKPs. We are most interested in applications of ZKPs to use cases in the financial and legal industries. In both of these contexts, the need for reliable verification of accuracy of facts pertaining to information is critical. Yet, it stands in stark opposition to the need for privacy, which is no less critical. To demonstrate this opposition on a specific example, take two imaginary financiers, Rocky and Morgan. Rocky has invented a cutting edge financial strategy which brings him annual returns of 25%, and has invested all his personal funds into it. He now needs to borrow cash to finance a further increase in trading volume. He believes he can borrow at about 7%, and make a whopping 17% margin. There’s only one problem. Morgan, whose business it is to lend capital, requires Rocky to put up collateral to guarantee the loan. But Rocky doesn’t want to disclose to Morgan the contents of his portfolio, as that would leak information about his wonderful strategy.

After some negotiation it becomes apparent that Rocky would either need to give up his dreams of getting a loan, or would have to disclose his portfolio to Morgan and take the risk of the latter replicating his strategy and eroding his edge. Rocky comes back to his office in a gloomy mood, downs a glass of Scotch, and spills the whole story to his secretary, who then goes ”Wait!”, and tells him that her husband is working on this new technology called zero-knowledge proofs that could potentially help. Rocky agrees to have a meeting with the husband, at which he learns that there is a way to prove the worth of his portfolio to Morgan without disclosing it.

Here’s how it works. Morgan would provide to Rocky some computer code that can determine the value of Rocky’s portfolio. Rocky would run the code and provide its output to Morgan along with the zero-knowledge proof that the code has been run correctly. This output would go to Morgan daily, and if the value is ever too low, Rocky would face a margin call that may require him to sell his entire portfolio in order to repay Morgan.1

This example makes clear the following general position: zero-knowledge proof technology is an essential tool that enables safe and meaningful balance between privacy and verification in the financial and legal context. Here are some more examples that illustrate this position:

Revenue-based lending: verify an organization’s revenue stream, without knowing the specific transaction details or channels.
Regulatory surveillance: verify that an organization is adhering to regulation without requiring access to the details of its activities.
KYC/AML verification without privacy violation: verify age and citizenship of individuals without requiring them to supply their name or any other personally identifiable information.
Contract verification: verify that an organization has entered into a valuable contract with a counterparty without disclosing who the counterparty is.
Vice clause verification: verify that an organization has not been trading with unethical organizations, without having access to the list of organization’s trading counterparties.

1.4. The structure of this paper.

In Section 2 we use a simple example of a Fibonacci sequence to demonstrate the fundamental structure and functioning of ZK proofs
In Section 3 we move from the simple example to introducing the way general computation proofs are achieved
In Section 4 we discuss how to construct the constraints (“primitives”) that can be used as “ingredients” in the proof scheme for various operations within the command set
In Section 5 we discuss specific operations in a fully-functional command set and describe how the primitives from Section 4 are used to generate appropriate proofs
In Section 6 we tie everything together to describe the complete ZK-proof protocol
In the Appendix we provide a glimpse into the configuration language for the ZK-compiler

1.5. Further information. This paper was written as a contribution to the global ZK community by the research team at ADAPT Framework Solutions. To learn more about ADAPT, a high-level developer framework for building decentralized systems, please see the ADAPT whitepaper.

Acknowledgements. We would like to thank Delendum Ventures and especially Daniel Lubarov for an opportunity for us to use a Delendum Fellowship and fruitful discussions.

2. Low degree proof example - Fibonacci Virtual Machine

2.1. Setup. Before going into the technical details of the actual protocol of verifying general program execution, we illustrate the construction with a simple one-column one-constraint ‘Fibonacci Virtual Machine’. Its ‘execution’ is a sequence of integers

a ,a , a ,a ,a ,a ,a , a 1 2 3 4 5 6 7 8

that obey the Fibonacci rule a_n = a_n-1 + a_n-2 for n = 3,..., 8. The input is a₁,a₂ and a₈ is the output. Suppose Prover wants to convince Verifier that they know such a sequence with output a₈ = 47. They have in mind

(1,3, 4,7,11,18, 29,) 47

but do not want to reveal the input a₁ = 1,a₂ = 3 or any intermediate values a₃,...,a₇.

2.2. Interpolating values. We represent the a_i as integers modulo a prime p, that is as elements of the finite field F = ℤ∕pℤ = F_p. In this example, we take p = 97. Additively,

F = ℤ∕97ℤ = {0,1,2,3...,96 }.

Multiplicatively, s = 5 is a primitive root in F. In other words,

× 2 3 96 F \ {0} = (ℤ ∕97ℤ) = {5,5 ,5 ,...,5 }.

The element ζ = 5³ is then of order 32, so it is a 2-power root of unity, and we let

x = (ζ4,ζ8,ζ12,...,ζ32), y = (sζ,sζ2,sζ3, ...,sζ32).

The x_i (i = 1,..., 8) are the 8th roots of unity in F, and the y_i (i = 1,..., 32) are the 32nd roots of unity shifted by s.

Let a = (1, 3, 4, 7, 11, 18, 29, 47) be the column of values that Prover wishes to encode. With Langrange interpolation, we can find the unique polynomial f(x) ∈ F[x] of degree < 8 with f(x_i) = a_i. It turns out to be

f(x) = 67x7 + 7x6 + 2x5 + 28x4 + 6x3 + 74x2 + 42x + 15.

There is a very efficient way to do this, ‘the number-theoretic transform NTT’ (essentially a version of the finite Fourier transform), that is specifically tailored to domains consisting of 2-power roots of unity in a field. Knowing f is equivalent to knowing the a_i. Prover evaluates f on the 32 points y₁,...,y₃₂ using NTT again, and these values are plotted in Figure 1. On the horizontal axis we plot F \{0} multiplicatively (to make the x_i and y_i equally spaced), and on the vertical axis we plot F additively.

∙x∙x∙x∙x∙x∙x∙x∙x∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙y∙y∙y∙y∙y∙y∙y∙y∙y∙12345678123456789

Figure 1: Values of f(x) = 67x⁷ + 7x⁶ + 2x⁵ + 28x⁴ + 6x³ + 74x² + 42x + 15 on x_i = ζ⁴ⁱ and y_i = sζⁱ in F₉₇.

As one (and only) black box, we will use the existence of the following (cf. also �2.6, �2.8)

FRI low degree testing algorithm. It is possible for a Verifier efficiently, and in zero knowledge, to check that a given list of 32 values lie close to the values of some polynomial of degree < 8 evaluated at the y_i.

In practice, 8 and 32 will be, say, 2²² and 2²⁶. Essentially, Prover commits all the values of the disguised trace in a Merkle tree, sends its root to Verifier, and Verifier, by quering very few of the values (~log of their number) is able to convince themselves that they lie close to some polynomial. This number of queries is so small that it is does not reveal enough information about the polynomial to be able to reconstruct its values at the x_i (the original trace), which is where ‘zero knowledge’ comes from.

2.3. Constraints and relaxing them. The Fibonacci rule

an = an- 1 + an- 2 for n ⁄= 1,2

is equivalent to the ‘constraint polynomial’

C(x) = (x - x1)(x - x2)(f (x ) - f (x∕ζ4) - f(x∕ζ8))

to be 0 at x = x₁,...,x₈. The condition f(x) - f(x∕ζ⁴) - f(x∕ζ⁸) = 0 guarantees that a_n - a_n-1 - a_n-2 = 0, and the ‘relaxing’ terms x - x₁ and x - x₂ allow this to be violated for n = 1, 2.

Equivalently, C(x) is divisible by the ‘zeroes polynomial’

∏8 Z(x ) = (x - xi) = x8 - 1. i=1

As2 deg C(x) = 2 + 2 deg f ≤ 16, this is equivalent to the ‘verification polynomial’

V(x ) = C (x)∕Z (x)

to be inded a polynomial of degree ≤ 8 in x. To summarise, the conditions

{f(y_i)}_i=1³² are values of a polynomial of degree ≤ 7
{V (y_i)}_i=1³² are values of a polynomial of degree ≤ 8
V (y_i) = f(y_i) - f(y_i-4) - f(y_i-8) for i = 1,..., 32

are what Verifier needs to know that the execution trace is valid.

2.4. Boundary conditions. We used multipliers x - x₁, x - x₂ to relax the constraint at n = 1, 2. We can also do the opposite, and divide by a linear factor to enforce a value for n = 8. For any function F(x),

F(x) is a polynomial and F(c) = b ⇔ F-(x)---b
x - c is a polynomial.

In our example, the output value a₈ = 47 guarantees that f(x)--47
x-x8 is still a polynomial, so we replace the condition (1) by

{}_i=1³² are values of a polynomial of degree ≤ 6.

2.5. Mixing. There is a simple observation (‘mixing’) that

F(x),G(x) are polynomials	⇔	F(x) + δG(x) is a polynomial
of degree ≤ d		of degree ≤ d for any δ ∈ F.

Moreover, when F is large, one random choice of δ ∈ F is enough, as long as (a) δ comes from Verifier rather than Prover, and (b) it is given to Prover after F and G are committed.

As the low degree testing is quite expensive, we use mixing to get (1^′) and (2) with one test, replacing them by

{y_i² + δV (y_i)}_i=1³² are values of a polynomial of degree ≤ 8.

Same idea can also be used to deal with multiple constraints, mixing them into one in the same way, say, with successive powers of δ.

2.6. FRI Low Degree Testing. This is also basically how FRI low degree testing works. Take a polynomial F(x) = ∑ c_ixⁱ, write every term xⁱ as x^jx^4k with j ∈{0, 1, 2, 3} and group those with the same j together. We get

4 4 2 4 3 4 F (x ) = f0(x ) + xf1 (x ) + x f2(x ) + x f3(x ),

(2.1)

and F is a polynomial if only if all the f_j are. Heuristically, if δ ∈ F is random, it is enough to check (2.1) for a number of x-coordinates, and that

f (x) + δf (x) + δ2f (x) + δ3f (x ) 0 1 2 3

is a polynomial for a random δ ∈ F. This reduces verifying that a function is a polynomial of degree ≤ d to verifying that some other function is a polynomial of degree ≈ d∕4, and FRI is a recursive algorithm based on this. We refer the reader to [BS+18] for details and security analysis.

2.7. Deep variant. We are now down to two verifications:

{y_i² + δV (y_i)}_i=1³² are values of a polynomial of degree ≤ 8,
V (y_i) = f(y_i) - f(y_i-4) - f(y_i-8) for i = 1,..., 32.

The original FRI scheme would check (3) for few indices i, chosen at random by Verifier. The DEEP modification adds extra security in verifying this relation so that just one query suffices. We test the equation at a random point outside the range of committed values y_i. In our example, let

2 j ξ = s ζ (deep x-coordinate)

for a random j ∈{1,..., 32} provided by Verifier. This is shifted away from the x_i and the y_i, so it is a different point in our base field. (In fact, we could even go a bit further, and take it from an extension F_p^d over F_p for some small d, improving security even further though at a cost of Verifier running time.) Because (3) comes from a relation between polynomials, it holds with y_i replaced by ξ as well. In other words,

v₁ = (v₂ - v₃ - v₄), (v₁,...,v₄) = (V (ξ),f(ξ),f(ξ∕ζ),f(ξ∕ζ²)).

The relation on the left is the one we test. We have to ensure that the values on the right are correct, and this is again of the type ‘prove that a polynomial passes through a given point’, so it can be incorporated into (2^′). Using Lagrange interpolation, construct the unique polynomial q of degree ≤ 3 so that

2 q(x8) = 48, q(ξ) = v2, q(ξ∕ ζ) = v3, q(ξ∕ζ ) = v4

and replace (2^′) with

{y_i⁴ + δ}_i=1³² are values of a polynomial of degree ≤ 7.

Thus, we reduced everything to one low degree test (2^′′) and one additional check (3^′).

2.8. Fiat-Shamir heuristic. The introduction of the mixing constant δ and the DEEP parameter ξ that come from Verifier has an undesired effect to making the protocol interactive. We would like a non-interactive protocol in which Prover produces a proof, and Verifier (or anybody else) could verify its correctness at any time afterwards, with no interaction. There is a standard trick how to achieve this, the Fiat-Shamir heuristic. It asserts that random input from an honest Verifier could be replaced with a hash coming from the preceding part of the protocol. In our case, Prover commits {f(y_i)}_y=1³² to a Merkle tree, and its Merkle root could be used as a source of the constants ξ and δ used in the rest of the protocol.

2.9. Summary. Let us collect all the steps for the validity of the trace for the Fibonacci Virtual Machine. Set F = ℤ∕97ℤ, s = 5 ∈ F, ζ = 5³ ∈ F (of order 32), x_i = ζ⁴ⁱ, y_i = sζⁱ.

Prover.

‘Execute the code’ by applying the Fibonacci rule starting from a₁ = 1, a₂ = 3 and up to a₈ = 47. This is the ‘original execution trace’
$a = (1, 3,4,7,11,18,29, 47).$
Interpolate the trace with a polynomial f, so that f(x_i) = a_i for i = 1,..., 8:
$f(x) = 67x7 + 7x6 + 2x5 + 28x4 + 6x3 + 74x2 + 42x + 15.$
Compute f(y_i) for i = 1,..., 32, the ‘disguised execution trace’.
Compute V (y_i) = f(y_i) - f(y_i-4) - f(y_i-8) for i = 1,..., 32.
Store triples (i,f(y_i),V (y_i)) as leaves of a Merkle tree and compute Merkle root δ. This δ will be used as a mixing parameter.
Let ξ = s²ζ^δ, or some other pre-arranged function of δ (‘deep x-coordinate’).
Set (v₁,...,v₄) = (V (ξ),f(ξ),f(ξ∕ζ),f(ξ∕ζ²)) (‘deep taps’).
Using Lagrange interpolation construct the unique polynomial q of degree 3 so that q(x₈) = 48,q(ξ) = v₂,q(ξ∕ζ) = v₃,q(ξ∕ζ²) = v₄.
Generate a FRI low degree proof P (=some branches of the Merkle tree) that
$y ↦→ y4------------f-(yi)---q(yi)----------- + δ V(yi) --v1- i i(yi - x8)(yi - ξ)(yi - ξ∕ζ)(yi - ξ∕ζ2) yi - ξ$
is a polynomial of degree ≤ 7.
The proof of validity of the execution trace is P plus the deep taps v₁,...,v₄.

Verifier. Given P and v₁,...,v₄,

Compute q(x) as in (8) from v₂,v₃,v₄.
Check that all branches in P lead to the same Merkle root.
Let δ be the Merkle root, and compute ξ = s²ζ^δ (or some pre-defined function of δ).
Check that v₁ = (v₂ - v₃ - v₄).
Check that P is a valid FRI proof for the function in (9).

3. General Compulation

Now we turn to the general setup. We work with a virtual machine running some assembly language. Prover executes a program, and generates an execution trace. For every time t = 1, 2,...,N, it stores the values of the program line counter pc, flag, registers (if any) and various auxiliary variables after the command at time t has been executed. All values lie in a finite field F = F_p with p elements, and are stored in columns of size N, say C₁,C₂,….

Program

pc	instruction	imm

1	ax:=ax+17	17
2	jmpnc 1	1

Execution trace



C₁	C₂	C₃	C₄	C₅	C₆
t	pc _t	flag _t	ax _t	bx _t	imm _t

1	2	0	17	0	17
2	1	0	17	0	1
3	2	0	34	0	17

Table 1: Source program and an execution trace

We will fix the following primary constants for the virtual machine:


const	description	example

W	VM word size, even power of 2	2³²
N	number of memory words & bound on the number of time steps	2²⁶
p	prime modulo which execution trace entries are computed	2⁶⁴-2³²+1

3.1. Goal. Our goal is to describe a protocol for Prover to convince Verifier that

Prover possesses a valid execution trace from a program execution.
The program started with an asserted public input (plus, generally, some private input, unknown to the Verifier).
The program terminated at an asserted place, with an asserted output.

Validity of the execution trace will be ensured with polynomial constraints (see �4), which are polynomial relations between columns, and boundary conditions, which are statements of the form ‘column C at time t has value a’. Let us first deal with (b) and (c) — program termination, input and output.

3.2. Program termination. To verify where the program has terminated, and that it has terminated correctly, we can have a special instruction, say halt (or answer), for a correct termination, that simply works as a jump back to the same instruction. There could be several of those in the program:


pc	instruction


121	halt

394	halt

If, say, the first one is reached, then the program enters an infinite loop until the time t = N. So we just need to check that the value of pc at time N is 121, say. This is a boundary constraint on the pc column.

3.3. Public input. We precede the program with a code that hashes the public part of the memory with some ZK-friendly hash (say Poseidon) and asserts that the hash agrees with a given constant h. In other words, the program cannot continue (with a valid execution trace) unless the value is h. Verifier checks that the expected public input hashes to that value.

3.4. Output. Similarly, at the end of the program, right before the termination block, we introduce a code that hashes the part of the memory responsible for the program output. Again, the code checks that the hash agrees with some value that Verifier possesses.

3.5. Recursion. These steps also allow to break the trace into ‘pieces’, which is necessary if the program takes more than N steps to execute. Each piece starts with a code that verifies that the previous one has executed correctly (which we will get to), and that its output hashes to the same value as the input of the current one.

4. Primitives

We are left to deal with (a) of �3.1 — how Prover can show the trace satisfies set of constraints with boundary conditions (�6), and how to formulate those constraints that ensure validity of the program execution for our chosen VM architecture (�5). A collection of columns, constraints and instruction definitions for a given VM forms a ‘configuration script’, and we begin by describing the primitives that make such a script.

4.1. Polynomial constraints. Validity of the trace will be mostly ensured with polynomial constraints. These are polynomial relations between the values of several columns at time t and t - 1. We use # to indicate ‘t - 1’, so that, for example,

constraint C₁C₂ + C₃^#C₄^# = 0

\text{[math]}

stands for

C1 (t)C2 (t) + C3 (t - 1)C4 (t - 1) for all t = 1,...,N.

We use the convention that the time ‘loops’, so C_i(0) := C_i(N). We also allow exceptions at some specified times t:

Example 4.1 (Time column). The following constraints with a boundary condition at t = 1 force C to be the ‘time’ column C = (1, 2, 3,...,N).

constraint	C = C^# + 1	for t≠1
boundary	C(1) = 1

4.2. Degree. The degree of a constraint is its degree as a polynomial in the variables C₁,C₁^#,C₂,.... For example,

degree(C = C^# + 1) = 1, degree(C₁² = C₂C₃) = 2, degree(C₁C₂C₃ = 0) = 3.

Higher degree constraints are normally expensive, and in our sample VM based on FRI we have all constraints of degree 2 (quadratic). Constraints of degree 1, or, more generally, variables that occur only once and linearly in some constraint are not particularly useful, as those variables can be eliminated. Higher degree constraints, when we need them, are broken into quadratic ones, at the cost of introducing new columns. For example, for

C1C2C3 = 0

we can introduce a new column Q = C₁C₂, and replace the constraint by an equivalent quadratic pair

Q = C1C2, C3Q = 0.

If we opt for a higher degree limit, say 3 or 4 instead of 2, we can reverse this and remove some columns. The optimal choice depends on how the running time grows with the number of columns and the degree bound.

4.3. Control, derived and external columns. To formulate a set of constraints for the execution trace, it will be useful to distinguish between 3 types of columns:


Column type	description	Table 1
		example

control	Columns shared with the program (except pc)	imm
	that depend on the instruction and its operands
derived	Columns C that are polynomial expressions in	t, pc
(or variable)	other columns, and possibly C^#
external	Columns whose values are computed ‘externally’,	flag
	outside the configuration script

Control columns come from the program (e.g. all immediate values of the instructions), derived ones are polynomials in other columns, and the rest are external. Write

control	C₁,C₂,…
var	C₅ = C₁C₂ + C₃C₄,…

to specify control and derived columns. Every var creates a column and also generates the corresponding constraint. If we wish to store, say, the highest bit of the values of one column C in another column H, then H is not a polynomial expression in C, so H is external. Prover can have a function highestbit to compute the highest bit of a machine word w ∈ [0,...,W-1], and the configuration command

external H=highestbit(C)

\text{[math]}

states that H is computed externally, and is a function of C (needed to order the computation of the derived columns and such external calls). This generates no constraints, so they have to be specified manually (say H(H-1) = 0 etc., in our example).

Example 4.2 (Jumps). This is how conditional (on flag) and unconditional jumps can be implemented; cf jmp, cjmp, cnjmp in Table 2:

external	flag
control	j1, j2, addr
var	j = j1+j2*flag
var	pc = (1-j)(pc^#+1)+jaddr	for t≠0
boundary	pc(0)=1

Here pc is a derived column, controlled by the values of j1, j2 and addr. When j1=j2=0, this forces pc_t=pc_t-1+1, so the program counter is incremented by one (default behaviour). If j1=1, j2=0, it becomes an unconditional jump to addr. For j1=0, j2=1, this is cjmp, and for j1=1, j2=-1, this is cnjmp.

4.4. Instructions. Generally, everything about an instruction is encoded in the control variables, either via immediates (e.g. addr in 4.2) or control flags (e.g. j1, j2 in 4.2). We will write instructions as follows:

inst ⟨instructionname⟩⟨operand1⟩⟨operand2⟩... : ⟨flag1⟩=value, ⟨flag2⟩=value, ...

\text{[math]}

with operands and flags from control columns; for example (see 4.2 again),

inst	jmp addr	: j1=1, j2=0
inst	jmpc addr	: j1=0, j2=1
inst	jmpnc addr	: j1=1, j2=-1

4.5. Permutation. The last primitives that we need are permutation and lookup, two very useful tools derived from polynomial constraints with boundary.

Let us start with permutation. Suppose we have two columns C₁, C₂ and we want to show that they are permutations of one another. That is, they have the same N elements,

{C1,1,...,C1,N} = {C2,1,...,C2,N }

as unordered multisets, but possibly in different order. Let us denote this as

C1 ~ C2.

Note that we do not want to reveal any of the elements or anything about the permutation. The trick is to note that having such a permutation is equivalent to the equality of polynomials ∏ _t=1^N(x-C_1,i) = ∏ _t=1^N(x-C_2,i), which can in turn be checked by substituting a random value of β for x. Thus,

N∏ β - C1,i C1 ~ C2 ⇐ ⇒ -------- = 1. t=1 β - C2,i

Like in the Fibonacci example, we get β through the Fiat-Shamir heuristic, as a Merkle root of previously committed columns. We view the probability that β happens to equal to one of the C_2,i (or C_1,i) as negligible.

Build an auxiliary column H whose values accummulate the partial products H_t = ∏ _i=1^t β--C1,i
β- C2,i . Then H satisfies a recursion (which is how we compute it) H_t = H_t-1 β-C
β-C12,t,t- , so we have a quadratic constraint with boundary

H# (β - C1) = H (β - C2 ), H (0) = H (N ) = 1.

If C₁,C₂ and H satisfy this constraint, then C₁ and C₂ are permutations of one another. To summarise, we write

permutation[β] C₁ ~ C₂

\text{[math]}

for

set	H = (∏ _i=1^t)_t=1^N
boundary	H(0) = H(N) = 1
constraint	H^#(β - C₁) = H(β - C₂)

4.6. Multiple permutation. As a slight extension, say we have two lists of m columns C₁⁽¹⁾,...,C₁^(m) and C₂⁽¹⁾,...,C₂^(m). We can show that the unions ⋃ _j=1^mC₁^(j) and ⋃_j=1^mC₂^(j) have the same mN values as multisets, with an obvious generalisation

set	H = (∏ _i=1^t ∏_j=1^m)_t=1^N
boundary	H(0) = H(N) = 1
constraint	H^# ∏_j=1^m(β - C₁^(j)) = H ∏_j=1^m(β - C₂^(j))

except that the resulting constraint is of degree m + 1 and has to be split into m quadratics. We write

permutation[β] C₁⁽¹⁾,...,C₁^(m) ~ C₂⁽¹⁾,...,C₂^(m)

\text{[math]}

for the resulting scheme.

4.7. Lookup. Say we have two columns F = (F_i)_i=1ⁿ (‘query’) and T = (T_i)_i=1^d (‘table’). We want to prove that F ⊂ T as sets, ignoring repetitions. Let S be a vector of length n + d. Gabizon et al. [GW20] prove that the three conditions

F ⊂ T as a set (unordered, ignoring repetitions)
S = F ∪ T as a multiset (unordered, with correct multiplicities)
S is sorted by T (if S_i = T_i′,S_j = T_j′, then i<j ⇔ i′<j′)

are satisfied if and only if the following polynomials in F[β,γ] are equal:

∏n d∏-1 n+∏d-1 (1 + β)n (γ + Fi ) (γ(1 + β) + Ti + βTi+1 ) = (γ(1 + β) + Si + βSi+1). i=1 i=1 i=1

We will take d = n + 1 = N, the only case we need. Then this takes the form

n n 2n ∏ ∏ ∏ ⋅⋅⋅ ⋅⋅⋅∕ ⋅⋅⋅ = 1. i=1 i=1 i=1

Splitting the last product into pairs, we get a condition of the form ∏ _i=1ⁿQ_i = 1, which can be verified by building a column with partial products Z_i = ∏ _j=1^i-1Q_j, and imposing the conditions that Z₁ = Z_n+1 = 1 and a recursive formula Z_i+1 = Z_iQ_i; this is very similar to the permutation scheme in �4.5-4.6.

In ∏ _i=1²ⁿ ⋅⋅⋅ , Bootle et al. group the ith and (i + n)th term together, and Pearson et al. [GW20] observe that it is more efficient to group ith and (i + 1)st, saving one boundary condition. The ‘cost’ of the lookup in terms of quadratic constraints is

4 additional columns: Z, Q, and two halves H₁, H₂ of S,
2 quadratic constraints Z_i+1 = Z_iQ_i and the expression for Q_i,
boundary conditions Z(1) = Z(n + 1) = 1.

Similarly to �4.6, we can generalise this scheme to multiple queries with the same table:

Algorithm 4.3. Let (F_i⁽¹⁾)_i=1ⁿ,...,(F_i^(k))_i=1ⁿ be k queries for the same table T = (T_i)_i=1ⁿ⁺¹.

Prover:

Set S = T ∪ F⁽¹⁾ ∪… ∪ F^(k) as a multiset, sorted by T.
Set H^(j) = (S_j,S_j+k+1,S_j+2(k+1),..., ) for j = 1,...,k + 1.
Thus, |S| = kn + n + 1, |H⁽¹⁾| = n + 1 and |H⁽²⁾| = ... = |H^(k+1)| = n.
Commit H⁽¹⁾,...,H^(k+1).
Get random constants β,γ ∈ k from Verifier3 .
Compute columns Z, Q⁽¹⁾,...,Q^(k) recursively, as follows.
10 111:   Z₁ ← 1
112:   for i ← 1 to n do
113:    v ← γ(1+β)+H_i^(k+1)+βH_i+1⁽¹⁾
114:    for j ← k to 1 by - 1 do
115:    v *:= (γ(1 + β)+H_i^(j)+βH_i^(j+1))∕(γ+F_i^(j))
116:    Q_i^(j) := v
117:    end for
118:    Z_i+1 := Z_i(γ(1+β)+T_i+βT_i+1)(1+β)^k∕v
119:   end for
Commit Z, Q⁽¹⁾,...,Q^(k).

Verifier: Check that Z₁ = Z_n+1 = 1, and the following k + 1 quadratic constraints. For all i = 1,...,n,

Zi+1Q (1i) = Zi(1 + β)k(γ(1+ β )+Ti +βTi+1 ), (j) (j+1) (j+1) (j) (j) (γ(1+ β)+H i(k+) βH i(k+1))Qi = (γ+F(ik+1))Qi (fo1)r j = 1..(.kk) -(1k,) (γ(1+ β)+H i + βH i )(γ(1 + β)+H i +βH i+1) = (γ +F i )Q i .

We write

lookup[β,γ] T F⁽¹⁾,...,F^(k)

\text{[math]}

for the resulting scheme, and group lookups with the same table. We will use this primitive

to check that numbers are in the word range in �5.6, and
to check that we are executing the correct program in �5.9.

4.8. Injective lookup. There is a version of this, called ‘bounded lookup’ [BD23], that puts a cap m on how many times an entry in T can occur in the F^(j) in total. In practice, the F^(j) and T can have different sizes (≤ n), and we pad them with a ‘dummy’ value μ ∈ T, which will be yet another Fiat-Shamir constant. We only need the m = 1 case, ‘injective lookup’. This is the case used for memory accesses in [B+18, Ce19], and we will use it in a very similar way. The algorithm then shows that every value T_i≠μ occurs at most once in ⋃ _jF^(j). This case is simpler, and has a cost of 2k auxiliary columns.

View F^(j) as having length n + 1, padding them with μ if necessary. What we want to show is that there are pairwise disjoint sets of indices X^(j) ⊂{1,...,n + 1} such that

F^(j) = permutation of T_i if i ∈ X^(j) and μ if iX^(j)_i=1ⁿ⁺¹.

We already described how permutation can be implemented, and we get:

Algorithm 4.4 (Multiple Queries Injective Lookup). Let (F_i⁽¹⁾)_i=1ⁿ⁺¹,...,(F_i^(k))_i=1ⁿ⁺¹ and T = (T_i)_i=1ⁿ⁺¹ be vectors in a field F, and μ ∈ T. The algorithm provides a proof that F_i^(j) ∈ T, and every T_i≠μ occurs at most once as an entry.

Prover:

Let X^(j)⊂{1,...,n + 1} be pairwise disjoint sets of indices for j =1,...,k so that

F^(j) = permutation of T^(j), T^(j) = {T_i if i ∈ X^(j) and μ if iX^(j)}_i=1ⁿ⁺¹.
Commit F^(j), T and T^(j) for j = 1,…,k.
Get random constant β ∈ F from Verifier.
Define vectors z^(j) = ∏_l=1ⁱ_i=1ⁿ⁺¹ for j =1,...,k, and commit them.

Verifier: Verify that

z_i^(j)(β-T_i^(j)) = z_i-1^(j)(β-F_i^(j)) for i = 1,..,n+1 and z_n+1^(j) = 1, for j = 1,...,k.
(T_i^(j) - T_i)(T_i^(j) - μ) = 0 for i = 1,..,n + 1, and j = 1,...,k.
T_i + (k - 1)μ -∑ _j=1^kT_i^(j)kμ -∑_j=1^kT_i^(j) = 0 for i = 1,..,n + 1.

The first condition shows that F^(j) and T^(j) are permutations of one another. The second shows that T^(j) either agrees with T or equals μ at every place. The last one guarantees that at every place T^(j) can agree with T for at most one j, which is equivalent to the X^(j) being disjoint.

We refer the reader to [A+22] for the discussion of lookup in the context when the F^(j) and T have very different size, and optimizations for Prover in that case.

We write

ilookup[β,μ] T F⁽¹⁾,...,F^(k)

\text{[math]}

for the resulting injective lookup scheme, and, again, group ilookups with the same table. This will be used in memory access checks in �5.8.

4.9. Permutation and lookups with multiple columns. Both permutation and lookup schemes can be extended to cover multiple columns. For example, if we we want to show that the sets of rows in a collection of columns C⁽¹⁾,...,C^(k) is the same as a the set of rows in a different collection D⁽¹⁾,...,D^(k), we can mix the columns together with a Fiat-Shamir constant α and check for permutation:

var	L = C⁽¹⁾ + αC⁽²⁾ + α²C⁽³⁾ + … + α^k-1C^(k)
var	R = D⁽¹⁾ + αD⁽²⁾ + α²D⁽³⁾ + … + α^k-1D^(k)
permutation[β]	L ~ R

The same idea works for lookups and injective lookups as well.

5. Instruction set

We now describe how to implement a full instruction set in terms of the above primitives (apart from jumps that have already been covered in Example 4.2 and �4.4). See Table 2 for the descriptions of what they do, and a summary of control column settings. We follow [B+18] closely here, and the instruction set is very close that of TinyRam [BS+13]. The only substantial change is that we propose a memory model that is quite different: we have no registers, as every memory cell can work as a register. As memory is quite ‘cheap’ in terms of ZK costs and every register adds a few additional columns, we found this scheme more efficient; see �5.8.


Instruction	Operands	Effect	flag set if
	o1 o2 o3

and	o1 o2 o3	o1:=o2 BitwiseAND o3	result=0
or	o1 o2 o3	o1:=o2 BitwiseOR o3	result=0
xor	o1 o2 o3	o1:=o2 BitwiseXOR o3	result=0
not	o1 o3	o1:=BitwiseNOT o2	result=0

add	o1 o2 o3	o1:=o2+o3 mod 2^W	overflow: o2+o3≥ 2^W
sub	o1 o2 o3	o1:=o2-o3 mod 2^W	underflow: o2-o3< 0
mull	o1 o2 o3	o1:=o2×o3 mod 2^W	�underflow: o2×o3< 2^W
umulh	o1 o2 o3	o1:=o2×o3 div 2^W	�underflow: o2×o3< 2^W
smulh	o1 o2 o3	o1:=o2×_signedo3 div 2^W	�underflow: result=0
udiv	o1 o2 o3	o1:=o2 div o3	division by 0: o3=0
umod	o1 o2 o3	o1:=o2 mod o3	division by 0: o3=0

cmpe	o1 o2	compare if =	o1=o2
cmpa	o1 o2	compare if >	o1>o2
cmpae	o1 o2	compare if ≥	o1≥o2
cmpg	o1 o2	signed compare if >	o1> _signedo2
cmpge	o1 o2	signed compare if ≥	o1≥_signedo2

mov	o1 o2	o1:=o2	unchanged
cmov	o1 o2	o1:=o2 if flag=1	unchanged

jmp	o1	pc:=o1	unchanged
cjmp	o1	pc:=o1 if flag=1	unchanged
cnjmp	o1	pc:=o1 if flag=0	unchanged

answer	o1	halt with exitcode o1	unchanged


Inst	Operands	cFLAG	δ	ϵ	j₁	j₂	λ	eAND	eMOD	eOR	ePROD	eSPROD	eSSUM	eSUM	eXOR

and	[i₁] [i₂] [i₃]\|i₄	1						1
or	[i₁] [i₂] [i₃]\|i₄	1								1
xor	[i₁] [i₂] [i₃]\|i₄	1													1
add	[i₁] [i₂] {[i₃]}{i₄}	1	1	1			1							1
sub	[i₁] [i₂] {[i₃]}{i₄}	1	1	-1			-1							1
mull	[i₁] [i₂] [i₃]\|i₄	1		1							1
umulh	[i₁] [i₂] [i₃]\|i₄	1									1
smulh	[i₁] [i₂] [i₃]\|i₄	1										1
udiv	[i₁] [i₂] [i₃]\|i₄	1		1					1
umod	[i₁] [i₂] [i₃]\|i₄	1							1
cmpe	[i₂] [i₃]\|i₄	1													1
cmpa	[i₃] {[i₂]}{i₄}	1		-1			-1							1
cmpg	[i₃] [i₂] i₄	1											1
mov	[i₁{+s₁[i₂]}] [i₃{+s₃[i₂]}]			1										1
jmp	i₄{+[i₃{+s₃[i₂]}]}				1
jmpc	i₄{+[i₃{+s₃[i₂]}]}					1
jmpnc	i₄{+[i₃{+s₃[i₂]}]}				1	-1

Table 2: Instruction set and control variable settings

5.1. Operands and flag. All our instructions have ≤ 3 operands v1 (output), v2 (input 1), v3 (input 2), and some need an advice variable v4, and/or use or affect flag:

v2 (input 1), v3 (input 2), flag_t-1 v1 (output), flag_t.

We will add range checks that v1,...,v4 are all in the word range [0,...,W-1] for every t (see below), and the following constraints for the flag:

control	c_flag
constraint	flag*(1-flag)=0	← flag = 0 or 1
constraint	(flag-flag^#)*(1-c_flag) = 0	← flag allowed to change if c_flag=1

5.2. Unsigned multiplication. Here is an example of a pair of instructions mull (product - lower word) and umulh (product - higher word, unsigned); they affect the flag, so c_flag is set:

control	e_prod, ϵ
var	hi = ϵv4+(1-ϵ)v1
var	lo = ϵv1+(1-ϵ)v4
var	prod = v2v3 - (lo+Whi)
constraint	e_prod*prod=0
inst	mull	: e_prod=1, c_flag=1, ϵ=1
inst	umulh	: e_prod=1, c_flag=1, ϵ=0

Control variable ϵ determines whether v1 (output) is the higher word hi or the lower word lo; v4 (advice) is the other one. Condition prod=0 and range checks ensure that lo and hi are computed correctly. Control variable e_prod is set to 1 for mull/umulh and to 0 for all other instructions, turning the prod=0 check on/off.

5.3. Setting flag for zero result. Multiplication and several other instructions (see Table 2) set the flag when either v1 or hi (zero result) or input v3 (division by 0) is zero. To check that this is done correctly, we introduce a column zer that keeps control of the expression whose vanishing sets the flag. For example, in the case of unsigned multiplication, e_prod = 1 and zer=hi. In the case of an instruction which does not need this check all the control flags in the expression for zer are zero, and zer=1-flag (which passes the check).

var	zer = (e_and+e_or+e_xor+e_sprod+e_ssum)v1 + e_prodhi + e_mod*v3
	+ (1-(e_and+e_or+e_xor+e_sprod+e_ssum)-e_prod-e_mod)*(1-flag)
constraint	flag*zer=0
external	Inv = invert(flag+zer)
constraint	(flag+zer)*Inv=1

The last 3 lines ensure that zer=0 ⇔ flag =1. Here invert is an external function that sends an element a ∈ F to its inverse 1∕a if a≠0 (and has unspecified effect when a = 0).

5.4. Unsigned arithmetic, comparisons and move. Addition and unsigned comparison are done in a similar way to unsigned multiplication. Control variables δ, ϵ, λ determine whether we are doing addition, subtraction or comparison or simply move v1:=v3, and the overall check is otherwise the same. The check also ensures that the flag is set correctly.

control	e_sum, δ, ϵ, λ
var	SUM = v1-(δv2+ϵv3+λ(i4-Wflag))
inst	add	: e_sum=1, c_flag=1, δ=1, λ=1
inst	sub	: e_sum=1, c_flag=1, δ=1, λ=-1
inst	cmpa	: e_sum=1, c_flag=1, ϵ=-1, λ=-1
inst	mov	: e_sum=1, ϵ=1
constraint	SUM*e_sum=0

Next we turn to unsigned division (udiv) and remainder (umod),

quo = v2 div v3, rem = v2 mod v3.

As for the lower/higher word dichotomy in case of multiplication, we use the same verification block for the two instructions, with a flag ϵ that control which of the variables quo and rem is the output v1, and which one is an advice word v4.

control	e_mod, ϵ
var	rem = ϵv4+(1-ϵ)v1
var	quo = ϵv1+(1-ϵ)v4
var	MOD = v2-(quo*v3+rem)
var	MOD_range = (1-flag)(v3-rem-1) + flag(-quo)
constraint	e_mod*MOD=0
inst	udiv	: e_mod=1, ϵ=1, c_flag=1
inst	umod	: e_mod=1, ϵ=0, c_flag=1

As above, we first check that flag is set if and only v3=0 (division by 0). We will check below that the column MOD_range is in the word range [0,...,W -1]. When flag=1 and v3=0,

MOD = v2-rem, MOD_range=-quo,

so MOD=0, 0≤MOD_range ≤ W -1 test that the quotient is 0 and the remainder is v2 in the division-by-0 case. In the normal case flag=0 and v3≠0, MOD=0, 0≤MOD_range ≤ W-1 test that v2,v3 satisfy the quotient/remainder relation and that the remainder, which is either v1,v4 and so ≥ 0, is moreover <v3.

5.5. Logic. There is a neat way to convert F_p-unfriendly logic operations to F_p-friendly arithmetic ones [B+18]. We can replace NOT by XOR with -1 at the stage of reading the code, so we just need AND, XOR and OR. The key observation is that for a,b ∈{0, 1},

a + b = 2(aAND b) + (a XOR b).

The same is true if a,b have only even bits in their binary expansion, so splitting machine words into even and odd bits helps convert addition to AND and XOR.

Notation 5.1. For a machine word a decomposed into bits, we write

∑ i a = ai2 = a0+2a1 +4a2 +8a3 +...= ae + 2ao, i

with a_e = ∑ _{i even}a_i2ⁱ the ‘even bits’ and a_o = ∑ _{i odd}a_i2^i-1 the shifted ‘odd bits’.

In this notation, for machine words a,b,c with a + b = c, we have

aAND b = 2(ao + bo)o + (ae + be)o, aXOR b = 2(a + b ) + (a + b ), o o e e e e aOR b = a AND b + aXOR b.

This converts to the following code and instructions:

control	e_and, e_xor, e_or
var	r1 = v2_o+v3_o
var	r2 = v2_e+v3_e
var	AND = v1-(2*r1_o+r2_o)
var	XOR = v1-(2*r1_e+r2_e)
var	OR = v1-(2r1_o+r2_o+2r1_e+r2_e)
constraint	e_andAND=0, e_xorXOR=0, e_or*OR=0
inst	and	: e_and=1, c_flag=1
inst	or	: e_or=1, c_flag=1
inst	xor	: e_xor=1, c_flag=1

5.6. Word range. To finish off the logic checks, it remains to verify that all x ∈{v₁,v₂,v₄,v₄,r₁,r₂} decompose as

x = xe + 2xo, xe,xo ∈ EvenBits,

where EvenBits is the table of all numbers in the word range that have only even bits in them. The equality on the left is just another constraint, and we implement the range check on the right with a table lookup (see �4.7),

lookup[β,γ]	EvenBits x_e,x_o

5.7. Signed arithmetic. Recall that our operand variables v_i are unsigned, lying in the word range [0,...,W-1]. To implement signed arithmetic, for v ∈{v₁,v₂,v₃} define a columm v_σ for the highest significant bit of v, and v_signed=v-v_σW ∈ [-W∕2,...,W∕2-1] for the signed version of v:

external	v_σ = highestbit(v)	← external call to compute v_σ
constraint	v_σ*(1-v_σ)=0	← must be 0 or 1
var	v_signed=v-v_σ*W	← in [-W/2,...,W/2-1]
var	v_sgncheck=v_o+(1-2v_σ)W/4	← provided this is non-negative

We check that v_sgncheck is in word range, in the same way as for the v_i. Then we can define signed multiplication and signed comparison:

var	ssum = v1 - (v2_signed+i4-v3_signed+W*flag)
var	sprod = v2_signedv3_signed- (v4+Wv1_signed)
constraint	e_ssum*ssum=0
constraint	e_sprod*sprod=0
inst	smulh	: e_sprod=1, c_flag=1
inst	cmpg	: e_ssum=1, c_flag=1, i4=0
inst	cmpge	: e_ssum=1, c_flag=1, i4=1

5.8. Memory. Recall that most of the instructions work as

v₂ (input 1), v₃ (input 2) v₁ (output).

We have no registers, and every memory cell can be any of the operands. At every time t we always read v₂ and v₃ and write v₁. We reserve Mem[0](=Mem[N]) for a ‘dummy’ memory cell that we read or write when needed, but which is not used by the program. For example, when executing

mov 10,20 (Mem[10]:=Mem[20])

we need to ensure that v₃ is read from Mem[20], v₂ from Mem[0] and v₁ written to Mem[10], while the instruction (mov) checks that v₁ = v₃ (see �5.4).

This memory scheme is implemented as follows. For every operand index i = 1, 2, 3 we define columns


v_i,addr	memory address accessed
v_i	value in Mem[v_i,addr] after access
v_i,vinit	initial value in Mem[v_i,addr] when program starts
v_i,usd	1 if memory has been accessed before, 0 else
v_i,tlink	previous time memory was accessed if v_i,usd=1, and
	last time when memory will be accessed if v_i,usd=0
v_i,vlink	value in Mem[v_i,addr] after access at time v_i,tlink

Column t_m and the word ‘time’ in the table refer to the ‘memory access clock’ that runs at the triple speed to the time t column. When executing an instruction at time t, we

read v₂ at t_m = 3t-2,
read v₃ at t_m = 3t-1, and
write v₁ at t_m = 3t.

For example, suppose that Mem[1] is initialised to 0, Mem[2] to 10, and Mem[1], Mem[2] are accessed twice during program execution at times t₁ < t₂, once to store Mem[1]:=Mem[2] (:=10) and once to add Mem[1]:=Mem[1]+Mem[2] (:=20). Here are the values of the memory columns in this case (*’s in the first row stand for some values that are not important here):


t	instruction	(t_mⁱ)	(i)	v_i,addr	v_i	v_i,vinit	v_i,usd	v_i,tlink	v_i,vlink


t₁	mov 1, 2	3t₁-2	2	0	*	*	*	*	*
		3t₁-1	3	2	0	10	0	3t₂-1	10
		3t₁	1	1	10	0	0	3t₂	20

t₂	add 1,1,2	3t₂-2	2	1	10	0	1	3t₁	10
		3t₂-1	3	2	10	10	1	3t₁-1	10
		3t₂	1	1	20	0	1	3t₂-2	10

Here are the constraints on the valid memory access:

(M1) Memory access form cycles. In our example, we have cycles 3t₁ → 3t₂-2 → 3t₂ → 3t₁ of length 3 for Mem[1] and 3t₁-1 → 3t₂-1 → 3t₁-1 of length 2 for Mem[2], such that tlink at every point agrees with time t_mⁱ of the previous one, and vlink agrees with v_i of the previous one. In other words, if we look at the overall set of 3N triples

( ) vi,addr,t,vi,tlink,t,vi,vlink,t , i=1,2,3,t=1,...,N

it is a permutation of the 3N triples

( ) vi,addr,t,3(t- 1)+i, vi,t . i=1,2,3,t=1,...,N

Mixing the three columns into one with some Fiat-Shamir constant α as in �4.9, this becomes a permutation of the type

union of 3 columns ~ union of 3 other columns

of the type discussed in �4.6 (with m = 3). Here is the resulting code:

var	v1_π^L = v1_addr + α(v1_tlink) + α²v1_vlink
var	v1_π^R = v1_addr + α(3t) + α²*v1
var	v2_π^L = v2_addr + α(v2_tlink) + α²v2_vlink
var	v2_π^R = v2_addr + α(3t-2) + α²*v2
var	v3_π^L = v3_addr + α(v3_tlink) + α²v3_vlink
var	v3_π^R = v3_addr + α(3t-1) + α²*v3
permutation[β]	v1_π^L,v2_π^L,v3_π^L ~ v1_π^R,v2_π^R,v3_π^R

(M2) Memory locations cannot be in more than one cycle. In our example, we need to check that there are no times t≠t₁,t₂ for which v_i,addr,t = 1 or 2 for some i. To do this, let MeInit be the initial state of the memory, and consider tables

(0, a, MeInit[a])_a=1^N and (1, a, MeInit[a])_a=1^N

We want to show that for every i and t, the triple

(v_i,usd,t, v_i,addr,t, v_i,vinit,t)

is contained in the first table exactly once when v_i,usd,t = 0 and contained in the second table when v_i,usd,t = 1. This ensures that all memory accesses belong to valid cycles, with each cell belonging to one cycle. This can be done with an injective lookup and a lookup, respectively. Again we mix the three columns using α:

global	MeInit	← Initial Memory, private
var	MeTable = 0 + αt + α²MeInit

Here MeTable is the mixed column corresponding to the first table, and for the second table it is simply MeTable+1 (replace 0+ by 1+ in its definition).

Let μ be another Fiat-Shamir constant. This is the code that we use for v ∈{v₁,v₂,v₃}:

var	v_Me = v_usd+αv_addr+α²v_vinit	← mix (usd,addr,vinit) for v
var	v_Me0 = (1-v_usd)v_Me+μv_usd	← Replace usd=1 entries by μ
var	v_Me1 = v_Me+μ-v_usd-v_Me0	← Replace usd=0 entries by μ
constraint	(v_Me0 - μ)*(v_Me1 - μ)=0	← check one of them holds
ilookup[β,μ]	MeTable v_Me0	← injective lookup when usd=0
lookup[β,γ]	MeTable v_Me1	← lookup when usd=1

(M3) Access times are in increasing order. We need to verify that times in the cycle are ordered correctly. That is, t_mⁱ < v_i,tlink when v_i,usd = 1, and t_mⁱ ≥ v_i,tlink when v_i,usd = 0. This guarantees that there is only one usd=0 entry in every cycle, at the smallest time. We can do both checks in one with a word range verification (see �5.6)

(1 - v ) * (v - ti ) + v * (ti - v - 1 ) ∈ [0,...,W - 1]. i,usd i,tlink m i,usd m i,tlink

(M5) Load instructions are correctly executed. For every memory load operation (in other words, for v₂ and v₃), we have to check that v_i = v_i,vinit when v_i,usd = 0 and v_i = v_i,vlink when v_i,usd = 1:

var	v2=v2_vinit+v2_usd*(v2_vlink-v2_vinit)	← for v2
var	v3mem=v3_vinit+v3_usd*(v3_vlink-v3_vinit)	← for v3, unless v3 replaced by i4
var	v3=v3mem(1-e_const)+e_consti4	← in XXXi instructions (e_const=1)

(M6) Memory addresses are correct. Finally, we need to check that we are accessing the right memory addresses.4 For most instructions the addresses for v₁, v₂, v₃ are simply the immediates i₁,i₂,i₃, but we wish to allow additional flexibility for mov’s and jmp’s; this is important for an assembly language that wants to support pointers of any kind. Specifically, in addition to

mov	Mem[i₁]:=Mem[i₃]
jmp/jmpc/jmpnc	i₃

we allow a more general version

mov	Mem[i₁ + s₁ Mem[i₂]]:=Mem[i₃ + s₃ Mem[i₂]]
jmp/jmpc/jmpnc	Mem[i₃ + s₃ Mem[i₂]]

This is easy to implement as follows:

var	v1_addr = i1+s1*v2
var	v2_addr = i2
var	v3_addr = i3+s3*v3

5.9. Program lookup. Finally, we mix the program counter column pc and all the control columns into one column InstExe, with some Fiat-Shamir constant α. We also mix the corresponding columns in the original source program into a column InstProg, and we note that this column (like, for example, EvenBits as well) is accessible to Verifier. Verifier knows what the program is, and needs to be certain that this is the one that Prover has executed. In other words, Verifier needs to be certain that all entries in InstProg are contained in InstExe. This is a lookup

lookup[β,γ]	InstProg InstExe

Not that is is certainly not an injective lookup, as the same program entry is generally executed multiple times, at different times t. The lookup does not reveal those times, only that the correct program was executed.

6. Prover/Verifier protocol

In Appendix A we summarise all the column declarations, constraints and instructions described in �5 in a configuration file. We are now ready to explain what the Prover/Verifier code looks like for verifying the correctness of an execution trace for our VM. This is done in two steps:

The configuration file is analysed to generate the number and the names of columns and the set of all primitives, correctly ordered. These are the quadratic constraints on the columns (possibly relaxed or with boundary), permutations, lookups and instruction settings. This is done once per configuration, and the output of this process is pieces of code (say, in C or assembly), that are configuration specific, inserted into the Prover/Verifier code.
The Prover/Verifier code is generally independent of the configuration apart from the pieces in (1) and a few external functions (such as highestbit, oddbits, CompleteMemoryCycles,... in our configuration). It handles the FRI machinery, but is agnostic to the exact columns and constraints.

The reason for this split is that the Prover/Verifier code that we get in the end can be highly optimised, and also adapted to a specific configuration. If, for example, a VM with registers or a VM with a tailored set of instructions is needed for a certain application, its configuration can be compiled quickly to an optimised Prover/Verifier code.

6.1. VM configuration. This is how the configuration file (Appendix A) is processed:

The configuration file is read, macros are processed and comments are removed. What remains is a sequence of commands var, constraint, inst, etc., one line per command. Underline signs are merged into identifiers
Lookups and injective lookups with the same table are grouped together
Permutations and lookups are replaced by column and constraint generating primitives (see �4.5–�4.9)
Derived columns for which there is a linear expression in other columns can be eliminated; more generally, at this stage columns and constraints can be rewritten to any specified degree bound (not just quadratic, like in our example)
Derived and external columns are ordered in order that they can be generated
Column-dependent pieces of Prover/Verifier code are generated

6.2. Prover code for configured VM.

6.2.1. Global settings.

Set global constants p (prime), N (memory size and bound the number of time steps) and W (VM word size)
Let ζ ∈ F_p be an element of order 4N, and s ∈ F_p of order not dividing 4N
Let x_i = ζ⁴ⁱ for i = 1,...,N, coordinates for the original trace
Let y_i = sζⁱ for i = 1,..., 4N, coordinates for the disguised trace
Choose a hash function to be used by the Merkle trees
Choose a pseudo-random generator (PRNG) for elements of F_p

6.2.2. Initialisation and program loading.

Create column array and fill it with zeroes
Read source program, initialising all control columns
Initialise memory and other initial tables (e.g. MeInit, EvenBits)

6.2.3. Program execution.

Execute program step by step
Along the way, populate all derived and external columns, excepts those that are global or rely on other global columns or Fiat-Shamir constants in their derivation
After the program has been executed, populate global columns except those that rely on Fiat-Shamir constants (e.g. complete memory cycles)

6.2.4. Stage 1 column commitment.

Interpolate all columns C_j computed so far from the execution trace with polynomials f_j of degree N that have those values at the x_i (using NTT, the number-theoretic transform)
Evaluate each f_j at all the y_i, using NTT again
Create a Merkle tree with 4N leaves Hash(i,f₁(y_i),f₂(y_i),...) for i = 1,..., 4N
Let r₁ be the Merkle tree root, viewed as an element of F_p

6.2.5. Fiat-Shamir constants.

Compute all Fiat-Shamir constants used for mixing, permutations, lookups etc. (in our implementation α, β, γ, μ) with PRNG seeded to the Merkle root r₁

6.2.6. All other columns, verification polynomial and commitment #2.

Compute all remaining columns (e.g. InstProg) in the execution trace (*)
Compute the values of verification polynomial using the constraints
Interpolate the verification polynomial at x_i and evaluate it at the y_i (**)
Commit all columns (*) and (**) as in �6.2.4 to a new Merkle tree with root r₂.

6.2.7. FRI parameters.

Seed PRNG with its current seed from �6.2.5 plus r₂
Using PRNG, pick a deep evaluation point in F_p, checking that it is not one of the x_i or the y_i
Generate FRI branch locations with PRNG

6.2.8. FRI low degree proof.

Evaluate the verification polynomials and all the polynomials interpolating the columns at the deep point and its ‘time step predecessor’, deep point dividing by ζ⁴; these are the deep taps
Generate FRI degree proof with branch locations from �6.2.7

6.2.9. Prover → Verifier. The proof of correct program execution consists of

Merkle tree roots r₁ and r₂
Deep taps
FRI low degree proof for the verification polynomial, consisting of a number of branches for the Merkle trees

6.3. Verifier code for configured VM.

Get r₁, r₂, deep taps and the FRI proof as in �6.2.9 from Prover
Set the same global settings as in �6.2.1
From r₁, set the same Fiat-Shamir parameters as in �6.2.5
From r₂, set the same FRI parameters as in �6.2.7
Interpolate polynomials for columns that are not determined by the constraints but shared between Prover and Verifier (e.g. InstProg, EvenBits)
From these values and the deep taps compute the values of all column polynomials at the deep point
Deduce the values of the verification polynomial at all the FRI branch locations
Check that these values together with the column values hash to the leaves of the FRI branches
Check the all the FRI lead to the correct Merkle roots
Check that the FRI low degree proof is correct

References

[A⁺22] H. M. Ardevol, J. B. Mel�, D. Lubarov, J. L. Mu�oz-Tapia, RapidUp: Multi-Domain Permutation Protocol for Lookup Tables, Cryptology ePrint Archive, Report 2022/1050, 2022, https://eprint.iacr.org/2022/1050.

[BS⁺13] E. Ben-Sasson, A. Chiesa, E. Tromer, M. Virza, Succinct non-interactive arguments for a von neumann architecture, Cryptology ePrint Archive, Report 2013/879, 2013, https://eprint.iacr.org/2013/879.

[BS⁺18] E. Ben-Sasson, I. Bentov, Y. Horesh, M. Riabzev, Fast reed-solomon interactive oracle proofs of proximity, in: 45th international colloquium on automata, languages, and programming (icalp 2018), Schloss Dagstuhl-Leibniz-Zentrum f�r Informatik, 2018.

[BS⁺19] E. Ben-Sasson, L. Goldberg, S. Kopparty, S. Saraf, DEEP-FRI: Sampling outside the box improves soundness, preprint, 2019, arXiv: 1903.12243.

[B⁺18] J. Bootle, A. Cerulli, J. Groth, S. Jakobsen, M. Maller, Arya: Nearly Linear-Time Zero-Knowledge Proofs for Correct Program Execution, Cryptology ePrint Archive, Report 2018/380, 2018, https://eprint.iacr.org/2018/380.

[BD23] A. Bulkin, T. Dokchitser, Plonkup scheme with multiple queries, Cryptology ePrint Archive, Report 2023/066, 2023, https://eprint.iacr.org/2023/066.

[B⁺23] B. B�nz, J. Bootle, D. Boneh, A. Poelstra, P. Wuille, G. Maxwell, Bulletproofs: Short Proofs for Confidential Transactions and More, 2018 IEEE Symposium on Security and Privacy (2018), 315–334.

[Ce19] Efficient Zero-Knowledge Proofs and their Applications, PhD thesis, UCL, 2019.

[GW20] A. Gabizon, Z. J. Williamson, plookup: A simplified polynomial protocol for lookup tables, Cryptology ePrint Archive, Report 2020/315, 2020, https://eprint.iacr.org/2020/315.

[GMR85] S. Goldwasser, S. Micali, C. Rackoff, The knowledge complexity of interactive proof-systems (extended abstract), in: Proceedings of the 17th Annual ACM Symposium on Theory of Computing, May 6-8, 1985, Providence, Rhode Island, USA, pp 291–304, 1985.

[KZG] A. Kate, G. M. Zaverucha, I. Goldberg, Constant-Size Commitments to Polynomials and Their Applications, in: Abe, M. (eds) Advances in Cryptology - ASIACRYPT 2010, LNCS, vol 6477 (2010), Springer, Berlin, Heidelberg.

[P⁺22] L. Pearson, J. Fitzgerald, H. Masip, M. Bell�s-Mu�oz, J. L. Mu�oz-Tapia, PlonKup: Reconciling PlonK with plookup, Cryptology ePrint Archive, Report 2022/086, 2022, https://eprint.iacr.org/2022/086.

Appendix A. VM Configuration

Configurator commands
One line per command, with the exception of multi-line macros

const k1, ...	global constant declaration (N=#time steps, W=#bits, ...)
const FiatShamir α, ...	random constants used in constraints, computed after the
	first commitment via Fiat-Shamir heuristic
var c1=C1 [except t=t0], ...	new derived column and its polynomial expression
constraint C1=C2 [except t=t1], ...	new polynomial constraint between columns
boundary c1(t1)=v1, ...	boundary condition on a column
external c1,c2, ... : cmd	call C++ code at every time t, replacing c1C[c_c1][t],...
	columns in rhs (except c1,..., and those referred with ~)
	need already to be computed at this time
external c1=cmd	shorthand for external c1: c1=cmd
global c1,c2, ... : cmd	call C++ code with strings replaced c1 c_c1,...
	all columns in rhs (except c1,..., and those referred with ~)
	need to be computed first for all t
global c1=cmd	shorthand for global c1: c1=cmd
control c1[=v1],...	define a control column and its default value if ≠0
inst name i1,... : e1=v1,...	define a new instruction name with operands i1,...
	and control flags e1=v1,...
permutation[β] c1,c2,... ~d1,d2,...	permutation constraint ⋃ c_i ~⋃ d_i with F-S constant α
lookup[β,γ] T F1,F2,...	add F1,... to the lookup “F1<T” with F-S constants β, γ
ilookup[β,μ] T F1,F2,...	add F1,... to the injective lookup “F1<T except μ”
	with F-S constants β, μ
macro name par1,par2,...= { code }	define macro name with parameters par1,... and a given code
	macros are processed before “ ” are merged in identifiers
// ...	inline comment, to end of line
⟨line⟩ \ ⟨line⟩ \ ...	ignore line breaks, merging several lines into one command

Configuration code

const N,W,W4
const FiatShamir mu,alpha,beta,gamma         // from Verifier via Fiat-Shamir

var t=t#+1 except t=0                        // time column
boundary t(0)=0

external v1,v4,flag : execute(c_prev,c,~v1,~v4,~flag,v2,v3)   // v2,v3 -> v1(output),v4(advice),flag
global EvenBits = EvenBitsTable(k,Round(Log(2,W)),N)          // EvenBits: sums of powers of 4
global MeInit: for i:=1 to #M do C[MeInit][i]:=k!M[i]; end for;   // MeInit: Initial Memory (private)

// --- Macros, columns and constraints

macro word C = {                // decompose C into evenbits and odd bits
  external C_o = oddbits(C)     // and check that their values are in [0..W-1]
  var C_e = C-2*C_o
  lookup[beta,gamma] EvenBits C_e,C_o
}

macro Mem v,address,t = {
  external2 v_vlink, v_tlink
  word v
  var v_addr = address
  var v_Me = v_usd+alpha*v_addr+alpha^2*v_vinit
  var v_Me0 = (1-v_usd)*v_Me+mu*v_usd
  var v_Me1 = v_Me+mu-v_usd-v_Me0
  constraint (v_Me0-mu)*(v_Me1-mu)=0
  ilookup[beta,mu] MeInit v_Me0
  lookup[beta,gamma] MeInit v_Me1
  var v_mpos=(v_usd-1)*(t-v_tlink)+v_usd*(t-v_tlink-1)
  word v_mpos
  var v_piL = v_addr + alpha*(v_tlink) + alpha^2*v_vlink
  var v_piR = v_addr + alpha*(t)       + alpha^2*v
}

macro sign v = {
  external v_sigma = highestbit(v)
  constraint v_sigma*(1-v_sigma)=0
  var v_signed=v-v_sigma*W
  var v_sgncheck=v_o+(1-2*v_sigma)*W4
}

Mem v1,i1+s1*v2,3*t        // v1: for writing, v2,v3: for reading
Mem v2,i2      ,3*t-2      // v4: advice variable
Mem v3,i3+s3*v2,3*t-1      // process v2, then v3, then write to v1
word v4                    // all 4 have to be words

var v2=v2_vinit+v2_usd*(v2_vlink-v2_vinit)      // memory load constraints
var v3mem=v3_vinit+v3_usd*(v3_vlink-v3_vinit)   // determine v2 and v3
var v3=v3mem*(1-eCONST)+eCONST*i4               // unless v3 is replaced by i4 (imm) by XXXi instructions

external v2_vinit, v2_usd: memload(3*t-2,~V‘M,v2_addr,   ~v2_vinit,~v2_vlink,~v2_tlink,~v2_usd)
external v3_vinit, v3_usd: memload(3*t-1,~V‘M,v3_addr,   ~v3_vinit,~v3_vlink,~v3_tlink,~v3_usd)
external v1_vinit, v1_usd: memsave(3*t  ,~V‘M,v1_addr,v1,~v1_vinit,~v1_vlink,~v1_tlink,~v1_usd)
                                                                                          
                                                                                          

global v1_vlink, v1_tlink, v2_vlink, v2_tlink, v3_vlink, v3_tlink:      \
  CompleteMemoryCycles(~V‘C,~V‘M,<3,[0,-2,-1]>,[v1_usd,v2_usd,v3_usd],  \
  [v1_tlink,v2_tlink,v3_tlink],[v1_vlink,v2_vlink,v3_vlink])

permutation[beta] v1_piL,v2_piL,v3_piL ~ v1_piR,v2_piR,v3_piR

control i1,i2,i3,i4            // immediates and their defaults; for i1,i2,i3 this
                               // refers to Mem[0]=Mem[N] - advice memory cell
control epsilon,delta,lambda   // reusable control columns for various instructions
control s1,s3                  // dynamic memory address control columns

sign v1     // signs (highest bits) of v1,v2,v3 for
sign v2     // SMULH, CMPG, CMPGE (signed operations)
sign v3

var r1 = (eAND+eOR+eXOR)*(v2_o+v3_o) + eMOD*MODrange + (eSSUM+eSPROD)*v1_sgncheck
var r2 = (eAND+eOR+eXOR)*(v2_e+v3_e)                 + (eSSUM+eSPROD)*v2_sgncheck
var r3 =                                               (eSSUM+eSPROD)*v3_sgncheck

word r1     // auxiliary variables for range checks, depending on the instruction
word r2
word r3

control  j1=0,j2=0           // controls for conditional and unconditional jumps
var      jsel = j1+j2*flag   // =0 (pc->pc+1), =1 (jmp), =flag(jmpc), =1-flag(jmpnc)
var      pc = (1-jsel)*(pc#+1)+jsel*v3 except t=0    // change of pc constraint
boundary pc(0)=1

control    cFLAG
constraint flag*(1-flag)=0                // flag = 0 or 1
constraint (flag-flag#)*(1-cFLAG) = 0     // only allowed to change if control cFLAG=1

var        zer = (eAND+eOR+eXOR+eSPROD+eSSUM)*v1 + ePROD*hi + eMOD*v3 \
               + (1-(eAND+eOR+eXOR+eSPROD+eSSUM)-ePROD-eMOD)*(1-flag)
constraint flag*zer=0                     // zer=0 <=> flag=1
external   Inv = invert(flag+zer)         // with zer set to v1 or hi when necessary
constraint (flag+zer)*Inv=1               // or 1-flag otherwise (no effect)

var        SUM = v1-(delta*v2+epsilon*v3+lambda*(i4-W*flag))
var        hi = epsilon*v4+(1-epsilon)*v1
var        lo = epsilon*v1+(1-epsilon)*v4     // hi=v1, lo=v4 for epsilon=0 (default)
var        PROD = v2*v3 - (lo+W*hi)
var        AND = v1-(2*r1_o+r2_o)
var        XOR = v1-(2*r1_e+r2_e)
var        OR  = v1-(2*r1_o+r2_o+2*r1_e+r2_e)
var        rem=hi   // re-used, aliases   // UMOD => v1=rem (output), v4=quo (advice)
var        quo=lo                         // UDIV => v1=quo (output), v4=rem (advice)
var        MOD = v2-(quo*v3+rem)                             // v2=quo*v3+rem
var        MODrange = (1-flag)*(v3-rem-1) + flag*(-quo)      // flag=0 => word v3-rem-1 <=> rem<v3,
var        SSUM  = v1 - (v2_signed+i4-v3_signed+W*flag)      // flag=1 => quo=0 => rem=0
var        SPROD = v2_signed*v3_signed - (v4+W*v1_signed)    // for cmpg,cmpge; for smulh
var        CMOV = v1-flag*v2-(1-flag)*v3
control    ePROD,eSUM,eCONST,eMOD,eSSUM,eSPROD,eAND,eOR,eXOR,eCMOV
constraint SUM*eSUM=0, PROD*ePROD=0, eSSUM*SSUM=0, eSPROD*SPROD=0, eMOD*MOD=0
constraint eOR*OR=0, eAND*AND=0, eXOR*XOR=0, eCMOV*CMOV=0

Instruction set
Operands and control variables

inst	and i1 i2 i3	: e_and=1, c_flag=1
inst	andi i1 i2 i4	: e_and=1, c_flag=1, e_const=1
inst	or i1 i2 i3	: e_or=1, c_flag=1
inst	ori i1 i2 i4	: e_or=1, c_flag=1, e_const=1
inst	xor i1 i2 i3	: e_xor=1, c_flag=1
inst	xori i1 i2 i4	: e_xor=1, c_flag=1, e_const=1
inst	add i1 i2 i3 i4	: e_sum=1, c_flag=1, δ=1, λ=1, ϵ=1
inst	addi i1 i2 i4	: e_sum=1, c_flag=1, δ=1, λ=1
inst	sub i1 i2 i3 i4	: e_sum=1, c_flag=1, δ=1, λ=-1, ϵ=-1
inst	subi i1 i2 i4	: e_sum=1, c_flag=1, δ=1, λ=-1
inst	mull i1 i2 i3	: e_prod=1, ϵ=1, c_flag=1
inst	mulli i1 i2 i4	: e_prod=1, ϵ=1, c_flag=1, e_const=1
inst	umulh i1 i2 i3	: e_prod=1, c_flag=1
inst	umulhi i1 i2 i4	: e_prod=1, c_flag=1, e_const=1
inst	smulh i1 i2 i3	: e_sprod=1, c_flag=1
inst	smulhi i1 i2 i4	: e_sprod=1, c_flag=1, e_const=1
inst	udiv i1 i2 i3	: e_mod=1, ϵ=1, c_flag=1
inst	udivi i1 i2 i4	: e_mod=1, ϵ=1, c_flag=1, e_const=1
inst	umod i1 i2 i3	: e_mod=1, c_flag=1
inst	umodi i1 i2 i4	: e_mod=1, c_flag=1, e_const=1
inst	cmpe i2 i3	: e_xor=1, c_flag=1
inst	cmpei i2 i4	: e_xor=1, c_flag=1, e_const=1
inst	cmpa i3 i2 i4	: e_sum=1, c_flag=1, δ=1, ϵ=-1, λ=-1
inst	cmpai i3 i4	: e_sum=1, c_flag=1, ϵ=-1, λ=-1
inst	cmpg i3 i2 i4	: e_ssum=1, c_flag=1
inst	mov i1 i3 s1 s3 i2	: e_sum=1, ϵ=1
inst	movi i1 i4 s1 i2	: e_sum=1, ϵ=1, e_const=1
inst	cmov i1 i2 i3	: e_cmov=1
inst	cmovi i1 i2 i4	: e_cmov=1, e_const=1
inst	jmp i3 s3 i2	: j1=1, j2=0
inst	jmpi i4	: j1=1, j2=0, e_const=1
inst	jmpc i3 s3 i2	: j1=0, j2=1
inst	jmpci i4	: j1=0, j2=1, e_const=1
inst	jmpnc i3 s3 i2	: j1=1, j2=-1
inst	jmpnci i4	: j1=1, j2=-1, e_const=1

Adapt Framework solutions

Email address: [email protected]