Polynomial Protocols for R1CS String Manipulation
In this post, we are going to describe two polynomial protocols that Alin Tomescu and I came up with when working on the Aptos Keyless circuit, an R1CS program written in the Circom domain-specific language to enable verification of OIDC credentials, which requires parsing a JWT string after verifying a signature computed over it. This is more difficult than it sounds - simple array assignment in Circom can create non-quadratic constraints, which are not allowed when programming in R1CS. The other solutions we found for string manipulation were either not efficient, or difficult to understand (and therefore difficult to implement, a frightening thing given how bug-prone handrolling zk proof circuits can be). So we created the following two protocols, one for concatenating two strings, and another for substring extraction.
For notation, when dealing with a string or array
Concatenating Two Strings
Our first problem is as follows: given two strings
In R1CS-land, however, we aren’t interested in doing computations ourselves but in verifying the correctness of a computation. Often this is easier to implement than performing the computation directly in Circom. This gives us the following problem: Given three strings
There’s another subtley here to deal with: because we are writing an arithmetic circuit, the type and number of constraints must be fixed in advance. This means that
Just to make things very explicit, if we have original strings
First we compute a random challenge
Finally, we denote the actual length of
and also check that all values of of
Correctness of the protocol above is easy to verify. And as might be expected, soundness here follows from the Schwartz-Zippel lemma.
As measured by the Circom compiler, this protocol uses around
NOTE: This protocol does verify the actual length of
Extracting a Substring
Our next problem is: Given a larger string
As before, the context of needing to verify computation translates this problem into the following: given a string
As before, we will need
Let’s give a concrete example, as we did before. If we have original strings
This protocol is slightly more involved than the last one, as we will need to make use of some interesting quirks of developing R1CS circuits. As mentioned before, we cannot simply copy parts of an array. To get around this, first we compute an array
Next we compute a string
Similar to our concatenation protocol, we next define and compute the polynomials
To verify, we check that
As above, completeness is easy to verify, and soundness follows from Schwartz-Zippel.
As measured by the Circom compiler, this uses roughly