Blockchain-Privacy-Sok-Notes

Background

Blockchain

A blockchain is an append-only log, aka ledger. In the context of a cryptocurrency, this log records all the transactions in the system. Miners compete to mine the next block and collect rewards when they do. The current state of the chain is determined by a consensus protocol run by the miners. End users broadcast transactions to the miners, and track only their own transactions, rather than the full chain.

Cryptocurrency

Track currency ownership in one of two ways:
- UTXO (Unspent transaction output): pioneered by Bitcoin. Miners keep track of all unspent transactions, and a client's total balance is the sum of these.
- Account model: pioneered by Ethereum. Each address has an account on the blockchain with a maintained balance value.
Functionality:
- Only currency transfer, perhaps with limited scripting for conditional payments
  - "Bitcoin-like"
- Arbitrary programs on chain for miners to execute
  - "Ethereum-like"
Ledger $\mathcal{L}$ is secure if it satisfies:
- Persistence: Given honest parties $P_1, P_2$ and time rounds $t_1 \le t_2$ , stable state of $\mathcal{L}$ maintained by $P_1$ at $t_1$ is a prefix of the stable state of $\mathcal{L}$ by $P_2$ at $t_2$ with overwhelming probability.
- Liveness: Given liveness parameter $u$ , if transaction $tx$ is broadcast at time round $t$ , then $tx$ is recorded on $\mathcal{L}$ at time at most $t+u$ with overwhelming probability.

Privacy

In addition to security of the ledger and consensus protocol, a privacy-preserving cryptocurrency must satisfy:

Ledger indistinguishability: adversary cannot distinguish between two versions of a ledger that differ in at least one transaction (with non-negligible probability)
Overdraft safety: adversary cannot spend more currency than they own

Smart contracts are general programs, thus achieving privacy is very non-trivial. Some challenges:

Handling concurrency with encrypted values
Inputs from various users with different public key encryptions
What if a given smart contract has security vulnerabilities?
What about interactions between private and public accounts?

Building blocks

Commitments:
- Setup(k) -> pp: takes security param $k$ and outputs public params $pp$ .
- Commit(pp, m, r) -> c: takes message $m$ and random $r$ and outputs commitment $c$ to $m$ .
- Open(pp, c) -> (m, r): outputs decommitment
- Must satisfy:
  - Hiding: $c$ does not reveal $m$
  - Binding: $c$ cannot open to $m'$ where $m \ne m'$ .
- Some are additively homomorphic, which is useful for updating private account balances.
Homomorphic Encryption
Zero Knowledge Proofs: a non interactive ZKP allows a prover to convince a verifier that it knows a witness $\omega$ for some statement $x$ without revealing anything about the witness, beyond what can be inferred by $x$ itself.
- Setup(k, np_spec) -> pp: takes security param $k$ and specifications of NP relation, which determine the set of all valid statements $x$ ; outputs public params pp.
- Prove(pp, x, w) -> pi: outputs proof $\pi$ proving correctness of $x$ (which means it satisfies the NP relation).
- Verify(pp, x, pi) -> {0, 1}: outputs whether or not proof is valid
- Must satisfy:
  - Completeness: any honestly generated proof is accepted by the verifier
  - Soundness: prover cannot convince verifier of false statements
  - Zero Knowledge: the proof $\pi$ reveals nothing about the witness $\omega$ .
- Can optionally satisfy
  - Succinctness: proof size is constant, verification time is linear in size of input
- When all four are satsified, the system is a zk-SNARK
  - aka Zero Knowledge Succinct Noninteractive ARgument of Knowledge
Multi Party Computation: a protocol that allows a set of mutually-distrusted parties to evaluate a function over their private inputs without revealing anything about these inputs, other than what can be deduced from the output.
- Must satisfy:
  - Correctness: outputs must be the same as if computation $f$ computed in the clear
  - Privacy: other than the outputs, no info about intputs is leaked
  - Fairness: either all or none of the parties learn the output

Desirable ZKP Properties

Universality: the property that a single reference string may be used to prove any NP statement.
- Hard to achieve with lightweight ZKPs.
- Mostly important for general computation, not payment transfers. General computation with a non-universal ZKP system means that new reference strings need to be generated for each new application, which is expensive.
- Yet still, 3 out of 4 of the private computing solutions using ZKP propose proof systems with non-universal reference strings.
Trust: many ZKPs require a "trusted setup", wherein a trusted party generates initial parameters. This preprocessing step helps future computations to be more efficient. Those without this requirement are called transparent. In the systems of this survey, such non-transparent ZKPs are also non-universal.
- This required trusted party represents a security issue.
- Motivates transparent proof systems like Bulletproofs.
- Non-transparent systems are still popular because of the efficiency gains.

Existing Solutions

Existing solutions have trended from simpler to more advanced functionality over the last decade; in increasing order,

Private Payments: a response to Bitcoin's lack of privacy. Includes confidentiality and sender/recipient anonymity.
Private Computation: extends the above functionality to allow arbitrary computation, preserving privacy of inputs/outputs. Flavors:
- C1: "On-chain" aka homomorphic encryption: users instruct miners to execute computations over private inputs, producing private outputs
- C2.1 "Off-chain" via ZKP; work is offloaded to a user who does the computation locally and produces a ZKP for minors to verify before accepting the output and updating blockchain state.
- C2.2 "Off-chain" without ZKP delegate computation to trusted entities.
Function privacy: extension to also keep the computation itself private.

Tradeoffs

Prioritizing privacy: prefer C1, C2.1
Prioritizing lightweight users: prefer C1, C2.2
Prioritizing high throughput: prefer C2.1
Balancing lightweight users and high throughput: perhaps C2.2
To maximize user experience and user privacy: UTXO over account model, to simplify things.
Minimizing system fees: prefer C2.1, C2.2

Future Solutions

Likely to be standalone systems; building on top of Ethereum is challenging while preserving privacy.
Each paradigm C1, C2.1, C2.2 has its own strengths and likely will continue to develop.
Multi-user input:
- Multi-key FHE: inefficient schemes, practical off-chain coordination is challenging.
- Extending ZKP with MPC: increases the load on the user, requires staying online throughout execution
ZKP trust spectrum: rather than black-and-white, transparent or trusted setup, new research has begun on a hybrid version with updateable reference strings.