For ethereum 2018 is the year of infrastructure. This is the year when early adoption will test the limits of the network, renewing focus on technologies built to scale ethereum.
Ethereum is still in its infancy. Today, it isn’t safe or scalable. This is well understood by anyone who works closely with the technology. But over the last year, the ICO-driven hype has begun to far exaggerate the current capabilities of the network. The promise of ethereum and web3 — a safe, easy to use decentralized internet, bound by a common set of economic protocols, and used by billions of people — is still on the horizon, and will not be realized until critical infrastructure is built.
The projects working to build this infrastructure and expand the capabilities of ethereum are commonly referred to as scaling solutions. These take many different forms, and are often compatible or complimentary with each other.
In this long post I want to dive deep into one category of scaling solution: “off-chain” or “layer 2” solutions.
- First, we’ll discuss the scaling challenges of ethereum (and all public blockchains) in general.
- Second, we’ll cover the different approaches to solving the scaling challenge, distinguishing between “layer 1” and “layer 2” solutions.
- Third, we’ll delve into layer 2 solutions and explain how they work — specifically, we’ll talk about state channels, Plasma, and Truebit
This article focuses on giving the reader a thorough and detailed conceptual understanding of how layer 2 solutions work. But we won’t dig into code or specific implementations. Rather, we focus on understanding the economic mechanisms used to build these systems, and the common insights that are shared between all layer 2 technologies.
1. The scaling challenges of public blockchains
First, it’s important to understand that “scaling” isn’t a single, specific problem. It refers to a collection of challenges that must be overcome to make ethereum useful to a global user base of billions of people.
The most commonly discussed scaling challenge is transaction throughput. Currently, ethereum can process roughly 15 transactions per second, while in comparison Visa processes approximately 45,000/tps. In the last year, some applications — like Cryptokitties, or the occasional ICO — have been popular enough to “slow down” the network and raise gas prices.
The core limitation is that public blockchains like ethereum require every transaction to be processed by every single node in the network. Every operation that takes place on the ethereum blockchain — a payment, the birth of a Cryptokitty, deployment of a new ERC20 contract — must be performed by every single node in the network in parallel. This is by design — it’s part of what makes public blockchains authoritative. Nodes don’t have to rely on someone else to tell them what the current state of the blockchain is — they figure it out for themselves.
This puts a fundamental limit on ethereum’s transaction throughput: it cannot be higher than what we are willing to require from an individual node.
We could ask every individual node to do more work. If we doubled the block size (i.e., the block gas limit), it would mean that each node is doing roughly double the amount of work processing each block. But this comes at the cost of decentralization: requiring more work from nodes means that less powerful computers (like consumer devices) may drop out of the network, and mining becomes more centralized in powerful node operators.
Instead, we need a way for blockchains to do more useful stuff without increasing the workload on individual nodes.
Conceptually, there are two ways we might go about solving this problem:
I. What if each node didn’t have to process every operation in parallel?
The first approach rejects our premise — what if we could build a blockchain where every node didn’t have to process every operation? What if, instead, the network was divided into two sections, which could operate semi-independently?
Section A could process one batch of transactions, while Section B processed another batch. This would effectively double the transaction throughput of a blockchain, since our limit is now what can be processed by two nodes at the same time. If we can split a blockchain into many different sections, then we can increase the throughput of a blockchain by many multiples.
This is the insight behind “sharding”, a scaling solution being pursued by Vitalik’s Ethereum Research group and others. A blockchain is split into different sections called shards, each of which can independently process transactions. Sharding is often referred to as a Layer 1 scaling solution because it is implemented at the base-level protocol of ethereum itself. If you want to learn more about sharding, I recommend this extensive FAQ and this blog post.