For any developer who has tried to build a decentralized application used by the masses, it is clear that Ethereum—in its current manifestation—isn’t quite ready. Transactions take a long time to clear and paying for every basic function is expensive and creates a poor user experience. It all boils down to a general ‘scalability’ problem. Both poor throughput and cost have been massive barriers to any meaningful adoption.
Today, Ethereum processes roughly 500,000 transactions per day, and at full capacity, can process about 13 transactions per second. While these transactions don’t require a third party to validate them, centralized counterparts can process transactions much more efficiently. For example, Visa’s payment network processes 150 million transactions per day — orders of magnitude more than any decentralized blockchain network has been able to achieve.
The main reason behind Ethereum’s scalability bottleneck is that each node in the network has to process each transaction. Remember that nodes perform the job of verifying that the miners’ work is valid. They play an integral role within the network as they’re the main check on the miners if they decide to act maliciously. Similarly, each node keeps an accurate copy of the current network state, meaning they don’t need to rely on a third party to confirm the balance of every account and smart contract.
When it comes to scaling Ethereum, the most important question is…
How much work is each node required to do? The more transactions that occur on the network, the more work that nodes have to perform. This work isn’t easy, node operators incur high fixed costs to purchase the equipment and it takes a high degree of technical know-how to set up and maintain a node.
If the status quo persists, it is likely that node count falls off significantly. Fewer nodes on the network means fewer checks on miners and more centralization in the network overall.
On-Chain vs Off-Chain Scaling
There are generally two schools of thought when its comes to scaling public blockchain networks: on-chain scaling and off-chain scaling.
On-chain scaling refers to any increase in capacity at the core blockchain layer. The most common on-chain scaling prescription is increasing the amount of data that can fit in each block. By raising the data limit, you can fit more transactions in each 13 second block interval. Other examples of on-chain scaling are centered around signatures and manipulating the amount of data required for a valid transaction.
Critics of on-chain scaling point out the large computational requirements that node operators must meet to participate in the network. The fewer full nodes there are, the more centralized the network becomes since it’s easier for miners to behave maliciously. Apart from squeezing out full nodes, on-chain scaling also requires strong consensus from the community before it’s implemented. This has proved difficult to do since any change would need to be approved through a hard fork.
On the other side of the aisle is off-chain scaling, which commonly refers to building additional layers that can handle transactions without using the core blockchain. Common examples of off-chain scaling include batching multiple payments into one transaction, payment channels, and sidechains. The core idea behind off-chain scaling is that the main blockchain should only be used as a trust and arbitration layer. Proponents argue that if we want transactions to persist across every node in the network, they should be limited to high value transactions.
While both on-chain and off-chain scaling have supporters, a larger portion of the community has coalesced around off-chain scaling as the most immediate way forward. The main reason for this is that off-chain better preserves decentralization, which is the trait that the Ethereum community ultimately wants to protect long term.
Off-chain scaling is frequently called ‘layer 2’ scaling because it involves moving transactions to layers that sit on top of the base Ethereum blockchain. This could theoretically extend to 3rd and 4th layers, but so far, development has focused on the layer immediately above Ethereum. Layer 2 scaling requires additional hardware and complex software to be built, so it often takes longer for the network to feel its effects.
It’s also important to note that on-chain scaling hasn’t been completely ruled out the same way it has in Bitcoin. One of the highly anticipated scaling solutions on Ethereum’s roadmap, Sharding, is an optimization to the core blockchain. The issue with Sharding, and on-chain scaling in general, is that changing the core protocol is technically difficult. As such, the community isn’t willing to rely solely on on-chain scaling. Moving activity to layers that sit above the main chain offer the quickest path to scalable applications.
Ethereum has more or less decided on a way forward. Ethereum 2.0 proves to be one of the most ambitious undertakings in the blockchain space, but if it works as intended, will be one of the biggest breakthroughs for decentralization. At the core of Ethereum 2.0 are three scalability solutions designed to make Ethereum more widely used.
State Channels are a way for users to transact peer to peer ‘off-chain,’ only sending messages onto the main chain when they want to exit the channel. They are constructed similarly to Bitcoin’s Lightning Network. Essentially they are payment channels in which users transact value outside of the main chain, and only revert back to it when they want to settle the channel. It’s very much like a bar tab. State channels are particularly interesting because they allow users to send state updates, like updates to a smart contract, rather than just money.
State Channels are functionally simple. Let’s say Alice and Bob want to send each other some ETH. If they expect to be sending each other multiple transactions, they could conduct all of those payments off-chain and then come back to the main-chain when they stop transacting. To open up a channel, Alice and Bob would both send ETH to a multisig address. They can then send each other as many transactions as they want as long as they store each transaction signature off-line. When they want to close the channel, they pay one on-chain transaction fee and receive their funds.
There are a number of interesting teams pushing state channel technology forward. Counterfactual is making it easy for developers to integrate state channels into their dApps, Spankchain is making it easy for their users to access scalable microtipping, and Connext is allowing every blockchain application to set up and use a state channel hub.
Plasma is a framework for the creation of child blockchains connected to Ethereum that allow for more scalable and complex usage. Plasma is a ‘layer 2’ technology because it enables blockchains to operate on top of the main chain. Child chains are anchored to the main chain through a root smart contract. This contract creates a permanent record of the state and stipulates the rules for the child chain. Users must follow the rules set out in the root contract if they want to get their assets back once they move back to the main chain.
Plasma seems a bit convoluted from first glance, but the concept is quite simple: You can create smaller blockchains of infinite complexity as long as the Ethereum network can verify that everything on those blockchains is valid. While it may seem like there are a lot of security threats, having Ethereum as an arbitration layer forces economically rational actors to behave honestly.
The Ethereum community is extremely excited about Plasma and there some notable teams building out the core infrastructure. OmiseGo has long been working on a Plasma implementation to better scale their payment network, Loom Network is creating Plasma sidechains to enhance the user experience of their collectible game, and the Ethereum Foundation’s Karl Floersch recently put out a simple Plasma Cash spec.
Sharding is one of the last components of Ethereum’s scaling roadmap and its one of the most ambitious. Sharding is different from State Channels and Plasma in that it’s an on-chain performance improvement for the Ethereum blockchain. Sharding refers to splitting the entire Ethereum network into multiple portions called ‘shards.’ Each shard is essentially a separate blockchain with its own state.
Sharding is also definitely one of the most complicated solutions on Ethereum’s roadmap, a lot of research and testing still needs to be done before it can be implemented. Specifically Ethereum developers need to build a cross-shard communication mechanism that will allow smart contracts on one shard to talk to other smart contracts on different shards.
While Sharding is the most technically complex solution, there are already 8 different clients being built by ecosystem developers. For a full list of each client, check out this list.
Wrapping It Up
There’s no debate that Ethereum has a stronghold over developer mindshare. It is the first network that enabled developers to build truly unstoppable applications with global distribution from day one. But competition is coming fast, and at the end of the day, the crown jewel for dApp developers is users.
As it stands today, Ethereum won’t be able to the scale necessary for millions of users. If it wants to retain the same level of decentralization, it will have to look for new ways to structure use around the main blockchain. The solutions set out in Ethereum’s scaling roadmap are extremely ambitious, but if successful, will leave Ethereum ready for mainstream usage. The main caveat, this roadmap is very experimental and there is no empirical proof that it will all work as planned.
Regardless of any doubt, the Ethereum community is pushing ahead and has thousands of the smartest people working on scaling. We should start to see tangible benefits in the next 3 – 6 months, primarily in transaction fees, finality, and user experience. From there, we will start experimenting on sidechains connected to a more robust and trusted Ethereum. Once it is cheap and easy to move frictionlessly between sidechains and Ethereum, we should start to see some breakout usage.