Aurora Continues To Push EVM Scaling With Engine 2.4.0 Release
Despite its innocuous-seeming version number, this release has had a big impact on several of our partners and thus furthers Aurora’s mission to expand the Ethereum ecosystem. Let’s get to it!
We are happy to announce the latest release of Aurora’s EVM Engine, version 2.4.0. Despite its innocuous-seeming version number, this release has had a big impact on several of our partners and thus furthers Aurora’s mission to expand the Ethereum ecosystem. Let’s get to it!
Summary:
- Quick reminder
- Why is this update important for both users and developers ?
- The Aurigami case
- Technical details
1) Quick (and useful) reminder
Aurora is built on the NEAR blockchain, which differs from Ethereum 1.0 in a number of ways. The important one for this story is that the NEAR protocol places limits on how many things each transaction can do, while the Ethereum protocol only sets limits on each block (though users can set limits for their own individual transactions).
Therefore, Aurora must (in theory) fit an entire Ethereum block into a single NEAR transaction. This is quite challenging, but with Engine release 2.4.0 we have pushed some performance optimizations which move us closer to this ambitious goal.
2) Why is this update important for both users and developers ?
Many defi projects were meeting the error Exceeded the maximum amount of gas allowed to burn per contract
and were confused because “gas” and “contract” are terms in Ethereum.
However, this error was actually coming from the NEAR transaction; “gas” was referring to NEAR gas (not EVM gas), and “contract” was specifically referring to Aurora’s Engine smart contract on NEAR. Given this context, the error message is pretty clear; there is some maximum amount of gas a contract is allowed to burn in a single transaction and our Engine is exceeding it trying to run these defi transactions.
Well guess what? With the 2.4.0 engine release, the problem is slowly coming to an end - we reduced the Engine’s gas usage by a factor 2, opening some very important use cases for developers, and making life easier for users by making it twice harder to accidentally hit the limit.
In particular, the amount of EVM computation that can fit in a single transaction now is sufficient to unblock some defi use cases that previously were simply not possible on Aurora.
3) The Aurigami case
We would like to shout out Aurigami, one of our defi partners, for devoting development resources to help us identify some of the opportunities for optimization that lead to this new release.
Thanks to the Engine 2.4.0 release, Aurigami was able to launch on Aurora’s testnet. As they are a borrowing and lending platform, this update was a core requirement to run their platform as smoothly as possible. It is globally even more important as this type of platform is crucial for the decentralized financial infrastructure of the Aurora ecosystem.
We are excited to see our ecosystem continue to grow and look forward to future updates that enable yet more defi use cases you know and love from Ethereum!
4) Technical details
If you’re still here, that’s great! You’re in for a real treat. Let’s get technical.
Let’s start with a quick recap of what the Aurora Engine is and how it works. The Engine is a smart contract written in Rust on the NEAR blockchain. It contains a full EVM interpreter to be able to execute transactions exactly the same as Ethereum, as well as all the auxiliary logic for validating transactions before execution (checking signature, nonce, account balance vs gas price, etc).
When you send a transaction to an Aurora RPC endpoint our infrastructure wraps your signed Ethereum transaction into a NEAR transaction for the Engine. This means each Aurora transaction becomes a NEAR transaction, and therefore must follow the rules of the NEAR protocol. As mentioned in the summary, this is where the NEAR gas issue arises (and not Ethereum, as explained previously).
Naturally this raises the question: “how can the Engine be made more efficient so that it can do more EVM work within the same amount of NEAR gas?”
This is not an easy question to answer (and indeed we continue the optimization effort), but is critical for Aurora’s success. Fortunately for us, core developers at NEAR and developers at Aurigami, one of the affected defi projects, were willing to help us find answers to that question.
There turned out to be a few low-hanging fruits that together had a big impact.
-
Update to a more recent version of Rust The rust compiler is good at optimizing the performance of code, and is always getting better. Simply moving to a later version of rust gave about a 1% improvement essentially for free.
-
Use little-endian representation of numbers on the EVM stack The EVM technically is defined to use big endian encoding, however most modern architectures use little endian. Constantly switching the byte order of numbers going on and off the stack for arithmetic operations was quite costly compared to using little endian on the stack and only switching to big endian when necessary (e.g. writing to storage or returning a result).
-
Short-circuit the account empty check An Ethereum account is considered “empty” if its nonce and balance are both 0 and no code is deployed there. Of course as soon as any one of those conditions is not met we need not check the others.
-
Caching values read from the NEAR state by the Engine contract (#438) (#446) This improvement had the biggest impact on gas usage of the Engine. The full details as to why are given below.
All these changes together made the Engine consume around half the NEAR gas it did previously when executing certain defi transactions. This 2x improvement was enough to unblock Aurigami and allow them to launch on Aurora’s testnet.
To understand why the caching change was so impactful, we need to know a little about how NEAR represents state and how it determines gas costs. As many blockchains do, NEAR uses a trie to store the state as this allows creating fairly compact proofs of what is stored on chain. Each contract on NEAR has its own key-value storage for its state, and this is embedded within the full NEAR state trie.
NEAR is very careful about setting its gas costs. The goal is to ensure they can maintain the 1-second block time which enables the web2-like UX of nearly instant transaction confirmations. To achieve this goal it must be true that no block can take more than 1 second to process, and in particular all the transactions in the block must also fit in this time. Since gas is the measurement of computational work, the idea is simple: ensure the gas costs are carefully measured so that 1 second of work done by a transaction corresponds to a fixed amount of gas, and then set the limits much lower than that.
To have such a stringent connection between gas cost and wall-clock time, there must be a separate cost for essentially all operations a transaction could cause the node to perform. For example, there is read_base
which is the cost for doing any contract state read whatsoever, as well as read_byte
which is the cost on top of the base cost per byte read. Getting back to the state details, there is also the touching_trie_node
cost which is the cost incurred for each node of the state trie that must be touched to read or write a value from storage.
As it turns out, these IO costs made up a large fraction of the total gas used by the Engine; more than 50% in some cases.Therefore, reducing the number of state reads necessary by introducing caching was a net win. Even though the cache logic itself has a cost (not to mention its memory footprint also has a cost), this was less than the amount saved by doing fewer reads.
To add even more nuance here, it turned out that some types of values required different caching logic than others. One particular value, called the “generation” is an implementation detail of the Engine. Each address has a “generation” associated with it. This value allows us to “delete” an entire address without actually incurring much IO cost; we simply increment the generation and ignore any state that was associated with lower generation values.
This means that reading a value from an EVM contract’s storage (which looks like a single read to devs at the Solidity level) turned out to be multiple reads because the Engine needed to read the generation first. However, the generation does not change over the course of a transaction (if an account is deleted and recreated in the same transaction these changes are actually cached in memory by the EVM interpreter, so the generation value written in the NEAR state never changes during a transaction). Moreover, a single transaction generally doesn’t touch many different addresses (even complex defi transactions rarely call more than 10 different EVM contracts). Therefore we can cache the complete address to generation mapping with each read, and this is exactly what was implemented.
On the other hand, EVM contract storage lookup themselves could not be fully cached without significant overhead both because the values could potentially be much larger than the 32-bit number representing the generation, and because the keys could be much more numerous; if a transaction invokes 10 different contracts and each had to read 10 values from storage then that is 100 different keys to cache along with their values.
Moreover, it’s not clear all this caching would be worthwhile since a given value from the contract storage may not be read multiple times, whereas we were essentially guaranteed to need the generation value for an address multiple times. Therefore, the same kind of cache we used for generation values was not suitable for more general caching. However, it did benefit from a different kind of cache. Due to an inefficiency in the EVM interpreter (which we will address properly in the future), it turned out that what looked like reading a value once at the Solidity level, translated into consecutive, repeated reads of that value at the Engine level. There were examples where the same value could be read 3 or 4 times in a row! Fortunately consecutive, repeated reads can be addressed with an extremely simple LRU cache of size 1, and this is what was implemented.
Whew, congrats on making it to the end! I hope you enjoyed this deep dive into the Aurora Engine. We continue this kind of optimization work as we advance our mission to expand Aurora to include more and more use cases. If this sounded interesting to you and you are proficient in Rust, consider taking a look at our careers page, we’d love to hear from you.