We last visited the topic of transaction prediction in Part IV. I hoped to get here sooner, but it became clear to me as I worked that single-threaded (read: slow as shit) liquidity fetching was going to be a roadblock.
I took some time to integrate process pools for a multiprocessor-aware bot, which enabled high-frequency arbitrage testing at scale on Arbitrum, and big boosts in performance.
With that mostly sorted, and after squishing several bugs, I am ready to return to the topic. This will not be the final post in the series, as I have some touch-up work to do before introducing the next project (a mempool-aware Uniswap V2/V3 backrun bot).
I’ve continued working on the UniswapTransaction
class during that time, and am pleased to report that the class is well-tested. While not perfect, I have thrown tens of thousands of transactions from the mempool at it and observed that it predicts their results with accuracy.
A Reintroduction to UniswapTransaction
The end goal of any class in the degenbot repo is to simplify and abstract away the complicated parts of automated arbitrage.
The early mempool-aware backrun bot did a lot of hand-calculation to predict the results of a particular transaction. It was a simple proof-of-concept that was appropriate for the V2-forked DEX available on Avalanche at the time.
However there were several issues:
Error handling was poor
There was a lot of repeated code
It was only reusable for V2 pools
The calculation logic lived on the “business” side of the bot
It was completely procedural and could not be easily parallelized
Following my exploration of Uniswap V3, I tunneled on the V3LiquidityPool
helper. Developing that class demonstrated the power of abstraction. The advantage of moving complexity into a class is that you only have to get it right once, then you can stop and just use the damn thing. V3LiquidityPool
is the most complicated helper in the repo because V3 pools are so complicated, and it’s a relief to create a helper and know that it can handle all of the moving parts.
UniswapTransaction
is similar. It is built to predict the results of mempool transactions going through Uniswap via a router contract.
Uniswap is a complex ecosystem, with six different router contracts in use at the time of writing (V2 Router, V2 Router2, V3 Router, V3 Router2, the old Universal Router and the new Universal Router). In case you had not seen the news, the first Universal Router had a re-entrancy vulnerability which was fixed by the new deployment. The Uniswap router contracts are immutable, so both will work as long as pools are still around for them to swap tokens through. There are still plenty of folks using both versions at the moment.
At its core, Uniswap still consists of two pool types (V2 & V3). Routers are a convenience contract that allow users to interact with these pools indirectly. We care about the pools and their states, so if we can decode what the router intends to do, we can predict the future state of the pools that it will use. If we can predict the future state of the pools, we can identify arbitrage opportunities.
Dealing with six routers is complex, but they are essentially wrappers over several standalone “tasks” involved with swapping tokens around. Here are the major ones:
Wrap ETH to WETH
Unwrap WETH to ETH
Swap an exact number of tokens into a V2 Pool, receive a guaranteed minimum amount out (or revert)
Swap a maximum amount of tokens into a V2 pool, receive a guaranteed amount out (or revert)
Swap an exact number of tokens into a V3 Pool, receive a guaranteed minimum amount out (or revert)
Swap a maximum amount of tokens into a V3 pool, receive a guaranteed amount out (or revert)
Sweep the balance of tokens held by the router contract to the caller
Improvements
There are various gas-saving shortcuts used by the newer routers. One in particular is using “shortcut” addresses to reduce calldata. The Universal Router uses a value of 0x0000000000000000000000000000000000000001 wherever it intends to represent msg.sender
and a value of 0x0000000000000000000000000000000000000002 wherever it intends to represent its own address. Refer to my introduction to Packed Calldata for more on the gas savings associated with these zero values.
A similar technique is used by the Universal Router to represent the “contract balance” of a token instead of using a large uint256. The value 0x10000000000000000000000000000000000000000000000000000000000000000 represents this balance. I was initially very confused the first time I saw this value in testing, since the equivalent integer is 115792089237316195423570985008687907853269984665640564039457584007913129639936 (the largest possible uint256).
That’s all fine and good for the contract, which knows what its balance is, but we have to work a little harder. To interpret these values correctly, the helper must know the balance of the contract at all points during the swap. And the only way to know this is to track the balance at each step. So the helper now maintains a ledger of pre/post-swap balances for all tokens and addresses involved with the transfer.
Whenever the helper encounters one of these special values, it will substitute the appropriate address or balance. At the end of the simulation, it checks the ledger to ensure that the net balance of the router and all pools along the swap path are zero.
It’s a crude technique to be sure, but I currently call sys.exit()
if an unaccounted balance is found. It’s common to observe several hundred new transaction in the space of a few blocks, so I prefer to kill the script immediately — this forces me to investigate, step through the transaction, and fix the behavior of the helper. The visibility has allowed me to catch many bugs that would have otherwise slipped by in a stream of messages.
If you want to help me test, please be aware of this behavior and report any transactions that hard-terminate in this way! Eventually the bugs will be found, rendering the hard-kill unnecessary, and I will remove it.
LOVE ❤️ NOTE: Big thanks to reader and Discord regular salparadi, who sharedmany transactions that gave the helper trouble. It’s much easier for me to fix these quickly with a target list.
Synergistic Helpers
A “pre-loaded” bot starts with a list of arbitrage paths through known pools and tokens. A key requirement of a mempool-watcher is a way to deal with pools that does not depend on a clean starting point.