Uniswap V3 — Pool Fetcher, Arb Path Builder, Liquidity Event Listener
Fetch, Build, Listen! Apparently We're Training Dogs?
The V3 contracts, designed apparently by the most giga-brained of giga-brains, require a nuanced understanding.
I’m mostly done with the V3 cycle arb helper (Part I, Part II), but there are a few more things that I need to build before it can go into production.
Pool Fetcher
First, we need a method of finding V3 pools. V2 made it easy for us, simply get the total number of pools from the factory using allPairsLength
, retrieve the pool address by index from allPairs
(stopping at the last pool reported by allPairsLength
), and you’re done!
V3 does not make it easy, and it complicates things tremendously because several pools can be created for the same token pair with different fees. What’s worse, the V3 Factory does not store a count or a public index of initialized pools!
However it does emit a helpful event (PoolCreated
) each time a new pool is created. That event has all of the necessary data needed to keep track of the address. The event is defined in IUniswapV3Factory.sol:
/// @notice Emitted when a pool is created
/// @param token0 The first token of the pool by address sort order
/// @param token1 The second token of the pool by address sort order
/// @param fee The fee collected upon every swap in the pool, denominated in hundredths of a bip
/// @param tickSpacing The minimum number of ticks between initialized ticks
/// @param pool The address of the created pool
event PoolCreated(
address indexed token0,
address indexed token1,
uint24 indexed fee,
int24 tickSpacing,
address pool
);
We’re familiar with setting up eth_subscription
notifications for watching events on a live blockchain, but how about retrieving historical events?
Web3py provides a convenient method for this task called getLogs
. It is a convenience method exposed from the events
method that accepts a starting block (fromBlock
) and ending block (toBlock
).
You can provide any range you like for these, but there is a limit to how much work the RPC will do before it errors or times out.
We know the V3 factory was deployed on block 12,369,621, so that establishes the first block where PoolCreated
events could have been emitted. I have determined that a block span of 50,000 works on my local geth node, but you may have different results if you’re using an external RPC. Please experiment and adjust as needed.
ethereum_lp_fetcher_uniswapv3_json.py
from brownie import network, Contract
import sys
import os
import json
import web3
BROWNIE_NETWORK = "mainnet-local"
os.environ["ETHERSCAN_TOKEN"] = "[redacted]"
# maximum blocks to process with getLogs
BLOCK_SPAN = 50_000
FACTORY_DEPLOYMENT_BLOCK = 12369621
try:
network.connect(BROWNIE_NETWORK)
except:
sys.exit("Could not connect!")
exchanges = [
{
"name": "Uniswap V3",
"filename": "ethereum_uniswapv3_lps.json",
"factory_address": "0x1F98431c8aD98523631AE4a59f267346ea31F984",
},
]
w3 = web3.Web3(web3.WebsocketProvider())
for name, factory_address, filename in [
(
exchange["name"],
exchange["factory_address"],
exchange["filename"],
)
for exchange in exchanges
]:
print(f"DEX: {name}")
try:
factory = Contract(factory_address)
except:
try:
factory = Contract.from_explorer(factory_address)
except:
factory = None
finally:
if factory is None:
sys.exit("FACTORY COULD NOT BE LOADED")
try:
with open(filename) as file:
lp_data = json.load(file)
except FileNotFoundError:
lp_data = []
if lp_data:
previous_block = lp_data[-1].get("block_number")
print(f"Found pool data up to block {previous_block}")
else:
previous_block = FACTORY_DEPLOYMENT_BLOCK
factory_contract = w3.eth.contract(
address=factory.address, abi=factory.abi
)
current_block = w3.eth.get_block_number()
previously_found_pools = len(lp_data)
print(f"previously found {previously_found_pools} pools")
for i in range(previous_block + 1, current_block + 1, BLOCK_SPAN):
if i + BLOCK_SPAN > current_block:
end_block = current_block
else:
end_block = i + BLOCK_SPAN
if pool_created_events := factory_contract.events.PoolCreated.getLogs(
fromBlock=i, toBlock=end_block
):
for event in pool_created_events:
lp_data.append(
{
"pool_address": event.args.pool,
"fee": event.args.fee,
"token0": event.args.token0,
"token1": event.args.token1,
"block_number": event.blockNumber,
"type": "UniswapV3",
}
)
with open(filename, "w") as file:
json.dump(lp_data, file, indent=2)
print(f"Saved {len(lp_data) - previously_found_pools} new pools")
After running that on an empty directory, the fetcher finds 9611 new V3 pools:
[
{
"pool_address": "0x1d42064Fc4Beb5F8aAF85F4617AE8b3b5B8Bd801",
"fee": 3000,
"token0": "0x1f9840a85d5aF5bf1D1762F925BDADdC4201F984",
"token1": "0xC02aaA39b223FE8D0A0e5C4F27eAD9083C756Cc2",
"block_number": 12369739,
"type": "UniswapV3"
},
{
"pool_address": "0x6c6Bc977E13Df9b0de53b251522280BB72383700",
"fee": 500,
"token0": "0x6B175474E89094C44Da98b954EedeAC495271d0F",
"token1": "0xA0b86991c6218b36c1d19D4a2e9Eb0cE3606eB48",
"block_number": 12369760,
"type": "UniswapV3"
},
{
"pool_address": "0x7BeA39867e4169DBe237d55C8242a8f2fcDcc387",
"fee": 10000,
"token0": "0xA0b86991c6218b36c1d19D4a2e9Eb0cE3606eB48",
"token1": "0xC02aaA39b223FE8D0A0e5C4F27eAD9083C756Cc2",
"block_number": 12369811,
"type": "UniswapV3"
},
...
]
If you run the fetcher again, it will start from the block after the most recently found pool.
Here is an updated version of the V2 LP fetcher, which uses the same getLogs
approach and adds a Uniswap version entry to the LP dictionary:
"type": "UniswapV2"
ethereum_lp_fetcher_uniswapv2_json.py
from brownie import network, Contract
import sys
import os
import json
import web3
BROWNIE_NETWORK = "mainnet-local"
os.environ["ETHERSCAN_TOKEN"] = "[redacted]"
# maximum blocks to process with getLogs
BLOCK_SPAN = 50_000
# number of pools to process at a time before flushing to disk
CHUNK_SIZE = 1000
try:
network.connect(BROWNIE_NETWORK)
except:
sys.exit("Could not connect!")
exchanges = [
{
"name": "SushiSwap",
"filename": "ethereum_sushiswap_lps.json",
"factory_address": "0xC0AEe478e3658e2610c5F7A4A2E1777cE9e4f2Ac",
"factory_deployment_block": 10794229,
},
{
"name": "Uniswap V2",
"filename": "ethereum_uniswapv2_lps.json",
"factory_address": "0x5C69bEe701ef814a2B6a3EDD4B1652CB9cc5aA6f",
"factory_deployment_block": 10000835,
},
]
w3 = web3.Web3(web3.WebsocketProvider())
current_block = w3.eth.get_block_number()
for name, factory_address, filename, deployment_block in [
(
exchange["name"],
exchange["factory_address"],
exchange["filename"],
exchange["factory_deployment_block"],
)
for exchange in exchanges
]:
print(f"DEX: {name}")
try:
factory_contract = Contract(factory_address)
except:
try:
factory_contract = Contract.from_explorer(factory_address)
except:
factory_contract = None
finally:
if factory_contract is None:
sys.exit("FACTORY COULD NOT BE LOADED")
try:
with open(filename) as file:
lp_data = json.load(file)
except FileNotFoundError:
lp_data = []
if lp_data:
previous_pool_count = len(lp_data)
print(f"Found previously-fetched data: {previous_pool_count} pools")
previous_block = lp_data[-1].get("block_number")
print(f"Found pool data up to block {previous_block}")
else:
previous_pool_count = 0
previous_block = deployment_block
for i in range(previous_block + 1, current_block + 1, BLOCK_SPAN):
if i + BLOCK_SPAN > current_block:
end_block = current_block
else:
end_block = i + BLOCK_SPAN
if pool_created_events := factory_contract.events.PairCreated.getLogs(
fromBlock=i, toBlock=end_block
):
for event in pool_created_events:
lp_data.append(
{
"pool_address": event.args.get("pair"),
"token0": event.args.get("token0"),
"token1": event.args.get("token1"),
"block_number": event.get("blockNumber"),
"pool_id": event.args.get(""),
"type": "UniswapV2",
}
)
with open(filename, "w") as file:
json.dump(lp_data, file, indent=2)
Arb Path Builder
Now that we have a bunch of V3 pools, we should do something useful with them. Building some 2-pool arbs would be a good plan! The latest UniswapLpCycle
helper will accept both V2 and V3 pools, so we need an arb path builder that can generate parsable paths for both types.
New Arb Format
The V2 arb path format worked fine, but it’s insufficiently generic and difficult to extend. I did a bunch of clumsy manipulations in the last bot example to transform “flash borrow” arbs to “cycle” arbs by re-purposing the arb path dictionary.
I didn’t like it, but it worked and I knew I’d get around to fixing it once V3 was ready.
Well, V3 is mostly ready!
First, let’s compare the old format and the new.
Old Format (JSON)
"0x277d530218d51adeea7d77d49bcee597e624690bd7023418dd53b28a7ecb1159": {
"id": "0x277d530218d51adeea7d77d49bcee597e624690bd7023418dd53b28a7ecb1159",
"borrow_pool": "0x06da0fd433C1A5d7a4faa01111c044910A184553",
"swap_pools": [
"0x0d4a11d5EEaaC28EC3F61d100daF4d40471f1852"
],
"borrow_token": "0xdAC17F958D2ee523a2206206994597C13D831ec7",
"repay_token": "0xC02aaA39b223FE8D0A0e5C4F27eAD9083C756Cc2"
}
New Format (JSON)
"0x277d530218d51adeea7d77d49bcee597e624690bd7023418dd53b28a7ecb1159": {
"id": "0x277d530218d51adeea7d77d49bcee597e624690bd7023418dd53b28a7ecb1159",
"pools": {
"0x06da0fd433C1A5d7a4faa01111c044910A184553": {
"pool_address": "0x06da0fd433C1A5d7a4faa01111c044910A184553",
"token0": "0xC02aaA39b223FE8D0A0e5C4F27eAD9083C756Cc2",
"token1": "0xdAC17F958D2ee523a2206206994597C13D831ec7",
"type": "UniswapV2"
},
"0x0d4a11d5EEaaC28EC3F61d100daF4d40471f1852": {
"pool_address": "0x0d4a11d5EEaaC28EC3F61d100daF4d40471f1852",
"token0": "0xC02aaA39b223FE8D0A0e5C4F27eAD9083C756Cc2",
"token1": "0xdAC17F958D2ee523a2206206994597C13D831ec7",
"type": "UniswapV2"
}
},
"arb_types": [
"cycle",
"flash_borrow_lp_swap"
],
"path": [
"0x06da0fd433C1A5d7a4faa01111c044910A184553",
"0x0d4a11d5EEaaC28EC3F61d100daF4d40471f1852"
]
}
The advantage of this structure is that the arbitrage entry is more generic, and can be extended later. Instead of hard-coding borrow_token
and repay_token
, I have left that logic to be implemented by the arb helper itself. The arb should be as generic as possible, consisting only of a unique id, the path (a list of pool addresses), compatible strategies for the path, and a sub-dictionary of relevant information for each pool, keyed by address.
I may tweak the arb_types
entries later, but this is a good starting point. My end goal is to build a class that translates these arb dictionaries into helper objects, instead of hand-modifying each one for the various arb strategies.
Here is the arb builder, built from the familiar Graph Theory / NetworkX foundation:
ethereum_parser_2pool_univ3.py