Arbitrum is getting a lot of attention lately, given the success of their recent governance token airdrop.
I was a big fan even before they dropped free money in my wallet (I mean, uh… the ability to responsibly govern the DAO!)
I first tried it out during my experimentation doing sSPELL arbitrage by hand using the bridge. I was a fan of the low fees, high throughput, and inherited security from mainnet.
Being the most popular Ethereum L2, it has a lot of users and many useful contracts deployed. It is home to both Sushiswap and Uniswap V3, which makes it a fun place for me to do simple (and cheap) “copy-paste” smart contract botting for testing against live participants.
Since Arbitrum is a rollup that records final state to Ethereum, it must have a mechanism to determine what state data is recorded at regular intervals. Offchain Labs have decided to determine and enforce the ordering of this data by means of a centralized gatekeeper called a sequencer. After users on Arbitrum submit their transactions to the RPC or node, it passes their transaction to the sequencer. Once the sequencer has a transaction, it determines how to order it.
The sequencer orders transactions (at least for now) with a first-come-first-serve queuing system, then submits them in order to the inbox, which is a contract on mainnet Ethereum that finalizes this data.
You can read about the sequencer in depth HERE.
There is a catch for the MEV enjoyoooor, however. No mempool! Instead of transactions being propagated across the network and evaluated/included by multiple nodes via a consensus mechanism, the nodes simply pass user transactions to the sequencer and it handles the rest.
The lack of mempool is similar to Avalanche, where non-validator nodes have marginalized access to the mempool gossip protocol.
Arbitrum, having no mempool at all, is at least “fair” in this way. However they do pay some lip service to the dark forest-y nature of it all by asserting that the transaction ordering performed by the sequencer can be verified as honest (albeit after the fact).
They sequencer publishes a continuous feed of all payloads generated by the sequencer. Arbitrum have published an extremely bare-bones guide to access this feed HERE.
I’ve spent the last few days poking at the feed, figuring out how to mine the coded payloads for useful nuggets.
What Does the Feed Look Like?
Let’s write up a very simple script that connects to the sequencer feed via websocket and watches the output:
arbitrum_feed_watcher.py
import asyncio
import json
import websockets
SEQUENCER_URI = "wss://arb1.arbitrum.io/feed"
async def watch_sequencer_feed():
print("Starting sequencer feed watcher")
async for websocket in websockets.connect(
uri=SEQUENCER_URI, ping_timeout=None
):
while True:
try:
sequencer_payload = json.loads(await websocket.recv())
except Exception as e:
print(f"(watch_new_blocks) websocket.recv(): {e}")
break
else:
try:
messages = sequencer_payload["messages"]
except KeyError as e:
continue
else:
for message in messages:
print(message)
if __name__ == "__main__":
asyncio.run(watch_sequencer_feed())
print("Complete")
Run and you’ll get a bunch of output like this:
{'sequenceNumber': 58388656, 'message': {'message': {'header': {'kind': 3, 'sender': '0xa4b000000000000000000073657175656e636572', 'blockNumber': 17050178, 'timestamp': 1681534314, 'requestId': None, 'baseFeeL1': None}, 'l2Msg': 'BPkBToIE9oQF9eEAgy3kH5RFIZFpcqdtW/pl+1Oc96DCWSBQrIC45IPXeOsAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABgNgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACiN69e////////////////////////////////+jMGoOvAykYAAAAAAAAAAAAAAAAAAAAAAAAAAAAC/TUIyUc8IWb07QAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAYMBSYagO6bng+LApyvTLudcLe4SibpnaGvXOXJn3ribVgLQYa+gNH7iSw1Cs04vYARm0HhnZwojDecBEwoOCR3m0pPGRcs='}, 'delayedMessagesRead': 767519}, 'signature': None}
{'sequenceNumber': 58388657, 'message': {'message': {'header': {'kind': 3, 'sender': '0xa4b000000000000000000073657175656e636572', 'blockNumber': 17050178, 'timestamp': 1681534314, 'requestId': None, 'baseFeeL1': None}, 'l2Msg': 'AwAAAAAAAADyBPjvgxG2fIQHJw4AgywCHpSgzDPdb0gZ1HMiYld5Kv4jDsPGf4C4hEb4O1AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAbMgshaAD5zEs46wSXbldfaV7p1UBYQtEunZ8NKbd2guAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAHoSDILIWgA+cxLOOsEl25XX2le6dVAWELRLp2fDSm3doLgIMBSYWgr8nuDtpcxK4lsJoofszn0fZyDkPbPKaVIlzEzAkv6vSgLRnSn2q58VIzmTn89M9jajVkGPrJeOwKNHHZWuUlOugAAAAAAAAAdgQC+HKCpLGCAeyAhAgL78CDBj1wlOTtsnfkHciasHah8En0o++nALzoh//LnlfUIzaAwAGgPjv2xvnImXqEckfM5rM9fJFFphjM8BYeHI6Kqlxu7a+gUoGW6VG8V5qBNarHH1oip6YeG/DnirK8sm+0DoDg9TE='}, 'delayedMessagesRead': 767519}, 'signature': None}
{'sequenceNumber': 58388658, 'message': {'message': {'header': {'kind': 3, 'sender': '0xa4b000000000000000000073657175656e636572', 'blockNumber': 17050178, 'timestamp': 1681534314, 'requestId': None, 'baseFeeL1': None}, 'l2Msg': 'AwAAAAAAAAD5BAL49YKksUmAhAgL78CDDENqlPstxYDu2VW1KEB7TTb/r+PaaFQBh0cpLcrZ4AC4hL65rd8AAAAAAAAAAAAAAAB6AW2qYOURdRslXHcUmKb5dMc0WQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAG0jrV+AAaS7nX3dlegeMCzS9JaU8wVIU1u+HrBWtbDUm/FDRprrAaqCahEtU7EaLdrwPua+jgAn6C92VVF7TMwPT3Pyu4sABoEsfU91rSSCGE1F9aR+a2PTwwvGQPyp7K6cMGqtp9YdFoCBQ0wmo63gH4ICRN1WszcpgmcnROB6lCEklQEhSjeZBAAAAAAAABp8EAvkGmoKksYIRhIQF9eEAhQSuDakAgzB9WpQRERESVO6yVHe2j7he2Sn3OpYFgoC5BigSqjyvAAAAAAAAAAAAAAAAZHaKOiRT8ejenkPpLWX8NuTJhy0AAAAAAAAAAAAAAACt1WIAVzNvho6ueKRRxQOue1drrQAAAAAAAAAAAAAAAP+XCmGgSxyhSDSkP13kUz6921zIAAAAAAAAAAAAAAAAZHaKOiRT8ejenkPpLWX8NuTJhy0AAAAAAAAAAAAAAABlqPB72ahZjhtbbAqI9HedvAd2dQAAAAAAAAAAAAAAAAAAAAAAAAAAAABEZKDNhprK9BKaAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABP1yAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAFAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAWAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAASNAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAARvAARBAKAH5cDSAAAAAAAAAAAAAAAAAAAAAAAAAAQdAAQDAAFqAAFQUSDIc/7L01T1pW4A5xC5DvQgHbJEja3VYgBXM2+Gjq54pFHFA657V2utAASsOJO6AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACN3OgkM3eqgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAADAAAAAAAAAAAAAAAAAZHaKOiRT8ejenkPpLWX8NuTJhy0AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABkQMTqAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAIAAAAAAAAAAAAAAACt1WIAVzNvho6ueKRRxQOue1drrQAAAAAAAAAAAAAAAIKvSUR9igfjvZW9DVbzUkFSP7qxACDWvb94gq9JRH2KB+O9lb0NVvNSQVI/urEAoIYKMuwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACPS76G1E+AgACcFEgH3cvo7wmMWDqCbsWzhprj8D6s2qCr0lEfYoH472VvQ1W81JBUj+6sQDk8CEJKQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAgAAAAAAAAAAAAAAAAcdFDfeu2vVo3hA7J6d4nC0NXkFMAAAAAAAAAAAAAAAAAh7uALZwOND8AUQAAcpAxzgC/JwAAAAAAAAAAAAAAAGR2ijokU/Ho3p5D6S1l/DbkyYctAAAAAAAAAAAAAAAAZajwe9moWY4bW2wKiPR3nbwHdnUAAAAAAAAAAAAAAACCr0lEfYoH472VvQ1W81JBUj+6sQAAAAAAAAAAAAAAAP+XCmGgSxyhSDSkP13kUz6921zIAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACPS76G1E+AgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAUD3cFAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAGQ6LYMAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAGHg0FmMQJnYW5kYWxmdGhlYnJvd25neG14bmkAGCTafjH1BgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAaAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAQXAQEYD2Vlle/RpAFqm1VLu67vdbsqn5o29h8pEVIIvDStSnp4bbqr63GaVCJKa1TbJA5dz2/M54yX7WyM/fVCwbAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAASNwAg1r2/eP+XCmGgSxyhSDSkP13kUz6921zIgKBsTson/5cKYaBLHKFINKQ/XeRTPr3bXMgRERESVO6yVHe2j7he2Sn3OpYFggAAAAAAAAAAAAAAAAAAAAAAAADP7nwIwAGgSbWQ/V+sBvgOObIGM+notLCeycEsG4mpE+yw3HWAuJygKk98hpUhS4DXRVNReK2s6VbDJN3AYxvQgMIcE2Icor4AAAAAAAABkgT5AY6B4oQF9eEAgyJVOJSebvf3WtiNTttMmSXJS3acWw1igYC5ASQuTb6PAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABOUYAAAAAAAAAAAAAAACPaP6j/ZEIxRM/Ye6sQHpeUlsmpQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAHZSlJAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABOUYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAoAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABBtndtsNRoUQT77ONYZ24l4b0FwlQjykbJPOXDeDT6U5M+74hejm2om43Npxp6KjkE2zo0X1zK43UzJa5QO2YyDhsAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAgwFJhqDn7TK9WLrtIGo3rsvUGWCced9ZVYPNbo30Sus64SyUT6BmGQ0gd4ci5GfxpEZIm8ftnh0JnOC0tlSd4vC56CBoZwAAAAAAAAN7BAL5A3aCpLEEgIQIC+/Agxnn6JRMYAUThL0tPAG/yEXPX0tEvL6d5YcFHV88fj0CuQMENZNWTAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAKAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAZDouegAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAADCwEMAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAMAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAB4AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAIAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABR1fPH49AgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAC3GwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAFHV88fj0CAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAKAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAr/5cKYaBLHKFINKQ/XeRTPr3bXMgAAfSCr0lEfYoH472VvQ1W81JBUj+6sQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAMABoC1nfQzfRCV3po8HW+bgLthwoNMTy9cHK7EaHyZwxtDsoClanCA3KXhT2vnvutsGth9VpoGfzdPBeELn+6wwesia'}, 'delayedMessagesRead': 767519}, 'signature': None}
[...]
This is where all the fun happens so let’s explore. First, the payload is a JSON-formatted dictionary which is pretty standard for data sent over a websocket.
The previous link about reading the feed describes the various labels for this structured data. We need to be most aware of type L1IncomingMessage
. It contains all of the relevant info necessary to catch and process Arbitrum transactions early.
Unfortunately it’s barely documented beyond a link to a source file written in Golang. I’m all for trying out new stuff so I taught myself a bit of basic Golang and started poking around.
The first thing you’ll discover after reading through incomingmessage.go
(LINK) is a function called ParseL2Transactions
. This function is executed by the larger ArbOS system, which is described with a big mess of jargon HERE.
Here’s what you need to know — The Arbitrum sequencer runs a modified version of geth, so take all of the familiar geth structures and functions and you’ll mostly know how to deal with the sequencer source code.
L1 and L2 Messages
There are two groups of messages within the sequencer: L1 and L2. The sequencer itself listens for L2 transactions (generated within the Arbitrum network), but interacts primarily with Ethereum to drive the inbox contract.
Arbitrum blocks are issued on regular intervals, and always include a “base” transaction from address 0x00000000000000000000000000000000000a4b05 that performs a 0 wei transfer to itself.
AUTISM NOTE: a4b05
is l33t for “ArbOS”, so kudos to Offchain Labs (buncha nerds fr fr) who obviously spent a lot of time on their computers in the 90s.
The Go file defines the L1 messages as:
const (
L1MessageType_L2Message = 3
L1MessageType_EndOfBlock = 6
L1MessageType_L2FundedByL1 = 7
L1MessageType_RollupEvent = 8
L1MessageType_SubmitRetryable = 9
L1MessageType_BatchForGasEstimation = 10
L1MessageType_Initialize = 11
L1MessageType_EthDeposit = 12
L1MessageType_BatchPostingReport = 13
L1MessageType_Invalid = 0xFF
)
And the L2 messages:
const (
L2MessageKind_UnsignedUserTx = 0
L2MessageKind_ContractTx = 1
L2MessageKind_NonmutatingCall = 2
L2MessageKind_Batch = 3
L2MessageKind_SignedTx = 4
// 5 is reserved
L2MessageKind_Heartbeat = 6 // deprecated
L2MessageKind_SignedCompressedTx = 7
// 8 is reserved for BLS signed batch
)
L1 messages are all be received by the sequencer and deal with the Ethereum mainnet interaction.
L2 messages may be received by participating nodes in the network, but ultimately end up at the sequencer just the same.
There is one kind of message we really care about, which is L2MessageKind_SignedTx
.
L2 Signed Transactions
How is a signed TX broadcast by the sequencer feed? First it arrives as a L1 message of type 3 (L1MessageType_L2Message
), and the encoded transaction will be marked as type 4 (L2MessageKind_SignedTx
).
The sequence reading link from Arbitrum lists the structure of L1IncomingMessage
as:
type L1IncomingMessage struct {
Header *L1IncomingMessageHeader `json:"header"`
L2msg []byte `json:"l2Msg"`
// Only used for `L1MessageType_BatchPostingReport`
BatchGasCost *uint64 `json:"batchGasCost,omitempty" rlp:"optional"`
}
Here the Header
attribute is where we look to determine the L1 message type. If you look up at the watcher output above, you’ll see several parameters:
kind
sender
blockNumber
timestamp
requestId
baseFeeL1
Most of these don’t apply to us, but the kind
parameter is useful and gives us the message classification. Probably what you expected.
Reading the output above, you’ll notice that the examples are set to 'kind':3
, signifying that they are L2 messages.
But what kind of L2 message?
This is where it gets really wacky.
Decoding l2Msg
Each message emitted by the sequencer feed with an L2 message will include a dictionary key called 'l2Msg'
within the 'message'
sub-dictionary of the surrounding 'message'
dictionary. There’s a lot of redundant naming here so please just roll with it.
Taking the first sequence above (#58388656) as an example for demonstration, we see that l2Msg
is a pretty funky looking character string:
'BPkBToIE9oQF9eEAgy3kH5RFIZFpcqdtW/pl+1Oc96DCWSBQrIC45IPXeOsAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABgNgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACiN69
[...]
SYagO6bng+LApyvTLudcLe4SibpnaGvXOXJn3ribVgLQYa+gNH7iSw1Cs04vYARm0HhnZwojDecBEwoOCR3m0pPGRcs='
This is not an encoding we’ve seen before. I’ll save you some trouble, it’s base64.
So now fire up a Python console and decode it:
>>> import base64
>>> base64.b64decode(
'BPkBToIE9oQF9eEAgy3kH5RFIZFpcqdtW/pl+1Oc96DCWSBQrIC45IPXeOsAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABgNgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACiN69e////////////////////////////////+jMGoOvAykYAAAAAAAAAAAAAAAAAAAAAAAAAAAAC/TUIyUc8IWb07QAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAYMBSYagO6bng+LApyvTLudcLe4SibpnaGvXOXJn3ribVgLQYa+gNH7iSw1Cs04vYARm0HhnZwojDecBEwoOCR3m0pPGRcs=')
b'\x04\xf9\x01N\x82\x04\xf6\x84\x05\xf5\xe1\x00\x83-\xe4\x1f\x94E!\x91ir\xa7m[\xfae\xfbS\x9c\xf7\xa0\xc2Y P\xac\x80\xb8\xe4\x83\xd7x\xeb\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00`6\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xa27\xaf^\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xfa3\x06\xa0\xeb\xc0\xcaF\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\xfd5\x08\xc9G<!f\xf4\xed\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x83\x01I\x86\xa0;\xa6\xe7\x83\xe2\xc0\xa7+\xd3.\xe7\\-\xee\x12\x89\xbaghk\xd79rg\xde\xb8\x9bV\x02\xd0a\xaf\xa04~\xe2K\rB\xb3N/`\x04f\xd0xgg\n#\r\xe7\x01\x13\n\x0e\t\x1d\xe6\xd2\x93\xc6E\xcb'
Looks more familiar! Let’s look at it in hex:
>>> base64.b64decode(' [...] ').hex()
'04f9014e8204f68405f5e100832de41f944521916972a76d5bfa65fb539cf7a0c2592050ac80b8e483d778eb000000000000000000000000000000000000000000000000000000000000603600000000000000000000000000000000000000000000000000000000a237af5efffffffffffffffffffffffffffffffffffffffffffffffffa3306a0ebc0ca4600000000000000000000000000000000000000000002fd3508c9473c2166f4ed00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000183014986a03ba6e783e2c0a72bd32ee75c2dee1289ba67686bd7397267deb89b5602d061afa0347ee24b0d42b34e2f600466d07867670a230de701130a0e091de6d293c645cb'
But what kind of signed TX is this?
Here is the part that hung me up for a long while.
All attempts to decode it failed until I decided to just watch the feed for 5 minutes looking for patterns. Eventually I saw that the first byte (b'\x04'
above) was fairly predictable. It was primarily 3 or 4, and then I realized that this was an encoded byte that signaled the kind of L2 message.
If I was better at using Go I would have realized this quickly, but oh well. The Go function sends the encoded data into an io.Reader
which is used for accessing streams of data. As you read data from it, the reader moves along the stream position until it reaches the end of the file and throws an EOF exception. ArbOS reads the first position into a 1 byte buffer, then determines the kind of transaction based on that. It then processes the rest of the encoded l2Msg
separately.
Decoding the TX
But the question remains — since we’re not using Go, what can I do with that hex string?
We need to determine one more thing before answering that question.
Did you know that Legacy (type0) transactions are RLP-encoded differently from EIP-1559 (type2) transactions? I didn’t.
I’ve taken for granted that I can just fetch a transaction from a network and decode it, but when presented with a random raw TX from a websocket, I was unsure how to proceed.
Legacy Transaction Decoding
So once again I scoured the docs and learned that Legacy transactions can be decoded somewhat simply using the rlp
module, which you’ll already have access to if you’ve installed Brownie and related tools:
>>> import rlp
>>> from ethereum.transactions import Transaction
>>> from hexbytes import HexBytes
>>> raw_tx = '04f9014e8204f68405f5e100832de41f944521916972a76d5bfa65fb539cf7a0c2592050ac80b8e483d778eb000000000000000000000000000000000000000000000000000000000000603600000000000000000000000000000000000000000000000000000000a237af5efffffffffffffffffffffffffffffffffffffffffffffffffa3306a0ebc0ca4600000000000000000000000000000000000000000002fd3508c9473c2166f4ed00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000183014986a03ba6e783e2c0a72bd32ee75c2dee1289ba67686bd7397267deb89b5602d061afa0347ee24b0d42b34e2f600466d07867670a230de701130a0e091de6d293c645cb'
>>> rlp.decode(
HexBytes(raw_tx[2:]),
Transaction
).as_dict()
{'nonce': 1270, 'gasprice': 100000000, 'startgas': 3007519, 'to': HexBytes('0x4521916972a76d5bfa65fb539cf7a0c2592050ac'), 'value': 0, 'data': HexBytes('0x83d778eb000000000000000000000000000000000000000000000000000000000000603600000000000000000000000000000000000000000000000000000000a237af5efffffffffffffffffffffffffffffffffffffffffffffffffa3306a0ebc0ca4600000000000000000000000000000000000000000002fd3508c9473c2166f4ed000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001'), 'v': 84358, 'r': 26981352538204881499297968755808149086952976875988935691529692733840859095471, 's': 23744452674558802074226830982840600658335803357639654388979651055823217903051}
You’ll notice that the hash is not included here to verify against the blockchain, so we have to generate it ourselves:
>>> import web3
>>> web3.Web3.keccak(hexstr=raw_tx[2:])
HexBytes('0xfda105b7d91edb9e6b2c4060ff8909099c3f6d94b55df7b7af4e08b4b7605f27')
Moving over to Arbiscan, you can verify the transaction matches the dictionary generated by rlp.decode
above.
EIP-1559 Transaction Decoding
Decoding these transactions is more complex because of course it is!