Customizing and Extending Python Classes
Beyond The Repo: How To Modify Degenbot Components Using Inheritance
The degenbot code base is a foundation for building EVM trading bots. I build a lot of features into the classes there, but there is a natural limit to how much I can add. For one, I do not have experience with every chain and every exchange on those chains. I target the most common platforms (Uniswap variants & Curve) and chains (Ethereum mainnet, Arbitrum, and Optimism-based L2s) because that effort will be the most helpful for the greatest number of readers.
Occasionally someone will ask for support with some obscure DEX. I’m always glad to dig into the specifics of an exchange and help them along, but I can’t spend time developing and testing a set of classes for a one-off fork.
What should users who want to build with degenbot, but need some tweaks to support their target exchange, do?
They can use the object-oriented features of Python to make “in-place” modifications and extensions without waiting on me.
But that’s easier said than done. If you’re new to programming you might not know that this is possible, or how to do it.
What About Base Szn?
Understanding objects, classes, and inheritance is broadly useful. I am presenting this on its own instead of burying it in a long post for a specific project.
It may seem that I’m getting off-track, but it will come together later. The results of the lesson will be used directly in the Base backrun project.
Objects, Classes, Inheritance
I will not cover object-oriented programming in detail here, but I do need to review some basics.
Object
Python is an object-oriented language. An object is some discrete “thing” that can be referenced by the programmer and looked up by the interpreter. It has a location in memory, and may contain data, functions, or both.
When we write a program, we are mostly creating objects, keeping track of them, and using them to perform some task.
Class
The foundation for most Python objects is a class. Classes are defined structures that encapsulate data and functions.
When we refer to the “type” of an object, we are mostly asking “what class was this object created from?”
Constructor
All classes require at least one function called the constructor. A constructor is defined by the __new__
method, which will create a new instance of the class in memory. that is sufficient to create an “instance” of that class, which is called an object.
Initializer
All objects in Python can be further initialized after they are instantiated. The Python interpreter will build the object by calling __new__
, followed by __init__
if it is defined.
Many coders call __init__
a constructor as an accepted shorthand (I have done this, too). Since Python will automatically include a generic __new__
method that does what you want, we sometimes forget it is there and assume that the initializer performs the construction of the object in memory.
Inheritance
A class can be built from other classes using a method called inheritance. Definitions flow down from “parent” to “child”. All classes that inherit from a class are called derived classes. They automatically receive access to the functions defined in their parent classes, and can choose to provide new functions or override the ones from their parent. However they cannot override the functions of their parent.
In Python, inheritance is done at the class definition by providing the parent class or classes as arguments to the derived class.
Here is an example of a class to represent a message about some event that has occured. I create a class called Event
with a single attribute and an initializer:
class Event:
def __init__(self, message):
self.message = message
Simple enough. Now let’s say that I’m building an app that passes events around, but want to make them more specific.
I could make the Event
class more specific by adding an attribute:
class Event:
def __init__(self, message, type):
self.message = message
self.type = type
If I have code that handles messages differently depending on their type, it would need to access the type
attribute each time to determine how to process it. The code might not need to handle every message type, but certainly every time you extended the Event
class with a new attribute, you would need to review the code to inspect the attribute and adjust its logic. What about messages that have more than one type?
For small projects this is fine, but you can see that extending by adding attributes to a common class can get very complicated.
As an alternative, you could use inheritance to create different event types:
class Failure(Event): ...
class Success(Event): ...
My derived Failure
and Success
classes do not override anything defined by the Event
class, so they are entered with the most limited definition possible — a class name with the parent class in parentheses, and nothing else.
Note that the class definition requires the colon, which the Python parser will expect to be followed by something. I’m not overriding anything so no definitions are included below. The typical “no-op” Python keyword is pass
, but an ellipsis (...
) is valid syntax as well. Both will satisfy the parser, but otherwise have no effect.
Derived Class Type Checks
A nice behavior of derived classes is that they pass type checks on their parent classes. Calling isinstance(failure_evt, Event)
or isinstance(success_evt, Event)
would return True
. So you can think of Failure
and Success
as an Event
, but with some extra identity that comes without needing to modify the parent class.
Overriding Specific Methods
If I wanted to create a special event that performed extra behavior in the initializer, I could override it:
class LoggedEvent(Event):
def __init__(self, message):
self.message = message
log_message(message) # log_message is defined elsewhere
In this case, LoggedEvent
defines its own initializer, which will perform the same step as the parent class, but also call log_message
.
Accessing Parent Methods
The difficulty with overriding methods is that sometimes you can get out of sync. If I modify Event
to include a new attribute in the initializer, I would also have to apply that same attribute to the derived classes that are overriding their own initializers.
Fortunately there is a clean solution, the super()
method. When a derived class calls super()
, it will return a reference to the parent class, which allows it to execute its methods “as is” without having to copy-paste them.
For example, the same LoggedEvent
class above could be written to initialize itself using super()
, and then perform the logging:
class LoggedEvent(Event):
def __init__(self, message):
super().__init__(message)
log_message(message) # log_message is defined elsewhere
This way, all changes to the parent class are automatically applied. Calling super().__init__(message)
is like copy-pasting the initializer directly into the code, but of course it’s more sophisticated and powerful. If you want to learn how Python resolves complicated inheritance with super()
, look up “Method Resolution Order”.
Extending UniswapLpCycle
From here, I will demonstrate how to use inheritence to develop a new arbitrage helper class.
I will extend the UniswapLpCycle
class for two purposes:
Compatibility with an EIP-1153 Transient Storage executor contract.
Implement the recursive callback necessary to do optimized Uniswap V3 pool transfers.
Please refer to these posts if you are unfamiliar.
Since the class will derive from UniswapLpCycle
, it will pass all type checks for that class, and for the more generic classes used at the top of the degenbot inheritance tree.
To begin the modifications, first we must understand how UniswapLpCycle
works. The basic structure that we will build from is this:
The arbitrage helper is built from an ordered iterable of Uniswap V2 or V3 liquidity pool helpers and a given input token
When the
calculate()
method is called, the arbitrage helper will perform a pre-calculation check using the_pre_calculation_check()
method, which will evalute the instananeous rate of exchange for the full path, starting with a swap of the input token at the first pool, and proceeding through the remaining pools until the last swap results in the same token being returned. If the pre-calculation check indicates that the rate of exchange is less than 1.0, it raisesArbitrageError
to indicate that the full calculation is not worth doing.If the pre-calculation check passes (by not raising the exception), the arbitrage helper will call an internal method
_calculate()
, which performs an optimization calculation using theminimize_scalar
function from SciPy, which will determine the initial swap amount that results in the greatest possible rate of exchange. Then it will build the swap amounts for each pool from this ideal input, and package the result into anArbitrageCalculationResult
, which contains information about the arbitrage helper, the tokens, the input amount, the profit, and the swaps at each pool.The swaps from the calculation result can be given as an input to the
generate_payloads()
method, which will generate a set of payloads for the set of complete arbitrage swaps from a given address. Each payload is a tuple with the format(target: address, calldata: bytes, value: uint256)
where the target is an address, the calldata is an array of bytes sent as a raw call to that address, and the value is the amount of Ether to send with it. This is the same payload structure outlined in the Vyper Bundle Executor lesson, which I encourage you to revisit if unfamiliar.
After the payloads are generated, the job of the arbitrage helper is complete and it is the responsibility of the user to send those payloads to a deployed executor contract.
What Needs To Change?
I will make two key changes.
The first is to change the structure of the payload format. The payloads sent to this contract will only be for swaps at a Uniswap pool, so I can make it more specific. I have settled on the format (target: address, calldata: bytes, will_callback: bool)
, preserving the payload size by replacing the ability to pass Ether with the call with the ability to control whether the call will trigger a callback.
The second is to improve the flexibility of specifying pools for the arbitrage helper at initialization. UniswapLpCycle
expects all pools to be submitted in a first → *middle → last ordering, and never deviates from that expectation. This means that monitoring both directions of an X-WETH arbitrage between two different exchanges will commit you to building two arbitrage helpers. One with X-WETH (exchange a) → X-WETH (exchange b), and another with X-WETH (exchange b) → X-WETH (exchange a). This chews up memory and processor time, and it would be better if the arbitrage helper could evaluate when the reverse path is profitable and adjust its behavior accordingly.
What Stays The Same?
The initializer can be reused. Hooray! All of the same pool and token checks are still valid, though it does a bit of unnecessary work building directional vectors for swapping calculations. It is not worth rewriting the whole initializer to eliminate the extra work, so I will just ignore it.
Similarly, the _build_amounts_out()
method is not used by the derived class and just remains in place without being used.
The magic methods __getstate__
and __setstate__
remain, which allow the derived class to be pickled and unpickled for use in multiprocessing pools.
The internal helper method _sort_overrides()
remains, which allows the derived class to format pool state overrides into an easily-used dictionary.
The notify()
method remains, which allows the derived class to support receiving events from other objects that implement a publisher/subscriber protocol. I have not talked about this yet, but will include a discussion in the next project writeup.
Caveats, Bugs, and To-Dos
The class is still being developed and tested, so it is not perfect. For one, the class only supports two-pool arbitrage. I have not fully tested arbitrage using three or more pools with the transient storage executor, but am confident that two-pool arbitrage works as expected.
The pre-calculation check needs to be written to support a bi-directional check, instead of assuming the swap path always follows the ordering provided at build time.
It does not support V2 <> V2 arbitrage, simply because I developed it with V3 specifically in mind and wrote the executor contract to only support a V3 callback. It is simple enough to extend this to add a V2 callback and write a V2 <> V2 swap optimization calc. I want this new class and contract to support no-balance arbitrage, which would be a huge win for newcomers. Currently this is supported for V3 <> V2 and V3 <> V3, but I just need to develop and test the flash borrow capability necessary to support V2 <> V2.
There is also some ugly hackiness around establishing the bounds of the optimization calculation for V3 pools, which I’m just working around by calling get_balance()
on the forward token. I will find a better way to do this.
Finally, I dislike having the payload generator tied so closely to the arbitrage helper. I want to cleanly separate these. My goal is to allow the arbitrage helper to calculate and generate a set of fully abstracted swaps. This would require users to write or use an encoder that would translate these into valid inputs to their contract, but it solves the problem of tying an arbitrage helper to a particular contract. We will see if I can pull it off!
Derived Class Helper
Here is the full definition of the new class with the two overridden methods.
Please pay particular attention to the pool type checking in _calculate
and the directional logic that builds different payloads starting at the V3 pool depending on the rate of exchange (abbreviated ROE in the comments).
I have been using this class for a week or so and it is robust and works as expected. Please review, test, make suggestions, and report bugs!