A lot of people use mock.patch()
in their tests, but it's also sometimes useful to monkey-patch code at runtime. This blog post talks about why and how.
Let's imagine that you're using some library (perhaps something big, like a web framework), and for whatever reason, you're unable to update the version you're using. Meanwhile, someone comes along and reports a major vulnerability. You need to somehow deal with the vulnerability, but you're in a situation where it's really hard to update.
So, you go find the actual change that fixed the vulnerability. You want to apply it to your version of the code. What do you do?
Well, you could fork the library, but that's kind of a pain to manage. When will you move off that fork? What's the difference between your fork and the original library? What do you do if you need to update versions slightly but you still need the fork?
Or, you can grab the bits of code you actually care about, and patch the system at runtime.
Some basics of patching
Somewhere, near the start of your application, you make a call to some function, apply_all_patches()
. Then, you write a function called apply_all_patches()
that calls other functions like apply_patch_for_this_thing()
and apply_patch_for_that_thing()
, etc.
Now, let's say there's a class, SomeClass
, with a function, some_function
. Let's suppose there's a vulnerability in some_function
, and you can you see the newer version of it with the fix.
You basically do:
# fix_for_some_thing_code.py
# This module mostly contains third-party code wrapped in functions.
# Include the original license since this is mostly third-party code.
# This standalone function has the code for the method that I'm trying to replace.
# Note, even though it's a top-level function, it still accepts self because I'm
# going to inject the function into the existing class later.
def some_function(self, ...):
...
# fix_for_some_thing_patch.py
# This takes the above third-party code and monkey-patches it in.
import fix_for_some_thing_code
# Here, I'm injecting that code:
def apply_patch_for_this_thing():
SomeClass._orig_some_function = SomeClass.some_function
SomeClass.some_function = fix_for_some_thing_code.some_function
Here are a couple of trivial functions to make it easier:
def patch(obj, attribute_name, new_value):
setattr(obj, f"_orig_{attribute_name}", getattr(obj, attribute_name, None))
setattr(obj, attribute_name, new_value)
def patch_multiple(obj, attribute_names, copy_from_obj):
for attribute_name in attribute_names:
new_value = getattr(copy_from_obj, attribute_name, None)
patch(obj, attribute_name, new_value)
Now, we can just write:
def apply_patch_for_this_thing():
patch(SomeClass, "some_function", some_function)
# Alternatively, if you have a bunch of patches:
patch(SomeClass, ["some_function", "some_other_function"], fix_for_some_thing_code)
Dealing with imports
Python's from a import b
can make a patcher's life difficult.
Let's say you have two modules, module_with_vuln
and module_that_imported_from_module_with_vuln
.
Depending on how module_that_imported_from_module_with_vuln
is written, it can make your life either more or less painful. And, let's imagine there are a ton of modules that import from module_with_vuln
.
If the problem is in a class's method, it's no big deal. You can just replace the method in the class.
If the thing that you have to replace is something immutable like an int, function, or enum, life becomes harder.
Let's imagine the problem is in some top_level_function
inside module_with_vuln
. Let's imagine that module_that_imported_from_module_with_vuln
has code like from module_with_vuln import top_level_function
. Even if you update module_with_vuln.top_level_function
, it won't matter because module_that_imported_from_module_with_vuln.top_level_function
still points to the original function. Anyone who used mock.patch
in their tests is familiar with this problem.
To deal with it, you have to focus on replacing module_that_imported_from_module_with_vuln.top_level_function
with your new module_with_vuln.top_level_function
after you've already patched module_with_vuln
. Basically, you have two places that you have to monkey-patch.
If you have a lot of things to patch, you might be asking if you can just swap out the entire module in sys.modules
, but that actually won't help if other modules have already run and done their imports. If you can really be the first thing to run, then you might be able to pull this trick off, but it's actually subtly harder than you might think.
Anyway, if what you have to patch is a method in a class, it's easy to just patch that one method in the class, but if what you have to patch is something like an int at the top level, you have no choice but to chase down all the paces that import it and patch their references too.
By the way, you have to be really careful about entirely redefining classes or modules. If you have some class, ClassWithVuln
, and you entirely redefine it, there might be some code out there that imported the old version of ClassWithVuln
and is doing stuff like isinstance(some_object, ClassWithVuln)
. If some_object
is an instance of the new ClassWithVuln
, but the import is for the old ClassWithVuln
, then isinstance
is going to return False
.
There's another weird edge case. Let's say that we're replacing some_function_with_vuln
, and the code is a closure that uses some globals like SomeHarmlessOtherClass
. You want to make sure that the old code and the new code reference the exact same SomeHarmlessOtherClass
. So, in your fix_for_some_thing_code.py
, you may want to import things from the original module that had the vulnerability:
# fix_for_some_thing_code.py
from module_with_vuln import SomeHarmlessOtherClass
# This standalone function has the code for the method that I'm trying to replace:
def some_function(self, ...):
...
One more trick. Write a test like:
def test_remember_the_patch_when_upgrading_the_library(self):
if some_library.__version__ != "1.2.3":
raise AssertionError("Remember to update or remove the patch for some_library")
Summary
In summary, my advice is:
- Remember to save a reference to the thing you're replacing, like
_orig_function_with_vuln
.
- Whenever possible, stick to replacing individual functions/methods.
- Avoid creating new modules, rather patch them in place.
- Avoid creating new classes, rather patch them in place.
- When creating a new function, make sure its closure is closing around the same instances that the original function was closing around.
- If the thing you're trying to replace is something immutable, like an int, function, enum, etc., and other modules are using
from module_with_vuln import something_immutable
, you're going to have to chase down those other modules and replace those references.
- Use a test to remind future developers to update or remove your patch when they update the library.
So, dynamically patching your libraries to work around vulnerabilities is definitely a useful technique. But, if the patching gets too extensive, you might decide to just bite the bullet and do the upgrade. In some cases, you might also decide that the vulnerability just isn't severe enough to worry about.
View comments