Exocet: A Second Look

So what I didn't really talk about last time is that more than just letting you directly express module loading, Exocet also implements:

Parameterized Modules



In Python, classes and functions take parameters, but modules don't. When you load a module, you don't have any opportunity to tell it what you want, or what context it's being loaded in. A lot of times, this matters.

For example, some code has optional dependencies. This idiom can be seen a lot in some parts of Twisted:

try:
from OpenSSL import SSL
except ImportError:
SSL = None


Code following this checks if 'SSL' is None to decide whether to define methods and classes that provide support for SSL connections in Twisted.


Other code can depend on one of multiple providers of an interface. Here's the famous example from the docs for the magnificent lxml library:


try:
from lxml import etree
print("running with lxml.etree")
except ImportError:
try:
# Python 2.5
import xml.etree.cElementTree as etree
print("running with cElementTree on Python 2.5+")
except ImportError:
try:
# Python 2.5
import xml.etree.ElementTree as etree
print("running with ElementTree on Python 2.5+")
except ImportError:
try:
# normal cElementTree install
import cElementTree as etree
print("running with cElementTree")
except ImportError:
try:
# normal ElementTree install
import elementtree.ElementTree as etree
print("running with ElementTree")
except ImportError:
print("Failed to import ElementTree from any known place")


The effect of this code is to try to import, in order, one of:

  1. lxml
  2. cElementTree from the stdlib
  3. ElementTree from the stdlib
  4. ElementTree installed separately
  5. cElementTree installed separately


I don't think it's a stretch to say this is rather silly. How would you feel if you saw a function that tried to access five different global variables in a row in order to decide what to do?

And though all of these modules implement the same interface, they're still different code, and you might hit some edge case where their behaviour differs. How do you test your code using each possible ElementTree implementation? As Glyph points out, it's always sunny in Python, and every piece of bad code can be worked around by writing worse code; you could have your unit tests fool around with sys.modules. But can't there be something better?

Here's a different way of thinking about it entirely.

In languages that aren't as good as Python, dependency injection is a technique that gets used to deal with this. Dependency injection has many forms and can be rather complicated, but the general idea is code declares the thing it needs, and something else (unfortunately, often an XML file!) describes what objects to provide to satisfy those dependencies.


With Exocet, we hijack the import statement to describe named parameters, indicating the dependencies our code has.
from exocet.parameters import etree

x = etree.parse(open("mydata.xml"))


Now, if you look in the Exocet tarball, you won't find a parameters.py file. This name doesn't correspond to anything on the filesystem.

So how does this help us? Well, it means you can load your ElementTree-using module like this:
m = exocet.pep302Mapper.withOverrides({"exocet.parameters.etree",
xml.etree.cElementTree})
my_etree_using_module = exocet.loadNamed("my_etree_using_module", m)


Or this:
m = exocet.pep302Mapper.withOverrides({"exocet.parameters.etree",
lxml.etree})
my_etree_using_module = exocet.loadNamed("my_etree_using_module", m)


This way, you can test your code without having to resort to sys.modules hackery, and you can better factor your applications by separating configuration and environment concerns from the rest of your code.

As you may recall from last time, pep302Mapper provides the default Python implementation of module loading. In this case, we've just added another name that can be imported, exocet.parameters.etree.

Being able to provide parameters to modules opens up the possibility to eliminate all use of global objects from your code, and pass objects only to the code that needs them. I believe that with experimentation and refinement of these tools, this technique is going to enable a lot of simpler methods for organizing complex applications and reduce a lot of complications people have to deal with in Python now.

0 comments: