IronPython - tackling some unexpected Exceptions

After I listened to an episode of Talk Python To Me featuring Alex Earl from the IronPython project I learned that not only is IronPython not dead/dying, but it's actually seeing a bit of a resurgence recently. I grabbed the sources from the IronLanguages GitHub, setup my dev environment, opened it up and launched the IronPythonConsole project hoping to see the familiar python REPL.

However instead I saw that it had hit an exception:

I was frustrated at first, thinking I'd done something wrong, but realised that getting to the bottom of an Exception was a fine way to introduce yourself to a new codebase.

The exception itself is a ZipImportError with the text "not a Zip file" and is thrown in the constructor for zipimporter.

Python Confession #1: I had never heard of or used zipimporter before.

Since I'd never heard of the class before I had no idea why the IronPython runtime would be calling this and especially on something which didn't appear to exist. So it's time to dig through the call stack to see where this comes from:

It appears that PythonCommandLine.ImportSite kicks this process off so that's where I started looking:

    private void ImportSite() {
        if (Options.SkipImportSite)
            return;
    
        try {
            Importer.ImportModule(PythonContext.SharedContext, null, "site", false, -1);
        } catch (Exception e) {
            Console.Write(Language.FormatException(e), Style.Error);
        }
    }

It turns out that site is a special Python module which is imported by default when the interpreter starts (for all implementations - IronPython, JythonPyPy and good old vanilla CPython). It's responsible for doing some platform-specific module path setup.

Python Confession #2: I had never heard of the site module before.

So how does importing site cause us to execute some code relating to zipimporter? Searching through the call stack at the point of the Exception shows that all this seems to come from FindImporterForPath which takes every function in path_hooks and attempts to apply it to the path we're importing.

    /// 
    /// Finds a user defined importer for the given path or returns null if no importer
    /// handles this path.
    /// 
    private static object FindImporterForPath(CodeContext/*!*/ context, string dirname) {
        List pathHooks = PythonContext.GetContext(context).GetSystemStateValue("path_hooks") as List;

        foreach (object hook in (IEnumerable)pathHooks) {
            try {
               object handler = PythonCalls.Call(context, hook, dirname);

                if (handler != null) {
                    return handler;
                }
            } catch (ImportException) {
                // we can't handle the path
            }
        }

So we call every path_hook with the module we're importing as an argument using PythonCalls.Call()The path_hooks themselves come from the sys module:

    sys.path_hooks

A list of callables that take a path argument to try to create a finder for the path. If a finder can be created, it is to be returned by the callable, else raise ImportError.

Python Confession #3: I had never heard of or used path_hooks before.

So what is in path_hooks? If I keep hitting continue on the Visual Studio debugger the Exception is caught, I reach the python REPL and can inspect what is in sys.path_hooks:

And there it is - zipimporter. Now we're approaching an explanation - when the IronPython interpreter is initialised it imports the site module which takes the everything in path_hooks and applies them to all the modules in our path - but since there are no .zip files anywhere in our path zipimporter (the only path hook) cannot find anything to operate on, so throws an exception which is normally caught and handled.

So this is normal behaviour - the exception is even expected, since path_hooks' documentation states that if a path_hook fails it raises an Exception.

Conclusion

OK nothing special has happened here since IronPython is behaving exactly as it should, however unexpected it was to me. That said, this is a very nice way to learn some new Python concepts, like:

  1. the zipimport module
  2. the site module
  3. sys.path_hooks 

I even have a half-working clone of zipimporter for tarballs called tgzimporter, but there's little need for such functionality as I suspect that even zipimporter is seldom used. 

It would've been easy to just keep hitting the "F5" key until I hit the REPL, but then I would likely have struggled to find a way to approach the source code and perhaps would've put it off indefinitely. Hopefully now I'll find some way to continue to improve my understanding and contribute to the IronPython project.

1 response
Sean, you're absolutely right except one thing: this damn "IronyPython" is a complete non-working junk :) Devs might think (but I bet they don't!) "it's not important, we're did a great job and waste a lot of money", but this is BS too :) It's not a product at all! You might download and automate normal Python (use console redirection and other trick) and don't have these EXTREMILY stupid bugs.