Python -- import
How does import work in Python
Below findings are based on Cpython version of 6066739ff7794e54c98c08b953a699cbc961cd28
.
Bootstrap
This step generates two C files Python/frozen.c
and Python/deepfreeze/deepfreeze.c
from importlib
python files. See part of Makefile below
############################################################################
# frozen modules (including importlib)
#
# Freezing is a multi step process. It works differently for standard builds
# and cross builds. Standard builds use Programs/_freeze_module and
# _bootstrap_python for freezing and deepfreezing, so users can build Python
# without an existing Python installation. Cross builds cannot execute
# compiled binaries and therefore rely on an external build Python
# interpreter. The build interpreter must have same version and same bytecode
# as the host (target) binary.
#
# Standard build process:
# 1) compile minimal core objects for Py_Compile*() and PyMarshal_Write*().
# 2) build Programs/_freeze_module binary.
# 3) create frozen module headers for importlib and getpath.
# 4) build _bootstrap_python binary.
# 5) create remaining frozen module headers with
# ``./_bootstrap_python Programs/_freeze_module.py``. The pure Python
# script is used to test the cross compile code path.
# 6) deepfreeze modules with _bootstrap_python
#
# Cross compile process:
# 1) create all frozen module headers with external build Python and
# Programs/_freeze_module.py script.
# 2) deepfreeze modules with external build Python.
#
Steps as follow
- Compile
Programs/_freeze_module.c
to an executable_freeze_module
. This executable compiles python code to byte code. - Call
_freeze_module
with all freeze-in files, namely, theFROZEN_FILES_IN
list. The output files areFROZEN_FILES_OUT
. For example,1
./_freeze_module importlib._bootstrap $(srcdir)/Lib/importlib/_bootstrap.py Python/frozen_modules/importlib._bootstrap.h
means generating
importlib._bootstrap.h
from_bootstrap.py
. The output file contains the corresponding compiled byte code of input file. - Call
freeze_module.py
onFROZEN_FILES_IN
to generatePython/frozen.c
. - Call
deppfreeze.py
onFROZEN_FILES_OUT
to generatePython/deepfreeze/deepfreeze.c
.
Registration
__import__
is a builtin module. It is registered during Python interpreter initialization stage. See below code snippet.
// bltinmodule.c.h
#define BUILTIN___IMPORT___METHODDEF \
{"__import__", (PyCFunction)(void(*)(void))builtin___import__, METH_FASTCALL|METH_KEYWORDS, builtin___import____doc__},
It is assigned to Python interpreter state interp->import_func
in function pylifecycle.c#pycore_init_builtins
.
// Get the __import__ function
PyObject *import_func = _PyDict_GetItemStringWithError(interp->builtins,
"__import__");
if (import_func == NULL) {
goto error;
}
interp->import_func = Py_NewRef(import_func);
The actual implementation call diagram is as blow.
1
2
3
4
builtin___import__
-> PyImport_ImportModuleLevelObject
-> mod = import_find_and_load(tstate, abs_name)
-> mod = PyObject_CallMethodObjArgs(interp->importlib, &_Py_ID(_find_and_load),abs_name, interp->import_func, NULL);
So finally, it calls _find_and_load
method in Lib/importlib/_bootstrap.py
. How? How could C code call a python file? As said in the Bootstrap section, this python file is compiled to byte code and saved in a C file. See below Initialization section to understand how Python loads this byte code when interpreter starts. Note, registration happens before below Initialization step. But it does not hurt because it is just registration. __import__
is not called yet.
Also, in above call stack, Python will transform relative import path or absolute import path, and takes different strategies for syntax import a.b.c
and from a.b import c
.
Initialization.
importlib
is initialized when python interpreter is initialized. There are a few steps.
- The first step is to load the frozen module. The main function is
import.c#PyImport_ImportFrozenModule("_frozen_importlib")
. It first finds the module information for_frozen_importlib
, which is insidefrozen.c
. More precisely,frozen.c
borrowsdeepfreeze.c#_Py_get_importlib__bootstrap_toplevel
for loading the frozen byte code. Call sequence:pylifecycle.c#init_importlib -> import.c#PyImport_ImportFrozenModule("_frozen_importlib") -> find_frozen("_frozen_importlib", &info) -> unmarshal_frozen_code(&info) -> d = module_dict_for_exec(tstate, name); // get the dictionary associated with _frozen_importlib module -> m = import_add_module(tstate, name) // not found "_frozen_importlib" module, so create a new module, // and add to "tstate->interp->modules" // The newly creately "_frozen_importlib" will not have builtins functions, // so we after create it, we also add builtins to its "md_dict" -> m = exec_code_in_module(tstate, name, d, co); -> v = PyEval_EvalCode(code_object, module_dict, module_dict); // run the frozen byte code, so get the _frozen_importlib -> m = import_get_module(tstate, name)
- After
_frozen_importlib
module is found, it is added totstate->interp->modules
. Alsointerp->imporlib
is set to the loaded_frozen_importlib
. - Bootstrap
_imp
module. (TODO: study in detail) - call
importlib._install
method.
Execution
When Python sees import a.b.c
, it translates it to byte code IMPORT_NAME
in ceval.c
, which calls import_name
static PyObject *
import_name(PyThreadState *tstate, _PyInterpreterFrame *frame,
PyObject *name, PyObject *fromlist, PyObject *level)
{
PyObject *import_func, *res;
PyObject* stack[5];
import_func = _PyDict_GetItemWithError(frame->f_builtins, &_Py_ID(__import__));
if (import_func == NULL) {
if (!_PyErr_Occurred(tstate)) {
_PyErr_SetString(tstate, PyExc_ImportError, "__import__ not found");
}
return NULL;
}
PyObject *locals = frame->f_locals;
/* Fast path for not overloaded __import__. */
if (import_func == tstate->interp->import_func) {
int ilevel = _PyLong_AsInt(level);
if (ilevel == -1 && _PyErr_Occurred(tstate)) {
return NULL;
}
res = PyImport_ImportModuleLevelObject(
name,
frame->f_globals,
locals == NULL ? Py_None :locals,
fromlist,
ilevel);
return res;
}
Py_INCREF(import_func);
stack[0] = name;
stack[1] = frame->f_globals;
stack[2] = locals == NULL ? Py_None : locals;
stack[3] = fromlist;
stack[4] = level;
res = _PyObject_FastCall(import_func, stack, 5);
Py_DECREF(import_func);
return res;
}
You can see that basically, it uses __import__
method and assigned it to variable import_func
. Then it compares it with tstate->interp->import_func
. In Registration section, we see that interp->import_func
is assigned to __import__
, then it seems that this comparison should always be true. But user may install import hooks, or even override __import__
function. (TODO: verify this statement).
Summary
After stating so much about import mechanism in Python, now we know that Lib/importlib/_bootstrap.py
is the place to find all detailed logic.