Python’s Innards: Hello, ceval.c!
2010/09/02 § 5 Comments
The “Python’s Innards” series owes its existence, at least in part, to hearing one of the Python-Fu masters in my previous workplace say something about a switch statement so large that it was needed to break it up just so some compilers won’t choke on it. I remember thinking then: “Choke the compiler with a switch? Hrmf, let me see that code.” Turns out that this switch can be found in ./Python/ceval.c: PyEval_EvalFrameEx and it switches over the current opcode, invoking its implementation. If I had to summarize all of CPython into one line, I’d probably choose that switch (actually I’d refuse, but humour me by assuming I was at gunpoint or something). This choice is rather subjective, as arguably there are more complex/interesting bits in Python’s object system (explored here and there) or parser/compiler related code. But I can’t help seeing that line, and its surrounding function and file, as the ‘do-work’ heart of CPython.
The reason I didn’t start the series from this heart is that I thought it would be too hard (mostly for the author…). Thanks to what we (well, at least I) learned in the previous posts, I think we can now understand it quite well. I’ll try to link backwards as necessary throughout the article, but if you haven’t followed the series so far, you’d probably do much better if you went back and read some of the previous articles before tackling this one. Also, for brevity’s sake in this post, I won’t qualify the file ./Python/ceval.c and the function PyEval_EvalFrameEx in it. Finally, remember that usually in the series when I quote code, I may note that I edited it, and in that case I often prefer clarity and brevity over accuracy; this is true for this post as well, only much more so, excerpts here might bear only slight resemblance to the real code.
So, where were we… Ah, yes, monstrous switch statement. Well, as I said, this switch can be found in the rather lengthy file ceval.c, in the rather lengthy function PyEval_EvalFrameEx, which takes more than half the file’s lines (it’s roughly 2,250 lines, the file is about 4,400). PyEval_EvalFrameEx implements CPython’s evaluation loop, which is to say that it’s a function that takes a frame object and iterates over each of the opcodes in its associated code object, evaluating (interpreting, executing) each opcode within the context of the given frame (this context is chiefly the associated namespaces and interpreter/thread states). There’s more to ceval.c than PyEval_EvalFrameEx, and we may discuss some of the other bits later in this post (or perhaps a follow-up post), but PyEval_EvalFrameEx is obviously the most important part of it.
Having described the evaluation loop in the previous paragraph, let’s see what it looks like in C (edited):
PyEval_EvalFrameEx(PyFrameObject *f, int throwflag) { /* variable declaration and initialization stuff */ for (;;) { /* do periodic housekeeping once in a few opcodes */ opcode = NEXTOP(); if (HAS_ARG(opcode)) oparg = NEXTARG(); switch (opcode) { case NOP: goto fast_next_opcode; /* lots of more complex opcode implementations */ default: /* become rather unhappy */ } /* handle exceptions or runtime errors, if any */ } /* we are finished, pop the frame stack */ tstate->frame = f->f_back; return retval; }
As you can see, iteration over opcodes is infinite (forever: fetch next opcode, do stuff), breaking out of the loop must be done explicitly. CPython (reasonably) assumes that evaluated bytecode is correct in the sense that it terminates itself by raising an exception, returning a value, etc. Indeed, if you were to synthesize a code object without a RETURN_VALUE at its end and execute it (exercise to reader: how?1), you’re likely to execute rubbish, reach the default handler (raises a SystemError) or maybe even segfault the interpreter (I didn’t check this thoroughly, but it looks plausible).
The evaluation loop may look fairly simple so far, but I kept back an important piece: I snipped about 1,450 lines of opcode implementations from within that big switch, all of them presumably more complex than a NOP. In order for you to be able to get a feel for what more serious opcode implementations look like, here’s the (edited) implementation of three more opcodes, illustrating a few more principles:
case BINARY_SUBTRACT: w = *--stack_pointer; /* value stack POP */ v = stack_pointer[-1]; x = PyNumber_Subtract(v, w); stack_pointer[-1] = x; /* value stack SET_TOP */ if (x != NULL) continue; break; case LOAD_CONST: x = PyTuple_GetItem(f->f_code->co_consts, oparg); *stack_pointer++ = x; /* value stack PUSH */ goto fast_next_opcode; case SETUP_LOOP: case SETUP_EXCEPT: case SETUP_FINALLY: PyFrame_BlockSetup(f, opcode, INSTR_OFFSET() + oparg, STACK_LEVEL()); continue;
We see several things. First, we see a typical value manipulation opcode, BINARY_SUBTRACT. This opcode (and many others) works with values on the value stack as well as with a few temporary variables, using CPython’s C-API abstract object layer (in our case, a function from the number-like object abstraction) to replace the two top values on the value stack with the single value resulting from subtraction. As you can see, a small set of temporary variables, such as v, w and x are used (and reused, and reused…) as the registers of the CPython VM. The variable stack_pointer represents the current bottom of the stack (the next free pointer in the stack). This variable is initialized at the beginning of the function like so: stack_pointer = f->f_stacktop;. In essence, together with the room reserved in the frame object for that purpose, the value stack is this pointer. To make things simpler and more readable, the real (unedited by me) code of ceval.c defines several value stack manipulation/observation macros, like PUSH, TOP or EMPTY. They do what you imagine from their names.
Next, we see a very simple opcode that loads values from somewhere into the valuestack. I chose to quote LOAD_CONST because it’s very brief and simple, although it’s not really a namespace related opcode. “Real” namespace opcodes load values into the value stack from a namespace and store values from the value stack into a namespace; LOAD_CONST loads constants, but doesn’t fetch them from a namespace and has no STORE_CONST counterpart (we explored all this at length in the article about namespaces). The final opcode I chose to show is actually the single implementation of several different control-flow related opcodes (SETUP_LOOP, SETUP_EXCEPT and SETUP_FINALLY), which offload all details of their implementation to the block stack manipulation function PyFrame_BlockSetup; we discussed the block stack in our discussion of interpreter stacks.
Something we can observe looking at these implementations is that different opcodes exit the switch statement differently. Some simply break, and let the code after the switch resume. Some use continue to start the for loop from the beginning. Some goto various labels in the function. Each exit has different semantic meaning. If you break out of the switch (the ‘normal’ route), various checks will be made to see if some special behaviour should be performed – maybe a code block has ended, maybe an exception was raised, maybe we’re ready to return a value. Continuing the loop or going to a label lets certain opcodes take various shortcuts; no use checking for an exception after a NOP or a LOAD_CONST, for instance.
That’s pretty much it. I can’t really say we’re done (not at all), but this is pretty much the gist of PyEval_EvalFrameEx. Simple, eh? Well, yeah, simple, but I lied a bit with the editing to make it simpler. For example, if you look at the code itself, you will see that none of the case expressions for the big switch are really there. The code for the NOP opcode is actually (remember this series is about Python 3.x unless noted otherwise, so this snippet is from Python 3.1.2):
TARGET(NOP) FAST_DISPATCH();
TARGET? FAST_DISPATCH? What are these? Let me explain. Things may become clearer if we’d look for a moment at the implementation of the NOP opcode in ceval.c of Python 2.x. Over there the code for NOP looks more like the samples I’ve shown you so far, and it actually seems to me that the code of ceval.c gets simpler and simpler as we look backwards at older revisions of it. The reason is that although I think PyEval_EvalFrameEx was originally written as a really exceptionally straightforward piece of code, over the years some necessary complexity crept into it as various optimizations and improvements were implemented (I’ll collectively call them ‘additions’ from now on, for lack of a better term).
To further complicate matters, many of these additions are compiled conditionally with preprocessor directives, so several things are implemented in more than one way in the same source file. In the larger code samples I quoted above, I liberally expanded some preprocessor directives using their least complex expansion. However, depending on compilation flags, these and other preprocessor directives might expand to something else, possibly a more complicated something. I can understand trading simplicity to optimize a tight loop which is used very often, and the evaluation loop is probably one of the more used loops in CPython (and probably as tight as its contributors could make it). So while this is all very warranted, it doesn’t help the readability of the code.
Anyway, I’d like to enumerate these additions here explicitly (some in more depth than others); this should aid future discussion of ceval.c, as well as prevent me from feeling like I’m hiding too many important things with my free spirited editing of quoted code. Fortunately, most if not all these additions are very well commented -actually, some of the explanations below will be just summaries or even taken verbatim from these comments, as I believe that they’re accurate (eek!). So, as you read PyEval_EvalFrameEx (and indeed ceval.c in general), you’re likely to run into any of these:
“Threaded Code” (Computed-GOTOs)
Let’s start with the addition that gave us TARGET, FAST_DISPATCH and a few other macros. The evaluation loop uses a “switch” statement, which decent compilers optimize as a single indirect branch instruction with a lookup table of addresses. Alas, since we’re switching over rapidly changing opcodes (it’s uncommon to have the same opcode repeat), this would have an adverse effect on the success rate of CPU branch prediction. Fortunately gcc supports the use of C-goto labels as values, which you can generally pass around and place in an array (restrictions apply!). Using an array of adresses in memory obtained from labels, as you can see in ./Python/opcode_targets.h, we create an explicit jump table and place an explicit indirect jump instruction at the end of each opcode. This improves the success rate of CPU prediction and can yield as much as 20% boost in performance.
Thus, for example, the NOP opcode is implemented in the code like so:
TARGET(NOP) FAST_DISPATCH();
In the simpler scenario, this would expand to a plain case statement and a goto, like so:
case NOP: goto fast_next_opcode;
But when threaded code is in use, that snippet would expand to (I highlighted the lines where we actually move on to the next opcode, using the dispatch table of label-values):
TARGET_NOP: opcode = NOP; if (HAS_ARG(NOP)) oparg = NEXTARG(); case NOP: { if (!_Py_TracingPossible) { f->f_lasti = INSTR_OFFSET(); goto *opcode_targets[*next_instr++]; } goto fast_next_opcode; }
Same behaviour, somewhat more complicated implementation, up to 20% faster Python. Nifty.
Opcode Prediction
Some opcodes tend to come in pairs. For example, COMPARE_OP is often followed by JUMP_IF_FALSE or JUMP_IF_TRUE, themselves often followed by a POP_TOP. What’s more, there are situations where you can determine that a particular next-opcode can be run immediately after the execution of the current opcode, without going through the ‘outer’ (and expensive) parts of the evaluation loop. PREDICT (and a few others) are a set of macros that explicitly peek at the next opcode and jump to it if possible, shortcutting most of the loop in this fashion (i.e., if (*next_instr == op) goto PRED_##op). Note that there is no relation to real hardware here, these are simply hardcoded conditional jumps, not an exploitation of some mechanism in the underlying CPU (in particular, it has nothing to do with “Threaded Code” described above).
Low Level Tracing
An addition primarily geared towards those developing CPython (or suffering from a horrible, horrible bug). Low Level Tracing is controlled by the LLTRACE preprocessor name, which is enabled by default on debug builds of CPython (see --with-pydebug). As explained in ./Misc/SpecialBuilds.txt: when this feature is compiled-in, PyEval_EvalFrameEx checks the frame’s global namespace for the variable __lltrace__. If such a variable is found, mounds of information about what the interpreter is doing are sprayed to stdout, such as every opcode and opcode argument and values pushed onto and popped off the value stack. Not useful very often, but very useful when needed.
This is the what the low level trace output looks like (slightly edited):
>>> def f(): ... global a ... return a - 5 ... >>> dis(f) 3 0 LOAD_GLOBAL 0 (a) 3 LOAD_CONST 1 (5) 6 BINARY_SUBTRACT 7 RETURN_VALUE >>> exec(f.__code__, {'__lltrace__': 'foo', 'a': 10}) 0: 116, 0 push 10 3: 100, 1 push 5 6: 24 pop 5 7: 83 pop 5 # trace of the end of exec() removed >>>
As you can guess, you’re seeing a real-time disassembly of what’s going through the VM as well as stack operations. For example, the first line says: line 0, do opcode 116 (LOAD_GLOBAL) with the operand 0 (expands to the global variable a), and so on, and so forth. This is a bit like (well, little more than) adding a bunch of printf calls to the heart of VM.
Advanced Profiling
Under this heading I’d like to briefly discuss several profiling related additions. The first relies on the fact that some processors (notably Pentium descendants and at least some PowerPCs) have built-in wall time measurement capabilities which are cheap and precise (correct me if I’m wrong). As an aid in the development of a high-performance CPython implementation, Python 2.4’s ceval.c was instrumented with the ability to collect per-opcode profiling statistics using these counters. This instrumentation is controlled by the somewhat misnamed --with-tsc configuration flag (TSC is an Intel Pentium specific name, and this feature is more general than that). Calling sys.settscdump(True) on an instrumented interpreter will cause the function ./Python/ceval.c: dump_tsc to print these statistics every time the evaluation loop loops.
The second advanced profiling feature is Dynamic Execution Profiling. This is only available if Python was built with the DYNAMIC_EXECUTION_PROFILE preprocessor name. As ./Tools/scripts/analyze_dxp.py says, [this] will tell you which opcodes have been executed most frequently in the current process, and, if Python was also built with -DDXPAIRS, will tell you which instruction _pairs_ were executed most frequently, which may help in choosing new instructions.
One last thing to add here is that enabling Dynamic Execution Profiling implicitly disables the “Threaded Code” addition.
The third and last addition in this category is function call profiling, controlled by the preprocessor name CALL_PROFILE. Quoting ./Misc/SpecialBuilds.txt again: When this name is defined, the ceval mainloop and helper functions count the number of function calls made. It keeps detailed statistics about what kind of object was called and whether the call hit any of the special fast paths in the code.
Extra Safety Valves
Two preprocessor names, USE_STACKCHECK and CHECKEXC include extra assertions. Testing an interpreter with these enabled may catch a subtle bug or regression, but they are usually disabled as they’re too expensive.
These are the additions I found, grepping ceval.c for #ifdef. I think we’ll call it a day here, although we’re by no means finished. For example, I’d like to devote a separate post to exceptions, which is where we can discuss the tail of the evaluation loop (everything after the big switch and before the end of the big for), which we merely skimmed today. I’d also like to devote a whole post to locking and synchronization (including the GIL), which we touched upon before but never covered properly. Last but really not least, there’s about 2,000 other lines in ceval.c which we didn’t cover today; none of them are as important as PyEval_EvalFrameEx, but we need to talk at least about some of them.
All these things taken into account, I think we can say that today we finally conquered the evaluation loop. This isn’t the end of the series, far from it, but I do see it as a milestone. “Hooray”, I believe the saying goes. I hope you’re enjoying the show, thanks for the supportive comments (they keep me going), and I’ll see you in the next post.
I would like to thank Nick Coghlan for reviewing this article; any mistakes that slipped through are my own.
Python’s Innards: Code Objects
2010/07/03 § 5 Comments
This article, part of a series of articles about Python’s internals, will continue our preparation to engage the machinery of code evaluation by discussing Code Objects. To those of you who just now joined in and didn’t even read the introduction (but why?!), please note an important disclaimer: while the series as a whole is CPython 3.x centric and might not ‘apply cleanly’ to other Python implementations, matters of bytecode and evaluation (like this article discusses) are even more likely to deviate between implementations. So some of what I say in this post may apply to other implementations, some not – I’m not even checking at the moment; if and when we’ll discuss implementations like PyPy, Jython, IronPython, etc, I’ll highlight some of the differences. With this disclaimer in mind, we can get back to the plot: Code Objects. The compilation of Python source code emits Python bytecode, which is evaluated at runtime to produce whatever behaviour the programmer implemented. I guess you can think of bytecode as ‘machine code for the Python virtual machine’, and indeed if you look at some binary x86 machine code (like this one: 0x55 0x89 0xe5 0xb8 0x2a 0x0 0x0 0x0 0x5d) and some Python bytecode (like that one: 0x64 0x1 0x0 0x53) they look more or less like the same sort of gibberish. Along with the actual bytecode, Python’s compiler emits additional fields, most of them must be coupled with the bytecode (otherwise it would be meaningless). The bytecode and these fields are lumped together in an object called a code object, our subject for this article.
You might initially confuse function objects with code objects, but shouldn’t. Functions are higher level creatures that execute code by relying on a lower level primitive, the code object, but adding more functionality on top of that (in other words, every function has precisely one code object directly associated with it, this is the function’s __code__ attribute, or f_code in Python 2.x). For example, among other things, a function keeps a reference to the global namespace (remember that?) in which it was originally defined, and knows the default values of arguments it receives. You can sometimes execute a code objects without a function (see eval and exec), but then you will have to provide it with a namespace or two to work in. Finally, just for accuracy’s sake, please note that tp_call of a function object isn’t exactly like exec or eval; the latter don’t pass in arguments or provide free argument binding (more below on these). If this doesn’t sit well with you yet, don’t panic, it just means functions’ code objects won’t necessarily be executable using eval or exec. I hope we have that settled.
Let’s see when code objects are created. Code objects are created whenever a block of Python code is compiled. We have mentioned blocks briefly before, the fine material defines them as “a piece of Python program text that is executed as a unit. The following are blocks: a module, a function body, and a class definition.” (the fine material also lists other but less-interesting-to-us code blocks, like every command in the interactive interpreter, the string passed to Python’s executable’s -c switch, etc). As usual, I don’t want to dig too deeply into compilation, but basically when a code block is encountered, it has to be successfully transformed into an AST (which requires mostly that its syntax will be correct), which is then passed to ./Python/compile.c: PyAST_Compile, the entry point into Python’s compilation machinary. A kind comment in ./Python/compile.c explains the general execution flow of this function.
Next, let’s discuss what is in a code object; I said it has stuff other than bytecode, but what? To whet our appetite about the various fields of a code object, we can look at the compiled Python sample from the first paragraph and disassemble it ourselves; it’s easier if we know beforehand that both samples implement a function which simply returns the value 42. Unlike the x86 machine code sample, which is self-contained and should be ready to run (<cough>assuming I didn’t botch it</cough>) the Python bytecode sample doesn’t include the constant value 42 in it at all. You absolutely can’t run this code meaningfully without its constants, and indeed 42 is referred to by one of the extra fields of the code object. We will best see the interaction between the actual bytecode and the accompanying fields as we do a manual disassembly.
From the interpreter (as usual, slight editing for readability):
# the opcode module has a mapping of opcode # byte values to their symbolic names >>> import opcode >>> def return42(): return 42 ... # this is the function's code object >>> return42.__code__ <code object return42 ... > # this is the actual bytecode >>> return42.__code__.co_code b'd\x01\x00S' # this is the field holding constants >>> return42.__code__.co_consts (None, 42) # the first opcode is LOAD_CONST >>> opcode.opname[return42.__code__.co_code[0]] 'LOAD_CONST' # LOAD_CONST has one word as an operand # let's get its value >>> return42.__code__.co_code[1] + \ ... 256 * return42.__code__.co_code[2] 1 # and which constant can we find in offset 1? >>> return42.__code__.co_consts[1] 42 # finally, the next opcode >>> opcode.opname[return42.__code__.co_code[3]] 'RETURN_VALUE' >>>
I hope this was educational, albeit doing it all the time could get boring. Fortunately, we have dis to do this work for us (>>> from dis import dis, you already saw I aliased and augmented it as diss). In addition to dis, the function show_code from the same module is useful to look at code objects (I aliased and augmented a bit as ssc). So let’s look at return42 with diss and ssc:
>>> diss(return42) 1 0 LOAD_CONST 1 (42) 3 RETURN_VALUE >>> ssc(return42) Name: return42 Filename: <stdin> Argument count: 0 Kw-only arguments: 0 Number of locals: 0 Stack size: 1 Flags: OPTIMIZED, NEWLOCALS, NOFREE Constants: 0: None 1: 42 >>>
We see diss and ssc generally agree with our disassembly, though ssc further parsed all sorts of other fields of the code object which we didn’t handle so far (you can run dir on a code object to see them yourself). We have also seen that our value of 42 is indeed referred to by a field of the code object, rather than somehow be encoded in the bytecode.
Code objects are immutable and their fields don’t hold any references (directly or indirectly) to mutable objects. This immutability is useful in simplifying many things, one of which is the handling of nested code blocks. An example of a nested code block is a class with two methods: the class is built using a code block, and this code block nests two inner code blocks, one for each method. This situation is recursively handled by creating the innermost code objects first and treating them as constants for the enclosing code object (much like an integer or a string literal would be treated). You may be wondering how mutable object literals (a = [1, 2, 3]) are represented in a code object, and the answer is that rather than referring to the mutable object with the code object, the ‘recipe’ to prepare it is kept (try >>> import dis; dis.dis(compile("a=(1,2,[3,4,{5:6}])", "string", "exec")) to make this immediately clear).
Now that we have seen the relation between the bytecode and a code object field (co_consts), let’s take a look at the myriad of other fields in a code object. To be honest, I’m not sure this list would be particularly exciting. Many of these fields are just integer counters or tuples of strings representing how many or which variables of various sorts are used in a code object. But looking to the horizon where ceval.c and frame object evaluation is waiting for us, I can tell you that we need an immediate and crisp understanding of all these fields and their exact meaning, subtleties included. So I’ll (tediously?) list and categorize them all, building on the rather terse description you can find in the standard type hierarchy. If this seems to boring right now, you best skim it now but keep it as a reference for later posts; trust me, it’s useful.
- co_name
- A name (a string) for this code object; for a function this would be the function’s name, for a class this would be the class’ name, etc. The compile builtin doesn’t let you specify this, so all code objects generated with it carry the name <module>.
- co_filename
- The filename from which the code was compiled. Will be <stdin> for code entered in the interactive interpreter or whatever name is given as the second argument to compile for code objects created with compile.
- co_varnames
- A tuple containing the names of the local variables (including arguments). To parse this tuple properly you need to look at co_flags and the counter fields listed below, so you’ll know which item in the tuple is what kind of variable. In the ‘richest’ case, co_varnames contains (in order): positional argument names (including optional ones), keyword only argument names (again, both required and optional), varargs argument name (i.e., *args), kwds argument name (i.e., **kwargs), and then any other local variable names. So you need to look at co_argcount, co_kwonlyargcount and co_flags to fully interpret this tuple.
- co_cellvars
- A tuple containing the names of local variables that are stored in cells (discussed in the previous article) because they are referenced by lexically nested functions.
- co_freevars
- A tuple containing the names of free variables. Generally, a free variable means a variable which is referenced by an expression but isn’t defined in it. In our case, it means a variable that is referenced in this code object but was defined and will be dereferenced to a cell in another code object (also see co_cellvars above and, again, the previous article).
- co_names
- A tuple containing the names which aren’t covered by any of the other fields (they are not local variables, they are not free variables, etc) used by the bytecode. This includes names deemed to be in the global or builtin namespace as well as attributes (i.e., if you do foo.bar in a function, bar will be listed in its code object’s names).
- co_argcount
- The number of positional arguments the code object expects to receive, including those with default values. For example, def foo(a, b, c=3): pass would have a code object with this value set to three. The code object of classes accept one argument which we will explore when we discuss class creation.
- co_kwonlyargcount
- The number of keyword arguments the code object can receive.
- co_nlocals
- The number of local variables used in the code object (including arguments).
- co_firstlineno
- The line offset where the code object’s source code began, relative to the module it was defined in, starting from one. In this (and some but not all other regards), each input line typed in the interactive interpreter is a module of its own.
- co_stacksize
- The maximum size required of the value stack when running this object. This size is statically computed by the compiler (./Python/compile.c: stackdepth when the code object is created, by looking at all possible flow paths searching for the one that requires the deepest value stack. To illustrate this, look at the diss and ssc outputs for a = 1 and a = [1,2,3]. The former has at most one value on the value stack at a time, the latter has three, because it needs to put all three integer literals on the stack before building the list.
- co_code
- A string representing the sequence of bytecode instructions, contains a stream of opcodes and their operands (or rather, indexes which are used with other code object fields to represent their operands, as we saw above).
- co_consts
- A tuple containing the literals used by the bytecode. Remember everything in a code object must be immutable, running diss and ssc on the code snippets a=(1,2,3) versus [1,2,3] and yet again versus a=(1,2,3,[4,5,6]) recommended to dig this field.
- co_lnotab
- A string encoding the mapping from bytecode offsets to line numbers. If you happen to really care how this is encoded you can either look at ./Python/compile.c or ./Lib/dis.py: findlinestarts.
- co_flags
- An integer encoding a number of flags regarding the way this code object was created (which says something about how it should be evaluated). The list of possible flags is listed in ./Include/code.h, as a small example I can give CO_NESTED, which marks a code object which was compiled from a lexically nested function. Flags also have an important role in the implementation of the __future__ mechanism, which is still unused in Python 3.1 at the time of this writing, as no “future syntax” exists in Python 3.1. However, even when thinking in Python 3.x terms co_flags is still important as it facilitates the migration from the 2.x branch. In 2.x, __future__ is used when enabling Python 3.x like behaviour (i.e., from __future__ import print_function in Python 2.7 will disable the print statement and add a print function to the builtins module, just like in Python 3.x). If we come across flags from now on (in future posts), I’ll try to mention their relevance in the particular scenario.
- co_zombieframe
- This field of the PyCodeObject struct is not exposed in the Python object; it (optionally) points to a stack frame object. This can aid performance by maintaining an association between a code object and a stack frame object, so as to avoid reallocation of frames by recycling the frame object used for a code object. There’s a detailed comment in ./Objects/frameobject.c explaining zombie frames and their reanimation, we may mention this issue again when we discuss stack frames.
Phew! This is everything in a code object. In making this list I’ve compiled quite a few code blocks, looking how changes in the Python source changes the resulting code object. I recommend you do something similar, and I actually bothered to make it ultra-easy for you to look into how various code blocks affect these fields: in the Mercurial repository I have for this series I created a directory called code_objects, within it you can find a self-explaining little utility that can facilitate looking at a few sample code blocks I wrote alongside with their disassembly and show_code output. Not all fields are necessarily covered in the sample code blocks I provided, you should be able to add a few more samples (if anything intrigues you) yourself and see them disassembled/analyzed. Also, I’m sorry, I’m totally a *NIX bigot (and will erase all flame or even flame-ish comments about that) and this toy might not run on Windows. There’s no good reason for that, I just wanted to use less for pagination, etc, and couldn’t be bothered with achieving the same effect on Windows.
This pretty much sums up what I have to say about code objects at the moment. Time permitting, I sincerely hope we’ll soon reach the next article, where we’ll tackle the final frontier before ./Python/ceval.c: PyEval_EvalFrameEx itself: frame objects. ¡Olé!
I would like to thank Nick Coghlan for reviewing this article; any mistakes that slipped through are my own.