There are three file types in Cython:
- This is not necessary, as it is automatic.
- a public declaration is only needed to make it accessible to external C code.
As a dynamic language, Python encourages a programming style of considering classes and objects in terms of their methods and attributes, more than where they fit into the class hierarchy.
This can make Python a very relaxed and comfortable language for rapid development, but with a price - the ‘red tape’ of managing data types is dumped onto the interpreter. At run time, the interpreter does a lot of work searching namespaces, fetching attributes and parsing argument and keyword tuples. This run-time ‘late binding’ is a major cause of Python’s relative slowness compared to ‘early binding’ languages such as C++.
However with Cython it is possible to gain significant speed-ups through the use of ‘early binding’ programming techniques.
Note
Typing is not a necessity
Providing static typing to parameters and variables is convenience to speed up your code, but it is not a necessity. Optimize where and when needed.
The cdef statement is used to make C level declarations for:
Variables: |
---|
cdef int i, j, k
cdef float f, g[42], *h
Structs: |
---|
cdef struct Grail:
int age
float volume
..note Structs can be declared as cdef packed struct, which has the same effect as the C directive #pragma pack(1).
Unions: |
---|
cdef union Food:
char *spam
float *eggs
Enums: |
---|
cdef enum CheeseType:
cheddar, edam,
camembert
cdef enum CheeseState:
hard = 1
soft = 2
runny = 3
Functions: |
---|
cdef int eggs(unsigned long l, float f):
...
Extension Types: | |
---|---|
cdef class Spam:
...
Note
Constants
Constants can be defined by using an anonymous enum:
cdef enum:
tons_of_spam = 3
A series of declarations can grouped into a cdef block:
cdef:
struct Spam:
int tons
int i
float f
Spam *p
void f(Spam *s):
print s.tons, "Tons of spam"
Note
ctypedef statement
The ctypedef statement is provided for naming types:
ctypedef unsigned long ULong
ctypedef int *IntPtr
Both C and Python function types can be declared to have parameters C data types.
Use normal C declaration syntax:
def spam(int i, char *s):
...
cdef int eggs(unsigned long l, float f):
...
As these parameters are passed into a Python declared function, they are magically converted to the specified C type value.
- This holds true for only numeric and string types
The following takes two Python objects as parameters and returns a Python object:
cdef spamobjs(x, y): ...Note
–
This is different then C language behavior, where it is an int by default.
- Borrowed references are taken as parameters
- New references are returned
Todo
link or label here the one ref count caveat for NumPy.
For sake of code clarity, it recommended to always use object explicitly in your code.
This is also useful for cases where the name being declared would otherwise be taken for a type:
cdef foo(object int): ...As a return type:
cdef object foo(object int): ...
Todo
Do a see also here ..??
When in a .pyx file, the signature is the same as it is in Python itself:
cdef class A: cdef foo(self): print "A" cdef class B(A) cdef foo(self, x=None) print "B", x cdef class C(B): cpdef foo(self, x=True, int k=3) print "C", x, kWhen in a .pxd file, the signature is different like this example: cdef foo(x=*):
cdef class A: cdef foo(self) cdef class B(A) cdef foo(self, x=*) cdef class C(B): cpdef foo(self, x=*, int k=*)
- The number of arguments may increase when subclassing, but the arg types and order must be the same.
As in Python 3, def functions can have keyword-only argurments listed after a "*" parameter and before a "**" parameter if any:
def f(a, b, *args, c, d = 42, e, **kwds):
...
- Shown above, the c, d and e arguments can not be passed as positional arguments and must be passed as keyword arguments.
- Furthermore, c and e are required keyword arguments since they do not have a default value.
If the parameter name after the "*" is omitted, the function will not accept any extra positional arguments:
def g(a, b, *, c, d):
...
- Shown above, the signature takes exactly two positional parameters and has two required keyword parameters
For basic numeric and string types, in most situations, when a Python object is used in the context of a C value and vice versa.
The following table summarizes the conversion possibilities, assuming sizeof(int) == sizeof(long):
C types
From Python types
To Python types
[unsigned] char
int, long
int
[unsigned] short
int, long
unsigned int
int, long
long
unsigned long
[unsigned] long long
float, double, long double
int, long, float
float
char *
str/bytes
str/bytes [1]
struct
dict
Note
Python String in a C Context
A Python string, passed to C context expecting a char*, is only valid as long as the Python string exists.
A reference to the Python string must be kept around for as long as the C string is needed.
If this can’t be guaranteed, then make a copy of the C string.
Cython may produce an error message: Obtaining char* from a temporary Python value and will not resume compiling in situations like this:
cdef char *s
s = pystring1 + pystring2
The reason is that concatenating to strings in Python produces a temporary variable.
- The variable is decrefed, and the Python string deallocated as soon as the statement has finished,
- Therefore the lvalue ``s`` is left dangling.
The solution is to assign the result of the concatenation to a Python variable, and then obtain the char* from that:
cdef char *s
p = pystring1 + pystring2
s = p
Note
It is up to you to be aware of this, and not to depend on Cython’s error message, as it is not guaranteed to be generated for every situation.
Note
The syntax is different from C convention
cdef char *p, float *q p = <char*>q
Note
Cython will not stop a casting where there is no conversion, but it will emit a warning.
- In this case, Cython will throw an error if "x" is not a (subclass) of MyExtensionType
- An integer literal is treated as a C constant
It will be truncated to whatever size your C compiler thinks appropriate.
Cast to a Python object like this:
<object>10000000000000000000The "L", "LL" and the "U" suffixes have the same meaning as in C
- Do NOT use 0.
- NULL is a reserved word in Cython
Note
Tricks, like the following will NOT work in Cython:
try: x = True except NameError: True = 1The above example will not work because True will always be looked up in the module-level scope. Do the following instead:
import __builtin__ try: True = __builtin__.True except AttributeError: True = 1
The “for ... in iterable” loop works as in Python, but is even more versatile in Cython as it can additionally be used on C types.
range() is C optimized when the index value has been declared by cdef, for example:
cdef size_t i
for i in range(n):
...
Iteration over C arrays and sliced pointers is supported and automatically infers the type of the loop variable, e.g.:
cdef double* data = ...
for x in data[:10]:
...
Iterating over many builtin types such as lists and tuples is optimized.
There is also a more verbose C-style for-from syntax which, however, is deprecated in favour of the normal Python “for ... in range()” loop. You might still find it in legacy code that was written for Pyrex, though.
The target expression must be a plain variable name.
The name between the lower and upper bounds must be the same as the target name.
- for i from 0 <= i < n:
...
Or when using a step size:
for i from 0 <= i < n by s: ...To reverse the direction, reverse the conditional operation:
for i from n > i >= 0: ...
cpdef functions can override cdef functions:
cdef class A:
cdef foo(self):
print "A"
cdef class B(A)
cdef foo(self, x=None)
print "B", x
cdef class C(B):
cpdef foo(self, x=True, int k=3)
print "C", x, k
Cython compiles calls to most built-in functions into direct calls to the corresponding Python/C API routines, making them particularly fast.
Only direct function calls using these names are optimised. If you do something else with one of these names that assumes it’s a Python object, such as assign it to a Python variable, and later call it, the call will be made as a Python function call.
Function and arguments | Return type | Python/C API Equivalent |
---|---|---|
abs(obj) | object, double, ... | PyNumber_Absolute, fabs, fabsf, ... |
callable(obj) | bint | PyObject_Callable |
delattr(obj, name) | None | PyObject_DelAttr |
exec(code, [glob, [loc]]) | object | |
dir(obj) | list | PyObject_Dir |
divmod(a, b) | tuple | PyNumber_Divmod |
getattr(obj, name, [default]) (Note 1) | object | PyObject_GetAttr |
hasattr(obj, name) | bint | PyObject_HasAttr |
hash(obj) | int / long | PyObject_Hash |
intern(obj) | object | Py*_InternFromString |
isinstance(obj, type) | bint | PyObject_IsInstance |
issubclass(obj, type) | bint | PyObject_IsSubclass |
iter(obj, [sentinel]) | object | PyObject_GetIter |
len(obj) | Py_ssize_t | PyObject_Length |
pow(x, y, [z]) | object | PyNumber_Power |
reload(obj) | object | PyImport_ReloadModule |
repr(obj) | object | PyObject_Repr |
setattr(obj, name) | void | PyObject_SetAttr |
Note 1: Pyrex originally provided a function getattr3(obj, name, default)() corresponding to the three-argument form of the Python builtin getattr(). Cython still supports this function, but the usage is deprecated in favour of the normal builtin, which Cython can optimise in both forms.
- Has no way of reporting a Python exception to it’s caller.
- Will only print a warning message and the exception is ignored.
First:
cdef int spam() except -1: ...
- In the example above, if an error occurs inside spam, it will immediately return with the value of -1, causing an exception to be propagated to it’s caller.
- Functions declared with an exception value, should explicitly prevent a return of that value.
Second:
cdef int spam() except? -1: ...
- Used when a -1 may possibly be returned and is not to be considered an error.
- The "?" tells Cython that -1 only indicates a possible error.
- Now, each time -1 is returned, Cython generates a call to PyErr_Occurred to verify it is an actual error.
Third:
cdef int spam() except *
A call to PyErr_Occurred happens every time the function gets called.
Note
Returning void
A need to propagate errors when returning void must use this version.
- integer
- enum
- float
- pointer type
- Must be a constant expression
Note
Note
Function pointers
Require the same exception value specification as it’s user has declared.
Use cases here are when used as parameters and when assigned to a variable:
int (*grail)(int, char *) except -1
Note
Python Objects
Note
C++
Do not try to raise exceptions by returning the specified value.. Example:
cdef extern FILE *fopen(char *filename, char *mode) except NULL # WRONG!
- The except clause does not work that way.
- It’s only purpose is to propagate Python exceptions that have already been raised by either...
- A Cython function
- A C function that calls Python/C API routines.
To propagate an exception for these circumstances you need to raise it yourself:
cdef FILE *p
p = fopen("spam.txt", "r")
if p == NULL:
raise SpamError("Couldn't open the spam file")
Defined using the DEF statement:
DEF FavouriteFood = "spam"
DEF ArraySize = 42
DEF OtherArraySize = 2 * ArraySize + 17
The right hand side must be a valid compile-time expression made up of either:
- Literal values
- Names defined by other DEF statements
- Corresponding to the values returned by os.uname()
- UNAME_SYSNAME
- UNAME_NODENAME
- UNAME_RELEASE
- UNAME_VERSION
- UNAME_MACHINE
The compile-time expression, in this case, must evaluate to a Python value of int, long, float, or str:
cdef int a1[ArraySize] cdef int a2[OtherArraySize] print "I like", FavouriteFood
- IF
- ELIF
- ELSE
IF UNAME_SYSNAME == "Windows":
include "icky_definitions.pxi"
ELIF UNAME_SYSNAME == "Darwin":
include "nice_definitions.pxi"
ELIF UNAME_SYSNAME == "Linux":
include "penguin_definitions.pxi"
ELSE:
include "other_definitions.pxi"
- This includes other IF and DEF statements
[1] | The conversion is to/from str for Python 2.x, and bytes for Python 3.x. |