Helper classes¶
This section describes some classes that do not fit in any other section and that mainly serve for ancillary purposes.
The Filters class¶
-
class
tables.
Filters
(complevel=0, complib='zlib', shuffle=True, fletcher32=False, least_significant_digit=None, _new=True)[source]¶ Container for filter properties.
This class is meant to serve as a container that keeps information about the filter properties associated with the chunked leaves, that is Table, CArray, EArray and VLArray.
Instances of this class can be directly compared for equality.
Parameters: complevel : int
Specifies a compression level for data. The allowed range is 0-9. A value of 0 (the default) disables compression.
complib : str
Specifies the compression library to be used. Right now, ‘zlib’ (the default), ‘lzo’, ‘bzip2’ and ‘blosc’ are supported. Additional compressors for Blosc like ‘blosc:blosclz’ (‘blosclz’ is the default in case the additional compressor is not specified), ‘blosc:lz4’, ‘blosc:lz4hc’, ‘blosc:snappy’ and ‘blosc:zlib’ are supported too. Specifying a compression library which is not available in the system issues a FiltersWarning and sets the library to the default one.
shuffle : bool
Whether or not to use the Shuffle filter in the HDF5 library. This is normally used to improve the compression ratio. A false value disables shuffling and a true one enables it. The default value depends on whether compression is enabled or not; if compression is enabled, shuffling defaults to be enabled, else shuffling is disabled. Shuffling can only be used when compression is enabled.
fletcher32 : bool
Whether or not to use the Fletcher32 filter in the HDF5 library. This is used to add a checksum on each data chunk. A false value (the default) disables the checksum.
least_significant_digit : int
If specified, data will be truncated (quantized). In conjunction with enabling compression, this produces ‘lossy’, but significantly more efficient compression. For example, if least_significant_digit=1, data will be quantized using
around(scale*data)/scale
, wherescale = 2**bits
, and bits is determined so that a precision of 0.1 is retained (in this case bits=4). Default is None, or no quantization.Note
quantization is only applied if some form of compression is enabled
Examples
This is a small example on using the Filters class:
import numpy from tables import * fileh = open_file('test5.h5', mode='w') atom = Float32Atom() filters = Filters(complevel=1, complib='blosc', fletcher32=True) arr = fileh.create_earray(fileh.root, 'earray', atom, (0,2), "A growable array", filters=filters) # Append several rows in only one call arr.append(numpy.array([[1., 2.], [2., 3.], [3., 4.]], dtype=numpy.float32)) # Print information on that enlargeable array print("Result Array:") print(repr(arr)) fileh.close()
This enforces the use of the Blosc library, a compression level of 1 and a Fletcher32 checksum filter as well. See the output of this example:
Result Array: /earray (EArray(3, 2), fletcher32, shuffle, blosc(1)) 'A growable array' type = float32 shape = (3, 2) itemsize = 4 nrows = 3 extdim = 0 flavor = 'numpy' byteorder = 'little'
Filters attributes
-
fletcher32
¶ Whether the Fletcher32 filter is active or not.
-
complevel
¶ The compression level (0 disables compression).
-
complib
¶ The compression filter used (irrelevant when compression is not enabled).
-
shuffle
¶ Whether the Shuffle filter is active or not.
-
Filters methods¶
-
Filters.
copy
(**override)[source]¶ Get a copy of the filters, possibly overriding some arguments.
Constructor arguments to be overridden must be passed as keyword arguments.
Using this method is recommended over replacing the attributes of an instance, since instances of this class may become immutable in the future:
>>> filters1 = Filters() >>> filters2 = filters1.copy() >>> filters1 == filters2 True >>> filters1 is filters2 False >>> filters3 = filters1.copy(complevel=1) Traceback (most recent call last): ... ValueError: compression library ``None`` is not supported... >>> filters3 = filters1.copy(complevel=1, complib='zlib') >>> print(filters1) Filters(complevel=0, shuffle=False, fletcher32=False, least_significant_digit=None) >>> print(filters3) Filters(complevel=1, complib='zlib', shuffle=False, fletcher32=False, least_significant_digit=None) >>> filters1.copy(foobar=42) Traceback (most recent call last): ... TypeError: __init__() got an unexpected keyword argument 'foobar'
The Index class¶
-
class
tables.index.
Index
(parentnode, name, atom=None, title='', kind=None, optlevel=None, filters=None, tmp_dir=None, expectedrows=0, byteorder=None, blocksizes=None, new=True)[source]¶ Represents the index of a column in a table.
This class is used to keep the indexing information for columns in a Table dataset (see The Table class). It is actually a descendant of the Group class (see The Group class), with some added functionality. An Index is always associated with one and only one column in the table.
Note
This class is mainly intended for internal use, but some of its documented attributes and methods may be interesting for the programmer.
Parameters: parentnode :
The parent
Group
object.Changed in version 3.0: Renamed from parentNode to parentnode.
name : str
The name of this node in its parent group.
atom : Atom
An Atom object representing the shape and type of the atomic objects to be saved. Only scalar atoms are supported.
title :
Sets a TITLE attribute of the Index entity.
kind :
The desired kind for this index. The ‘full’ kind specifies a complete track of the row position (64-bit), while the ‘medium’, ‘light’ or ‘ultralight’ kinds only specify in which chunk the row is (using 32-bit, 16-bit and 8-bit respectively).
optlevel :
The desired optimization level for this index.
filters : Filters
An instance of the Filters class that provides information about the desired I/O filters to be applied during the life of this object.
tmp_dir :
The directory for the temporary files.
expectedrows :
Represents an user estimate about the number of row slices that will be added to the growable dimension in the IndexArray object.
byteorder :
The byteorder of the index datasets on-disk.
blocksizes :
The four main sizes of the compound blocks in index datasets (a low level parameter).
Index instance variables¶
-
Index.
column
¶ The Column (see The Column class) instance for the indexed column.
-
Index.
dirty
¶ Whether the index is dirty or not.
Dirty indexes are out of sync with column data, so they exist but they are not usable.
-
Index.
filters
¶ Filter properties for this index - see Filters in The Filters class.
-
Index.
is_csi
¶ Whether the index is completely sorted or not.
Changed in version 3.0: The is_CSI property has been renamed into is_csi.
-
tables.index.Index.
nelements
¶ The number of currently indexed rows for this column.
Index methods¶
-
Index.
read_sorted
(start=None, stop=None, step=None)[source]¶ Return the sorted values of index in the specified range.
The meaning of the start, stop and step arguments is the same as in
Table.read_sorted()
.
-
Index.
read_indices
(start=None, stop=None, step=None)[source]¶ Return the indices values of index in the specified range.
The meaning of the start, stop and step arguments is the same as in
Table.read_sorted()
.
Index special methods¶
-
Index.
__getitem__
(key)[source]¶ Return the indices values of index in the specified range.
If key argument is an integer, the corresponding index is returned. If key is a slice, the range of indices determined by it is returned. A negative value of step in slice is supported, meaning that the results will be returned in reverse order.
This method is equivalent to
Index.read_indices()
.
The IndexArray class¶
-
class
tables.indexes.
IndexArray
(parentnode, name, atom=None, title='', filters=None, byteorder=None)[source]¶ Represent the index (sorted or reverse index) dataset in HDF5 file.
All NumPy typecodes are supported except for complex datatypes.
Parameters: parentnode :
The Index class from which this object will hang off.
Changed in version 3.0: Renamed from parentNode to parentnode.
name : str
The name of this node in its parent group.
atom :
An Atom object representing the shape and type of the atomic objects to be saved. Only scalar atoms are supported.
title :
Sets a TITLE attribute on the array entity.
filters : Filters
An instance of the Filters class that provides information about the desired I/O filters to be applied during the life of this object.
byteorder :
The byteroder of the data on-disk.
-
chunksize
¶ The chunksize for this object.
-
slicesize
¶ The slicesize for this object.
-
The Enum class¶
-
class
tables.misc.enum.
Enum
(enum)[source]¶ Enumerated type.
Each instance of this class represents an enumerated type. The values of the type must be declared exhaustively and named with strings, and they might be given explicit concrete values, though this is not compulsory. Once the type is defined, it can not be modified.
There are three ways of defining an enumerated type. Each one of them corresponds to the type of the only argument in the constructor of Enum:
Sequence of names: each enumerated value is named using a string, and its order is determined by its position in the sequence; the concrete value is assigned automatically:
>>> boolEnum = Enum(['True', 'False'])
Mapping of names: each enumerated value is named by a string and given an explicit concrete value. All of the concrete values must be different, or a ValueError will be raised:
>>> priority = Enum({'red': 20, 'orange': 10, 'green': 0}) >>> colors = Enum({'red': 1, 'blue': 1}) Traceback (most recent call last): ... ValueError: enumerated values contain duplicate concrete values: 1
Enumerated type: in that case, a copy of the original enumerated type is created. Both enumerated types are considered equal:
>>> prio2 = Enum(priority) >>> priority == prio2 True
Please note that names starting with _ are not allowed, since they are reserved for internal usage:
>>> prio2 = Enum(['_xx']) Traceback (most recent call last): ... ValueError: name of enumerated value can not start with ``_``: '_xx'
The concrete value of an enumerated value is obtained by getting its name as an attribute of the Enum instance (see __getattr__()) or as an item (see __getitem__()). This allows comparisons between enumerated values and assigning them to ordinary Python variables:
>>> redv = priority.red >>> redv == priority['red'] True >>> redv > priority.green True >>> priority.red == priority.orange False
The name of the enumerated value corresponding to a concrete value can also be obtained by using the __call__() method of the enumerated type. In this way you get the symbolic name to use it later with __getitem__():
>>> priority(redv) 'red' >>> priority.red == priority[priority(priority.red)] True
(If you ask, the __getitem__() method is not used for this purpose to avoid ambiguity in the case of using strings as concrete values.)
Enum special methods¶
-
Enum.
__call__
(value, *default)[source]¶ Get the name of the enumerated value with that concrete value.
If there is no value with that concrete value in the enumeration and a second argument is given as a default, this is returned. Else, a ValueError is raised.
This method can be used for checking that a concrete value belongs to the set of concrete values in an enumerated type.
Examples
Let
enum
be an enumerated type defined as:>>> enum = Enum({'T0': 0, 'T1': 2, 'T2': 5})
then:
>>> enum(5) 'T2' >>> enum(42, None) is None True >>> enum(42) Traceback (most recent call last): ... ValueError: no enumerated value with that concrete value: 42
-
Enum.
__contains__
(name)[source]¶ Is there an enumerated value with that name in the type?
If the enumerated type has an enumerated value with that name, True is returned. Otherwise, False is returned. The name must be a string.
This method does not check for concrete values matching a value in an enumerated type. For that, please use the
Enum.__call__()
method.Examples
Let
enum
be an enumerated type defined as:>>> enum = Enum({'T0': 0, 'T1': 2, 'T2': 5})
then:
>>> 'T1' in enum True >>> 'foo' in enum False >>> 0 in enum Traceback (most recent call last): ... TypeError: name of enumerated value is not a string: 0 >>> enum.T1 in enum # Be careful with this! Traceback (most recent call last): ... TypeError: name of enumerated value is not a string: 2
-
Enum.
__eq__
(other)[source]¶ Is the other enumerated type equivalent to this one?
Two enumerated types are equivalent if they have exactly the same enumerated values (i.e. with the same names and concrete values).
Examples
Let
enum*
be enumerated types defined as:>>> enum1 = Enum({'T0': 0, 'T1': 2}) >>> enum2 = Enum(enum1) >>> enum3 = Enum({'T1': 2, 'T0': 0}) >>> enum4 = Enum({'T0': 0, 'T1': 2, 'T2': 5}) >>> enum5 = Enum({'T0': 0}) >>> enum6 = Enum({'T0': 10, 'T1': 20})
then:
>>> enum1 == enum1 True >>> enum1 == enum2 == enum3 True >>> enum1 == enum4 False >>> enum5 == enum1 False >>> enum1 == enum6 False
Comparing enumerated types with other kinds of objects produces a false result:
>>> enum1 == {'T0': 0, 'T1': 2} False >>> enum1 == ['T0', 'T1'] False >>> enum1 == 2 False
-
Enum.
__getattr__
(name)[source]¶ Get the concrete value of the enumerated value with that name.
The name of the enumerated value must be a string. If there is no value with that name in the enumeration, an AttributeError is raised.
Examples
Let
enum
be an enumerated type defined as:>>> enum = Enum({'T0': 0, 'T1': 2, 'T2': 5})
then:
>>> enum.T1 2 >>> enum.foo Traceback (most recent call last): ... AttributeError: no enumerated value with that name: 'foo'
-
Enum.
__getitem__
(name)[source]¶ Get the concrete value of the enumerated value with that name.
The name of the enumerated value must be a string. If there is no value with that name in the enumeration, a KeyError is raised.
Examples
Let
enum
be an enumerated type defined as:>>> enum = Enum({'T0': 0, 'T1': 2, 'T2': 5})
then:
>>> enum['T1'] 2 >>> enum['foo'] Traceback (most recent call last): ... KeyError: "no enumerated value with that name: 'foo'"
-
Enum.
__iter__
()[source]¶ Iterate over the enumerated values.
Enumerated values are returned as (name, value) pairs in no particular order.
Examples
>>> enumvals = {'red': 4, 'green': 2, 'blue': 1} >>> enum = Enum(enumvals) >>> enumdict = dict([(name, value) for (name, value) in enum]) >>> enumvals == enumdict True
The UnImplemented class¶
-
class
tables.
UnImplemented
(parentnode, name)[source]¶ This class represents datasets not supported by PyTables in an HDF5 file.
When reading a generic HDF5 file (i.e. one that has not been created with PyTables, but with some other HDF5 library based tool), chances are that the specific combination of datatypes or dataspaces in some dataset might not be supported by PyTables yet. In such a case, this dataset will be mapped into an UnImplemented instance and the user will still be able to access the complete object tree of the generic HDF5 file. The user will also be able to read and write the attributes of the dataset, access some of its metadata, and perform certain hierarchy manipulation operations like deleting or moving (but not copying) the node. Of course, the user will not be able to read the actual data on it.
This is an elegant way to allow users to work with generic HDF5 files despite the fact that some of its datasets are not supported by PyTables. However, if you are really interested in having full access to an unimplemented dataset, please get in contact with the developer team.
This class does not have any public instance variables or methods, except those inherited from the Leaf class (see The Leaf class).
-
byteorder
= None¶ The endianness of data in memory (‘big’, ‘little’ or ‘irrelevant’).
-
nrows
= None¶ The length of the first dimension of the data.
-
shape
= None¶ The shape of the stored data.
-
The Unknown class¶
Exceptions module¶
In the exceptions
module exceptions and warnings that are specific
to PyTables are declared.
-
exception
tables.
HDF5ExtError
(*args, **kargs)[source]¶ A low level HDF5 operation failed.
This exception is raised the low level PyTables components used for accessing HDF5 files. It usually signals that something is not going well in the HDF5 library or even at the Input/Output level.
Errors in the HDF5 C library may be accompanied by an extensive HDF5 back trace on standard error (see also
tables.silence_hdf5_messages()
).Changed in version 2.4.
Parameters: message :
error message
h5bt :
This parameter (keyword only) controls the HDF5 back trace handling. Any keyword arguments other than h5bt is ignored.
- if set to False the HDF5 back trace is ignored and the
HDF5ExtError.h5backtrace
attribute is set to None - if set to True the back trace is retrieved from the HDF5
library and stored in the
HDF5ExtError.h5backtrace
attribute as a list of tuples - if set to “VERBOSE” (default) the HDF5 back trace is
stored in the
HDF5ExtError.h5backtrace
attribute and also included in the string representation of the exception - if not set (or set to None) the default policy is used
(see
HDF5ExtError.DEFAULT_H5_BACKTRACE_POLICY
)
-
format_h5_backtrace
(backtrace=None)[source]¶ Convert the HDF5 trace back represented as a list of tuples. (see
HDF5ExtError.h5backtrace
) into a string.New in version 2.4.
-
DEFAULT_H5_BACKTRACE_POLICY
= 'VERBOSE'¶ Default policy for HDF5 backtrace handling
- if set to False the HDF5 back trace is ignored and the
HDF5ExtError.h5backtrace
attribute is set to None - if set to True the back trace is retrieved from the HDF5
library and stored in the
HDF5ExtError.h5backtrace
attribute as a list of tuples - if set to “VERBOSE” (default) the HDF5 back trace is
stored in the
HDF5ExtError.h5backtrace
attribute and also included in the string representation of the exception
This parameter can be set using the
PT_DEFAULT_H5_BACKTRACE_POLICY
environment variable. Allowed values are “IGNORE” (or “FALSE”), “SAVE” (or “TRUE”) and “VERBOSE” to set the policy to False, True and “VERBOSE” respectively. The special value “DEFAULT” can be used to reset the policy to the default valueNew in version 2.4.
- if set to False the HDF5 back trace is ignored and the
-
h5backtrace
= None¶ HDF5 back trace.
Contains the HDF5 back trace as a (possibly empty) list of tuples. Each tuple has the following format:
(filename, line number, function name, text)
Depending on the value of the h5bt parameter passed to the initializer the h5backtrace attribute can be set to None. This means that the HDF5 back trace has been simply ignored (not retrieved from the HDF5 C library error stack) or that there has been an error (silently ignored) during the HDF5 back trace retrieval.
New in version 2.4.
See also
traceback.format_list
traceback.format_list()
- if set to False the HDF5 back trace is ignored and the
-
exception
tables.
ClosedNodeError
[source]¶ The operation can not be completed because the node is closed.
For instance, listing the children of a closed group is not allowed.
-
exception
tables.
ClosedFileError
[source]¶ The operation can not be completed because the hosting file is closed.
For instance, getting an existing node from a closed file is not allowed.
-
exception
tables.
FileModeError
[source]¶ The operation can not be carried out because the mode in which the hosting file is opened is not adequate.
For instance, removing an existing leaf from a read-only file is not allowed.
-
exception
tables.
NodeError
[source]¶ Invalid hierarchy manipulation operation requested.
This exception is raised when the user requests an operation on the hierarchy which can not be run because of the current layout of the tree. This includes accessing nonexistent nodes, moving or copying or creating over an existing node, non-recursively removing groups with children, and other similarly invalid operations.
A node in a PyTables database cannot be simply overwritten by replacing it. Instead, the old node must be removed explicitely before another one can take its place. This is done to protect interactive users from inadvertedly deleting whole trees of data by a single erroneous command.
-
exception
tables.
NoSuchNodeError
[source]¶ An operation was requested on a node that does not exist.
This exception is raised when an operation gets a path name or a
(where, name)
pair leading to a nonexistent node.
-
exception
tables.
UndoRedoError
[source]¶ Problems with doing/redoing actions with Undo/Redo feature.
This exception indicates a problem related to the Undo/Redo mechanism, such as trying to undo or redo actions with this mechanism disabled, or going to a nonexistent mark.
-
exception
tables.
UndoRedoWarning
[source]¶ Issued when an action not supporting Undo/Redo is run.
This warning is only shown when the Undo/Redo mechanism is enabled.
-
exception
tables.
NaturalNameWarning
[source]¶ Issued when a non-pythonic name is given for a node.
This is not an error and may even be very useful in certain contexts, but one should be aware that such nodes cannot be accessed using natural naming (instead,
getattr()
must be used explicitly).
-
exception
tables.
PerformanceWarning
[source]¶ Warning for operations which may cause a performance drop.
This warning is issued when an operation is made on the database which may cause it to slow down on future operations (i.e. making the node tree grow too much).
-
exception
tables.
FlavorError
[source]¶ Unsupported or unavailable flavor or flavor conversion.
This exception is raised when an unsupported or unavailable flavor is given to a dataset, or when a conversion of data between two given flavors is not supported nor available.
-
exception
tables.
FlavorWarning
[source]¶ Unsupported or unavailable flavor conversion.
This warning is issued when a conversion of data between two given flavors is not supported nor available, and raising an error would render the data inaccessible (e.g. on a dataset of an unavailable flavor in a read-only file).
See the FlavorError class for more information.
-
exception
tables.
FiltersWarning
[source]¶ Unavailable filters.
This warning is issued when a valid filter is specified but it is not available in the system. It may mean that an available default filter is to be used instead.
-
exception
tables.
OldIndexWarning
[source]¶ Unsupported index format.
This warning is issued when an index in an unsupported format is found. The index will be marked as invalid and will behave as if doesn’t exist.