Release notes for PyTables 3.0 series¶
Author: | PyTables Developers |
---|---|
Contact: | pytables@googlemail.com |
Changes from 2.4 to 3.0¶
New features¶
- Since this release PyTables provides full support to Python 3 (closes gh-188).
- The entire code base is now more compliant with coding style guidelines describe in the PEP8 (closes gh-103 and gh-224). See API changes for more details.
- Basic support for HDF5 drivers. Now it is possible to open/create an HDF5 file using one of the SEC2, DIRECT, LOG, WINDOWS, STDIO or CORE drivers. Users can also set the main driver parameters (closes gh-166). Thanks to Michal Slonina.
- Basic support for in-memory image files. An HDF5 file can be set from or copied into a memory buffer (thanks to Michal Slonina). This feature is only available if PyTables is built against HDF5 1.8.9 or newer. Closes gh-165 and gh-173.
- New
File.get_filesize()
method for retrieving the HDF5 file size. - Implemented methods to get/set the user block size in a HDF5 file (closes gh-123)
- Improved support for PyInstaller. Now it is easier to pack frozen applications that use the PyTables package (closes: gh-177). Thanks to Stuart Mentzer and Christoph Gohlke.
- All read methods now have an optional out argument that allows to pass a pre-allocated array to store data (closes gh-192)
- Added support for the floating point data types with extended precision (Float96, Float128, Complex192 and Complex256). This feature is only available if numpy provides it as well. Closes gh-51 and gh-214. Many thanks to Andrea Bedini.
- Consistent
create_xxx()
signatures. Now it is possible to create all data setsArray
,CArray
,EArray
,VLArray
, andTable
from existing Python objects (closes gh-61 and gh-249). See also the API changes section. - Complete rewrite of the
nodes.filenode
module. Now it is fully compliant with the interfaces defined in the standardio
module. Only non-buffered binary I/O is supported currently. See also the API changes section. Closes gh-244. - New pt2to3 tool is provided to help users to port their applications to the new API (see API changes section).
Improvements¶
- Improved runtime checks on dynamic loading of libraries: meaningful error messages are generated in case of failure. Also, now PyTables no more alters the system PATH. Closes gh-178 and gh-179 (thanks to Christoph Gohlke).
- Improved list of search paths for libraries as suggested by Nicholaus Halecky (see gh-219).
- Removed deprecated Cython include (.pxi) files. Contents of
convtypetables.pxi
have been moved inutilsextension.pyx
. Closes gh-217. - The internal Blosc library has been upgraded to version 1.2.3.
- Pre-load the bzip2 library on windows (closes gh-205)
- The
File.get_node()
method now accepts unicode paths (closes gh-203) - Improved compatibility with Cython 0.19 (see gh-220 and gh-221)
- Improved compatibility with numexpr 2.1 (see also gh-199 and gh-241)
- Improved compatibility with development versions of numpy (see gh-193)
- Packaging: since this release the standard tar-ball package no more includes the PDF version of the “PyTables User Guide”, so it is a little bit smaller now. The complete and pre-build version of the documentation both in HTML and PDF format is available on the file download area on SourceForge.net. Closes: gh-172.
- Now PyTables also uses Travis-CI as continuous integration service. All branches and all pull requests are automatically tested with different Python versions. Closes gh-212.
Other changes¶
- PyTables now requires Python 2.6 or newer.
- Minimum supported version of Numexpr is now 2.0.
API changes¶
The entire PyTables API as been made more PEP8 compliant (see gh-224).
This means that many methods, attributes, module global variables and also
keyword parameters have been renamed to be compliant with PEP8 style
guidelines (e.g. the tables.hdf5Version
constant has been renamed into
tables.hdf5_version
).
We made the best effort to maintain compatibility to the old API for existing applications. In most cases, the old 2.x API is still available and usable even if it is now deprecated (see the Deprecations section).
The only important backwards incompatible API changes are for names of function/methods arguments. All uses of keyword arguments should be checked and fixed to use the new naming convention.
The new pt2to3 tool can be used to port PyTables based applications to the new API.
Many deprecated features and support for obsolete modules has been dropped:
- The deprecated
is_pro
module constant has been removed - The nra module and support for the obsolete numarray module has been removed. The numarray flavor is no more supported as well (closes gh-107).
- Support for the obsolete Numeric module has been removed. The numeric flavor is no longer available (closes gh-108).
- The tables.netcdf3 module has been removed (closes gh-68).
- The deprecated
exceptions.Incompat16Warning
exception has been removed - The
File.create_external_link()
method no longer has a keyword parameter named warn16incompat. It was deprecated in PyTables 2.4.
Moreover:
The
File.create_array()
,File.create_carray()
,File.create_earray()
,File.create_vlarray()
, andFile.create_table()
methods of theFile
objects gained a new (optional) keyword argument namedobj
. It can be used to initialize the newly created dataset with an existing Python object, though normally these are numpy arrays.The atom/descriptor and shape parameters are now optional if the obj argument is provided.
The
nodes.filenode
has been completely rewritten to be fully compliant with the interfaces defined in theio
module.The FileNode classes currently implemented are intended for binary I/O.
Main changes:
- the FileNode base class is no more available,
- the new version of
nodes.filenode.ROFileNode
andnodes.filenode.RAFileNode
objects no more expose the offset attribute (the seek and tell methods can be used instead), - the lineSeparator property is no more available and the
\n
character is always used as line separator.
The __version__ module constants has been removed from almost all the modules (it was not used after the switch to Git). Of course the package level constant (
tables.__version__
) still remains. Closes gh-112.The
lrange()
has been dropped in favor of xrange (gh-181)The
parameters.MAX_THREADS
configuration parameter has been dropped in favor ofparameters.MAX_BLOSC_THREADS
andparameters.MAX_NUMEXPR_THREADS
(closes gh-147).The
conditions.compile_condition()
function no more has a copycols argument, it was no more necessary since Numexpr 1.3.1. Closes gh-117.The expectedsizeinMB parameter of the
File.create_vlarray()
and of theVLArrsy.__init__()
methods has been replaced by expectedrows. See also (gh-35).The
Table.whereAppend()
method has been renamed intoTable.append_where()
(closes gh-248).
Please refer to the Migrating from PyTables 2.x to 3.x document for more details about API changes and for some useful hint about the migration process from the 2.X API to the new one.
Other possibly incompatible changes¶
All methods of the
Table
class that take start, stop and step parameters (includingTable.read()
,Table.where()
,Table.iterrows()
, etc) have been redesigned to have a consistent behaviour. The meaning of the start, stop and step and their default values now always work exactly like in the standardslice
objects. Closes gh-44 and gh-255.Unicode attributes are not stored in the HDF5 file as pickled string. They are now saved on the HDF5 file as UTF-8 encoded strings.
Although this does not introduce any API breakage, files produced are different (for unicode attributes) from the ones produced by earlier versions of PyTables.
System attributes are now stored in the HDF5 file using the character set that reflects the native string behaviour: ASCII for Python 2 and UTF8 for Python 3. In any case, system attributes are represented as Python string.
The
iterrows()
method of*Array
andTable
as well as theTable.itersorted()
now behave like functions in the standarditertools
module. If the start parameter is provided and stop is None then the array/table is iterated from start to the last line. In PyTables < 3.0 only one element was returned.
Deprecations¶
- As described in API changes, all functions, methods and attribute names that was not compliant with the PEP8 guidelines have been changed. Old names are still available but they are deprecated.
- The use of upper-case keyword arguments in the
open_file()
function and theFile
class initializer is now deprecated. All parameters defined in thetables/parameters.py
module can still be passed as keyword argument to theopen_file()
function just using a lower-case version of the parameter name.
Bugs fixed¶
- Better check access on closed files (closes gh-62)
- Fix for
File.renameNode()
where in certain casesFile._g_updateLocation()
was wrongly called (closes gh-208). Thanks to Michka Popoff. - Fixed ptdump failure on data with nested columns (closes gh-213). Thanks to Alexander Ford.
- Fixed an error in
open_file()
when filename is anumpy.str_
(closes gh-204) - Fixed gh-119, gh-230 and gh-232, where an index on
Time64Col
(only,Time32Col
was ok) hides the data on selection from a Tables. Thanks to Jeff Reback. - Fixed
tables.tests.test_nestedtypes.ColsTestCase.test_00a_repr
test method. Now therepr
of of cols on big-endian platforms is correctly handled (closes gh-237). - Fixes bug with completely sorted indexes where nrowsinbuf must be equal to or greater than the chunksize (thanks to Thadeus Burgess). Closes gh-206 and gh-238.
- Fixed an issue of the
Table.itersorted()
with reverse iteration (closes gh-252 and gh-253).
Enjoy data!
—The PyTables Developers