Datatable

Latest version: v1.1.0

Safety actively analyzes 629994 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 2 of 3

0.8.0

0.7.0

[v0.7.0](https://github.com/h2oai/datatable/compare/0.7.0...v0.6.0) — 2018-11-16
Added
- Frame can now be created from a list/dict of numpy arrays.
- Filters can now be used together with groupby expressions.
- fread's verbose output now includes time spent opening the input file.
- Added ability to read/write Jay files.
- Frames can now be constructed via the keyword-args list of columns
(i.e. `Frame(A=..., B=...)`).
- Implemented logical operators "and" `&` and "or" `|` for eager evaluator.
- Implemented integer division `//` and modulo `%` operators.
- A Frame can now have a key column (or columns).
- Key column(s) are saved when the frame is saved into a Jay file.
- A Frame can now be naturally-joined with a keyed Frame.
- Columns can now be updated within join expressions.
- The error message when selecting a column that does not exist in the Frame
now refers to similarly-named columns in that Frame, if there are any. At
most 3 possible columns are reported, and they are ordered from most likely
to least likely (1253).
- Frame() constructor now accepts a list of tuples, which it treats as rows
when creating the frame.
- Frame() can now be constructed from a list of named tuples, which will
be treated as rows and field names will be used as column names.
- frame.copy() can now be used to create a copy of the Frame.
- Frame() can now be constructed from a list of dictionaries, where each
item in the list represents a single row.
- Frame() can now be created from a datetime64 numpy array (1274).
- Groupby calculations are now parallel.
- `Frame.cbind()` now accepts a list of frames as the argument.
- Frame can now be sorted by multiple columns.
- new function `split_into_nhot()` to split a string column into fragments
and then convert them into a set of indicator variables ("n-hot encode").
- ability to convert object columns into strings.
- implemented `Frame.replace()` function.
- function `abs()` to find the absolute value of elements in the frame.
- improved handling of Excel files by fread:
* sheet name can now be used as a path component in the file name,
causing only that particular sheet to be parsed;
* further, a cell range can be specified as a path component after the
sheet name, forcing fread to consider only the provided cell range;
* fread can now handle the situation when a spreadsheet has multiple
separate tables in the same sheet. They will now be detected automatically
and returned to the user as separate Frame objects (the name of each
frame will contain the sheet name and cell range from where the data was
extracted).
- HTML rendering of Frames inside a Jupyter notebook.
- set-theoretic functions: `union`, `intersect`, `setdiff` and `symdiff`.
- support for multi-column keys.
- ability to join Frames on multiple columns.
- In Jupyter notebook columns now have visual indicators of their types.
The logical types are color-coded, and the size of each element is
given by the number of dots (1428).

Changed
- `names` argument in `Frame()` constructor can no longer be a string --
use a list or tuple of strings instead.
- `Frame.resize()` removed -- same functionality is available via
assigning to `Frame.nrows`.
- `Frame.rename()` removed -- .name setter can be used instead.
- `Frame([])` now creates a 0x0 Frame instead of 0x1.
- Parameter `inplace` in `Frame.cbind()` removed (was deprecated).
Instead of `inplace=False` use `dt.cbind(...)`.
- `Frame.cbind()` no longer returns anything (previously it returned self,
but this was confusing w.r.t whether it modifies the target, or returns
a modified copy).
- `DT[i, j]` now returns a python scalar value if `i` is integer, and `j`
is integer/string. This is referred to as "explicit element selection".
In the unlikely scenario when a single element needs to be returned as
a frame, one can always write `DT[i:i+1, j]` or `DT[[i], j]`.
- The performance of explicit element selection improved by a factor of 200x.
- Building no longer requires an LLVM distribution.
- `DT[col]` syntax has been deprecated and now emits a warning. This
will be converted to an error in version 0.8.0, and will be interpreted
as row selector in 0.9.0.
- default format for `Frame.save()` is now "jay".

Fixed
- bug in dt.cbind() where the first Frame in the list was ignored.
- bug with applying a cast expression to a view column.
- occasional memory errors caused by a lack of available mmap handles.
- memory leak in groupby operations.
- `names` parameter in Frame constructor is now checked for correctness.
- bug in fread with QR bump occurring out-of-sample.
- `import datatable` now takes only 0.13s, down from 0.6s.
- fread no longer wastes time reading the full input, if max_nrows option is used.
- bug where max_nrows parameter was sometimes causing a seg.fault
- fread performance bug caused by memory-mapped file being accidentally
copied into RAM.
- rare crash in fread when resizing the number of rows.
- saving view frames to csv.
- crash when sorting string columns containins NA strings.
- crash when applying a filter to a 0-rows frame.
- if `x` is a Frame, then `y = dt.Frame(x)` now creates a shallow copy
instead of a copy-by-reference.
- upgraded dependency version for typesentry, the previous version was not
compatible with Python 3.7.
- rare crash when converting a string column from pandas DataFrame, when
that column contains many non-ASCII characters.
- f-column-selectors should no longer throw errors and produce only unique
ids when stringified (1241).
- crash when saving a frame with many boolean columns into CSV (1278).
- incorrect .stypes/.ltypes property after calling cbind().
- calculation of min/max values in internal rowindex upon row resizing.
- frame.sort() with no arguments no longer produces an error.
- f-expressions now do not crash when reused with a different Frame.
- g-columns can be properly selected in a join (1352).
- writing to disk of columns > 2GB in size (1387).
- crash when sorting by multiple columns and the first column was
of string type (1401).
---
Download links

- Linux X86_64
- [datatable-0.7.0-cp37-cp37m-linux_x86_64.whl](https://h2o-release.s3.amazonaws.com/datatable/stable/datatable-0.7.0/datatable-0.7.0-cp37-cp37m-linux_x86_64.whl) (for Python 3.7)
- [datatable-0.7.0-cp36-cp36m-linux_x86_64.whl](https://h2o-release.s3.amazonaws.com/datatable/stable/datatable-0.7.0/datatable-0.7.0-cp36-cp36m-linux_x86_64.whl) (for Python 3.6)
- [datatable-0.7.0-cp35-cp35m-linux_x86_64.whl](https://h2o-release.s3.amazonaws.com/datatable/stable/datatable-0.7.0/datatable-0.7.0-cp35-cp35m-linux_x86_64.whl) (for Python 3.5)

- PowerPC PPC64
- [datatable-0.7.0-cp37-cp37m-linux_ppc64le.whl](https://h2o-release.s3.amazonaws.com/datatable/stable/datatable-0.7.0/datatable-0.7.0-cp37-cp37m-linux_ppc64le.whl) (for Python 3.7)
- [datatable-0.7.0-cp36-cp36m-linux_ppc64le.whl](https://h2o-release.s3.amazonaws.com/datatable/stable/datatable-0.7.0/datatable-0.7.0-cp36-cp36m-linux_ppc64le.whl) (for Python 3.6)
- [datatable-0.7.0-cp35-cp35m-linux_ppc64le.whl](https://h2o-release.s3.amazonaws.com/datatable/stable/datatable-0.7.0/datatable-0.7.0-cp35-cp35m-linux_ppc64le.whl) (for Python 3.5)

- MacOSX
- [datatable-0.7.0-cp37-cp37m-macosx_10_7_x86_64.whl](https://h2o-release.s3.amazonaws.com/datatable/stable/datatable-0.7.0/datatable-0.7.0-cp37-cp37m-macosx_10_7_x86_64.whl) (for Python 3.7)
- [datatable-0.7.0-cp36-cp36m-macosx_10_7_x86_64.whl](https://h2o-release.s3.amazonaws.com/datatable/stable/datatable-0.7.0/datatable-0.7.0-cp36-cp36m-macosx_10_7_x86_64.whl) (for Python 3.6)
- [datatable-0.7.0-cp35-cp35m-macosx_10_7_x86_64.whl](https://h2o-release.s3.amazonaws.com/datatable/stable/datatable-0.7.0/datatable-0.7.0-cp35-cp35m-macosx_10_7_x86_64.whl) (for Python 3.5)

- Source Distribution
- [datatable-0.7.0.tar.gz](https://h2o-release.s3.amazonaws.com/datatable/stable/datatable-0.7.0/datatable-0.7.0.tar.gz)

0.6.0

[v0.6.0](https://github.com/h2oai/datatable/compare/v0.6.0...v0.5.0) — 2018-06-05
Added
- fread will detect feather file and issue an appropriate error message.
- when fread extracts data from archives into memory, it will now display
the size of the extracted data in verbose mode.
- syntax `DT[i, j, by]` is now supported.
- multiple reduction operators can now be performed at once.
- in groupby, reduction columns can now be combined with regular or computed
columns.
- during grouping, group keys are now added automatically to the select list.
- implement `sum()` reducer.
- `==` operator now works for string columns too.
- Improved performance of groupby operations.

Fixed
- fread will no longer emit an error if there is an NA string in the header.
- if the input contains excessively long lines, fread will no longer waste time
printing a sample of first 5 lines in verbose mode.
- fixed wrong calculation of mean / standard deviation of line length in fread
if the sample contained broken lines.
- frame view will no longer get stuck in a Jupyter notebook.
---
Download links

- [datatable-0.6.0-cp36-cp36m-linux_x86_64.whl](https://h2o-release.s3.amazonaws.com/datatable/stable/datatable-0.6.0//datatable-0.6.0-cp36-cp36m-linux_x86_64.whl)
- [datatable-0.6.0.tar.gz](https://h2o-release.s3.amazonaws.com/datatable/stable/datatable-0.6.0/datatable-0.6.0.tar.gz)
- [datatable-0.6.0-cp36-cp36m-macosx_10_7_x86_64.whl](https://h2o-release.s3.amazonaws.com/datatable/stable/datatable-0.6.0/datatable-0.6.0-cp36-cp36m-macosx_10_7_x86_64.whl)
- [datatable-0.6.0-cp36-cp36m-linux_ppc64le.whl](https://h2o-release.s3.amazonaws.com/datatable/stable/datatable-0.6.0/datatable-0.6.0-cp36-cp36m-linux_ppc64le.whl)
- [datatable-0.6.0-cp35-cp35m-macosx_10_7_x86_64.whl](https://h2o-release.s3.amazonaws.com/datatable/stable/datatable-0.6.0/datatable-0.6.0-cp35-cp35m-macosx_10_7_x86_64.whl)
- [datatable-0.6.0-cp35-cp35m-linux_x86_64.whl](https://h2o-release.s3.amazonaws.com/datatable/stable/datatable-0.6.0/datatable-0.6.0-cp35-cp35m-linux_x86_64.whl)
- [datatable-0.6.0-cp35-cp35m-linux_ppc64le.whl](https://h2o-release.s3.amazonaws.com/datatable/stable/datatable-0.6.0/datatable-0.6.0-cp35-cp35m-linux_ppc64le.whl)

0.5.0

[v0.5.0](https://github.com/h2oai/datatable/compare/v0.5.0...v0.4.0) — 2018-05-25
Added
- rbind()-ing now works on columns of all types (including between any types).
- `dt.rbind()` function to perform out-of-place row binding.
- ability to change the number of rows in a Frame.
- ability to modify a Frame in-place by assigning new values to particular
cells.
- `dt.__git_version__` variable containing the commit hash from which the
package was built.
- ability to read .bz2 compressed files with fread.

Fixed
- Ensure that fread only emits messages to Python from the master thread.
- Fread can now properly recognize quoted NA strings.
- Fixed error when unbounded f-expressions were printed to console.
- Fixed problems when operating with too many memory-mapped Frames at once.
- Fixed incorrect groupby calculation in some rare cases.
---
Download links

- [datatable-0.5.0-cp35-cp35m-macosx_10_7_x86_64.whl](https://h2o-release.s3.amazonaws.com/datatable/stable/datatable-0.5.0//datatable-0.5.0-cp35-cp35m-macosx_10_7_x86_64.whl)
- [datatable-0.5.0-cp36-cp36m-linux_x86_64.whl](https://h2o-release.s3.amazonaws.com/datatable/stable/datatable-0.5.0//datatable-0.5.0-cp36-cp36m-linux_x86_64.whl)
- [datatable-0.5.0-cp36-cp36m-macosx_10_7_x86_64.whl](https://h2o-release.s3.amazonaws.com/datatable/stable/datatable-0.5.0//datatable-0.5.0-cp36-cp36m-macosx_10_7_x86_64.whl)
- [datatable-0.5.0.tar.gz](https://h2o-release.s3.amazonaws.com/datatable/stable/datatable-0.5.0//datatable-0.5.0.tar.gz)
- [datatable-0.5.0-cp35-cp35m-linux_ppc64le.whl](https://h2o-release.s3.amazonaws.com/datatable/stable/datatable-0.5.0//datatable-0.5.0-cp35-cp35m-linux_ppc64le.whl)
- [datatable-0.5.0-cp35-cp35m-linux_x86_64.whl](https://h2o-release.s3.amazonaws.com/datatable/stable/datatable-0.5.0//datatable-0.5.0-cp35-cp35m-linux_x86_64.whl)
- [datatable-0.5.0-cp36-cp36m-linux_ppc64le.whl](https://h2o-release.s3.amazonaws.com/datatable/stable/datatable-0.5.0//datatable-0.5.0-cp36-cp36m-linux_ppc64le.whl)

0.4.0

[v0.4.0](https://github.com/h2oai/datatable/compare/0.4.0...v0.3.2) — 2018-05-07
Added
- Fread now parses integers with thousands separator (e.g. "1,000").
- Added option `fread.anonymize` which forces fread to anonymize all user input
in the verbose logs / error messages.
- Allow type-casts from booleans / integers / floats into strings.
---
Download links

- [datatable-0.4.0-cp36-cp36m-linux_x86_64.whl](https://h2o-release.s3.amazonaws.com/datatable/stable/datatable-0.4.0/datatable-0.4.0-cp36-cp36m-linux_x86_64.whl)
- [datatable-0.4.0.tar.gz](https://h2o-release.s3.amazonaws.com/datatable/stable/datatable-0.4.0/datatable-0.4.0.tar.gz)
- [datatable-0.4.0-cp35-cp35m-linux_x86_64.whl](https://h2o-release.s3.amazonaws.com/datatable/stable/datatable-0.4.0/datatable-0.4.0-cp35-cp35m-linux_x86_64.whl)

0.3.2

Added
- Implemented sorting for `str64` columns.
- write_csv can now write columns of type `str64`.
- Fread can now accept a list of files to read, or a glob pattern.

Fixed
- Fix the source distribution (`sdist`) by including all the files that are
required for building from source.
- Install no longer fails with `llvmlite 0.23.0` package.

Page 2 of 3

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.