Changelogs » Awkward

Awkward

999.0

a.pad(3, clip=True).fillna(999).regular()
returns [[  1.1,   2.2,   3.3],
[999. , 999. , 999. ],

9.9


      

0.10.1

Closed a security hole and backward incompatibility in `awkward.persist.whitelist` handling.

0.10.0

A new semi-major version providing tweaks for Tables of VirtualArrays, so they can be used as lazy arrays in uproot. All changes are from PR 131.

0.9.0

As a new semi-major version, this is now the minimum version of awkward for uproot-methods and uproot. It includes

* The first usable version of awkward-numba (jagged arrays only).
* PRs 122 and scikit-hep/uproot-methods50, which moves part of the machinery to construct jagged Lorentz vector arrays from jagged arrays into awkward itself.
* A long-standing fix to uproot's Pandas handling, which illegally depended on internals of the `JaggedArray` class to function. Now the interface is strictly through the public API.

0.9.0rc3

Added PR 123, which fixed a missing return and also updated the list of methods for `JaggedArrayNumba` to match `JaggedArray`.

0.9.0rc2

Includes PR 122, which is the first half to be followed by scikit-hep/uproot-methods50.

0.9.0rc1

Deploying release candidate so that uproot continuous integration is not broken.

0.8.15

PRs 117, 118, 120: new `JaggedArray.choose` and `argchoose`, as well as generalization of `concatenate` to `ObjectArrays` and `Tables`.

0.8.14

Added to the deserialization whitelist, primarily `uproot_methods.classes.*` (so that `TLorentzVectors` and such can be deserialized without manually adding it to the whitelist).

0.8.13

PR 114: `awkward.save` and `awkward.toparquet` now have the same order of arguments: first filename, then the array(s) to save (and `awkward.save's` order was chosen to be consistent with `numpy.save`).

0.8.12

PR 105 fixed two cases of not checking for empty arrays before calling `.max()`.

Added `pad` and `fillna` to turn jagged arrays into Numpy arrays:

python

0.8.11

0.8.9 and 0.8.10 broke uproot's `tree.pandas.df()` because that function (illegally!) used the private method `_broadcast`. This release puts it back as an alias, which will make uproot work as long as the installed version of awkward isn't in this two-version window.

This will be handled properly soon.

0.8.10

PR 101: minor bug-fixes on version 0.8.9.

0.8.9

Various bug-fixes and improvements to broadcasting from PR 99.

The old internal member function `_broadcast` has been made part of the public API as `tojagged`. (Do not confuse this with the internal member function `_tojagged`, which will sooner or later be removed. The public `tojagged`, with no underscore, has a different definition and is intended to be maintained.)

0.8.8

All array types have an `nbytes` parameter, which determines eviction from uproot's `ArrayCache`. Without this parameter, the cache would fill up to a billion _arrays_ rather than a billion _bytes_!

The `nbytes` parameter only counts data in arrays, not the Python objects that support those arrays (which differs between Pythons 2 and 3, and PyPy doesn't track), and it doesn't track ephemeral attributes, even if they are arrays (like `JaggedArray._counts`, which only exists after the first time `JaggedArray.counts` is requested). It also doesn't make a distinction between owned data and not-owned data, so views would be double-counted.

The `nbytes` algorithm always halts, even if structures have cyclic references (if `x.content is x`, the `nbytes` of `x` are not double-counted and do not lead to infinite recursion).

0.8.7

This release adds `awkward.toarrow` and `awkward.toparquet`, renaming old functions to `awkward.fromarrow` and `awkward.fromparquet` for symmetry. They can only be used if you have `pyarrow` installed, which is not a strict dependency (must be explicitly installed). String columns can be converted from Arrow to Awkward, but not from Awkward to Arrow because of an open question (see comments).

The implemented conversion is really just between Awkward and Arrow, letting `pyarrow` convert to and from Parquet.

Top-level Awkward `Tables` (possibly under `ChunkedArray` or any `MaskedArray`) are converted into Arrow `Tables`, but deeper Awkward `Tables` are converted into Arrow `StructArrays`.

Arrow arrays with an associated mask adds a `BitMaskedArray` to the Awkward structure. All Awkward `MaskedArrays` are pushed down to the deepest Arrow level that can accept them. This might not be necessary—a better understanding of how to generate Arrow buffers might make this unnecessary.

Python types in Awkward `ObjectArrays` can't be saved to Arrow, as it's a multilingual serialization system.

Awkward `VirtualArrays` are evaluated before converting to Arrow. When reading _from_ Parquet, all columns of all chunks are presented as Awkward `VirtualArrays` so that they may be lazily read. By default, Awkward `VirtualArrays` are read-once: the `VirtualArray` object maintains a reference to the materialized array. That's good for multiple reading performance, but bad for memory use. The `cache` parameter of `fromparquet` lets you pass a dict-like cache, such as from the `cachetools` library.

Awkward `ChunkedArrays` become `RecordBatches` in a `Table` in `toarrow` but separate `Tables` in `toparquet`. When reading `fromparquet`, the separate `Tables` define the level of granularity for incremental reading.

If `toparquet` is given an iterable of Awkward data, it will incrementally write the Parquet file. The same can be achieved by an Awkward `ChunkedArray` of `Tables` of `VirtualArray`, which is what `fromparquet` returns, so the output of `fromparquet` can be used as input to `toparquet`.

0.8.6

guitargeek implemented `JaggedArray.concatenate(axis=1)`, which concatenates each subarray within two jagged arrays of the same length. (PR 80)

0.8.5

Adds a global switch to turn off all property checks (probably an insignificant performance cost) and validity checks (probably a major performance cost). The downside is that incorrectly-formed inputs to a calculation can render the final results meaningless.

python
import awkward.array.base
awkward.array.base.AwkwardArray.check_prop_valid = False
awkward.array.base.AwkwardArray.check_whole_valid = False

This was also one of the features [promised in the specification](https://github.com/scikit-hep/awkward-array/blob/master/specification.adocglobal-switches-and-types) but not implemented until now.

0.8.4

Fix the `JaggedArray.zip` method, bringing it up to date with the specification (PR 79).

0.8.3

Faster ufunc calculations with JaggedArray (77).

0.8.2

Fixed loss of Numbaness on __getitem__[string] for lgray .

0.8.1

Found and fixed a few more places where the wrong `awkward` or `numpy` module is being used.

0.8.0

Introduced [awkward-numba](https://pypi.org/project/awkward-numba/), a separately installable package to accelerate some methods with Numba's just in time compilation.

* Running `pip install awkward-numba` adds a package inside awkward: `import awkward.numba`.
* Arrays in `awkward.numba.*` behave just like arrays in `awkward.*` except that some methods are faster and require Numba to be installed.
* Reading persisted arrays with `awkwardlib="awkward.numba"` loads them in accelerated form.
* Both unaccelerated and accelerated arrays can be used in the same process; the latter are a subclass of the former.

To do this, many central facilities from `awkward.util` have been moved into the classes themselves. For instance, the Numpy corresponding to a `JaggedArray` or a `JaggedArrayNumba` are now in `JaggedArray.numpy` and `JaggedArrayNumba.numpy`. This is to prepare the way for GPU extensions of awkward-array, in which `JaggedArrayCUDA.numpy` would actually be [CuPy](https://cupy.chainer.org).

Soon, [uproot-methods](https://github.com/scikit-hep/uproot-methods) and [uproot](https://github.com/scikit-hep/uproot) will depend on awkard 0.8.0 as a minimum version.

0.7.3

Remove README from data_files.

0.7.2

Fixed a bug in jagged fancy indexing: 66.

0.7.1

The `out` parameter in Numpy ufuncs are for in-place operations. awkward-array does not support in-place operations, so we now detect this and forbid it.

0.7.0

Cleaned up and ready to become the new minimal version dependency.

uproot-methods and uproot will depend on this version.

0.6.2

Fixed 59.

0.6.1

* Tables whose `rowname` is `"tuple"` and whose fields are `map(str, range(n))` are visualized as tuples (so that you can see field values, unlike the opaque default) in `repr` and `tolist()`.
* Jagged `cross`, `pairs`, `distincts` and their `arg*` equivalents produce tuple-Tables.
* Jagged `cross`, `pairs`, `distincts` and their `arg*` equivalents have a `nested=False` option. If `nested=True`, the output is jagged one level deeper to keep track of which pairs contain the same left value. (For the explode-operate-reduce pattern.)
* Reducers are implemented on all types. Jagged reducers apply to the deepest level of jagged nesting only. Their `regularaxis=None` argument lets you send an `axis` argument to the Numpy array at the deepest level. Only rectangular arrays can be reduced in an arbitrary axis.
* Jagged `flatten` has an `axis=0` argument to determine which jagged level gets flattened. (This can't be negative.) `flatten` can happen at an arbitrary depth, but not reducers.
* The `array.at.whatever` syntax has been removed; it led to unreadable code.
* Table-tuple indexes like `["0"]`, `["1"]`, `["2"]`, etc. can be accessed by `.i0`, `.i1`, `.i2` (up to 9).
* Physics-motivated tests: jet cleaning and gen-reco matching.

0.6.0

Added `awkward.fromiter` and `awkward.fromiterchunks` to convert arbitrary data into columnar awkward-arrays. Added a description of these functions in the specification.

Started a real README and stubbed out its chapters.

Renamed `Table.content` as `Table.contents` for symmetry with `UnionArray.contents`. Only product types and sum types have "contents" (plural); the rest have only a "content" (singular). In all cases, a "content" is an array (Numpy or awkward-array), but for product types, "contents" is an ordered dict of arrays and for sum types, "contents" is an ordered list of arrays.

Fixed a display bug (if `self` is an awkward-array, but `self[:3]` is a Numpy array because it's empty, it no longer causes an error).

0.5.6

Revert a change to the location of default types (`INDEXTYPE`, etc.) from `awkward.util` to `awkward.array.base.AwkwardArray`. uproot depended on its old location.

Now it's in both places, and someday it will be only in `awkward.array.base.AwkwardArray`, but only after a major awkward version update that uproot will have to depend on.

_Do not use awkward 0.5.5 with uproot!_