Fix the `JaggedArray.zip` method, bringing it up to date with the specification (PR 79).
Faster ufunc calculations with JaggedArray (77).
Fixed loss of Numbaness on __getitem__[string] for lgray .
Found and fixed a few more places where the wrong `awkward` or `numpy` module is being used.
Introduced [awkward-numba](https://pypi.org/project/awkward-numba/), a separately installable package to accelerate some methods with Numba's just in time compilation.
* Running `pip install awkward-numba` adds a package inside awkward: `import awkward.numba`.
* Arrays in `awkward.numba.*` behave just like arrays in `awkward.*` except that some methods are faster and require Numba to be installed.
* Reading persisted arrays with `awkwardlib="awkward.numba"` loads them in accelerated form.
* Both unaccelerated and accelerated arrays can be used in the same process; the latter are a subclass of the former.
To do this, many central facilities from `awkward.util` have been moved into the classes themselves. For instance, the Numpy corresponding to a `JaggedArray` or a `JaggedArrayNumba` are now in `JaggedArray.numpy` and `JaggedArrayNumba.numpy`. This is to prepare the way for GPU extensions of awkward-array, in which `JaggedArrayCUDA.numpy` would actually be [CuPy](https://cupy.chainer.org).
Soon, [uproot-methods](https://github.com/scikit-hep/uproot-methods) and [uproot](https://github.com/scikit-hep/uproot) will depend on awkard 0.8.0 as a minimum version.
Remove README from data_files.
Fixed a bug in jagged fancy indexing: 66.
The `out` parameter in Numpy ufuncs are for in-place operations. awkward-array does not support in-place operations, so we now detect this and forbid it.
Cleaned up and ready to become the new minimal version dependency.
uproot-methods and uproot will depend on this version.
* Tables whose `rowname` is `"tuple"` and whose fields are `map(str, range(n))` are visualized as tuples (so that you can see field values, unlike the opaque default) in `repr` and `tolist()`.
* Jagged `cross`, `pairs`, `distincts` and their `arg*` equivalents produce tuple-Tables.
* Jagged `cross`, `pairs`, `distincts` and their `arg*` equivalents have a `nested=False` option. If `nested=True`, the output is jagged one level deeper to keep track of which pairs contain the same left value. (For the explode-operate-reduce pattern.)
* Reducers are implemented on all types. Jagged reducers apply to the deepest level of jagged nesting only. Their `regularaxis=None` argument lets you send an `axis` argument to the Numpy array at the deepest level. Only rectangular arrays can be reduced in an arbitrary axis.
* Jagged `flatten` has an `axis=0` argument to determine which jagged level gets flattened. (This can't be negative.) `flatten` can happen at an arbitrary depth, but not reducers.
* The `array.at.whatever` syntax has been removed; it led to unreadable code.
* Table-tuple indexes like `["0"]`, `["1"]`, `["2"]`, etc. can be accessed by `.i0`, `.i1`, `.i2` (up to 9).
* Physics-motivated tests: jet cleaning and gen-reco matching.
Added `awkward.fromiter` and `awkward.fromiterchunks` to convert arbitrary data into columnar awkward-arrays. Added a description of these functions in the specification.
Started a real README and stubbed out its chapters.
Renamed `Table.content` as `Table.contents` for symmetry with `UnionArray.contents`. Only product types and sum types have "contents" (plural); the rest have only a "content" (singular). In all cases, a "content" is an array (Numpy or awkward-array), but for product types, "contents" is an ordered dict of arrays and for sum types, "contents" is an ordered list of arrays.
Fixed a display bug (if `self` is an awkward-array, but `self[:3]` is a Numpy array because it's empty, it no longer causes an error).
Revert a change to the location of default types (`INDEXTYPE`, etc.) from `awkward.util` to `awkward.array.base.AwkwardArray`. uproot depended on its old location.
Now it's in both places, and someday it will be only in `awkward.array.base.AwkwardArray`, but only after a major awkward version update that uproot will have to depend on.
_Do not use awkward 0.5.5 with uproot!_
Wrote specification and touched up the library to match. The library isn't completely in agreement with the specification, but it is close.
Optimized validity checking for JaggedArrays with `starts` and `stops` that correspond to a single `offsets` (a very common case).
When selecting fields with a `__getitem__` string, the result would always be valid, so it is now preloaded with `_isvalid = True` (for all classes, not just JaggedArray).
Rewrote `JaggedArray._tojagged` so that it would not depend on the `content` being a Numpy array, as part of a fix of 49. (_That_ error was caused by an incompletely filled `numpy.empty` array in the old `JaggedArray._tojagged`, but now it's moot.)
Also added performance-testing options. If the following are set:
awkward.array.base.AwkwardArray.allow_tonumpy = False
awkward.array.base.AwkwardArray.allow_iter = False
then no arrays will be auto-convert to Numpy or be iterable in Python (except in `__str__` and `__repr__`). These are the two slowest operations, and refusing them with a `RuntimeError` may help a user search for bugs.
(The option can also be set on a particular subclass of `AwkwardArray` or a particular instance.)
Empty list/array should not be interpreted as an empty string array.
* Access named columns via `myarray.at.x`, equivalent to `myarray["x"]`.
* Access numeric columns via `myarray.at(0)`, equivalent to `myarray["0"]`.
* All subindexes of jagged indexing is now supported.
* Added `astype` method to all classes. (39)
* Corrected use of `reduceat` in jagged reducers. (38)
* Bubble up mix-in methods from deeper nesting when selecting through column name.
* Added `test_crosscut` and `test_methods`.
Fixed 31 and 33. It is now possible to slice the second dimension of a jagged array:
Jagged arrays that happen to have fixed dimension can be cast as a regular array:
And an error was fixed in filtering jagged arrays with jagged masks that affected sublists with more than 256 elements.
Persist Table views, not just base Tables.
Formally migrated to pytest, fixed the infinite loop when converting awkward-arrays to Numpy (19), fixed a bug in UnionArray ufuncs (probably 15), and added a mechanism to avoid pickle in awkward-array serializations.
To get a Zenodo DOI.
Arrow buffers can now be viewed as awkward arrays, and by extension, Parquet files. Parquet files are read lazily as `ChunkedArray(VirtualArray(...))`.
Persistence has also been updated, both to accommodate Arrow/Parquet and also to provide a consistent, clean interface.
Persistence: awkward arrays may now be read from and written to pickle, ZIP files, HDF5 files, and anything with a dict-like interface (e.g. shelve). Compression and cross-references/cyclic-references are included.
Added `JaggedArray.fromindex` (10) and fixed `UnionArray.fromtags` (8).
Finished a basic implementation of all planned awkward array classes.
Jagged, table, object, indexed, and chunked arrays are done, union, masked, sparse, appendable, and virtual arrays are not.
Fixed a bug in which a Table.Row.(column name) returned the whole column, not the value for a single row and column. This only affected the dot notation, not the bracket notation, but is now fixed for both.