LazyArray API completely revamped; it now supports lazy arrays across files:
python
this step reads nothing except TTree headers to get a global map of numentries
(and returns immediately):
>>> la = uproot.tree.lazyarrays("tests/samples/sample-*.root", "sample")
>>> la
{'ab': <LazyArray 'ab' at 7bd3e5355650>, 'f4': <LazyArray 'f4' at 7bd3e5355cd0>,
'Ai8': <LazyArray 'Ai8' at 7bd3e5355bd0>, 'f8': <LazyArray 'f8' at 7bd3e5355d90>,
'Ai4': <LazyArray 'Ai4' at 7bd3e5355a50>, 'Ai1': <LazyArray 'Ai1' at 7bd3e5355750>,
'Ai2': <LazyArray 'Ai2' at 7bd3e53558d0>, 'ai2': <LazyArray 'ai2' at 7bd3e5355890>,
'ai8': <LazyArray 'ai8' at 7bd3e5355b90>, 'u4': <LazyArray 'u4' at 7bd3e5355a90>,
'u1': <LazyArray 'u1' at 7bd3e5355790>, 'u2': <LazyArray 'u2' at 7bd3e5355910>,
'i8': <LazyArray 'i8' at 7bd3e5355b50>, 'i1': <LazyArray 'i1' at 7bd3e53556d0>,
'Au2': <LazyArray 'Au2' at 7bd3e5355990>, 'i2': <LazyArray 'i2' at 7bd3e5355850>,
'i4': <LazyArray 'i4' at 7bd3e53559d0>, 'Au1': <LazyArray 'Au1' at 7bd3e5355810>,
'b': <LazyArray 'b' at 7bd3e5355610>, 'Af8': <LazyArray 'Af8' at 7bd3e5355e10>,
'Au4': <LazyArray 'Au4' at 7bd3e5355b10>, 'ai1': <LazyArray 'ai1' at 7bd3e5355710>,
'Af4': <LazyArray 'Af4' at 7bd3e5355d50>, 'Au8': <LazyArray 'Au8' at 7bd3e5355c90>,
'u8': <LazyArray 'u8' at 7bd3e5355c10>, 'Ab': <LazyArray 'Ab' at 7bd3e5355690>,
'n': <LazyArray 'n' at 7bd3e53555d0>, 'au1': <LazyArray 'au1' at 7bd3e53557d0>,
'au2': <LazyArray 'au2' at 7bd3e5355950>, 'af8': <LazyArray 'af8' at 7bd3e5355dd0>,
'au4': <LazyArray 'au4' at 7bd3e5355ad0>, 'str': <LazyArray 'str' at 7bd3e5355e50>,
'ai4': <LazyArray 'ai4' at 7bd3e5355a10>, 'af4': <LazyArray 'af4' at 7bd3e5355d10>,
'au8': <LazyArray 'au8' at 7bd3e5355c50>}
this object represents the "i4" branch across all files
(24 in this case; the files matching the glob pattern above; you can also use explicit lists)
>>> la["i4"]
<LazyArray 'i4' at 7bd3e53559d0>
use __getitem__ to actually read anything
>>> la["i4"][10:20]
array([-5, -4, -3, -2, -1, 0, 1, 2, 3, 4], dtype=int32)
And it has a Dask connector:
python
>>> la["i4"].dask.array()
dask.array<array, shape=(720,), dtype=int32, chunksize=(7,)>
The default Dask chunksize is the basket size (ridiculously small in this example, just to demonstrate).