Datalad

Latest version: v1.0.2

Safety actively analyzes 629723 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 11 of 15

0.11.1

Not secure
Rushed out bugfix release to stay fully compatible with recent
[git-annex][] which introduced v7 to replace v6.

Fixes

- [install][]: be able to install recursively into a dataset ([2982][])
- [save][]: be able to commit/save changes whenever files potentially
could have swapped their storage between git and annex
([1651][]) ([2752][]) ([3009][])
- [aggregate-metadata][]:
- dataset's itself is now not "aggregated" if specific paths are
provided for aggregation ([3002][]). That resolves the issue of
`-r` invocation aggregating all subdatasets of the specified dataset
as well
- also compare/verify the actual content checksum of aggregated metadata
while considering subdataset metadata for re-aggregation ([3007][])
- `annex` commands are now chunked assuming 50% "safety margin" on the
maximal command line length. Should resolve crashes while operating
of too many files at ones ([3001][])
- `run` sidecar config processing ([2991][])
- no double trailing period in docs ([2984][])
- correct identification of the repository with symlinks in the paths
in the tests ([2972][])
- re-evaluation of dataset properties in case of dataset changes ([2946][])
- [text2git][] procedure to use `ds.repo.set_gitattributes`
([2974][]) ([2954][])
- Switch to use plain `os.getcwd()` if inconsistency with env var
`$PWD` is detected ([2914][])
- Make sure that credential defined in env var takes precedence
([2960][]) ([2950][])

Enhancements and new features

- [shub://datalad/datalad:git-annex-dev](https://singularity-hub.org/containers/5663/view)
provides a Debian buster Singularity image with build environment for
[git-annex][]. `tools/bisect-git-annex` provides a helper for running
`git bisect` on git-annex using that Singularity container ([2995][])
- Added `.zenodo.json` for better integration with Zenodo for citation
- [run-procedure][] now provides names and help messages with a custom
renderer for ([2993][])
- Documentation: point to [datalad-revolution][] extension (prototype of
the greater DataLad future)
- [run][]
- support injecting of a detached command ([2937][])
- `annex` metadata extractor now extracts `annex.key` metadata record.
Should allow now to identify uses of specific files etc ([2952][])
- Test that we can install from http://datasets.datalad.org
- Proper rendering of `CommandError` (e.g. in case of "out of space"
error) ([2958][])

0.11

- [save][] fully replaces [add][] (which is obsolete now, and will be removed
in a future release).

- A new Git-annex aware [status][] command enables detailed inspection of dataset
hierarchies. The previously available [diff][] command has been adjusted to
match [status][] in argument semantics and behavior.

- The ability to configure dataset procedures prior and after the execution of
particular commands has been replaced by a flexible "hook" mechanism that is able
to run arbitrary DataLad commands whenever command results are detected that match
a specification.

- Support of the Windows platform has been improved substantially. While performance
and feature coverage on Windows still falls behind Unix-like systems, typical data
consumer use cases, and standard dataset operations, such as [create][] and [save][],
are now working. Basic support for data provenance capture via [run][] is also
functional.

- Support for Git-annex direct mode repositories has been removed, following the
end of support in Git-annex itself.

- The semantics of relative paths in command line arguments have changed. Previously,
a call `datalad save --dataset /tmp/myds some/relpath` would have been interpreted
as saving a file at `/tmp/myds/some/relpath` into dataset `/tmp/myds`. This has
changed to saving `$PWD/some/relpath` into dataset `/tmp/myds`. More generally,
relative paths are now always treated as relative to the current working directory,
except for path arguments of [Dataset][] class instance methods of the Python API.
The resulting partial duplication of path specifications between path and dataset
arguments is mitigated by the introduction of two special symbols that can be given
as dataset argument: `^` and `^.`, which identify the topmost superdataset and the
closest dataset that contains the working directory, respectively.

- The concept of a "core API" has been introduced. Commands situated in the module
`datalad.core` (such as [create][], [save][], [run][], [status][], [diff][])
receive additional scrutiny regarding API and implementation, and are
meant to provide longer-term stability. Application developers are encouraged to
preferentially build on these commands.

0.11.0

Not secure
[git-annex][] 6.20180913 (or later) is now required - provides a number of
fixes for v6 mode operations etc.

Major refactoring and deprecations

- `datalad.consts.LOCAL_CENTRAL_PATH` constant was deprecated in favor
of `datalad.locations.default-dataset` [configuration][config] variable
([2835][])

Minor refactoring

- `"notneeded"` messages are no longer reported by default results
renderer
- [run][] no longer shows commit instructions upon command failure when
`explicit` is true and no outputs are specified ([2922][])
- `get_git_dir` moved into GitRepo ([2886][])
- `_gitpy_custom_call` removed from GitRepo ([2894][])
- `GitRepo.get_merge_base` argument is now called `commitishes` instead
of `treeishes` ([2903][])

Fixes

- [update][] should not leave the dataset in non-clean state ([2858][])
and some other enhancements ([2859][])
- Fixed chunking of the long command lines to account for decorators
and other arguments ([2864][])
- Progress bar should not crash the process on some missing progress
information ([2891][])
- Default value for `jobs` set to be `"auto"` (not `None`) to take
advantage of possible parallel get if in `-g` mode ([2861][])
- [wtf][] must not crash if `git-annex` is not installed etc ([2865][]),
([2865][]), ([2918][]), ([2917][])
- Fixed paths (with spaces etc) handling while reporting annex error
output ([2892][]), ([2893][])
- `__del__` should not access `.repo` but `._repo` to avoid attempts
for reinstantiation etc ([2901][])
- Fix up submodule `.git` right in `GitRepo.add_submodule` to avoid
added submodules being non git-annex friendly ([2909][]), ([2904][])
- [run-procedure][] ([2905][])
- now will provide dataset into the procedure if called within dataset
- will not crash if procedure is an executable without `.py` or `.sh`
suffixes
- Use centralized `.gitattributes` handling while setting annex backend
([2912][])
- `GlobbedPaths.expand(..., full=True)` incorrectly returned relative
paths when called more than once ([2921][])

Enhancements and new features

- Report progress on [clone][] when installing from "smart" git servers
([2876][])
- Stale/unused `sth_like_file_has_content` was removed ([2860][])
- Enhancements to [search][] to operate on "improved" metadata layouts
([2878][])
- Output of `git annex init` operation is now logged ([2881][])
- New
- `GitRepo.cherry_pick` ([2900][])
- `GitRepo.format_commit` ([2902][])
- [run-procedure][] ([2905][])
- procedures can now recursively be discovered in subdatasets as well.
The uppermost has highest priority
- Procedures in user and system locations now take precedence over
those in datasets.

0.10.3.1

Not secure
Emergency bugfix to address forgotten boost of version in
`datalad/version.py`.

0.10.3

This is largely a bugfix release which addressed many (but not yet all)
issues of working with git-annex direct and version 6 modes, and operation
on Windows in general. Among enhancements you will see the
support of public S3 buckets (even with periods in their names),
ability to configure new providers interactively, and improved `egrep`
search backend.

Although we do not require with this release, it is recommended to make
sure that you are using a recent `git-annex` since it also had a variety
of fixes and enhancements in the past months.

Fixes

- Parsing of combined short options has been broken since DataLad
v0.10.0. ([2710][])
- The `datalad save` instructions shown by `datalad run` for a command
with a non-zero exit were incorrectly formatted. ([2692][])
- Decompression of zip files (e.g., through `datalad
add-archive-content`) failed on Python 3. ([2702][])
- Windows:
- colored log output was not being processed by colorama. ([2707][])
- more codepaths now try multiple times when removing a file to deal
with latency and locking issues on Windows. ([2795][])
- Internal git fetch calls have been updated to work around a
GitPython `BadName` issue. ([2712][]), ([2794][])
- The progress bar for annex file transferring was unable to handle an
empty file. ([2717][])
- `datalad add-readme` halted when no aggregated metadata was found
rather than displaying a warning. ([2731][])
- `datalad rerun` failed if `--onto` was specified and the history
contained no run commits. ([2761][])
- Processing of a command's results failed on a result record with a
missing value (e.g., absent field or subfield in metadata). Now the
missing value is rendered as "N/A". ([2725][]).
- A couple of documentation links in the "Delineation from related
solutions" were misformatted. ([2773][])
- With the latest git-annex, several known V6 failures are no longer
an issue. ([2777][])
- In direct mode, commit changes would often commit annexed content as
regular Git files. A new approach fixes this and resolves a good
number of known failures. ([2770][])
- The reporting of command results failed if the current working
directory was removed (e.g., after an unsuccessful `install`). ([2788][])
- When installing into an existing empty directory, `datalad install`
removed the directory after a failed clone. ([2788][])
- `datalad run` incorrectly handled inputs and outputs for paths with
spaces and other characters that require shell escaping. ([2798][])
- Globbing inputs and outputs for `datalad run` didn't work correctly
if a subdataset wasn't installed. ([2796][])
- Minor (in)compatibility with git 2.19 - (no) trailing period
in an error message now. ([2815][])

Enhancements and new features

- Anonymous access is now supported for S3 and other downloaders. ([2708][])
- A new interface is available to ease setting up new providers. ([2708][])
- Metadata: changes to egrep mode search ([2735][])
- Queries in egrep mode are now case-sensitive when the query
contains any uppercase letters and are case-insensitive otherwise.
The new mode egrepcs can be used to perform a case-sensitive query
with all lower-case letters.
- Search can now be limited to a specific key.
- Multiple queries (list of expressions) are evaluated using AND to
determine whether something is a hit.
- A single multi-field query (e.g., `pa*:findme`) is a hit, when any
matching field matches the query.
- All matching key/value combinations across all (multi-field)
queries are reported in the query_matched result field.
- egrep mode now shows all hits rather than limiting the results to
the top 20 hits.
- The documentation on how to format commands for `datalad run` has
been improved. ([2703][])
- The method for determining the current working directory on Windows
has been improved. ([2707][])
- `datalad --version` now simply shows the version without the
license. ([2733][])
- `datalad export-archive` learned to export under an existing
directory via its `--filename` option. ([2723][])
- `datalad export-to-figshare` now generates the zip archive in the
root of the dataset unless `--filename` is specified. ([2723][])
- After importing `datalad.api`, `help(datalad.api)` (or
`datalad.api?` in IPython) now shows a summary of the available
DataLad commands. ([2728][])
- Support for using `datalad` from IPython has been improved. ([2722][])
- `datalad wtf` now returns structured data and reports the version of
each extension. ([2741][])
- The internal handling of gitattributes information has been
improved. A user-visible consequence is that `datalad create
--force` no longer duplicates existing attributes. ([2744][])
- The "annex" metadata extractor can now be used even when no content
is present. ([2724][])
- The `add_url_to_file` method (called by commands like `datalad
download-url` and `datalad add-archive-content`) learned how to
display a progress bar. ([2738][])

0.10.2

Not secure
Primarily a bugfix release to accommodate recent git-annex release
forbidding file:// and http://localhost/ URLs which might lead to
revealing private files if annex is publicly shared.

Fixes

- fixed testing to be compatible with recent git-annex (6.20180626)
- [download-url][] will now download to current directory instead of the
top of the dataset

Enhancements and new features

- do not quote ~ in URLs to be consistent with quote implementation in
Python 3.7 which now follows RFC 3986
- [run][] support for user-configured placeholder values
- documentation on native git-annex metadata support
- handle 401 errors from LORIS tokens
- `yoda` procedure will instantiate `README.md`
- `--discover` option added to [run-procedure][] to list available
procedures

Page 11 of 15

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.