Opentargets-validator

Latest version: v1.0.0

Safety actively analyzes 613777 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 2

2.2.2

1.0.0

Now 17× faster
IMPC data contains 1,332,579 lines or 1.82 GB of uncompressed JSON data. Running in parallel on a 12 core machine with 64 GB RAM:
* Old validator runs in **13 minutes 46 seconds** (826 seconds);
* New validator runs in **48 seconds.**

This is achieved by switching from `jsonschema` to `fastjsonschema` with compiled validator objects, and also by processing evidence strings in blocks to decrease multiprocessing overhead.

Does not hang
The `pypeln` library which used to give us so many issues with validator randomly hanging is not used anymore.

Helpful error messages
The new way to format error messages immediately draws attention to it and doesn't leave the user guessing as to what went wrong. It also provides the original evidence string which errored:

2023-09-07 14:08:03,454 - opentargets_validator.validator - ERROR -
Line 82 is a valid JSON object, but it does not match the schema:
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃data.effects[0] must contain ['direction'] properties┃
┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛
{"id":"ENSG00000006071","event":"abnormal pulse","eventId":"HP_0031860","datasource":"Lynch et al. (2017)","effects":[{"dosing":"acute"}],"literature":"28216264","biosamples":[{"tissueLabel":"pancreas","tissueId":"UBERON_0001264"},{"tissueLabel":"renal","tissueId":"UBERON_0001008"},{"tissueLabel":"cardiovascular","tissueId":"UBERON_0004535"}]}


No legacy dependencies
Used to have six dependencies: `requests`, `jsonschema`, `rfc3987`, `simplejson`, `pypeln`, `opentargets-urlzsource`. The last one, in particular, has long been moved to Open Targets archive repository and hasn't been supported in ages.

Now just two shiny and new dependencies: `pathos` and `fastjsonschema`.

Other improvements
* Updated README
* Updated CLI help message (the actual usage syntax is unaffected)
* Added `--version` argument to print version and exit

Technical changes
* Minimum Python version lifted from 3.7 to 3.8 (required by dependencies)
* Configured Black formatter and formatted the entire code base
* Updated `.gitignore`
* Removed `Dockerfile`

0.8.0

* The [hanging issue](https://github.com/opentargets/platform/issues/1967) was resolved by updating pypln version.
* Updating pypln sadly implied the support for Python2 had to be dropped. The minimum required version now is Python 3.7.
* Test suite has been updated.

0.7.0

- Removed `--hash` option to check for duplicates to be compatible with new 2.x JSON schema version
- Updated jsonschema dependency to version >=3.0.2
- Added instructions to install with Conda

0.6.0

0.4.0

New release with a few minor changes:
- bugfix for python3 on `file://` urls for schemas
- new feature now exits with non-zero exit code (2) when there have been any validation failures

Page 1 of 2

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.