Scancode-toolkit

Latest version: v32.1.0

Safety actively analyzes 628918 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 9 of 12

2.9.1

Not secure
-------------------

This is a minor pre-release of what will come up for 3.0 with no API change.

Licenses:
- There are new and improved licenses and license detection rules 994 991 695 983 998 969

Copyrights:
- Copyright detection has been improved 930 965

Misc:
- Improve support for JavaScript map files: they may contain both debugging
information and whole package source code.
- multiple minor bug fixes

Credits: Many thanks to everyone that contributed to this release with code and bug reports

- haikoschol
- jamesward
- JonoYang
- DennisClark
- swinslow

2.9.0b1

Not secure
---------------------

This is a major pre-release of what will come up for 3.0

This has a lot of new changes including improved plugins, speed and detection
that are not yet fully documented but it can be used for testing.

API changes:
- Command line API

- `--diag` option renamed to `--license-diag`

- `--format <format code>` option has been replaced by multiple options one
for each format such as `--format-csv` `--format-json` and multiple formats
can be requested at once

- new experimental `--cache-dir` option and `SCANCODE_CACHE` environment variable
and `--temp-dir` and `SCANCODE_TMP` environment variable to set the temp and
cache directories.

- JSON data output format: no major changes

- programmatic API in scancode/api.py:

- get_urls(location, threshold=50): new threshold argument

- get_emails(location, threshold=50): new threshold argument

- get_file_infos renamed to get_file_info

- Resource moved to scancode.resource and significantly updated

- get_package_infos renamed to get_package_info


Command line
- You can select multiple outputs at once (e.g. JSON and CSV, etc.) 789
- There is a new capability to reload a JSON scan to reprocess it with postcsan
plugins and or converting a JSON scan to CSV or else.


Licenses:
- There are several new and improved licenses and license detection rules 799 774 589
- Licenses data now contains the full name as well as the short name.

- License match have a notion of "coverage" which is the number of matched
words compared to the number of words in the matched rule.
- The license cache is not checked anymore for consistency once created which
improved startup times. (unless you are using a Git checkout and you are
developping with a SCANCODE_DEV_MODE tag file present)
- License catagory names have been improved

Copyrights:
- Copyright detection in binary files has been improved
- There are several improvements to the copyright detection quality fixing these
tickets: 795 677 305 795
- There is a new post scan plugin that can be used to ignore certain copyright in
the results

Summaries:
- Add new support for copyright summaries using smart holder deduplication 930

Misc:
- Add options to limit the number of emails and urls that are collected from
each file (with a default to 50) 384
- When configuring in dev mode, VS Code settings are created
- Archive detection has been improved
- There is a new cache and temporary file configuration with --cache-dir and
--temp-dir CLI options. The --no-cache option has been removed
- Add new --examples to show usage examples help
- Move essential configuration to a scancode_config.py module
- Only read a few pages from PDF files by default
- Improve handling of files with weird characters in their names on all OSses
- Improve detection of archive vs. comrpessed files
- Make all copyright tests data driven using YAML files like for license tests


Plugins
- Prescan plugins can now exclude files from the scans
- Plugins can now contribute arbitrary command line options 787 and 748
- there is a new plugin stage called output_filter to optionally filter a scan before output.
One example is to keep "only findings" 787
- The core processing is centered now on a Codebase and Resource abstraction
that represents the scanned filesystem in memory 717 736
All plugins operate on this abstraction
- All scanners are also plugins 698 and now everything is a plugin including the scans
- The interface for output plugins is the same as other plugins 715


Credits: Many thanks to everyone that contributed to this release with code and bug reports
(and this list is likely missing some)

- SaravananOffl
- jpopelka
- yashdsaraf
- haikoschol
- jdaguil
- ajeans
- DennisClark
- susg
- pombredane
- mjherzog
- Sidsharik
- nishakm
- yasharmaster
- techytushar
- JonoYang
- majurg
- aviral1701
- haikoschol
- chinyeungli
- vivonk
- Chaitya62
- inishchith

2.2.1

Not secure
-------------------

This is a minor release with several bug fixes, one new feature
and one (minor) API change.

API change:
~~~~~~~~~~~

- Licenses data now contains a new reference_url attribute instead of a
dejacode_url attribute. This defaults to the public DejaCode URL and
can be configured with the new --license-url-template command line
option.

New feature:
~~~~~~~~~~~~~~~

- There is a new "--format jsonlines" output format option.
In this format, each line in the output is a valid JSON document. The
first line contains a "header" object with header-level data such as
notice, version, etc. Each line after the first contains the scan
results for a single file formatted with the same structure as a
whole scan results JSON documents but without any header-level
attributes. See also http://jsonlines.org/

Other changes:
~~~~~~~~~~~~~~~

- Several new and improved license detection rules have been added.
The logic of detection has been refined to handle some rare corner
cases. The underscore character "_" is treated as part of a license
word and the handling of negative and false_positive license rules
has been simplified.

- Several issues with dealing with codebase with non-ASCII,
non-UTF-decodable file paths and other filesystem encodings-related
bug have been fixed.

- Several copyright detection bugs have been fixed.
- PHP Composer and RPM packages are now detected with --package
- Several other package types are now detected with --package even
though only a few attribute may be returned for now until full parser
are added.
- Several parsing NPM packages bugs have been fixed.
- There are some minor performance improvements when scanning some
large file for licenses.

2.1.0

Not secure
-------------------

This is a minor release with several new and improved features and bug
fixes but no significant API changes.

- New plugin architecture by yashdsaraf

- we can now have pre-scan, post-scan and output format plugins
- there is a new CSV output format and some example, experimental plugins
- the CLI UI has changed to better support these plugins

- New and improved licenses and license detection rules including
support for EPL-2.0 and OpenJDK-related licensing and synchronization
with the latest SPDX license list

- Multiple bug fixes such as:

- Ensure that authors are reported even if there is no copyright 669
- Fix Maven package POM parsing infinite loop 721
- Improve handling of weird non-unicode byte paths 688 and 706
- Improve PDF parsing to avoid some crash 723

Credits: Many thanks to everyone that contributed to this release with code and bug reports
(and this list is likely missing some)

* abuhman
* chinyeungli
* jimjag
* JonoYang
* jpopelka
* majurg
* mjherzog
* pgier
* pkajaba
* pombredanne
* scottctr
* sschuberth
* yahalom5776
* yashdsaraf

2.0.1

Not secure
-------------------

This is a minor release with minor new and improved features and bug
fixes.

- New and improved license detection, including refined match scoring
for 534
- Bug fixed in License detection leading to a very long scan time for some
rare JavaScript files. Reported by jarnugirdhar
- New "base_name" attribute returned with file information. Reported by
chinyeungli
- Bug fixed in Maven POM package detection. Reported by kalagp

2.0.0

Not secure
-------------------

This is a major release with several new and improved features and bug
fixes.

Some of the key highlights include:

License detection:
~~~~~~~~~~~~~~~~~~~

- Brand new, faster and accurate detection engine using multiple
techniques eventually doing multiple exhaustive comparisons of
a scanned file content against all the license and rule texts.

- Several new licenses and over 2500+ new and improved licenses
detection rules have been added making the detection significantly
better (and weirdly enough faster too as a side-effect of the new
detection engine)

- the matched license text can be optionally returned with the
`--license-text` option

- The detection accuracy has been benchmarked against other detection
engine and ScanCode has shown to be more accurate and
comprehensive than all the other engines reviewed.

- improved scoring of license matches


Package and dependencies:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

- new and improved detection of multiple package formats: NPM, Maven,
NuGet, PHP Composer, Python Pypi and RPM. In most cases direct,
declared dependencies are also reported.

- several additional package formats will be reported in the future
version.

- note: the structure of Packages data is evolving and should not be
considered API at this stage


Scan outputs:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

- New SPDX tag/values and RDF outputs.

- new compact JSON format (the pretty printed format is still
available with the the `json-pp` format).
The JSON format has been changed significantly and is closer to a
documented, standard format that we call the ABC data format.

- Minor refinements on the html and html-app format. Note that the
html-app format will be deprecated and replaced by the new AboutCode
Manager desktop app (electron-based) in future versions.


- Copyright: Improved copyright detection: several false positive are
no longer returned and copyrights are more accurate


- Archive: support for shallow extraction and support for new archive
types (such as Spring boot shell archives)


Performance:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

- Everything is generally faster, and license detection performance
has been significantly improved.

- Scans can run on multiple processes in parallel with the new
`--processes` option speeding up things even further. A scan of a
full Debian pool of source packages was reported to scan in about
11 hours (on a rather beefy 144 cores, 256GB machine)

- Reduced memory usage with the use of caching

Other notes:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

- This is the last release with Linux 32 bits architecture support
- The scan of a file can be interrupted after a timeout with a 120
seconds default
- ScanCode is now available as a library on the Pypi Python package
index for use as a library. The documentation for the library usage
will follow in future versions
- New `--ignore` option: You can optionally ignore certain file and
paths during a scan
- New `--diag option`: display additional debug and diagnostic data
- The scanned file paths can now reported as relative, rooted or
absolute with new command line options with a default to a rooted
path.


Thank you to all contributors to this release and the 200+ stars
and 60+ forks on GitHub!

Credits in alphabetical order:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Alexander Lisianoi
Avi Aryan
Benedikt Spranger
Chin Yeung
Dennis Clark
Hugo Jacob
Jakub Wilk
Jericho attritionorg
Jillian Daguil
Jiri Popelka
John M. Horan
Jonathan "Jono" Yang
Li Ha
Michael Herzog
Michael Rupprecht
Nusrat Sultana
Paul Kunz
Philippe Ombredanne
Rakesh Balusa
Ranvir Singh
Richard Fontana
Sebastian Schuberth
Steven Esser
Thomas Gleixner
Tisoga forrestchang
Yash D. Saraf
Yash Sharma

Page 9 of 12

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.