Csvkit

Latest version: v2.0.0

Safety actively analyzes 623642 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 5

2.0.0

-------------------

This is the first major release since December 27, 2016. Thank you to all :ref:`contributors<authors>`, including 44 new contributors since 1.0.0!

Want to use csvkit programmatically? Check out `agate <https://agate.readthedocs.io/en/latest/>`__, used internally by csvkit.

**BACKWARDS-INCOMPATIBLE CHANGES:**

- :doc:`/scripts/csvclean` now writes its output to standard output and its errors to standard error, instead of to ``basename_out.csv`` and ``basename_err.csv`` files. Consequently:

- The :code:`--dry-run` option is removed. The :code:`--dry-run` option changed error output from the CSV format used in ``basename_err.csv`` files to a prosaic format like ``Line 1: Expected 2 columns, found 3 columns``.
- Summary information like ``No errors.``, ``42 errors logged to basename_err.csv`` and ``42 rows were joined/reduced to 24 rows after eliminating expected internal line breaks.`` is not written.

- :doc:`/scripts/csvclean` no longer reports or fixes errors by default; it errors if no checks or fixes are enabled. Opt in to the original behavior using the :code:`--length-mismatch` and :code:`--join-short-rows` options. See new options below.
- :doc:`/scripts/csvclean` no longer omits rows with errors from the output. Opt in to the original behavior using the :code:`--omit-error-rows` option.
- :doc:`/scripts/csvclean` joins short rows using a newline by default, instead of a space. Restore the original behavior using the :code:`--separator " "` option.

In brief, to restore the original behavior for :doc:`/scripts/csvclean`:

.. code-block:: bash

csvclean --length-mismatch --omit-error-rows --join-short-rows --separator " " myfile.csv

Other changes:

- feat: :doc:`/scripts/csvclean` adds the options:

- :code:`--length-mismatch`, to error on data rows that are shorter or longer than the header row
- :code:`--empty-columns`, to error on empty columns
- :code:`--enable-all-checks`, to enable all error reporting
- :code:`--omit-error-rows`, to omit data rows that contain errors, from standard output
- :code:`--label LABEL`, to add a "label" column to standard error
- :code:`--header-normalize-space`, to strip leading and trailing whitespace and replace sequences of whitespace characters by a single space in the header
- :code:`--join-short-rows`, to merge short rows into a single row
- :code:`--separator SEPARATOR`, to change the string with which to join short rows (default is newline)
- :code:`--fill-short-rows`, to fill short rows with the missing cells
- :code:`--fillvalue FILLVALUE`, to change the value with which to fill short rows (default is none)

- feat: The :code:`--quoting` option accepts 4 (`csv.QUOTE_STRINGS <https://docs.python.org/3/library/csv.html#csv.QUOTE_STRINGS>`__) and 5 (`csv.QUOTE_NOTNULL <https://docs.python.org/3/library/csv.html#csv.QUOTE_NOTNULL>`__) on Python 3.12.
- feat: :doc:`/scripts/csvformat`: The :code:`--out-quoting` option accepts 4 (`csv.QUOTE_STRINGS <https://docs.python.org/3/library/csv.html#csv.QUOTE_STRINGS>`__) and 5 (`csv.QUOTE_NOTNULL <https://docs.python.org/3/library/csv.html#csv.QUOTE_NOTNULL>`__) on Python 3.12.
- fix: :doc:`/scripts/csvformat`: The :code:`--out-quoting` option works with 2 (`csv.QUOTE_NONUMERIC <https://docs.python.org/3/library/csv.html#csv.QUOTE_NOTNUMERIC>`__). Use the :code:`--locale` option to set the locale of any formatted numbers.
- fix: :doc:`/scripts/csvclean`: The :code:`--join-short-rows` option no longer reports length mismatch errors that were fixed.

1.5.0

----------------------

- feat: Add support for Zstandard files with the ``.zst`` extension, if the ``zstandard`` package is installed.
- feat: :doc:`/scripts/csvformat` adds a :code:`--out-asv` (:code:`--A`) option to use the ASCII unit separator and record separator.
- feat: :doc:`/scripts/csvsort` adds a :code:`--ignore-case` (:code:`--i`) option to perform case-independent sorting.

1.4.0

-------------------------

- feat: :doc:`/scripts/csvpy` adds the options:

- :code:`--no-number-ellipsis`, to disable the ellipsis (``…``) if max precision is exceeded, for example, when using ``table.print_table()``
- :code:`--sniff-limit``
- :code:`--no-inference``

- feat: :doc:`/scripts/csvpy` removes the :code:`--linenumbers` and :code:`--zero` output options, which had no effect.
- feat: :doc:`/scripts/in2csv` adds a :code:`--reset-dimensions` option to `recalculate <https://openpyxl.readthedocs.io/en/stable/optimized.html#worksheet-dimensions>`_ the dimensions of an XLSX file, instead of trusting the file's metadata. csvkit's dependency `agate-excel <https://agate-excel.readthedocs.io/en/latest/>`_ 0.4.0 automatically recalculates the dimensions if the file's metadata expresses dimensions of "A1:A1" (a single cell).
- fix: :doc:`/scripts/csvlook` only reads up to :code:`--max-rows` rows instead of the entire file.
- fix: :doc:`/scripts/csvpy` supports the existing input options:

- :code:`--locale`
- :code:`--blanks`
- :code:`--null-value`
- :code:`--date-format`
- :code:`--datetime-format`
- :code:`--skip-lines`

- fix: :doc:`/scripts/csvpy`: :code:`--maxfieldsize` no longer errors when :code:`--dict` is set.
- fix: :doc:`/scripts/csvstack`: :code:`--maxfieldsize` no longer errors when :code:`--no-header-row` isn't set.
- fix: :doc:`/scripts/in2csv`: :code:`--write-sheets` no longer errors when standard input is an XLS or XLSX file.
- Update minimum agate version to 1.6.3.

1.3.0

------------------------

- :doc:`/scripts/csvformat` adds a :code:`--skip-header` (:code:`-E`) option to not output a header row.
- :doc:`/scripts/csvlook` adds a :code:`--max-precision` option to set the maximum number of decimal places to display.
- :doc:`/scripts/csvlook` adds a :code:`--no-number-ellipsis` option to disable the ellipsis (``…``) if :code:`--max-precision` is exceeded. (Requires agate 1.9.0 or greater.)
- :doc:`/scripts/csvstat` supports the :code:`--no-inference` (:code:`-I`), :code:`--locale` (:code:`-L`), :code:`--blanks`, :code:`--date-format` and :code:`datetime-format` options.
- :doc:`/scripts/csvstat` reports a "Non-null values" statistic (or a :code:`nonnulls` column when :code:`--csv` is set).
- :doc:`/scripts/csvstat` adds a :code:`--non-nulls` option to only output counts of non-null values.
- :doc:`/scripts/csvstat` reports a "Most decimal places" statistic (or a :code:`maxprecision` column when :code:`--csv` is set).
- :doc:`/scripts/csvstat` adds a :code:`--max-precision` option to only output the most decimal places.
- :doc:`/scripts/csvstat` adds a :code:`--json` option to output results as JSON text.
- :doc:`/scripts/csvstat` adds an :code:`--indent` option to indent the JSON text when :code:`--json` is set.
- :doc:`/scripts/in2csv` adds a :code:`--use-sheet-names` option to use the sheet names as file names when :code:`--write-sheets` is set.
- feat: Add a :code:`--null-value` option to commands with the :code:`--blanks` option, to convert additional values to NULL.
- fix: Reconfigure the encoding of standard input according to the :code:`--encoding` option, which defaults to ``utf-8-sig``. Affected users no longer need to set the ``PYTHONIOENCODING`` environment variable.
- fix: Prompt the user if additional input is expected (i.e. if no input file or piped data is provided) in :doc:`/scripts/csvjoin`, :doc:`/scripts/csvsql` and :doc:`/scripts/csvstack`.
- fix: No longer errors if a NUL byte occurs in an input file.
- Add Python 3.12 support.

1.2.0

-----------------------

- fix: :doc:`/scripts/csvjoin` uses the correct columns when performing a :code:`--right` join.
- Add SQLAlchemy 2 support.
- Drop Python 3.7 support (end-of-life was June 5, 2023).

1.1.1

-------------------------

- feat: :doc:`/scripts/csvstack` handles files with columns in different orders or with different names.

Page 1 of 5

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.