Sqlflow

Latest version: v0.15.0

Safety actively analyzes 619504 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

0.4.2

Major Features and Improvements
- Add three pre-made Runnable: extract_ts_features (extract time series features using tsfresh), binning and psi
- Model Meta Design: get the model metadata (such as docker image name for TO TRAIN, model type and so on) when generating prediction workflow step code
- Distinguish XGBoost model when generating prediction workflow code
- Support config https for jupyterhub

Refactorization
- Implement the end-to-end workflow of XGBoost prediction and evaluation
- Implement predict and explain in Alisa submitter at runtime
- Unify the API of local and PAI submitter
- Simplify HDFS parameters

Bug Fixes
- Fix titanic Maxcompute dataset importing when FLOAT data type is not enabled
- Fix generate Couler evaluate step in workflow mode.
- Fix paiio reading table bug when running TO EXPLAIN on PAI.
- Fix XGBoost data compatibility issue: compatible with various CSV format such as a,b,c, and a, b, c, and the string containing /
- Fix explain issue when SHAP values are not listed

0.4.1

Major Features and Improvements

- The model zoo can be used in the playground now.
- CLI supports downloading the model in the model zoo to local.
- Support the GCN model in the official models repo.
- CI has been moved to the Github actions. Travis CI was disabled.
- `TO RUN` syntax can use the file name instead of using the absolute path.
- Non-linear optimization problems are supported by the BARON solver.
- `CONSTRAINT` clause can be optional in the `TO MAXIMIZE|MINIMIZE` statement.

Refactorization

- The end to end XGBoost training on local can run in the workflow mode now.
- Unify the DBMS APIs by the `Connection` and `ResultSet` interfaces in the Python side.

Bug Fixes

- Fix the bug that XGBoost training cannot have more than 255 feature columns.
- Fix the bug that the TiDB parser cannot parse the `LAG` function.

0.4.0

Major Features and Improvements

- The parser can remove all comments now.
- Support linear programming using `pyomo` and `optflow`.
- Add Model zoo default model definitions in image `sqlflow/sqlflow`.
- Support custom train loop, predict sample, evaluation loop in custom models.
- Move CI jobs from Travis to GitHub actions to use a pre-setup environment to speed up the build and test.
- Add [SQLFlow Playground](https://github.com/sql-machine-learning/playground) where users can get a quick experience of SQLFlow.

Refactorization

- WIP: refactoring `sqlflow_submitter` to `runtime`. The `runtime` library supports feature derivation, statement verifier, job submitters to various platforms, and executes the workflow step then saves the model into the database.
- Remove `is_pai` conditions in `runtime.tensorflow` package and move corresponding code runs on PAI to `runtime.pai`.

Bug Fixes

- Fix size calculation in fillCSVFieldDesc is always 0 in feature derivation.

0.3.0rc.1

Major Features and Improvements

---

- Support `TO EVALUATE` clause to evaluate a model.
- SQLFlow model zoo, support publicly share model definitions and models.
- Support mathematical programming using SQL.
- Support feature column in the XGBoost model, including training, evaluating, prediction, and explaining.
- Support incremental training for both TensorFlow and XGBoost models.
- Add logs to record runtime status.
- Command-line Tool support release/remove model/repo .
- Support `SHOW TRAIN` statement go get original SQL.
- Create the [SQLFlow Playground](https://github.com/sql-machine-learning/playground) as a quick-start environment.

Improvements

- Improve the user experience on workflow mode, including improving workflow log structure, return selected rows, and diagnostic message to the GUI system.
- Improve some diagnostic messages on the workflow model.
- Supports passing all the selected columns into the prediction result table.
- Decompose the all-in-one Docker image into separated Docker images.

0.2.0rc.1

Major Features and Improvements

1. Support parsing on SQL programs and arbitrary select statement in extended syntax. https://github.com/sql-machine-learning/sqlflow/issues/1126
1. Support feature derivation. https://github.com/sql-machine-learning/sqlflow/issues/705
1. Support high available SQLFlow server by submitting SQL programs to Kubernetes clusters as a workflow. https://github.com/sql-machine-learning/sqlflow/issues/1066
1. Enhanced REPL functionality.
1. Support more training configurations:
1. Support configuring optimizers for Tensorflow Estimator models.
1. Support configuring optimizers and losses for custom Keras models.
1. Support configuring metrics for training Tensorflow Estimator models and Keras models.
1. Support explaining TensorFlow BoostedTrees models.
1. Support writing EXPLAIN results to a table.

Breaking changes:

1. We update the syntax extension from appending TRAIN/PREDICT/ANALYZE to TO TRAIN/PREDICT/EXPLAIN. https://github.com/sql-machine-learning/sqlflow/issues/998
1. Removed ALPS and ElasticDL code generators to adapt current intermediate representation implementation.

0.1.0rc.1

SQLFlow release v0.1.0-rc.1 is the first release candidate of SQLFlow.

The current version includes the following features:

- Database Support
- MySQL
- Hive: [gohive](https://github.com/sql-machine-learning/gohive)
- MaxCompute: [gomaxcompute](https://github.com/sql-machine-learning/gomaxcompute)
- Machine Learning Systems and Models Support
- Tensorflow Pre-made estimators.
- Custom Keras Model: [contribute_models.md](https://github.com/sql-machine-learning/models/blob/develop/doc/contribute_models.md)
- Xgboost models: https://github.com/sql-machine-learning/sqlflow/pull/765
- Feature Columns Supported When Using Tensorflow or Keras Models:
- numeric_column
- bucket_column
- cross_column
- category_id_column
- sequence_category_id_column
- Column Data Type Support:
- FLOAT/INT/BIGINT
- VARCHAR/TEXT
- CSV formatted DENSE Tensor
- CSV formatted SPARSE Tensor
- Support Standalone Deployment and Session support: https://github.com/sql-machine-learning/sqlflow/issues/531
- Deploy on Kubernetes Cluster: https://github.com/sql-machine-learning/sqlflow/pull/537
- Unsupervised Training with Clustering Model: https://github.com/sql-machine-learning/sqlflow/pull/737
- Analyze the Machine Learning Mode: [analyzer_design.md](https://github.com/sql-machine-learning/sqlflow/blob/develop/doc/analyzer_design.md)

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.