Kernel-tuner

Latest version: v1.0

Safety actively analyzes 630217 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 3 of 4

0.1.8

Changed
- bugfix for when using iterations smaller than 3
- the install procedure now uses extras, e.g. [cuda,opencl]
- option quiet makes tune_kernel completely quiet
- extensive updates to documentation

Added
- type checking for kernel arguments and answers lists
- checks for reserved keywords in tunable paramters
- checks for whether thread block dimensions are specified
- printing units for measured time with CUDA and OpenCL
- option to print all measured execution times

0.1.7

Changed
- bugfix install when scipy not present
- bugfix for GPU cleanup when using Noodles runner
- reworked the way strings are handled internally

Added
- option to set compiler name, when using C backend

0.1.6

Changed
- actively freeing GPU memory after tuning
- bugfix for 3D grids when using OpenCL

Added
- support for dynamic parallelism when using PyCUDA
- option to use differential evolution optimization
- global optimization strategies basinhopping, minimize

0.1.5

Changed
- option to pass a fraction to the sample runner
- fixed a bug in memset for OpenCL backend

Added
- parallel tuning on single node using Noodles runner
- option to pass new defaults for block dimensions
- option to pass a Python function as code generator
- option to pass custom function for output verification

0.1.4

Changed
- device and kernel name are printed by runner
- tune_kernel also returns a dict with environment info
- using different timer in C vector add example

0.1.3

Changed
- changed how scalar arguments are handled internally

Added
- separate install and contribution guides

Page 3 of 4

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.