Pytorch-metric-learning

Latest version: v2.5.0

Safety actively analyzes 630523 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 6 of 9

1.0.0

Reference embeddings for tuple losses
You can separate the source of anchors and positive/negatives. In the example below, anchors will be selected from embeddings and positives/negatives will be selected from ref_emb.

python
loss_fn = TripletMarginLoss()
loss = loss_fn(embeddings, labels, ref_emb=ref_emb, ref_labels=ref_labels)


Efficient mode for DistributedLossWrapper
- efficient=True: each process uses its own embeddings for anchors, and the gathered embeddings for positives/negatives. Gradients will **not** be equal to those in non-distributed code, but the benefit is reduced memory and faster training.
- efficient=False: each process uses gathered embeddings for both anchors and positives/negatives. Gradients will be equal to those in non-distributed code, but at the cost of doing unnecessary operations (i.e. doing computations where both anchors and positives/negatives have no gradient).

The default is False. You can set it to True like this:

python
from pytorch_metric_learning import losses
from pytorch_metric_learning.utils import distributed as pml_dist

loss_func = losses.ContrastiveLoss()
loss_func = pml_dist.DistributedLossWrapper(loss_func, efficient=True)

Documentation: https://kevinmusgrave.github.io/pytorch-metric-learning/distributed/

Customizing k-nearest-neighbors for AccuracyCalculator
You can use a different type of faiss index:
python
import faiss
from pytorch_metric_learning.utils.accuracy_calculator import AccuracyCalculator
from pytorch_metric_learning.utils.inference import FaissKNN

knn_func = FaissKNN(index_init_fn=faiss.IndexFlatIP, gpus=[0,1,2])
ac = AccuracyCalculator(knn_func=knn_func)


You can also use a custom distance function:
python
from pytorch_metric_learning.distances import SNRDistance
from pytorch_metric_learning.utils.inference import CustomKNN

knn_func = CustomKNN(SNRDistance())
ac = AccuracyCalculator(knn_func=knn_func)


Relevant docs:
- [Accuracy Calculation](https://kevinmusgrave.github.io/pytorch-metric-learning/accuracy_calculation/)
- [FaissKNN](https://kevinmusgrave.github.io/pytorch-metric-learning/inference_models/#faissknn)
- [CustomKNN](https://kevinmusgrave.github.io/pytorch-metric-learning/inference_models/#customknn)


Issues resolved
https://github.com/KevinMusgrave/pytorch-metric-learning/issues/204
https://github.com/KevinMusgrave/pytorch-metric-learning/issues/251
https://github.com/KevinMusgrave/pytorch-metric-learning/issues/256
https://github.com/KevinMusgrave/pytorch-metric-learning/issues/292
https://github.com/KevinMusgrave/pytorch-metric-learning/issues/330
https://github.com/KevinMusgrave/pytorch-metric-learning/issues/337
https://github.com/KevinMusgrave/pytorch-metric-learning/issues/345
https://github.com/KevinMusgrave/pytorch-metric-learning/issues/347
https://github.com/KevinMusgrave/pytorch-metric-learning/issues/349
https://github.com/KevinMusgrave/pytorch-metric-learning/issues/353
https://github.com/KevinMusgrave/pytorch-metric-learning/issues/359
https://github.com/KevinMusgrave/pytorch-metric-learning/issues/361
https://github.com/KevinMusgrave/pytorch-metric-learning/issues/362
https://github.com/KevinMusgrave/pytorch-metric-learning/issues/363
https://github.com/KevinMusgrave/pytorch-metric-learning/issues/368
https://github.com/KevinMusgrave/pytorch-metric-learning/issues/376
https://github.com/KevinMusgrave/pytorch-metric-learning/issues/380

Contributors
Thanks to yutanakamura-tky and KinglittleQ for pull requests, and mensaochun for providing helpful code in 380

0.9.99

Bug fixes

- Accuracy Calculation bug in GlobalTwoStreamEmbeddingSpaceTester (301)
- Mixed precision bug in convert_to_weights (300)

Features

- [HierarchicalSampler](https://kevinmusgrave.github.io/pytorch-metric-learning/samplers/#hierarchicalsampler)
- Improved functionality for [InferenceModel](https://kevinmusgrave.github.io/pytorch-metric-learning/inference_models/#inferencemodel) (296 and 304)
- train_indexer now accepts a dataset
- also added functions save_index, load_index, and add_to_indexer
- Added power argument to LpRegularizer (299)
- Return exception if labels has more than 1 dimension (307)
- [Added a global flag](https://kevinmusgrave.github.io/pytorch-metric-learning/common_functions/#collect_stats) for turning on/off collect_stats (311)
- TripletMarginLoss smooth variant uses the input margin now (315)
- [Use package-specific logger, "PML"](https://kevinmusgrave.github.io/pytorch-metric-learning/common_functions/#logger), instead of root logger (318)
- Cleaner key verification in the trainers (102)

Thanks to elias-ramzi, gkouros, vltanh, and Hummer12007

0.9.98

AccuracyCalculator breaking change (issue 290)
The k parameter in AccuracyCalculator has a new behavior. The allowed values are:

* None. This means k will be set to the total number of reference embeddings.
* An integer greater than 0. This means k will be set to the input integer.
* "max_bin_count". This means k will be set to max(bincount(reference_labels)) - self_count where self_count == 1 if the query and reference embeddings come from the same source.

The old behavior is described [here](https://kevinmusgrave.github.io/pytorch-metric-learning/accuracy_calculation/#warning-for-versions-0997).

If your dataset is large, you might find the k-nn search is now very slow. This is because the new default behavior is to set k to len(reference_embeddings). To avoid this, you can set k to a number, like k = 1000 or try k = "max_bin_count" to get behavior similar (though not identical) to the old default.

Apologies for the drastic change. I'm hoping to have things stable and following semantic versioning when v1.0 arrives.


Bug fixes
* lmu.convert_to_triplets has been fixed (291)
* Losses and miners should now be compatible with autocast (293)

New features / improvements
* The loss used in [Supervised Contrastive Learning](https://arxiv.org/abs/2004.11362). Documentation: [SupConLoss](https://kevinmusgrave.github.io/pytorch-metric-learning/losses/#supconloss). By fjsj (281, 288)
* Vectorized convert_to_triplets (279)

0.9.97

Bug fixes
- Small fix for NTXentLoss with no negative pairs 272
- Fixed .detach() bug in NTXentLoss 282
- Fixed parameter override bug in MatchFinder.get_matching_pairs() 286 by joaqo

New features and improvements

AccuracyCalculator now uses torch instead of numpy
- All the calculations (except for NMI and AMI) are done with torch. Calculations will be done on the same device and dtype as the input query tensor.
- You can still pass numpy arrays into AccuracyCalculator.get_accuracy, but the arrays will be immediately converted to torch tensors.

Faster custom label comparisons in AccuracyCalculator
- See 264 by mlopezantequera

Numerical stability improvement for DistanceWeightedMiner
See 278 by z1w

UniformHistogramMiner
This is like DistanceWeightedMiner, except that it works well with high dimension embeddings, and works with any distance metric (not just L2 normalized distance). [Documentation](https://kevinmusgrave.github.io/pytorch-metric-learning/miners/#uniformhistogramminer)

PerAnchorReducer
This converts unreduced pairs to unreduced elements. For example, NTXentLoss returns losses per positive pair. If you used PerAnchorReducer with NTXentLoss, then the losses per pair would first be converted to losses per batch element, before being passed to the inner reducer. See the [documentation](https://kevinmusgrave.github.io/pytorch-metric-learning/reducers/#peranchorreducer)

BaseTester no longer converts embeddings from torch to numpy
This includes the get_all_embeddings function. If you want get_all_embeddings to return numpy arrays, you can set the return_as_numpy flag to True:
python
embeddings, labels = tester.get_all_embeddings(dataset, model, return_as_numpy=True)

The embeddings are converted to numpy only for the visualizer and visualizer_hook, if specified.

Reduced usage of .to(device) and .type(dtype)
Tensors are initialized on device and with the necessary dtype, and they are moved to device and cast to dtypes only when necessary. See [this code snippet](https://github.com/KevinMusgrave/pytorch-metric-learning/blob/3c354998a218409d3ac6f00a9662696acf97ff96/src/pytorch_metric_learning/utils/common_functions.py#L460-L473) for details.

Simplified DivisorReducer
Replaced "divisor_summands" with "divisor".

0.9.96

New Features

Thanks to mlopezantequera for adding the following features!

Testers: allow any combination of query and reference sets (250)
To evaluate different combinations of query and reference sets, use the splits_to_eval argument for tester.test().

For example, let's say your dataset_dict has two keys: "dataset_a" and "train".

- The default splits_to_eval = None is equivalent to:
python
splits_to_eval = [('dataset_a', ['dataset_a']), ('train', ['train'])]

- dataset_a as the query, and train as the reference:
python
splits_to_eval = [('dataset_a', ['train'])]

- dataset_a as the query, and dataset_a + train as the reference:
python
splits_to_eval = [('dataset_a', ['dataset_a', 'train'])]


Then pass splits_to_eval to tester.test:
python
tester.test(dataset_dict, epoch, model, splits_to_eval = splits_to_eval)


Note that this new feature makes the old reference_set init argument obsolete, so reference_set has been removed.



AccuracyCalculator: allow arbitrary label comparion functions (254)
AccuracyCalculator now has an optional init argument, label_comparison_fn, which is a function that compares two numpy arrays of labels and returns a boolean array. The default is numpy.equal. If a custom function is used, then you must exclude clustering based metrics ("NMI" and "AMI"). The following is an example of a custom function for two-dimensional labels. It returns True if the 0th column matches, and the 1st column does **not** match:
python
def example_label_comparison_fn(x, y):
return (x[:, 0] == y[:, 0]) & (x[:, 1] != y[:, 1])

AccuracyCalculator(exclude=("NMI", "AMI"),
label_comparison_fn=example_label_comparison_fn)


Other Changes

- BaseTrainer and BaseTester now take in an optional dtype argument. This is the type that the dataset output will be converted to, e.g. torch.float16. If set to the default value of None, then no type casting will be done.
- Removed self.dim_reduced_embeddings from BaseTester and the associated code in HookContainer, due to lack of use.
- tester.test() now returns all_accuracies, whereas before, it returned nothing and you'd have to access all_accuracies either through the end_of_testing_hook or by accessing tester.all_accuracies.
- tester.embeddings_and_labels is deleted at the end of tester.test() to free up memory.

0.9.95

New

BatchEasyHardMiner
This new miner is an implementation of [Improved Embeddings with Easy Positive Triplet Mining](https://openaccess.thecvf.com/content_WACV_2020/papers/Xuan_Improved_Embeddings_with_Easy_Positive_Triplet_Mining_WACV_2020_paper.pdf). See [the documentation](https://kevinmusgrave.github.io/pytorch-metric-learning/miners/#batcheasyhardminer). Thanks marijnl!

New metric added to AccuracyCalculator
The new metric is mean_average_precision, which is the commonly used k-nn based mAP in information retrieval.
Note that this differs from the already existing metric, mean_average_precision_at_r.

Bug fixes

- dtype casting in MultiSimilarityMiner changed to work with autocast. See 233 by thinline72
- Added logic for dealing with zero rows in the weight matrix in DistanceWeightedMiner by ignoring them. For example, if the entire weight matrix is 0, then no triplets will be returned. Previously, the zero rows would cause a RuntimeError. See 230 by tpanum

Page 6 of 9

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.