Unsloth tag for `xxxTrainer`
If users use Unsloth library, the `unsloth` tag gets automatically pushed on the Hub.
* [`xxxTrainer`] Add unsloth tag by younesbelkada in https://github.com/huggingface/trl/pull/1130
DPO fixes
Some important fixes for DPO has been introduced to address: https://twitter.com/jon_durbin/status/1743575483365699809 and to make DPO faster
* Allow separate devices for target/ref models. by jondurbin in https://github.com/huggingface/trl/pull/1190
* Allow swapping PEFT adapters for target/ref model. by jondurbin in https://github.com/huggingface/trl/pull/1193
* Change device access order for speedup of calculating metrics in DPOTrainer by brcps12 in https://github.com/huggingface/trl/pull/1154
DDPO + PEFT
Now DDPO supports PEFT
* add: support for `peft` in ddpo. by sayakpaul in https://github.com/huggingface/trl/pull/1165
Other fixes
* add peft_module_casting_to_bf16 in DPOTrainer by sywangyi in https://github.com/huggingface/trl/pull/1143
* SFT Tokenizer Fix by ChrisCates in https://github.com/huggingface/trl/pull/1142
* Minor fixes to some comments in some examples. by mattholl in https://github.com/huggingface/trl/pull/1156
* Correct shapes in docstring of PPOTrainer's train_minibatch method by nikihowe in https://github.com/huggingface/trl/pull/1170
* Update sft_trainer.py by Hemanthkumar2112 in https://github.com/huggingface/trl/pull/1162
* Fix batch all gather by vwxyzjn in https://github.com/huggingface/trl/pull/1177
* Address issue 1122 by maneandrea in https://github.com/huggingface/trl/pull/1174
* Fix misleading variable "epoch" from the training loop from PPOTrainer Doc. by Jfhseh in https://github.com/huggingface/trl/pull/1171
* SFTTrainer: follow args.remove_unused_columns by mgerstgrasser in https://github.com/huggingface/trl/pull/1188
* Handle last token from generation prompt by pablovicente in https://github.com/huggingface/trl/pull/1153
New Contributors
* ChrisCates made their first contribution in https://github.com/huggingface/trl/pull/1142
* brcps12 made their first contribution in https://github.com/huggingface/trl/pull/1154
* mattholl made their first contribution in https://github.com/huggingface/trl/pull/1156
* sayakpaul made their first contribution in https://github.com/huggingface/trl/pull/1165
* nikihowe made their first contribution in https://github.com/huggingface/trl/pull/1170
* Hemanthkumar2112 made their first contribution in https://github.com/huggingface/trl/pull/1162
* maneandrea made their first contribution in https://github.com/huggingface/trl/pull/1174
* Jfhseh made their first contribution in https://github.com/huggingface/trl/pull/1171
* mgerstgrasser made their first contribution in https://github.com/huggingface/trl/pull/1188
* pablovicente made their first contribution in https://github.com/huggingface/trl/pull/1153
* jondurbin made their first contribution in https://github.com/huggingface/trl/pull/1190
**Full Changelog**: https://github.com/huggingface/trl/compare/v0.7.7...v0.7.8