- Fix bug in the event that a token id is supplied that overrides a default of
an inferred token.
- Add `pad_at_end` boolean setting that when True pads at the end of the
sequence, and when False pads at the beginning.
- Add dedicated `Vocab` object which replaces the dictionary of string to
integer.
- Update tokenizer integration to override `convert_tokens_to_string`
- Fix bug when trying to save the huggingface tokenizer.
- Make the third party "extras" during python packaging.
- Add better testing and batch encoding operatons.