Matthew Hernandez 6f12edb0cc Fix issue: 731 by resolving semantic error (#738) 4 months ago
..
README.md a22d612be6 Bonus material: extending tokenizers (#496) 10 months ago
bpe-from-scratch.ipynb 6f12edb0cc Fix issue: 731 by resolving semantic error (#738) 4 months ago
tests.py 6f12edb0cc Fix issue: 731 by resolving semantic error (#738) 4 months ago

README.md

Byte Pair Encoding (BPE) Tokenizer From Scratch

  • bpe-from-scratch.ipynb contains optional (bonus) code that explains and shows how the BPE tokenizer works under the hood.