Implementing classic NLP models from scratch with clean code and easy-to-understand architecture.
This library is for educational purposes only. It is not optimized for production use. And it may contain bugs CURRENTLY, so feel free to contribute and report issues.
Until now, we only do simple tests, which is not enough. But we will do much more rigorous testing in the FUTURE. And we will add more docs for you can RUN it easily, add more playgrounds for you to experiment with the models and look inside the model implementations.
10 important NLP models range from 2003 to 2020:
- NNLM(2003)
- Word2Vec(2013)
- Seq2Seq(2014)
- Attention(2015)
- fastText(2016)
- Transformer(2017)
- BERT(2018)
- GPT(2018)
- XLNet(2019)
- T5(2020)
Yes, there are some differences in the implementations. The goal of toynlp is to provide a simple and educational implementation of these models, which may not include all the optimizations and features in original papers.
The reason is that I want to focus on the core ideas and concepts behind each model, rather than getting bogged down in implementation details. Especially when the original papers may introduce complexities that are not essential for understanding the main contributions of the work.
But, I do need to add docs for each model to clarify these differences and provide guidance on how to use the implementations effectively. I'll do this later. Let's first make it work and then make it better.
Well, it's in toyllm!
I separated the models into two libraries, toynlp
for traditional "small" NLP models and toyllm
for LLMs, which are typically larger and more complex.
Glad you asked! The "toy" style is all about simplicity and educational value. We have another two toys besides toynlp and toyllm: toyml for traditional machine learning models; toyrl for deep reinforcement learning models.