My NLP Curriculum
by Kavita Ganeshan
Data Science, Machine Learning, Deep Learning are the new buzz words recently.
There are lot of MOOCs and online courses / certifications available for these topics as well. I have always worked on Text and wanted to enroll in such course with Natural Language Processing in focus. Having gone through the syllabus and contens of few courses online, I felt the need to create a curriculum of my own. This is because the online material available is all disperse and I ended up going away from the course to gather my knowledge. So this is my attempt in educating / updating my NLP knowledge. Feel free to use this and modify it to your needs.
Major Topics that I need to work on:
- Programming - Python, PyTorch
- Math - Linear Algebra, Probability, Statistics
- NLP - Linguistics, Statistical NLP, Deep NLP
- Data Science - Pandas
- MOOC - Andrew Ng, Stanford NLP, Oxford Deep Mind Lectures, EdX, Fast.ai Reference - Jason Brownlee, Dan Jurafsky Speech & Language Processing book, Cracking the coding interview
I am going to give myself about 8 months to finish my own curriculum and the test is I come up with my own Project implementation of something interesting in NLP using Deep learning (more like a thesis if possible). I will grade myself and I must say I am my worst critic. So trust me, this is a difficult assignment!
I will be updating this list as and when I find something new to add to the list.
- Python:
- Generators
- Vectorization
- Data Structures
- Numpy structures and their implementation
- Scikit Learn structures and their implementation
- Algorithms
- Indexing and searching in dictionary in python
- Interview cake implementation for best time and space complexity
- Matrix assignment
- Linear Algebra:
- Matrix Vectors
- Tensors
- Probability:
- Conditional Probability - Heads Tails - Questions for interviews
- Statistics:
- Definitions, Metrics
- correlation
- Distributions
- Statistical NLP:
- Vectorizer / Transformer - Scikit Learn
- HMM
- CRF
- Topic Modelling - LDA
- Sequence labelling
- Feature selection
- Dimensionality reduction - PCA, ICA
- Deep NLP:
- Word2vec - CBOW/ Skip gram
- Representation Learning
- CNN for text
- RNN for text
- Attention model for text
- Pytorch for text
- BERT / Transformers - hugging face
- Linguistics:
- Discourse Segmentation
- Pandas:
- Dataframe manipulations
- MOOC:
- Andrew Ng - CNN
- Stanford NLP - Richard Socher
- Oxford
- Fast.ai - Rachel Thomas
Subscribe via RSS