Introduction to NLP, Regular Expressions, Regular Expressions in Practical NLP, Word Tokenization, Word Normalization and Stemming, Sentence Segmentation.
Defining Minimum Edit Distance, Computing Minimum Edit Distance, Back trace for Computing Alignments, Minimum Edit Distance in Computational Biology Weighted Minimum Edit Distance.
Introduction to N-grams, Estimating N-gram Probabilities, Evaluation and Perplexity, Generalization and Zeros.
Smoothing Add One, Interpolation, Good Turing Smoothing, Kneser Ney Smoothing.
The Spelling Correction Task, the Noisy Channel Model of Spelling, Real Word Spelling Correction.
State of the Art Systems, What is Text Classification, Text Classification &Naive Bayes, Formalizing the Naive Bayes Classifier, Naive Bayes Relationship to Language Modelling, Precision, Recall, and the F-measure, Text Classification Evaluation, Practical Issues in Text Classification.
What is Sentiment Analysis, Sentiment Analysis A baseline algorithm, Sentiment Lexicons, Learning Sentiment Lexicons, Generative vs Discriminative Models, Making features from text for discriminative NLP models.
Feature Based Linear Classifiers, Building a Maxent Model, Generative vs Discriminative models The problem of over counting evidence, Introduction to Information Extraction, Evaluation of Named Entity Recognition, Sequence Models for Named Entity Recognition.