Description
With a machine learning approach and less focus on linguistic details, this gentle introduction to natural language processing develops fundamental mathematical and deep learning models for NLP under a unified framework. NLP problems are systematically organised by their machine learning nature, including classification, sequence labelling, and sequence-to-sequence problems. Topics covered include statistical machine learning and deep learning models, text classification and structured prediction models, generative and discriminative models, supervised and unsupervised learning with latent variables, neural networks, and transition-based methods. Rich connections are drawn between concepts throughout the book, equipping students with the tools needed to establish a deep understanding of NLP solutions, adapt existing models, and confidently develop innovative models of their own. Featuring a host of examples, intuition, and end of chapter exercises, plus sample code available as an online resource, this textbook is an invaluable tool for the upper undergraduate and graduate student.
Systematically discusses natural language processing from a machine learning perspective, delivering a deeper mathematical understanding of NLP solutions. Students can then harness this knowledge to solve NLP tasks and build better NLP models.
Provides running examples, figures, and high-level description throughout, allowing students to absorb machine learning concepts and proofs in a meaningful way
Features in-depth discussion of deep learning methods and NLP
Establishes a strong correlation between deep learning and linear models for NLP, smoothing the steep learning curve for students as they draw connections between these concepts in a unified framework
Explains the reasoning behind NLP models so that engineers will be able to better use, tailor, and even improve them
Table of Contents Part I. Basics:
1. Introduction
2. Counting relative frequencies
3. Feature vectors
4. Discriminative linear classifiers
5. A perspective from information theory
6. Hidden variables
Part II. Structures:
7. Generative sequence labelling
8. Discriminative sequence labelling
9. Sequence segmentation
10. Predicting tree structures
11. Transition-based methods for structured prediction
12. Bayesian models
Part III. Deep Learning:
13. Neural network
14. Representation learning
15. Neural structured prediction
16. Working with two texts
17. Pre-training and transfer learning
18. Deep latent variable models
Index.
Yue Zhang, Westlake University Zhiyang Teng, Westlake University