|
|
For undergraduate or advanced undergraduate courses in Classical Natural Language Processing, Statistical Natural Language Processing, Speech Recognition, Computational Linguistics, and Human Language Processing. An explosion of Web-based language techniques, merging of distinct fields, availability of phone-based dialogue systems, and much more make this an exciting time in speech and language processing. The first of its kind to thoroughly cover language technology -- at all levels and with all modern technologies -- this text takes an empirical approach to the subject, based on applying statistical and other machine-learning algorithms to large corporations. The authors cover areas that traditionally are taught in different courses, to describe a unified vision of speech and language processing. Emphasis is on practical applications and scientific evaluation. An accompanying Website contains teaching materials for instructors, with pointers to language processing resources on the Web. The Second Edition offers a significant amount of new and extended material. Supplements: Click on the "Resources" tab to View Downloadable Files: * Solutions * Power Point Lecture Slides - Chapters 1-5, 8-10, 12-13 and 24 Now Available! * For additional resourcse visit the author website: http://www.cs.colorado.edu/~martin/slp.html
| ISBN | 0135041961 | | Pages | 1024 | | ISBN13 | 9780135041963 (What's this?) | | Part volume | International Version | | Publisher | Pearson Education (US) | | Weight (grammes) | 1320 | | Imprint | Pearson | | Published in | Upper Saddle River | | Format | Paperback | | Previous ISBN | 9780131227989 | | Publication date | 29 Apr 2008 | | Height (mm) | 235 | | DEWEY | 410.285 | | Width (mm) | 178 | | DEWEY edition | DC22 | | Academic level | Tertiary education |
|
| |
Foreword Preface About the Authors 1 Introduction 1.1 Knowledge in Speech and Language Processing 1.2 Ambiguity 1.3 Models and Algorithms 1.4 Language, Thought, and Understanding 1.5 The State of the Art 1.6 Some Brief History 1.6.1 Foundational Insights: 1940s and 1950s 1.6.2 The Two Camps: 1957--1970 1.6.3 Four Paradigms: 1970--1983 1.6.4 Empiricism and Finite State Models Redux: 1983--1993 1.6.5 The Field Comes Together: 1994--1999 1.6.6 The Rise of Machine Learning: 2000--2008 1.6.7 On Multiple Discoveries 1.6.8 A Final Brief Note on Psychology 1.7 Summary Bibliographical and Historical Notes Part I Words 2 Regular Expressions and Automata 2.1 Regular Expressions 2.1.1 Basic Regular Expression Patterns 2.1.2 Disjunction, Grouping, and Precedence 2.1.3 A Simple Example 2.1.4 A More Complex Example 2.1.5 Advanced Operators 2.1.6 Regular Expression Substitution, Memory, and ELIZA 2.2 Finite-State Automata 2.2.1 Using an FSA to Recognize Sheeptalk 2.2.2 Formal Languages 2.2.3 Another Example 2.2.4 Non-Deterministic FSAs 2.2.5 Using an NFSA to Accept Strings 2.2.6 Recognition as Search 2.2.7 Relating Deterministic and Non-Deterministic Automata 2.3 Regular Languages and FSAs 2.4 Summary Bibliographical and Historical Notes Exercises 3 Words and Transducers 3.1 Survey of (Mostly) English Morphology 3.1.1 Inflectional Morphology 3.1.2 Derivational Morphology 3.1.3 Cliticization 3.1.4 Non-Concatenative Morphology 3.1.5 Agreement 3.2 Finite-State Morphological Parsing 3.3 Construction of a Finite-State Lexicon 3.4 Finite-State Transducers 3.4.1 Sequential Transducers and Determinism 3.5 FSTs for Morphological Parsing 3.6 Transducers and Orthographic Rules 3.7 The COmbination of an FST Lexicon and Rules 3.8 Lexicon-Free FSTs: The Porter Stemmer 3.9 Word and Sentence Tokenization 3.9.1 Segmentation in Chinese 3.10 Detection and Correction of Spelling Errors 3.11 Minimum Edit Distance 3.12 Human Morphological Processing 3.13 Summary Bibliographical and Historical Notes Exercises 4 N-grams 4.1 Word Counting in Corpora 4.2 Simple (Unsmoothed) N-grams 4.3 Training and Test Sets 4.3.1 N-gram Sensitivity to the Training Corpus 4.3.2 Unknown Words: Open Versus Closed Vocabulary Tasks 4.4 Evaluating N-grams: Perplexity 4.5 Smoothing 4.5.1 Laplace Smoothing 4.5.2 Good-Turing Discounting 4.5.3 Some Advanced Issues in Good-Turing Estimation 4.6 Interpolation 4.7 Backoff 4.7.1 Advanced: Details of Computing Katz Backoff a and P* 4.8 Practical Issues: Toolkits and Data Formats 4.9 Advanced Issues in Language Modeling 4.9.1 Advanced Smoothing Methods: Kneser-Ney Smoothing 4.9.2 Class-Based N-grams 4.9.3 Language Model Adaptation and Web Use 4.9.4 Using Longer Distance Information: A Brief Summary 4.10 Advanced: Information Theory Background 4.10.1 Cross-Entropy for Comparing Models 4.11 Advanced: The Entropy of English and Entropy Rate Constancy 4.12 Summary Bibliographical and Historical Notes Exercises 5 Part-of-Speech Tagging 5.1 (Mostly) English Word Classes 5.2 Tagsets for English 5.3 Part-of-Speech Tagging 5.4 Rule-Based Part-of-Speech Tagging 5.5 HMM Part-of-Speech Tagging 5.5.1 Computing the Most-Likely Tag Sequence: An Example 5.5.2 Formalizing Hidden Markov Model Taggers 5.5.3 Using the Viterbi Algorithm for HMM Tagging 5.5.4 Extending the HMM Algorithm to Trigrams 5.6 Transformation-Based Tagging 5.6.1 How TBL Rules Are Applied 5.6.2 How TBL Rules Are Learned 5.7 Evaluation and Error Analysis 5.7.1 Error Analysis 5.8 Advanced Issues in Part-of-Speech Tagging 5.8.1 Practical Issues: Tag Indeterminacy and Tokenization 5.8.2 Unknown Words 5.8.3 Part-of-Speech Tagging for Other Languages 5.8.4 Tagger Combination 5.9 Advanced: The Noisy Channel Model for Spelling 5.9.1 Contextual Spelling Error Correction 5.10 Summary Bibliographical and Historical Notes Exercises 6 Hidden Markov and Maximum Entropy Models 6.1 Markov Chains 6.2 The Hidd
|
|
|
|
|