HOME > Detail View

Detail View

Foundations of statistical natural language processing

Foundations of statistical natural language processing (Loan 21 times)

Material type
단행본
Personal Author
Manning, Christopher D. Schütze, Hinrich.
Title Statement
Foundations of statistical natural language processing / Christopher D. Manning, Hinrich Schütze.
Publication, Distribution, etc
Cambridge, Mass. :   MIT Press,   c1999.  
Physical Medium
xxxvii, 680 p. ; 24 cm.
ISBN
0262133601
Bibliography, Etc. Note
Includes bibliographical references (p. 611-655) and index.
Subject Added Entry-Topical Term
Computational linguistics --Statistical methods.
000 00000pam u2200205 a 4500
001 000000657802
005 20151021093011
008 990211s1999 mau b 001 0 eng
010 ▼a 99021137
020 ▼a 0262133601
040 ▼a DLC ▼c DLC ▼d UKM ▼d 211009
049 ▼l 111146434
050 0 0 ▼a P98.5.S83 ▼b M36 1999
082 0 0 ▼a 410/.285 ▼2 21
084 ▼a 410.285 ▼2 DDCK
090 ▼a 410.285 ▼b M283f
100 1 ▼a Manning, Christopher D.
245 1 0 ▼a Foundations of statistical natural language processing / ▼c Christopher D. Manning, Hinrich Schütze.
260 ▼a Cambridge, Mass. : ▼b MIT Press, ▼c c1999.
300 ▼a xxxvii, 680 p. ; ▼c 24 cm.
504 ▼a Includes bibliographical references (p. 611-655) and index.
650 0 ▼a Computational linguistics ▼x Statistical methods.
700 1 ▼a Schütze, Hinrich.

Holdings Information

No. Location Call Number Accession No. Availability Due Date Make a Reservation Service
No. 1 Location Main Library/Western Books/ Call Number 410.285 M283f Accession No. 111146434 Availability Available Due Date Make a Reservation Service B M

Contents information

Table of Contents


CONTENTS

Ⅰ Preliminaries = 1

 1 Introduction = 3

 2 Mathematical Foundations = 39

 3 Linguistic Essentials = 81

 4 Corpus-Based Work = 117

Ⅱ Words = 149

 5 Collocations = 151

 6 Statistical Inference : n-gram Models over Sparse Data = 191

 7 Word Sense Disambiguation = 229

 8 Lexical Acquisition = 265

Ⅲ Grammar = 315

 9 Markov Models = 317

 10 Part-of-Speech Tagging = 341

 11 Probabilistic Context Free Grammars = 381

 12 Probabilistic Parsing = 407

Ⅳ Applications and Techniques = 461

 13 Statistical Alignment and Machine Translation = 463

 14 Clustering = 495

 15 Topics in Information Retrieval = 529

 16 Text Categorization = 575

List of Tables = xv

List of Figures = xxi

Table of Notations = xxv

Preface = xxix

Road Map = xxxv

Ⅰ Preliminaries = 1

 1 Introduction = 3

  1.1 Rationalist and Empiricist Approaches to Language = 4

  1.2 Scientific Content = 7

   1.2.1 Questions that linguistics should answer = 8

   1.2.2 Non-categorical phenomena in language = 11

   1.2.3 Language and cognition as probabilistic phenomena = 15

  1.3 The Ambiguity of Language : Why NLP Is Difficult = 17

  1.4 Dirty Hands = 19

   1.4.1 Lexical resources = 19

   1.4.2 Word counts = 20

   1.4.3 Zipf's laws = 23

   1.4.4 Collocations = 29

   1.4.5 Concordances = 31

  1.5 Further Reading = 34

  1.6 Exercises = 35

 2 Mathematical Foundations = 39

  2.1 Elementary Probability Theory = 40

   2.1.1 Probability spaces = 40

   2.1.2 Conditional probability and independence = 42

   2.1.3 Bayes' theorem = 43

   2.1.4 Random variables = 45

   2.1.5 Expectation and variance = 46

   2.1.6 Notation = 47

   2.1.7 joint and conditional distributions = 48

   2.1.8 Determining P = 48

   2.1.9 Standard distributions = 50

   2.1.10 Bayesian statistics = 54

   2.1.11 Exercises = 59

  2.2 Essential Information Theory = 60

   2.2.1 Entropy = 61

   2.2.2 Joint entropy and conditional entropy = 63

   2.2.3 Mutual information = 66

   2.2.4 The noisy channel model = 68

   2.2.5 Relative entropy or Kullback-Leibler divergence = 72

   2.2.6 The relation to language : Cross entropy = 73

   2.2.7 The entropy of English = 76

   2.2.8 Perplexity = 78

   2.2.9 Exercises = 78

  2.3 Further Reading = 79

 3 Linguistic Essentials = 81

  3.1 Parts of Speech and Morphology = 81

   3.1.1 Nouns and pronouns = 83

   3.1.2 Words that accompany nouns : Determiners and adjectives = 87

   3.1.3 Verbs = 88

   3.1.4 Other parts of speech = 91

  3.2 Phrase Structure = 93

   3.2.1 Phrase structure grammars = 96

   3.2.2 Dependency : Arguments and adjuncts = 101

   3.2.3 X'theory = 106

   3.2.4 Phrase structure ambiguity = 107

   6.1.3 Building n-gram models = 195

  6.2 Statistical Estimators = 196

   6.2.1 Maximum Likelihood Estimation(MLE) = 197

   6.2.2 Laplace's law, Lidstone's law and the Jeffreys-Perks law = 202

   6.2.3 Held out estimation = 205

   6.2.4 Cross-validation(deleted estimation) = 210

   6.2.5 Good-Turing estimation = 212

   6.2.6 Briefly noted = 216

  6.3 Combining Estimators = 217

   6.3.1 Simple linear interpolation = 218

   6.3.2 Katz's backing-off = 219

   6.3.3 General linear interpolation = 220

   6.3.4 Briefly noted = 222

   6.3.5 Language models for Austen = 223

  6.4 Conclusions = 224

  6.5 Further Reading = 225

  6.6 Exercises = 225

 7 Word Sense Disambiguation = 229

  7.1 Methodological Preliminaries = 232

   7.1.1 Supervised and unsupervised learning = 232

   7.1.2 Pseudowords = 233

   7.1.3 Upper and lower bounds on performance = 233

  7.2 Supervised Disambiguation = 235

   7.2.1 Bayesian classification 2 = 3 5

   7.2.2 An information-theoretic approach = 239

  7.3 Dictionary-Based Disambiguation = 241

   7.3.1 Disambiguation based on sense definitions = 242

   7.3.2 Thesaurus-based disambiguation = 244

   7.3.3 Disambiguation based on translations in a second-language corpus = 247

   7.3.4 One sense per discourse, one sense per collocation = 249

  7.4 Unsupervised Disambiguation = 252

  7.5 What Is a Word Sense ? = 256

  7.6 Further Reading = 260

  7.7 Exercises = 262

 8 Lexical Acquisition = 265

  8.1 Evaluation Measures = 267

  8.2 Verb Subcategorization = 271

  8.3 Attachment Ambiguity = 278

   8.3.1 Hindle and Rooth(1993) = 280

   8.3.2 General remarks on PP attachment = 284

  8.4 Selectional Preferences = 288

  8.5 Semantic Similarity = 294

   8.5.1 Vector space measures = 296

   8.5.2 Probabilistic measures = 303

  8.6 The Role of Lexical Acquisition in Statistical NLP = 308

  8.7 Further Reading = 312

Ⅲ Grammar = 315

 9 Markov Models = 317

  9.1 Markov Models = 318

  9.2 Hidden Markov Models = 320

   9.2.1 Why use HMMs ? = 322

   9.2.2 General form of an HMM = 324

  9.3 The Three Fundamental Questions for HMMs = 325

   9.3.1 Finding the probability of an observation = 326

   9.3.2 Finding the best state sequence = 331

   9.3.3 The third problem : Parameter estimation = 333

  9.4 HMMs : Implementation, Properties, and Variants = 336

   9.4.1 Implementation = 336

   9.4.2 Variants = 337

   9.4.3 Multiple input observations = 338

   9.4.4 Initialization of parameter values = 339

  9.5 Further Reading = 339

 10 Part-of-Speech Tagging = 341

  10.1 The Information Sources in Tagging = 343

  10.2 Markov Model Taggers = 345

   10.2.1 The probabilistic model = 345

   10.2.2 The Viterbi algorithm = 349

   10.2.3 Variations = 351

  10.3 Hidden Markov Model Taggers = 356

   10.3.1 Applying HMMs to POS tagging = 357

   10.3.2 The effect of initialization on HMM training = 359

  10.4 Transformation-Based Learning of Tags = 361

   10.4.1 Transformations = 362

   10.4.2 The learning algorithm = 364

   10.4.3 Relation to other models = 365

   10.4.4 Automata = 367

   10.4.5 Summary = 369

  10.5 Other Methods, Other Languages = 370

   10.5.1 Other approaches to tagging = 370

   10.5.2 Languages other than English = 371

  10.6 Tagging Accuracy and Uses of Taggers = 371

   10.6.1 Tagging accuracy = 371

   10.6.2 Applications of tagging = 374

  10.7 Further Reading = 377

  10.8 Exercises = 379

 11 Probabilistic Context Free Grammars = 381

  11.1 Some Features of PCFGs = 386

  11.2 Questions for PCFGs = 388

  11.3 The Probability of a String = 392

   11.3.1 Using inside probabilities = 392

   11.3.2 Using outside probabilities = 394

   11.3.3 Finding the most likely parse for a sentence = 396

   11.3.4 Training a PCFG = 398

  11.4 Problems with the Inside-Outside Algorithm = 401

  11.5 Further Reading = 402

  11.6 Exercises = 404

 12 Probabilistic Parsing = 407

  12.1 Some Concepts = 408

   12.1.1 Parsing for disambiguation = 408

   12.1.2 Treebanks = 412

   12.1.3 Parsing models vs. language models = 414

   12.1.4 Weakening the independence assumptions of PCFGs = 416

   12.1.5 Tree probabilities and derivational probabilities = 421

   12.1.6 There's more than one way to do it = 423

   12.1.7 Phrase structure grammars and dependency grammars = 428

   12.1.8 Evaluation = 431

   12.1.9 Equivalent models = 437

   12.1.10 Building parsers : Search methods = 439

   12.1.11 Use of the geometric mean = 442

  12.2 Some Approaches = 443

   12.2.1 Non-lexicalized treebank grammars = 443

   12.2.2 Lexicalized models using derivational histories = 448

   12.2.3 Dependency-based models = 451

   12.2.4 Discussion = 454

  12.3 Further Reading = 456

  12.4 Exercises = 458

Ⅳ Applications and Techniques = 461

 13 Statistical Alignment and Machine Translation = 463

  13.1 Text Alignment = 466

   13.1.1 Aligning sentences and paragraphs = 467

   13.1.2 Length-based methods = 471

   13.1.3 Offset alignment by signal processing techniques = 475

   13.1.4 Lexical methods of sentence alignment = 478

   13.1.5 Summary = 484

   13.1.6 Exercises = 484

  13.2 Word Alignment = 484

  13.3 Statistical Machine Translation = 486

  13.4 Further Reading = 492

 14 Clustering = 495

  14.1 Hierarchical Clustering = 500

   14.1.1 Single-link and complete-link clustering = 503

   14.1.2 Group-average agglomerative clustering = 507

   14.1.3 An application : Improving a language model = 509

   14.1.4 Top-down clustering = 512

  14.2 Non-Hierarchical Clustering = 514

   14.2.1 K-means = 515

   14.2.2 The EM algorithm 518

  14.3 Further Reading = 527

  14.4 Exercises = 528

 15 Topics in Information Retrieval = 529

  15.1 Some Background on Information Retrieval = 530

   15.1.1 Common design features of IR systems = 532

   15.1.2 Evaluation measures = 534

   15.1.3 The probability ranking principle(PRP) = 538

  15.2 The Vector Space Model = 539

   15.2.1 Vector similarity = 540

   15.2.2 Term weighting = 541

  15.3 Term Distribution Models = 544

   15.3.1 The Poisson distribution = 545

   15.3.2 The two-Poisson model = 548

   15.3.3 The K mixture = 549

   15.3.4 Inverse document frequency = 551

   15.3.5 Residual inverse document frequency = 553

   15.3.6 Usage of term distribution models = 554

  15.4 Latent Semantic Indexing = 554

   15.4.1 Least-squares methods = 557

   15.4.2 Singular Value Decomposition = 558

   15.4.3 Latent Semantic Indexing in IR = 564

  15.5 Discourse Segmentation = 566

   15.5.1 Text Tiling = 567

  15.6 Further Reading = S70

  15.7 Exercises = 573

 16 Text Categorization = S75

  16.1 Decision Trees = 578

  16.2 Maximum Entropy Modeling = 589

   16.2.1 Generalized iterative scaling = 591

   16.2.2 Application to text categorization = 594

  16.3 Perceptrons = 597

  16.4 k Nearest Neighbor Classification = 604

  16.5 Further Reading = 607

Tiny Statistical Tables = 609

Bibliography = 611

Index = 657



New Arrivals Books in Related Fields