HOME > Detail View

Detail View

Text mining with R : a tidy approach

Text mining with R : a tidy approach (Loan 15 times)

Material type
단행본
Personal Author
Silge, Julia. Robinson, David.
Title Statement
Text mining with R : a tidy approach / Julia Silge and David Robinson.
Publication, Distribution, etc
Sebastopol, CA :   O'Reilly Media,   2017.  
Physical Medium
xii, 178 p. : ill. ; 24 cm.
ISBN
9781491981658
Bibliography, Etc. Note
Includes bibliographical references and index.
Subject Added Entry-Topical Term
R (Computer program language). Data mining.
000 00000nam u2200205 a 4500
001 000045912516
005 20170821150715
008 170817s2017 caua b 001 0 eng d
020 ▼a 9781491981658
040 ▼a 211009 ▼c 211009 ▼d 211009
082 0 4 ▼a 006.312 ▼2 23
084 ▼a 006.312 ▼2 DDCK
090 ▼a 006.312 ▼b S582t
100 1 ▼a Silge, Julia.
245 1 0 ▼a Text mining with R : ▼b a tidy approach / ▼c Julia Silge and David Robinson.
260 ▼a Sebastopol, CA : ▼b O'Reilly Media, ▼c 2017.
300 ▼a xii, 178 p. : ▼b ill. ; ▼c 24 cm.
504 ▼a Includes bibliographical references and index.
650 0 ▼a R (Computer program language).
650 0 ▼a Data mining.
700 1 ▼a Robinson, David.
945 ▼a KLPA

Holdings Information

No. Location Call Number Accession No. Availability Due Date Make a Reservation Service
No. 1 Location Main Library/Western Books/ Call Number 006.312 S582t Accession No. 111777424 Availability Available Due Date Make a Reservation Service B M

Contents information

Author Introduction

줄리아 실기(지은이)

줄리아는 스택 오버플로에서 일하는 데이터 과학자다. 복잡한 데이터셋들을 분석하기도 하고 기술적 주제로 다양한 청중과 소통하기도 한다. 천체물리학 박사이며, 제인 오스틴을 사랑하고, 아름다운 도표 그리기를 좋아한다.

데이비드 로빈슨(지은이)

데이비드는 스택 오버플로에서 데이터 과학자로 근무하고 있으며, 프린스턴대학교에서 전산생물학 박사 학위를 받았다. broom, gganimate, fuzzyjoin, widyr 같은 R 패키지를 주로 오픈소스 형태로 개발한다.

Information Provided By: : Aladin

Table of Contents

CONTENTS
Preface = vii
1. The Tidy Text Format = 1
 Contrasting Tidy Text with Other Data Structures = 2
 The unnest_tokens Function = 2
 Tidying the Works of Jane Austen = 4
 The gutenbergr Package = 7
 Word Frequencies = 8
 Summary = 12
2. Sentiment Analysis with Tidy Data = 13
 The sentiments Dataset = 14
 Sentiment Analysis with Inner Join = 16
 Comparing the Three Sentiment Dictionaries = 19
 Most Common Positive and Negative Words = 22
 Wordclouds = 25
 Looking at Units Beyond Just Words = 27
 Summary = 29
3. Analyzing Word and Document Frequency : tf-idf = 31
 Term Frequency in Jane Austen''''s Novels = 32
 Zipf''''s Law = 34
 The bind_tf_idf Function = 37
 A Corpus of Physics Texts = 40
 Summary = 44
4. Relationships Between Words : N-grams and Correlations = 45
 Tokenizing by N-gram = 45
  Counting and Filtering N-grams = 46
  Analyzing Bigrams = 48
  Using Bigrams to Provide Context in Sentiment Analysis = 51
  Visualizing a Network of Bigrams with ggraph = 54
  Visualizing Bigrams in Other Texts = 59
 Counting and Correlating Pairs of Words with the widyr Package = 61
  Counting and Correlating Among Sections = 62
  Examining Pairwise Correlation = 63
 Summary = 67
5. Converting to and from Nontidy Formats = 69
 Tidying a Document-Term Matrix = 70
  Tidying Document Term Matrix Objects = 71
  Tidying dfm Objects = 74
 Casting Tidy Text Data into a Matrix = 77
 Tidying Corpus Objects with Metadata = 79
  Example : Mining Financial Articles = 81
 Summary = 87
6. Topic Modeling = 89
 Latent Dirichlet Allocation = 90
  Word-Topic Probabilities = 91
  Document-Topic Probabilities = 95
 Example : The Great Library Heist = 96
  LDA on Chapters = 97
  Per-Document Classification = 100
  By-Word Assignments : augment = 103
 Alternative LDA Implementations = 107
 Summary = 108
7. Case Study : Comparing Twitter Archives = 109
 Getting the Data and Distribution of Tweets = 109
 Word Frequencies = 110
 Comparing Word Usage = 114
 Changes in Word Use = 116
 Favorites and Retweets = 120
 Summary = 124
8. Case Study : Mining NASA Metadata = 125
 How Data Is Organized at NASA = 126
  Wrangling and Tidying the Data = 126
  Some Initial Simple Exploration = 129
 Word Co-ocurrences and Correlations = 130
  Networks of Description and Title Words = 131
  Networks of Keywords = 134
 Calculating tf-idf for the Description Fields = 137
  What Is tf-idf for the Description Field Words? = 137
  Connecting Description Fields to Keywords = 138
 Topic Modeling = 140
  Casting to a Document-Term Matrix = 140
  Ready for Topic Modeling = 141
  Interpreting the Topic Model = 142
  Connecting Topic Modeling with Keywords = 149
 Summary = 152
9. Case Study : Analyzing Usenet Text = 153
 Preprocessing = 153
  Preprocessing Text = 155
 Words in Newsgroups = 156
  Finding tf-idf Within Newsgroups = 157
  Topic Modeling = 160
 Sentiment Analysis = 163
  Sentiment Analysis by Word = 164
  Sentiment Analysis by Message = 167
  N-gram Analysis = 169
 Summary = 171
Bibliography = 173
Index = 175

New Arrivals Books in Related Fields

Cartwright, Hugh M. (2021)
한국소프트웨어기술인협회. 빅데이터전략연구소 (2021)