HOME > Detail View

Detail View

Large-scale text visual analytics via machine learning and human interaction

Large-scale text visual analytics via machine learning and human interaction

Material type
학위논문
Personal Author
최민석 崔珉碩
Title Statement
Large-scale text visual analytics via machine learning and human interaction / Minsuk Choi
Publication, Distribution, etc
Seoul :   Graduate School, Korea University,   2019  
Physical Medium
x, 82장 : 천연색삽화, 도표 ; 26 cm
기타형태 저록
Large-Scale Text Visual Analytics via Machine Learning and Human Interaction   (DCOLL211009)000000084351  
학위논문주기
학위논문(박사)-- 고려대학교 대학원: 컴퓨터·전파통신공학과, 2019. 8
학과코드
0510   6YD36   362  
General Note
지도교수: 주재걸  
Bibliography, Etc. Note
참고문헌: 장 71-82
이용가능한 다른형태자료
PDF 파일로도 이용가능;   Requires PDF file reader(application/pdf)  
비통제주제어
Visual analytics , Machine learning , Human computer interactiion,,
000 00000nam c2200205 c 4500
001 000045999193
005 20191017125247
007 ta
008 190702s2019 ulkad bmAC 000c eng
040 ▼a 211009 ▼c 211009 ▼d 211009
085 0 ▼a 0510 ▼2 KDCP
090 ▼a 0510 ▼b 6YD36 ▼c 362
100 1 ▼a 최민석 ▼g 崔珉碩
245 1 0 ▼a Large-scale text visual analytics via machine learning and human interaction / ▼d Minsuk Choi
260 ▼a Seoul : ▼b Graduate School, Korea University, ▼c 2019
300 ▼a x, 82장 : ▼b 천연색삽화, 도표 ; ▼c 26 cm
500 ▼a 지도교수: 주재걸
502 1 ▼a 학위논문(박사)-- ▼b 고려대학교 대학원: ▼c 컴퓨터·전파통신공학과, ▼d 2019. 8
504 ▼a 참고문헌: 장 71-82
530 ▼a PDF 파일로도 이용가능; ▼c Requires PDF file reader(application/pdf)
653 ▼a Visual analytics ▼a Machine learning ▼a Human computer interactiion
776 0 ▼t Large-Scale Text Visual Analytics via Machine Learning and Human Interaction ▼w (DCOLL211009)000000084351
900 1 0 ▼a Choi, Min-suk, ▼e
900 1 0 ▼a 주재걸 ▼g 朱宰傑, ▼e 지도교수
900 1 0 ▼a Choo, Jae-gul, ▼e 지도교수
945 ▼a KLPA

Electronic Information

No. Title Service
1
Large-scale text visual analytics via machine learning and human interaction (36회 열람)
View PDF Abstract Table of Contents
No. Location Call Number Accession No. Availability Due Date Make a Reservation Service
No. 1 Location Science & Engineering Library/Stacks(Thesis)/ Call Number 0510 6YD36 362 Accession No. 123062319 Availability Available Due Date Make a Reservation Service B M
No. 2 Location Science & Engineering Library/Stacks(Thesis)/ Call Number 0510 6YD36 362 Accession No. 123062320 Availability Available Due Date Make a Reservation Service B M
No. 3 Location Sejong Academic Information Center/Thesis(5F)/ Call Number 0510 6YD36 362 Accession No. 153083340 Availability Available Due Date Make a Reservation Service M
No. Location Call Number Accession No. Availability Due Date Make a Reservation Service
No. 1 Location Science & Engineering Library/Stacks(Thesis)/ Call Number 0510 6YD36 362 Accession No. 123062319 Availability Available Due Date Make a Reservation Service B M
No. 2 Location Science & Engineering Library/Stacks(Thesis)/ Call Number 0510 6YD36 362 Accession No. 123062320 Availability Available Due Date Make a Reservation Service B M
No. Location Call Number Accession No. Availability Due Date Make a Reservation Service
No. 1 Location Sejong Academic Information Center/Thesis(5F)/ Call Number 0510 6YD36 362 Accession No. 153083340 Availability Available Due Date Make a Reservation Service M

Contents information

Abstract

Users of social network services (SNS), including Twitter, Yelp, and Facebook, generate a great deal of data in such forms as text documents, images, and videos. Of these forms, text documents such as messages, comments, and reviews are particularly revelatory of users’ intentions in a given social environment. Thus, much recent research has focused on analyzing and predicting social phenomena using text-based SNS data. Machine learning (ML), which requires a large amount of data for its performance, is an emerging tool for mining useful information from the data. However, because fully automated computation methods for ML do not provide clear understandings of the data and of careful interactions from the data analyst’s point of view, there is a growing interest in interactive visual interface systems based on machine learning techniques.

This thesis introduces two systems that tightly combine human-centered interactive systems with computational methods related to machine learning using large-scale text document data.

First, to support an efficient document labeling environment, I present a system called Attentive Interactive Labeling Assistant (AILA) [1]. At its core, AILA uses Interactive Attention Module (IAM), a novel module that visually highlights words in a document that labelers may pay attention to when labeling a document. IAM utilizes attention-based Deep Neural Networks, which not only support a prediction of which words to highlight, but also enable labelers to indicate words that should be assigned high attention weights while labeling to improve the future quality of word prediction. The results our study showed that the participants’ labeling efficiency increased significantly under the condition with IAM than under the condition without IAM, while the two conditions maintained roughly the same labeling accuracy.

Second, detecting anomalous events of a particular area in a timely manner is an important task. Geo-tagged social media data are useful resource for this task, but the abundance of everyday language in them makes this task still challenging. To address such challenges, I present TopicOnTiles, a visual analytics system that can reveal the information relevant to anomalous events in a multi-level tile-based map interface by using social media data [2]. To this end, I adopt and improve a recently proposed topic modeling method that can extract spatio-temporally exclusive topics corresponding to a particular region and a time point. Furthermore, I utilize a tile-based map interface to efficiently handle large-scale data in parallel. Our user interface effectively highlights anomalous tiles using our novel glyph visualization that encodes the degree of anomaly computed by our exclusive topic modeling processes. To show the effectiveness of our system, I present several usage scenarios using real-world datasets as well as comprehensive user study results.

Table of Contents

Contents 
Abstract 
Contents i 
List of Figures iv 
1 Introduction 1
2 AILA: Attentive Interactive Labeling Assistant for Document Classification 
through Attention-based Deep Neural Networks 3
 2.1 Introduction 3
 2.2 Related Work 6
  2.2.1 Text classification models using attention mechanism 6
  2.2.2 UI design for interactive document labeling 8
 2.3 Interactive Attention Module  9
  2.3.1 Model pipeline 10
  2.3.2 Reliability Assessment 12
 2.4 AILA 18
  2.4.1 Design Considerations 18
  2.4.2 Architecture and implementation details  24
 2.5 Study 25
  2.5.1 Methodology  26
  2.5.2 Results 30
 2.6 Discussion 32
 2.7 Conclusion 34
3 TopicOnTiles: Tile-Based Spatio-Temporal Event Analytics via Exclusive 
Topic Modeling on Social Media 35
 3.1 Introduction 35
 3.2 Related Work 38
  3.2.1 Visual analytics on Social Media Analysis 38
  3.2.2 Spatio-Temporal Visual Analytics on Anomaly Detection  39
 3.3 Overall Design of TopicOnTiles 41
  3.3.1 R1. Providing tile-wise topical summary  43
  3.3.2 R2. Revealing anomalous tiles along with keyword-based topical 
information 43
  3.3.3 R3. Allowing access to raw data with their geospatial and temporal 
frequency patterns 43
 3.4 How TopicOnTiles Works 44
  3.4.1 User Interfaces for Anomalous Event Detection 44
  3.4.2 System Architecture . 52
  3.4.3 System Implementation  54
 3.5 Usage Scenarios 54
  3.5.1 ING New York City Marathon  56
  3.5.2 Trayvon Martin Protest and MLB All-Star Futures Game. . . . . 57
 3.6 Evaluation: User Study  60
  3.6.1 Study Design 60
  3.6.2 Analysis Results  61
 3.7 Discussions  63
  3.7.1 Determining the number of anomalous topics  65
  3.7.2 Events distributed around tile boundaries  66
  3.7.3 Facilitating user interfaces for novice users 66
 3.8 Conclusions and Future Work  67
4 Conclusion 68
Bibliography 71
Acknowledgement