HOME > 상세정보

상세정보

From text classification to keyphrase extraction

From text classification to keyphrase extraction

자료유형
학위논문
개인저자
이송은, 李松垠
서명 / 저자사항
From text classification to keyphrase extraction / Song-eun Lee
발행사항
Seoul :   Graduate School, Korea University,   2020  
형태사항
iv, 40장 : 도표 ; 26 cm
기타형태 저록
From Text Classification to Keyphrase Extraction   (DCOLL211009)000000127344  
학위논문주기
학위논문(석사)-- 고려대학교 대학원, 컴퓨터·전파통신공학과, 2020. 2
학과코드
0510   6D36   1114  
일반주기
지도교수: 이상근  
서지주기
참고문헌: 장 34-40
이용가능한 다른형태자료
PDF 파일로도 이용가능;   Requires PDF file reader(application/pdf)  
비통제주제어
키워드추출,,
000 00000nam c2200205 c 4500
001 000046026240
005 20230712091750
007 ta
008 191226s2020 ulkd bmAC 000c eng
040 ▼a 211009 ▼c 211009 ▼d 211009
085 0 ▼a 0510 ▼2 KDCP
090 ▼a 0510 ▼b 6D36 ▼c 1114
100 1 ▼a 이송은, ▼g 李松垠
245 1 0 ▼a From text classification to keyphrase extraction / ▼d Song-eun Lee
260 ▼a Seoul : ▼b Graduate School, Korea University, ▼c 2020
300 ▼a iv, 40장 : ▼b 도표 ; ▼c 26 cm
500 ▼a 지도교수: 이상근
502 0 ▼a 학위논문(석사)-- ▼b 고려대학교 대학원, ▼c 컴퓨터·전파통신공학과, ▼d 2020. 2
504 ▼a 참고문헌: 장 34-40
530 ▼a PDF 파일로도 이용가능; ▼c Requires PDF file reader(application/pdf)
653 ▼a 키워드추출
776 0 ▼t From Text Classification to Keyphrase Extraction ▼w (DCOLL211009)000000127344
900 1 0 ▼a Lee, Song-eun, ▼e
900 1 0 ▼a 이상근, ▼g 李尙根, ▼d 1971-, ▼e 지도교수 ▼0 AUTH(211009)153285
945 ▼a KLPA

전자정보

No. 원문명 서비스
1
From text classification to keyphrase extraction (20회 열람)
PDF 초록 목차

소장정보

No. 소장처 청구기호 등록번호 도서상태 반납예정일 예약 서비스
No. 1 소장처 과학도서관/학위논문서고/ 청구기호 0510 6D36 1114 등록번호 123063749 도서상태 대출가능 반납예정일 예약 서비스 B M
No. 2 소장처 과학도서관/학위논문서고/ 청구기호 0510 6D36 1114 등록번호 123063750 도서상태 대출가능 반납예정일 예약 서비스 B M

컨텐츠정보

초록

Existing keyphrase extraction approaches often suffer from issues such as the sparsity and brevity of short text (e.g., headlines, queries, and tweets). In this paper, we propose a novel keyphrase extraction method for short text by utilizing recurrent neural networks. The main idea behind our approach is to classify short text into a relevant class or category and extract keyphrases from important words in the class or category. Unlike previous supervised approaches that need the information of annotated keyphrases, our approach requires only a text classification dataset (i.e., DBpedia), which is easier to use and requires less human effort. In our approach, we first feed short text into the attention-based neural network for text classification. We then compute attention weights of each word in input short text. Subsequently, we detect keyphrase candidates by chunking phrases and summing the attention weights of compositional words in the chunked phrase. The experimental results clearly show the efficacy of our approach on real-world datasets, such as headlines, queries, and tweets. The proposed method outperforms the Microsoft Cognitive Services and IBM Watson Natural Language Understanding service for keyphrase extraction in terms of F1-score and acceptable percentage on the NYT and Question datasets. Further, we confirm that the proposed method is comparable to supervised methods for keyphrase extraction from short text in the Tweet dataset.

목차

1 Introduction 1
2 Preliminary 4
 2.1 Attention-based LSTM for Text Classification 4
 2.2 Fine-grained Topic Knowledge 5
3 Methodology 8
 3.1 Keyphrase Extraction from Topic Classification 11
  3.1.1 Attention-based Topic Classification 11
  3.1.2 Keyphrase Selection 12
 3.2 Incorporating Fine-grained Topic Knowledge into Topic Classification 13
  3.2.1 Fine-grained Topic Embedding 13
  3.2.2 Attention Mechanism with Fine-grained Topic 14
4 Experiments 17
 4.1 Datasets 17
  4.1.1 Topic Classification Training Dataset 17
  4.1.2 Keyphrase Extraction Evaluation Dataset 17
 4.2 Experiment Settings 18
  4.2.1 Baselines 18
  4.2.2 Model Parameters 19
 4.3 Keyphrase Extraction Results 19
  4.3.1 Quantitative Evaluation 19
  4.3.2 Qualitative Evaluation 21
  4.3.3 Application to Tweet 24
 4.4 Analysis 25
  4.4.1 Visualization of Attention 25
  4.4.2 Analysis of Topic Classification Results 27
5 Related Work 29
 5.1 Supervised Approach to Keyphrase Extraction 29
 5.2 Graph-based Ranking Approach to Keyphrase Extraction 30
 5.3 Topic-based Clustering Method for Keyphrase Extraction 30
 5.4 Combining Knowledge with Neural Network 30
6 Conclusion 32
 6.1 Conclusion 32
 6.2 Future Work 32
Bibliography 34