HOME > Detail View

Detail View

Improving ODP-based text classification with word embeddings and DBpedia

Improving ODP-based text classification with word embeddings and DBpedia

Material type
학위논문
Personal Author
Aliyeva, Dinara
Title Statement
Improving ODP-based text classification with word embeddings and DBpedia / Dinara Aliyeva
Publication, Distribution, etc
Seoul :   Graduate School, Korea University,   2018  
Physical Medium
iv, 30장 : 도표 ; 26 cm
기타형태 저록
Improving ODP-based Text Classification with Word Embeddings and DBpedia   (DCOLL211009)000000081711  
학위논문주기
학위논문(석사)-- 고려대학교 대학원: 컴퓨터·전파통신공학과, 2018. 8
학과코드
0510   6D36   1086  
General Note
지도교수: 이상근  
Bibliography, Etc. Note
참고문헌: 장 28-30
이용가능한 다른형태자료
PDF 파일로도 이용가능;   Requires PDF file reader(application/pdf)  
비통제주제어
word embeddings, text classification, knowledge base,,
000 00000nam c2200205 c 4500
001 000045953706
005 20180919164542
007 ta
008 180702s2018 ulkd bmAC 000 eng
040 ▼a 211009 ▼c 211009 ▼d 211009
085 ▼a 0510 ▼2 KDCP
090 ▼a 0510 ▼b 6D36 ▼c 1086
100 1 ▼a Aliyeva, Dinara
245 1 0 ▼a Improving ODP-based text classification with word embeddings and DBpedia / ▼d Dinara Aliyeva
260 ▼a Seoul : ▼b Graduate School, Korea University, ▼c 2018
300 ▼a iv, 30장 : ▼b 도표 ; ▼c 26 cm
500 ▼a 지도교수: 이상근
502 0 ▼a 학위논문(석사)-- ▼b 고려대학교 대학원: ▼c 컴퓨터·전파통신공학과, ▼d 2018. 8
504 ▼a 참고문헌: 장 28-30
530 ▼a PDF 파일로도 이용가능; ▼c Requires PDF file reader(application/pdf)
653 ▼a word embeddings, text classification, knowledge base
776 0 ▼t Improving ODP-based Text Classification with Word Embeddings and DBpedia ▼w (DCOLL211009)000000081711
900 1 0 ▼a 이상근 ▼g 李尙根, ▼e 지도교수
945 ▼a KLPA

Electronic Information

No. Title Service
1
Improving ODP-based text classification with word embeddings and DBpedia (14회 열람)
View PDF Abstract Table of Contents

Holdings Information

No. Location Call Number Accession No. Availability Due Date Make a Reservation Service
No. 1 Location Science & Engineering Library/Stacks(Thesis)/ Call Number 0510 6D36 1086 Accession No. 123059629 Availability Available Due Date Make a Reservation Service B M
No. 2 Location Science & Engineering Library/Stacks(Thesis)/ Call Number 0510 6D36 1086 Accession No. 123059630 Availability Available Due Date Make a Reservation Service B M

Contents information

Abstract

Traditional Open Directory Project (ODP)-based text classification methods effectively capture topics of texts by utilizing the hierarchical structure of explicitly human-built knowledge base. However, they lack entities, important in text classification tasks, and only consider term count based approaches, ignoring the important semantic similarity between words. In this paper, we propose a system to incorporate the DBpedia entities into the Open Directory Project for an improved text classification performance. First, we search for DBpedia entities in the ODP documents. Second, we train a word-entity embedding model, which projects DBpedia entities and words from the ODP into single distributional space. Third, we incorporate the obtained word and entity embeddings into the representation of the ODP categories and documents for the ODP-based text classification. 

Table of Contents

1. Introduction 1
2. Background 5
2.1 Open Directory Project 5
2.2 DBpedia 5
2.3 ODP-based Classifier 6
3. Enriching ODP with DBpedia knowledge 8
3.1 Searching for DBpedia entities in ODP documents 8
3.2 Adding DBpedia entities to the ODP documents 9
4. Incorporating Word Embeddings 10
4.1 Word Embeddings 10
4.2 Dual Word Embeddings 11
4.3 Text Classification 12
5. Evaluation 15
5.1 Datasets 15
5.1.1 Training Datasets 15
5.1.2 Test Datasets 16
5.2 Evaluation Metrics 17
5.3 Experimental Setup 17
5.4 Results 19
5.4.1 Parameter Setting 19
5.4.2 Performance Evaluation 19
5.4.3 Qualitative Analysis 22
6. Related Work 25
7. Conclusion and Future Work 27
7.1. Summary of This Thesis 27
7.2 Future Work 27

New Arrivals Books in Related Fields