000 | 00000nam c2200205 c 4500 | |
001 | 000045953706 | |
005 | 20180919164542 | |
007 | ta | |
008 | 180702s2018 ulkd bmAC 000 eng | |
040 | ▼a 211009 ▼c 211009 ▼d 211009 | |
085 | ▼a 0510 ▼2 KDCP | |
090 | ▼a 0510 ▼b 6D36 ▼c 1086 | |
100 | 1 | ▼a Aliyeva, Dinara |
245 | 1 0 | ▼a Improving ODP-based text classification with word embeddings and DBpedia / ▼d Dinara Aliyeva |
260 | ▼a Seoul : ▼b Graduate School, Korea University, ▼c 2018 | |
300 | ▼a iv, 30장 : ▼b 도표 ; ▼c 26 cm | |
500 | ▼a 지도교수: 이상근 | |
502 | 0 | ▼a 학위논문(석사)-- ▼b 고려대학교 대학원: ▼c 컴퓨터·전파통신공학과, ▼d 2018. 8 |
504 | ▼a 참고문헌: 장 28-30 | |
530 | ▼a PDF 파일로도 이용가능; ▼c Requires PDF file reader(application/pdf) | |
653 | ▼a word embeddings, text classification, knowledge base | |
776 | 0 | ▼t Improving ODP-based Text Classification with Word Embeddings and DBpedia ▼w (DCOLL211009)000000081711 |
900 | 1 0 | ▼a 이상근 ▼g 李尙根, ▼e 지도교수 |
945 | ▼a KLPA |
Electronic Information
No. | Title | Service |
---|---|---|
1 | Improving ODP-based text classification with word embeddings and DBpedia (14회 열람) |
View PDF Abstract Table of Contents |
Holdings Information
No. | Location | Call Number | Accession No. | Availability | Due Date | Make a Reservation | Service |
---|---|---|---|---|---|---|---|
No. 1 | Location Science & Engineering Library/Stacks(Thesis)/ | Call Number 0510 6D36 1086 | Accession No. 123059629 | Availability Available | Due Date | Make a Reservation | Service |
No. 2 | Location Science & Engineering Library/Stacks(Thesis)/ | Call Number 0510 6D36 1086 | Accession No. 123059630 | Availability Available | Due Date | Make a Reservation | Service |
Contents information
Abstract
Traditional Open Directory Project (ODP)-based text classification methods effectively capture topics of texts by utilizing the hierarchical structure of explicitly human-built knowledge base. However, they lack entities, important in text classification tasks, and only consider term count based approaches, ignoring the important semantic similarity between words. In this paper, we propose a system to incorporate the DBpedia entities into the Open Directory Project for an improved text classification performance. First, we search for DBpedia entities in the ODP documents. Second, we train a word-entity embedding model, which projects DBpedia entities and words from the ODP into single distributional space. Third, we incorporate the obtained word and entity embeddings into the representation of the ODP categories and documents for the ODP-based text classification.
Table of Contents
1. Introduction 1 2. Background 5 2.1 Open Directory Project 5 2.2 DBpedia 5 2.3 ODP-based Classifier 6 3. Enriching ODP with DBpedia knowledge 8 3.1 Searching for DBpedia entities in ODP documents 8 3.2 Adding DBpedia entities to the ODP documents 9 4. Incorporating Word Embeddings 10 4.1 Word Embeddings 10 4.2 Dual Word Embeddings 11 4.3 Text Classification 12 5. Evaluation 15 5.1 Datasets 15 5.1.1 Training Datasets 15 5.1.2 Test Datasets 16 5.2 Evaluation Metrics 17 5.3 Experimental Setup 17 5.4 Results 19 5.4.1 Parameter Setting 19 5.4.2 Performance Evaluation 19 5.4.3 Qualitative Analysis 22 6. Related Work 25 7. Conclusion and Future Work 27 7.1. Summary of This Thesis 27 7.2 Future Work 27