000 | 00000nam c2200205 c 4500 | |
001 | 000045953706 | |
005 | 20230712091505 | |
007 | ta | |
008 | 180702s2018 ulkd bmAC 000 eng | |
040 | ▼a 211009 ▼c 211009 ▼d 211009 | |
085 | ▼a 0510 ▼2 KDCP | |
090 | ▼a 0510 ▼b 6D36 ▼c 1086 | |
100 | 1 | ▼a Aliyeva, Dinara |
245 | 1 0 | ▼a Improving ODP-based text classification with word embeddings and DBpedia / ▼d Dinara Aliyeva |
260 | ▼a Seoul : ▼b Graduate School, Korea University, ▼c 2018 | |
300 | ▼a iv, 30장 : ▼b 도표 ; ▼c 26 cm | |
500 | ▼a 지도교수: 이상근 | |
502 | 0 | ▼a 학위논문(석사)-- ▼b 고려대학교 대학원, ▼c 컴퓨터·전파통신공학과, ▼d 2018. 8 |
504 | ▼a 참고문헌: 장 28-30 | |
530 | ▼a PDF 파일로도 이용가능; ▼c Requires PDF file reader(application/pdf) | |
653 | ▼a word embeddings, text classification, knowledge base | |
776 | 0 | ▼t Improving ODP-based Text Classification with Word Embeddings and DBpedia ▼w (DCOLL211009)000000081711 |
900 | 1 0 | ▼a 이상근, ▼g 李尙根, ▼d 1971-, ▼e 지도교수 ▼0 AUTH(211009)153285 |
945 | ▼a KLPA |
전자정보
소장정보
No. | 소장처 | 청구기호 | 등록번호 | 도서상태 | 반납예정일 | 예약 | 서비스 |
---|---|---|---|---|---|---|---|
No. 1 | 소장처 과학도서관/학위논문서고/ | 청구기호 0510 6D36 1086 | 등록번호 123059629 | 도서상태 대출가능 | 반납예정일 | 예약 | 서비스 |
No. 2 | 소장처 과학도서관/학위논문서고/ | 청구기호 0510 6D36 1086 | 등록번호 123059630 | 도서상태 대출가능 | 반납예정일 | 예약 | 서비스 |
컨텐츠정보
초록
Traditional Open Directory Project (ODP)-based text classification methods effectively capture topics of texts by utilizing the hierarchical structure of explicitly human-built knowledge base. However, they lack entities, important in text classification tasks, and only consider term count based approaches, ignoring the important semantic similarity between words. In this paper, we propose a system to incorporate the DBpedia entities into the Open Directory Project for an improved text classification performance. First, we search for DBpedia entities in the ODP documents. Second, we train a word-entity embedding model, which projects DBpedia entities and words from the ODP into single distributional space. Third, we incorporate the obtained word and entity embeddings into the representation of the ODP categories and documents for the ODP-based text classification.
목차
1. Introduction 1 2. Background 5 2.1 Open Directory Project 5 2.2 DBpedia 5 2.3 ODP-based Classifier 6 3. Enriching ODP with DBpedia knowledge 8 3.1 Searching for DBpedia entities in ODP documents 8 3.2 Adding DBpedia entities to the ODP documents 9 4. Incorporating Word Embeddings 10 4.1 Word Embeddings 10 4.2 Dual Word Embeddings 11 4.3 Text Classification 12 5. Evaluation 15 5.1 Datasets 15 5.1.1 Training Datasets 15 5.1.2 Test Datasets 16 5.2 Evaluation Metrics 17 5.3 Experimental Setup 17 5.4 Results 19 5.4.1 Parameter Setting 19 5.4.2 Performance Evaluation 19 5.4.3 Qualitative Analysis 22 6. Related Work 25 7. Conclusion and Future Work 27 7.1. Summary of This Thesis 27 7.2 Future Work 27