HOME > 상세정보

상세정보

Improving ODP-based text classification with word embeddings and DBpedia

Improving ODP-based text classification with word embeddings and DBpedia

자료유형
학위논문
개인저자
Aliyeva, Dinara
서명 / 저자사항
Improving ODP-based text classification with word embeddings and DBpedia / Dinara Aliyeva
발행사항
Seoul :   Graduate School, Korea University,   2018  
형태사항
iv, 30장 : 도표 ; 26 cm
기타형태 저록
Improving ODP-based Text Classification with Word Embeddings and DBpedia   (DCOLL211009)000000081711  
학위논문주기
학위논문(석사)-- 고려대학교 대학원, 컴퓨터·전파통신공학과, 2018. 8
학과코드
0510   6D36   1086  
일반주기
지도교수: 이상근  
서지주기
참고문헌: 장 28-30
이용가능한 다른형태자료
PDF 파일로도 이용가능;   Requires PDF file reader(application/pdf)  
비통제주제어
word embeddings, text classification, knowledge base,,
000 00000nam c2200205 c 4500
001 000045953706
005 20230712091505
007 ta
008 180702s2018 ulkd bmAC 000 eng
040 ▼a 211009 ▼c 211009 ▼d 211009
085 ▼a 0510 ▼2 KDCP
090 ▼a 0510 ▼b 6D36 ▼c 1086
100 1 ▼a Aliyeva, Dinara
245 1 0 ▼a Improving ODP-based text classification with word embeddings and DBpedia / ▼d Dinara Aliyeva
260 ▼a Seoul : ▼b Graduate School, Korea University, ▼c 2018
300 ▼a iv, 30장 : ▼b 도표 ; ▼c 26 cm
500 ▼a 지도교수: 이상근
502 0 ▼a 학위논문(석사)-- ▼b 고려대학교 대학원, ▼c 컴퓨터·전파통신공학과, ▼d 2018. 8
504 ▼a 참고문헌: 장 28-30
530 ▼a PDF 파일로도 이용가능; ▼c Requires PDF file reader(application/pdf)
653 ▼a word embeddings, text classification, knowledge base
776 0 ▼t Improving ODP-based Text Classification with Word Embeddings and DBpedia ▼w (DCOLL211009)000000081711
900 1 0 ▼a 이상근, ▼g 李尙根, ▼d 1971-, ▼e 지도교수 ▼0 AUTH(211009)153285
945 ▼a KLPA

전자정보

No. 원문명 서비스
1
Improving ODP-based text classification with word embeddings and DBpedia (14회 열람)
PDF 초록 목차

소장정보

No. 소장처 청구기호 등록번호 도서상태 반납예정일 예약 서비스
No. 1 소장처 과학도서관/학위논문서고/ 청구기호 0510 6D36 1086 등록번호 123059629 도서상태 대출가능 반납예정일 예약 서비스 B M
No. 2 소장처 과학도서관/학위논문서고/ 청구기호 0510 6D36 1086 등록번호 123059630 도서상태 대출가능 반납예정일 예약 서비스 B M

컨텐츠정보

초록

Traditional Open Directory Project (ODP)-based text classification methods effectively capture topics of texts by utilizing the hierarchical structure of explicitly human-built knowledge base. However, they lack entities, important in text classification tasks, and only consider term count based approaches, ignoring the important semantic similarity between words. In this paper, we propose a system to incorporate the DBpedia entities into the Open Directory Project for an improved text classification performance. First, we search for DBpedia entities in the ODP documents. Second, we train a word-entity embedding model, which projects DBpedia entities and words from the ODP into single distributional space. Third, we incorporate the obtained word and entity embeddings into the representation of the ODP categories and documents for the ODP-based text classification. 

목차

1. Introduction 1
2. Background 5
2.1 Open Directory Project 5
2.2 DBpedia 5
2.3 ODP-based Classifier 6
3. Enriching ODP with DBpedia knowledge 8
3.1 Searching for DBpedia entities in ODP documents 8
3.2 Adding DBpedia entities to the ODP documents 9
4. Incorporating Word Embeddings 10
4.1 Word Embeddings 10
4.2 Dual Word Embeddings 11
4.3 Text Classification 12
5. Evaluation 15
5.1 Datasets 15
5.1.1 Training Datasets 15
5.1.2 Test Datasets 16
5.2 Evaluation Metrics 17
5.3 Experimental Setup 17
5.4 Results 19
5.4.1 Parameter Setting 19
5.4.2 Performance Evaluation 19
5.4.3 Qualitative Analysis 22
6. Related Work 25
7. Conclusion and Future Work 27
7.1. Summary of This Thesis 27
7.2 Future Work 27

관련분야 신착자료