000 | 00000nam c2200205 c 4500 | |
001 | 000045932643 | |
005 | 20230712091401 | |
007 | ta | |
008 | 180102s2018 ulkad bmAC 000c eng | |
040 | ▼a 211009 ▼c 211009 ▼d 211009 | |
085 | 0 | ▼a 0510 ▼2 KDCP |
090 | ▼a 0510 ▼b 6D36 ▼c 1070 | |
100 | 1 | ▼a 전소영 ▼g 全昭暎 |
245 | 1 0 | ▼a Utilizing probase in open directory project-based text classification / ▼d So Young Jun |
246 | 1 1 | ▼a 프로베이스를 활용한 오픈 디렉터리 프로젝트 기반의 텍스트 분류 |
260 | ▼a Seoul : ▼b Graduate School, Korea University, ▼c 2018 | |
300 | ▼a iv, 38장 : ▼b 천연색삽화, 도표 ; ▼c 26 cm | |
500 | ▼a 지도교수: 이상근 | |
502 | 0 | ▼a 학위논문(석사)-- ▼b 고려대학교 대학원, ▼c 컴퓨터·전파통신공학과, ▼d 2018. 2 |
504 | ▼a 참고문헌: 장 35-38 | |
530 | ▼a PDF 파일로도 이용가능; ▼c Requires PDF file reader(application/pdf) | |
653 | ▼a Text Classification ▼a Knowledge Base Integration | |
776 | 0 | ▼t Utilizing Probase in Open Directory Project-based Text Classification ▼w (DCOLL211009)000000080395 |
900 | 1 0 | ▼a Jun, So Young, ▼e 저 |
900 | 1 0 | ▼a 이상근, ▼g 李尙根, ▼d 1971-, ▼e 지도교수 ▼0 AUTH(211009)153285 |
945 | ▼a KLPA |
Electronic Information
No. | Title | Service |
---|---|---|
1 | Utilizing probase in open directory project-based text classification (23회 열람) |
View PDF Abstract Table of Contents |
Holdings Information
No. | Location | Call Number | Accession No. | Availability | Due Date | Make a Reservation | Service |
---|---|---|---|---|---|---|---|
No. 1 | Location Science & Engineering Library/Stacks(Thesis)/ | Call Number 0510 6D36 1070 | Accession No. 123058293 | Availability Available | Due Date | Make a Reservation | Service |
No. 2 | Location Science & Engineering Library/Stacks(Thesis)/ | Call Number 0510 6D36 1070 | Accession No. 123058294 | Availability Available | Due Date | Make a Reservation | Service |
Contents information
Abstract
Open Directory Project (ODP) has been utilized in large-scale text classification tasks owing to its representation ability of various categories. However, ODP has limitations in terms of scarcity in entities, which plays an important role in identifying the category of a text. Thus the deficiency of entities in ODP may affect the ODP-based classification performance. In this thesis, we propose a method to enrich ODP categories with entities by leveraging Probase, a knowledge base that contains millions of entities. To incorporate Probase entities in ODP categories, we first represent each Probase entity and ODP categories as a bag-of-concepts. Second, based on concept representation, we compute semantic relevance between them. Finally, based on this semantic relevance, we add Probase entities to the related ODP categories. Our experiment results using a real-world dataset show the efficacy of the proposed approach in text classification, exhibiting a significant improvement over the state-of-the-art techniques.
Table of Contents
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . i List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . ii List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . iii 1 Introduction 1 2 Background 8 2.1 Open Directory Project . . . . . . . . . . . . . . . . . 8 2.2 Probase . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.3 ODP-based Classifier . . . . . . . . . . . . . . . . . . 9 3 Concept Representation of ODP Categories 11 3.1 Searching for Probase Concepts in ODP Categories . 12 3.2 Representing ODP Category as a Probase Concept Vector . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.3 Enriching Concept Vector of Rare ODP Categories . 15 4 Enrichment of ODP Category with Probase Entities 17 4.1 Concept Representation of Probase Entity . . . . . . 17 4.2 Relevance between ODP Category and Probase Entity 19 5 Performance Evaluation 21 5.1 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . 22 5.1.1 ODP Dataset . . . . . . . . . . . . . . . . . . 22 5.1.2 Probase Dataset . . . . . . . . . . . . . . . . . 22 5.1.3 Test Dataset . . . . . . . . . . . . . . . . . . . 23 5.1.3 Experiment Setup . . . . . . . . . . . . . . . . 24 5.2 Experiment Result . . . . . . . . . . . . . . . . . . . 25 5.2.1 Experiment Result of Matching Performance . 26 5.2.2 Experiment Result of Classification Performance 28 5.3 Qualitative Analysis . . . . . . . . . . . . . . . . . . 29 6 Related Work 32 7 Conclusion and Future work 34 8 Bibliography 35