HOME > Detail View

Detail View

Utilizing probase in open directory project-based text classification

Utilizing probase in open directory project-based text classification

Material type
학위논문
Personal Author
전소영 全昭暎
Title Statement
Utilizing probase in open directory project-based text classification / So Young Jun
Publication, Distribution, etc
Seoul :   Graduate School, Korea University,   2018  
Physical Medium
iv, 38장 : 천연색삽화, 도표 ; 26 cm
기타형태 저록
Utilizing Probase in Open Directory Project-based Text Classification   (DCOLL211009)000000080395  
학위논문주기
학위논문(석사)-- 고려대학교 대학원, 컴퓨터·전파통신공학과, 2018. 2
학과코드
0510   6D36   1070  
General Note
지도교수: 이상근  
Bibliography, Etc. Note
참고문헌: 장 35-38
이용가능한 다른형태자료
PDF 파일로도 이용가능;   Requires PDF file reader(application/pdf)  
비통제주제어
Text Classification, Knowledge Base Integration,,
000 00000nam c2200205 c 4500
001 000045932643
005 20230712091401
007 ta
008 180102s2018 ulkad bmAC 000c eng
040 ▼a 211009 ▼c 211009 ▼d 211009
085 0 ▼a 0510 ▼2 KDCP
090 ▼a 0510 ▼b 6D36 ▼c 1070
100 1 ▼a 전소영 ▼g 全昭暎
245 1 0 ▼a Utilizing probase in open directory project-based text classification / ▼d So Young Jun
246 1 1 ▼a 프로베이스를 활용한 오픈 디렉터리 프로젝트 기반의 텍스트 분류
260 ▼a Seoul : ▼b Graduate School, Korea University, ▼c 2018
300 ▼a iv, 38장 : ▼b 천연색삽화, 도표 ; ▼c 26 cm
500 ▼a 지도교수: 이상근
502 0 ▼a 학위논문(석사)-- ▼b 고려대학교 대학원, ▼c 컴퓨터·전파통신공학과, ▼d 2018. 2
504 ▼a 참고문헌: 장 35-38
530 ▼a PDF 파일로도 이용가능; ▼c Requires PDF file reader(application/pdf)
653 ▼a Text Classification ▼a Knowledge Base Integration
776 0 ▼t Utilizing Probase in Open Directory Project-based Text Classification ▼w (DCOLL211009)000000080395
900 1 0 ▼a Jun, So Young, ▼e
900 1 0 ▼a 이상근, ▼g 李尙根, ▼d 1971-, ▼e 지도교수 ▼0 AUTH(211009)153285
945 ▼a KLPA

Electronic Information

No. Title Service
1
Utilizing probase in open directory project-based text classification (23회 열람)
View PDF Abstract Table of Contents

Holdings Information

No. Location Call Number Accession No. Availability Due Date Make a Reservation Service
No. 1 Location Science & Engineering Library/Stacks(Thesis)/ Call Number 0510 6D36 1070 Accession No. 123058293 Availability Available Due Date Make a Reservation Service B M
No. 2 Location Science & Engineering Library/Stacks(Thesis)/ Call Number 0510 6D36 1070 Accession No. 123058294 Availability Available Due Date Make a Reservation Service B M

Contents information

Abstract

Open Directory Project (ODP) has been utilized in large-scale text classification tasks owing to its representation ability of various categories. However, ODP has limitations in terms of scarcity in entities, which plays an important role in identifying the category of a text. Thus the deficiency of entities in ODP may affect the ODP-based classification performance. In this thesis, we propose a method to enrich ODP categories with entities by leveraging Probase, a knowledge base that contains millions of entities. To incorporate Probase entities in ODP categories, we first represent each Probase entity and ODP categories as a bag-of-concepts. Second, based on concept representation, we compute semantic relevance between them. Finally, based on this semantic relevance, we add Probase entities to the related ODP categories. Our experiment results using a real-world dataset show the efficacy of the proposed approach in text classification, exhibiting a significant improvement over the state-of-the-art techniques.

Table of Contents

Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . i
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . ii
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . iii
1 Introduction 1
2 Background 8
2.1 Open Directory Project . . . . . . . . . . . . . . . . . 8
2.2 Probase . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3 ODP-based Classifier . . . . . . . . . . . . . . . . . . 9
3 Concept Representation of ODP Categories 11
3.1 Searching for Probase Concepts in ODP Categories . 12
3.2 Representing ODP Category as a Probase Concept
Vector . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.3 Enriching Concept Vector of Rare ODP Categories . 15
4 Enrichment of ODP Category with Probase Entities 17
4.1 Concept Representation of Probase Entity . . . . . . 17
4.2 Relevance between ODP Category and Probase Entity 19
5 Performance Evaluation 21
5.1 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . 22
5.1.1 ODP Dataset . . . . . . . . . . . . . . . . . . 22
5.1.2 Probase Dataset . . . . . . . . . . . . . . . . . 22
5.1.3 Test Dataset . . . . . . . . . . . . . . . . . . . 23
5.1.3 Experiment Setup . . . . . . . . . . . . . . . . 24
5.2 Experiment Result . . . . . . . . . . . . . . . . . . . 25
5.2.1 Experiment Result of Matching Performance . 26
5.2.2 Experiment Result of Classification Performance 28
5.3 Qualitative Analysis . . . . . . . . . . . . . . . . . . 29
6 Related Work 32
7 Conclusion and Future work 34
8 Bibliography 35