HOME > Detail View

Detail View

Word embedding technique with channel attention mechanism for Korean language

Word embedding technique with channel attention mechanism for Korean language

Material type
학위논문
Personal Author
권오준, 權五俊
Title Statement
Word embedding technique with channel attention mechanism for Korean language / Ohjoon Kwon
Publication, Distribution, etc
Seoul :   Graduate School, Korea University,   2020  
Physical Medium
v, 39장 : 천연색삽화, 도표 ; 26 cm
기타형태 저록
Word Embedding Technique with Channel Attention Mechanism for Korean Language   (DCOLL211009)000000127351  
학위논문주기
학위논문(석사)-- 고려대학교 대학원: 컴퓨터·전파통신공학과, 2020. 2
학과코드
0510   6D36   1105  
General Note
지도교수: 이상근  
Bibliography, Etc. Note
참고문헌: 장 35-39
이용가능한 다른형태자료
PDF 파일로도 이용가능;   Requires PDF file reader(application/pdf)  
비통제주제어
typos embedding , word embedding,,
000 00000nam c2200205 c 4500
001 000046026314
005 20200428152911
007 ta
008 191226s2020 ulkad bmAC 000c eng
040 ▼a 211009 ▼c 211009 ▼d 211009
085 0 ▼a 0510 ▼2 KDCP
090 ▼a 0510 ▼b 6D36 ▼c 1105
100 1 ▼a 권오준, ▼g 權五俊
245 1 0 ▼a Word embedding technique with channel attention mechanism for Korean language / ▼d Ohjoon Kwon
260 ▼a Seoul : ▼b Graduate School, Korea University, ▼c 2020
300 ▼a v, 39장 : ▼b 천연색삽화, 도표 ; ▼c 26 cm
500 ▼a 지도교수: 이상근
502 0 ▼a 학위논문(석사)-- ▼b 고려대학교 대학원: ▼c 컴퓨터·전파통신공학과, ▼d 2020. 2
504 ▼a 참고문헌: 장 35-39
530 ▼a PDF 파일로도 이용가능; ▼c Requires PDF file reader(application/pdf)
653 ▼a typos embedding ▼a word embedding
776 0 ▼t Word Embedding Technique with Channel Attention Mechanism for Korean Language ▼w (DCOLL211009)000000127351
900 1 0 ▼a Kwon, Oh-joon, ▼e
900 1 0 ▼a 이상근, ▼g 李尙根, ▼e 지도교수
945 ▼a KLPA

Electronic Information

No. Title Service
1
Word embedding technique with channel attention mechanism for Korean language (30회 열람)
View PDF Abstract Table of Contents

Holdings Information

No. Location Call Number Accession No. Availability Due Date Make a Reservation Service
No. 1 Location Science & Engineering Library/Stacks(Thesis)/ Call Number 0510 6D36 1105 Accession No. 123063725 Availability Available Due Date Make a Reservation Service B M
No. 2 Location Science & Engineering Library/Stacks(Thesis)/ Call Number 0510 6D36 1105 Accession No. 123063726 Availability Available Due Date Make a Reservation Service B M

Contents information

Abstract

Word embedding is considered as an essential factor in improving the performance of various Natural Language Processing (NLP) models. However, since word embedding that is used in general research is derived from a well-refined dataset, it is often less applicable in a real-world dataset. Particularly, in the case of Hangeul (Korean language), which has a unique writing system, different kinds of Out-Of-Vocabulary (OOV) appear based on typos or newly coined words. In this thesis, we propose a stable Hangeul word embedding technique that can maintain performance even for the noisy texts with various typos. We create a word vector that mixes correct words with intentionally generated typos and performed end-to-end training using the contextual information of the embedded word. In order to demonstrate the effectiveness of our model, we conduct intrinsic, extrinsic, and attention score visualization tests. While the existing embedding techniques failed to prove their accuracy as the noise level increased, the embedding technique developed in this thesis shows stable performances.

Table of Contents

1 Introduction 1
2 Related Work 5
 2.1 Typo Word Embedding Methods for English 5
 2.2 Word Embedding Methods for Korean 7
3 Model 10
 3.1 Generating Korean Typo 10
 3.2 Jamo-level Convolution Neural Network with Channel Attention 12
 3.3 Training and Deriving Word Embeddings 15
4 Experiments 17
 4.1 Experiments Settings 17
 4.2 Word Analogy Task 18
  4.2.1 Datasets 19
  4.2.2 Results 19
 4.3 Language Model Task 20
  4.3.1 Datasets 20
  4.3.2 Results 21
 4.4 Sentiment Classification Task 22
  4.4.1 Datasets 22
  4.4.2 Results 23
5 Analysis 24
 5.1 Effects of Channel Attention 24
 5.2 Nearest Neighbor of Words 26
 5.3 Robustness to Noise Level 30
 5.4 Training Time 32
6 Conclusion 34
Bibliography 35

New Arrivals Books in Related Fields