000 | 00000nam c2200205 c 4500 | |
001 | 000045978961 | |
005 | 20190416165557 | |
007 | ta | |
008 | 181227s2019 ulka bmAC 000c eng | |
040 | ▼a 211009 ▼c 211009 ▼d 211009 | |
085 | 0 | ▼a 0510 ▼2 KDCP |
090 | ▼a 0510 ▼b 6D36 ▼c 1097 | |
100 | 1 | ▼a 김예지 ▼g 金禮知 |
245 | 1 0 | ▼a Three-stream fusion network for first-person interaction recognition / ▼d Ye-ji Kim |
260 | ▼a Seoul : ▼b Graduate School, Korea Unversity, ▼c 2019 | |
300 | ▼a 33장 : ▼b 천연색삽화 ; ▼c 26 cm | |
500 | ▼a 지도교수: 이성환 | |
502 | 0 | ▼a 학위논문(석사)-- ▼b 고려대학교 대학원: ▼c 컴퓨터·전파통신공학과, ▼d 2019. 2 |
504 | ▼a 참고문헌: 장 27-33 | |
530 | ▼a PDF 파일로도 이용가능; ▼c Requires PDF file reader(application/pdf) | |
653 | ▼a human activity recognition ▼a first-person vision | |
776 | 0 | ▼t Three-Stream Fusion Network for First-Person Interaction Recognition ▼w (DCOLL211009)000000083456 |
900 | 1 0 | ▼a Kim, Ye-ji, ▼e 저 |
900 | 1 0 | ▼a 이성환 ▼g 李晟瑍, ▼e 지도교수 |
945 | ▼a KLPA |
Electronic Information
No. | Title | Service |
---|---|---|
1 | Three-stream fusion network for first-person interaction recognition (28회 열람) |
View PDF Abstract Table of Contents |
Holdings Information
No. | Location | Call Number | Accession No. | Availability | Due Date | Make a Reservation | Service |
---|---|---|---|---|---|---|---|
No. 1 | Location Science & Engineering Library/Stacks(Thesis)/ | Call Number 0510 6D36 1097 | Accession No. 123060861 | Availability Available | Due Date | Make a Reservation | Service |
No. 2 | Location Science & Engineering Library/Stacks(Thesis)/ | Call Number 0510 6D36 1097 | Accession No. 123060862 | Availability Available | Due Date | Make a Reservation | Service |
Contents information
Abstract
First-person interaction recognition is a challenging task due to unstable video conditions from a camera wearer’s movement. For human interaction recognition from a first-person viewpoint, this paper proposes the three-stream fusion network with two main parts: three-stream architecture and three-stream correlation fusion. The three-stream architecture captures characteristics of the target appearance, target motion, and camera ego-motion. The three-stream correlation fusion combines three feature maps of each stream to consider correlations between the target appearance, target motion, and camera ego-motion. The fused feature vector is robust to the camera movement and compensates for the noise of camera ego-motion. Short-term intervals are modeled with the fused feature vector, and the LSTM considers the temporal dynamics of videos. We evaluated the proposed method on two public benchmark datasets to show the effectiveness of our approach. In the experiments, we showed that the proposed fusion method successfully generated a discriminative feature vector, and our network outperformed all competing activity recognition methods in first-person videos where a lot of camera ego-motion occurs.
Table of Contents
1. Introduction 1 2. Related Work 4 3. Three-Stream Fusion Network 7 3.1. Three-Stream Architecture 7 3.2. Three-Stream Correlation Fusion 9 3.3. LSTM for Classification 12 4. Experiments 14 4.1. Datasets 14 4.2. Implementation Details 16 4.3. Performance Evaluations 18 4.4. Three-Stream Correlation Fusion Evaluations 22 5. Conclusion for Research 26