HOME > Detail View

Detail View

Robust voice activity detection using formant frequencies

Robust voice activity detection using formant frequencies

Material type
학위논문
Personal Author
유인철 兪仁哲
Title Statement
Robust voice activity detection using formant frequencies / Inchul Yoo
Publication, Distribution, etc
Seoul :   Graduate School, Korea University,   2015  
Physical Medium
viii, 83장 : 삽화, 도표 ; 26 cm
기타형태 저록
Robust Voice Activity Detection Using Formant Frequencies   (DCOLL211009)000000060022  
학위논문주기
學位論文(博士)-- 高麗大學校 大學院 : 컴퓨터·電波通信工學科, 2015. 8
학과코드
0510   6YD36   294  
General Note
지도교수: 陸東錫  
Bibliography, Etc. Note
참고문헌: 장 79-83
이용가능한 다른형태자료
PDF 파일로도 이용가능;   Requires PDF file reader(application/pdf)  
비통제주제어
voice activity detection,,
000 00000nam c2200205 c 4500
001 000045841301
005 20150826162359
007 ta
008 150628s2015 ulkad bmAC 000c eng
040 ▼a 211009 ▼c 211009 ▼d 211009
085 0 ▼a 0510 ▼2 KDCP
090 ▼a 0510 ▼b 6YD36 ▼c 294
100 1 ▼a 유인철 ▼g 兪仁哲
245 1 0 ▼a Robust voice activity detection using formant frequencies / ▼d Inchul Yoo
260 ▼a Seoul : ▼b Graduate School, Korea University, ▼c 2015
300 ▼a viii, 83장 : ▼b 삽화, 도표 ; ▼c 26 cm
500 ▼a 지도교수: 陸東錫
502 1 ▼a 學位論文(博士)-- ▼b 高麗大學校 大學院 : ▼c 컴퓨터·電波通信工學科, ▼d 2015. 8
504 ▼a 참고문헌: 장 79-83
530 ▼a PDF 파일로도 이용가능; ▼c Requires PDF file reader(application/pdf)
653 ▼a voice activity detection
776 0 ▼t Robust Voice Activity Detection Using Formant Frequencies ▼w (DCOLL211009)000000060022
900 1 0 ▼a Yoo, In-chul, ▼e
900 1 0 ▼a 육동석 ▼g 陸東錫, ▼e 지도교수
900 1 0 ▼a Yook, Dong-suk, ▼e 지도교수
945 ▼a KLPA

Electronic Information

No. Title Service
1
Robust voice activity detection using formant frequencies (29회 열람)
View PDF Abstract Table of Contents

Holdings Information

No. Location Call Number Accession No. Availability Due Date Make a Reservation Service
No. 1 Location Science & Engineering Library/Stacks(Thesis)/ Call Number 0510 6YD36 294 Accession No. 123052365 Availability Available Due Date Make a Reservation Service B M

Contents information

Abstract

Voice activity detection (VAD) can distinguish human speech from other sounds. Various applications?including speech coding and speech recognition?can benefit from VAD. To accurately detect voice activity, the algorithm must take into account the characteristic features of human speech and/or background noise. For many real-life applications, noise can frequently occur in an unexpected manner, and it is therefore difficult to accurately determine the characteristics of noise in such situations. As a result, robust VAD algorithms that are less dependent on correct noise estimates are more desirable for real-life applications. Formants are the major spectral peaks of human voice and are highly useful for distinguishing human vowel sounds. Because of the characteristics of their spectral peaks, formants are likely to survive in a signal after severe corruption by noise, making them attractive features for voice activity detection under low signal-to-noise ratio (SNR) conditions. However, nonrelevant spectral peaks from background noise make it difficult to accurately extract formants from noisy signals. In this paper, a simple formant-based VAD algorithm is proposed that overcomes the problem of formant detection under conditions with severe noise. The proposed method has much faster processing time and outperforms standard VAD algorithms under various noise conditions. The robustness against various types of noise and the light computational load of the proposed method make it suitable for various applications.

Table of Contents

CHAPTER 1	INTRODUCTION	1
CHAPTER 2	RELATED WORKS	5
2.1	Speech-Related Features	7
2.1.1	Energy and zero-crossing rate (ZCR)	7
2.1.2	Spectral entropy	9
2.1.3	Band-partitioned spectral entropy	10
2.2	Statistical Methods	12
2.2.1	Likelihood ratio test (LRT)-based method	12
2.2.2	Distributional modeling of speech signals	14
2.2.3	Parametric representation of speech signals	15
2.3	G.729 Annex.B Algorithm	16
2.3.1	Feature extraction	17
2.3.2	Background noise parameter estimation	19
2.3.3	Multiboundary VAD decision	20
2.3.4	VAD decision smoothing	22
2.4	ETSI AMR Option 1 Algorithm	23
2.4.1	Feature extraction	24
2.4.2	Background noise parameter estimation	24
2.4.3	Initial VAD decision	25
2.4.4	Hang-over addition	25
2.5	ETSI AMR Option 2 Algorithm	26
2.5.1	Feature extraction	27
2.5.2	Background noise parameter estimation	28
2.5.3	VAD decision	29
2.5.4	Hang-over addition	31
2.6	Summary	33
CHAPTER 3	IN-DEPTH ANALYSIS OF SIGNAL CORRUPTIONS BY NOISES	34
3.1	Analysis of Spectral Peaks	36
3.2	Vector Distance Metrics	39
3.2.1	Unnormalized vector distance metric	39
3.2.2	Normalized vector distance metric by total energies	41
3.2.3	Normalized vector distance metric by maximum energies	43
3.3	Spectral Peak-Based Metric	46
3.3.1	Direct comparison of spectral peak bands	46
3.3.2	Peak extraction-based approach	48
3.4	Summary	50
CHAPTER 4	DIRECT SIMILARITY COMPUTATION BETWEEN PEAK SIGNATURE AND CORRUPTED SPECTRUM	51
4.1	Peak Valley Difference (PVD)	52
4.1.1	Analysis of differences in average energy	52
4.1.2	VAD using average energy differences	54
4.1.3	Remarks on PVD algorithm	55
4.2	Peak-Neighbor Difference (PND)	56
4.2.1	VAD using formant frequencies	56
4.2.2	Band-limited computation for increased robustness against noises	58
4.2.3	Threshold calculation and post processing	60
CHAPTER 5	EXPERIMENTS	61
5.1	Experimental Conditions	61
5.1.1	Data preparation	61
5.1.2	Evaluation metrics	62
5.1.3	Noise mixing using FaNT	63
5.1.4	Baseline systems	64
5.1.5	Test sets	64
5.2	Aurora-2 Results	66
5.2.1	Averaged accuracy by noise type	66
5.2.2	Averaged accuracy by SNR level	67
5.3	NOISEX-92 Results	68
5.3.1	Averaged accuracy by noise type	68
5.3.2	Averaged accuracy by SNR level	69
5.4	Music Results	70
5.4.1	Averaged accuracy by noise type	70
5.4.2	Averaged accuracy by SNR level	71
5.5	Contours of VAD algorithms	72
5.6	Computational overheads	75
CHAPTER 6	CONCLUSION	77

New Arrivals Books in Related Fields