Wednesday, April 26, 2006

Một số website học tiếng Anh

Academic reading + vocabulary: http://www.uefap.com/reading/readfram.htm --> Có phân loại mấy ngàn từ phổ biến nhất, hay dùng trong academic English

Academic writting
: http://groups.yahoo.com/group/academic_writing/ (particularly for undergraduates) hay http://www.phrasebank.man.ac.uk/index.htm --> ngân hàng các phrase (phrasebank) dùng trong academic writting

Listening
: http://www.bbc.co.uk/worldservice/learningenglish/newsenglish/ hay http://www.bbc.co.uk/videonation/.

Speaking: http://www.mylanguageexchange.com/Learn/English.asp hay http://ec.hku.hk/tops/ --> cái này giúp làm presentation

Thursday, March 02, 2006

Survey of the State of the Art in Human Language Technology

Foreword

Ron Cole, editor in chief

The field of human language technology covers a broad range of activities with the eventual goal of enabling people to communicate with machines using natural communication skills. Research and development activities include the coding, recognition, interpretation, translation, and generation of language.

The study of human language technology is a multidisciplinary enterprise, requiring expertise in areas of linguistics, psychology, engineering and computer science. Creating machines that will interact with people in a graceful and natural way using language requires a deep understanding of the acoustic and symbolic structure of language, (the domain of linguistics) and the mechanisms and strategies that people use to communicate with each other (the domain of psychology). Given the remarkable ability of people to converse under adverse conditions, such as noisy social gatherings or band-limited communication channels, advances in signal processing are essential to produce robust systems (the domain of electrical engineering). Advances in computer science are needed to create the architectures and platforms needed to represent and utilize all of this knowledge. Collaboration among researchers in each of these areas is needed to create multimodal and multimedia systems that combine speech, facial cues and gestures both to improve language understanding and to produce more natural and intelligible speech by animated characters.

Human language technologies play a key role in the age of information. Today, the benefits of information and services on computer networks are unavailable to those without access to computers or the skills to use them. As the importance of interactive networks increases in commerce and daily life, those who do not have access to computers or the skills to use them are further handicapped from becoming productive members of society.

Advances in human language technology offer the promise of nearly universal access to on-line information and services. Since almost everyone speaks and understands a language, the development of spoken language systems will allow the average person to interact with computers without special skills or training, using common devices such as the telephone. These systems will combine spoken language understanding and generation to allow people to interact with computers using speech to obtain information on virtually any topic, to conduct business and to communicate with each other more effectively.

Advances in the processing of speech, text and images are needed to make sense of the massive amounts of information now available via computer networks. A student's query: "Tell me about global warming," should set in motion a set of procedures that locate, organize and summarize all available information about global warming from books, periodicals, newscasts, satellite images and other sources. Translation of speech or text from one language to another is needed to access and interpret all available material and present it to the student in her native language.

This book surveys the state of the art of human language technology. The goal of the survey is to provide an interested reader with an overview of the field---the main areas of work, the capabilities and limitations of current technology, and the technical challenges that must be overcome to realize the vision of graceful human computer interaction using natural communication skills.

The book consists of thirteen chapters written by 97 different authors. In order to create a coherent and readable volume, a great deal of effort was expended to provide consistent structure and level of presentation within and across chapters. The editorial board met six times over a two year period. During the first two meetings, the structure of the survey was defined, including topics, authors, and guidelines to authors. During each of the final four meetings (in four different countries), each author's contribution was carefully reviewed and revisions were requested, with the aim of making the survey as inclusive, up-to-date and internally consistent as possible.

This book is due to the efforts of many people. The survey was the brainchild of Oscar Garcia (then program director at the National Science Foundation in the United States), and Antonio Zampolli, professor at the University of Pisa, Italy. Oscar Garcia and Mark Liberman helped organize the survey and participated in the selection of topics and authors; their insights and contributions to the survey are gratefully acknowledged. I thank all of my colleagues on the editorial board, who dedicated remarkable amounts of time and effort to the survey. I am particularly grateful to Joseph Mariani for his diligence and support during the past two years, and to Victor Zue for his help and guidance throughout this project. I thank Hans Uszkoreit and Antonio Zampolli for their help in finding publishers. The survey owes much to the efforts of Vince Weatherill, the production editor, who worked with the editorial board and the authors to put the survey together, and to Don Colton, who indexed the book and proofread it several times. Finally, on behalf of the editorial board, we thank the authors of this survey, whose talents and patience were responsible for the quality of this product.

The survey was supported by a grant from the National Science Foundation to Ron Cole, Victor Zue and Mark Liberman, and by the European Commission. Additional support was provided by the Center for Spoken Language Understanding at the Oregon Graduate Institute and the University of Pisa, Italy.

Ron Cole Poipu Beach
Kauii, Hawaii, USA
January 31, 1996

Wednesday, March 01, 2006

Some photos on my lover's Graduation Ceremony




List of Universities in Korea

1 Ajou University www.ajou.ac.kr

2 Andong National University www.andong.ac.kr

3 Anyang University www.anyang.ac.kr

4 Asia United Theological University www.acts.ac.kr

5 Busan National University of Education www.bnue.ac.kr

6 Busan Presbyterian Theological College and Seminary www.bptcs.ac.kr

7 Calvin University www.calvin.ac.kr

8 Catholic University of Daegu www.cataegu.ac.kr

9 Catholic University of Pusan www.cup.ac.kr

10 Changwon National University www.changwon.ac.kr

11 Cheju National University www.cheju.ac.kr

12 Cheju National University of Education www.cheju-e.ac.kr

13 Chinju National University of Education www.cue.ac.kr

14 Chodang University www.chodang.ac.kr

15 Chonan University www.cheonan.ac.kr

16 Chonbuk National University www.cbnu.ac.kr

17 Chongju National University of Education www.chongju-e.ac.kr

18 Chongju University www.chongju.ac.kr

19 Chongshin University www.chongshin.ac.kr

20 Chonnan National University www.chonnam.ac.kr

21 Chosun University www.chosun.ac.kr

22 Chugye University for Arts www.chugye.ac.kr

23 Chuchen National University of Education www.cnue.ac.kr

24 Chung-ang University www.cau.ac.kr

25 Chungbuk National University www.chungbuk.ac.kr

26 Chungju National University www.chungju.ac.kr

27 Chungnam National University www.chungnam.ac.kr

28 Chungwoon University www.chungwoon.ac.kr

29 College of Medicine, Pochon Cha University
30 Daebul University www.daebul.ac.kr

31 Daegu National University of Education www.dnue.ac.kr

32 Daegu University www.daegu.ac.kr

33 Daejeon University www.dju.ac.kr

34 Daejin University www.daejin.ac.kr

35 Dankook University www.doonkook.ac.kr

36 Dong-A University www.donga.ac.kr

37 Dongduk Women's University www.dongduk.ac.kr

38 Dong-Eui University www.dongguk.edu

39 Dongguk University www.dongeui.ac.kr

40 Donghae University www.donghae.ac.kr

41 Dongseo University www.dongseo.ac.kr

42 Dongshin University www.dongshinu.ac.kr

43 Dongyang University www.dyu.ac.kr

44 Duksung Women's University www.duksung.ac.kr

45 Eulji University School of Medicine www.eulji.ac.kr

46 Ewha Womans University www.ewha.ac.kr

47 Far East University www.kdu.ac.kr

48 Gachon Medical school www.gachon.ac.kr

49 Gongju National University of Education www.gjue.ac.kr

50 Gyeongsang National University www.gsnu.ac.kr

51 Halla University www.halla.ac.kr

52 Hallym University www.hallym.ac.kr

53 Hanbat National University www.hanbat.ac.kr

54 Handong University www.handong.edu

55 Hanil University & Presbyterian Theological www.hanil.ac.kr

56 Hankuk Aviation University www.hau.ac.kr

57 Hankuk University of Foreign Studies www.hufs.edu

58 Hankyong National University www.hankyong.ac.kr

59 Hanlyo University www.hanlyo.ac.kr

60 Hannam University www.hannam.ac.kr

61 Hansei University www.hansei.ac.kr

62 Hanseo University www.hanseo.ac.kr

63 Hansung University www.hansung.ac.kr

64 Hanyang University www.hanyang.ac.kr

65 Hanyoung Theological University www.hytu.ac.kr

66 Honam Theological University & Seminary www.htus.ac.kr

67 Honam University www.honam.ac.kr

68 Hongik University www.hongik.ac.kr

69 Hoseo University www.hoseo.ac.kr

70 Howon University www.howon.ac.kr

71 Hyuosung University www.hyupsung.ac.kr

72 Inchon Catholic University www.iccu.ac.kr

73 Inchon National University of Education www.inchon-e.ac.kr

74 Information and Communication University www.icu.ac.kr

75 Inha University University www.inha.ac.kr

76 Inje University www.inje.ac.kr

77 Jeonju National University of Education www.jnue.ac.kr

78 Jeonju University www.jeonju.ac.kr

79 Junju National University of Industry www.jinju.ac.kr

80 Joongbu University www.joongbu.ac.kr

81 Korea Advanced Institute of Science and Technology www.kaist.edu

82 Kangnam University www.kangnam.ac.kr

83 Kangnung National University www.kangnung.ac.kr

84 Kangwon National University www.kangwon.ac.kr

85 Kaya University www.kaya.ac.kr

86 Kkottongnae Hyundo University of Social-welfare www.kkot.ac.kr

87 Kongju National University www.kongju.ac.kr

88 Konkuk University www.konkuk.ac.kr

89 Koyang University www.konyang.ac.kr

90 Kookmin University www.kookmin.ac.kr

91 Korea Baptist Theological University www.kbtus.ac.kr

92 Korea Maritime University www.kmaritime.ac.kr

93 Korea National Open University www.knou.ac.kr

94 Korea National University of Education www.knue.ac.kr

95 Korea Nazarine University www.nazarene.ac.kr

96 Korea Polytechnic University www.kpu.ac.kr

97 Korea University www.korea.edu

98 Korea University of technology and Education www.kut.ac.kr

99 Korea Bible University www.bible.ac.kr

100 Korea National University of physical Education www.knupe.ac.kr

101 Kosin University http://www.kosin.ac.kr

102 Kumoh National University of Technology www.kumoh.ac.kr

103 Kunsan National University www.kunsan.ac.kr

104 Kwandong University www.kwandong.ac.kr

105 Kwangju Catholic University www.kjcatholic.ac.kr

106 Kwangju National University www.gnue.ac.kr

107 Kwangju University www.kwangju.ac.kr

108 Kwangju Women's University www.kwu.ac.kr

109 Kwangshin University www.kwangshin.ac.kr

110 Kwangwoon University www.kwangwoon.ac.kr

111 Kyonggi University www.kyonggi.ac.kr

112 Kyongju University www.kyongju.ac.kr

113 Kyungdong University www.kyungdong.ac.kr

114 Kyunghee University www.Khu.ac.kr

115 Kyungil University www.kyungil.ac.kr

116 Kyungnam University www.kyungnam.ac.kr

117 Kyungpook Mational University www.knu.ac.kr

118 Kyungsan University www.kyungsan.ac.kr

119 Kyungsung University www.ks.ac.kr

120 Kyungwon University www.kyungwon.ac.kr

121 Kyungwoon University www.kyungwoon.ac.kr

122 Luther theological University www.ltu.ac.kr

123 Methodist Theological Seminary www.mts.ac.kr

124 Miryng National University www.miryang.ac.kr

125 Mokpo Catholic University www.mcu.ac.kr

126 Mokpo National Maritime University www.mmu.ac.kr

127 Mokpo National University www.mokpo.ac.kr

128 Mokwon University www.mokwon.ac.kr

129 Myongji University www.mju.ac.kr

130 Myungsin University www.myungshin.ac.kr

131 Nambu University www.nambu.ac.kr

132 Namseoul University www.nsu.ac.kr

133 Paichai University www.pcu.ac.kr

134 Pohang University of Science and Technology www.postech.ac.kr

135 Presbyterian College & Theological Seminary www.pcts.ac.kr

136 Pukyong National University www.pknu.ac.kr

137 Pusan National University www.pnu.edu

138 Pusan University of Foreign Studies www.pufs.ac.kr

139 Pyongtaek University www.ptuniv.ac.kr

140 Sahmyook University www.syu.ac.kr

141 Samchok National University www.samchok.ac.kr

142 Sangji University www.sangji.ac.kr

143 Sangju National University www.sangju.ac.kr

144 Sangmyung University www.sangmyung.ac.kr

145 Sejong University www.sejong.ac.kr

146 Semyung University www.semyung.ac.kr

147 Seokyeong University www.seokyeong.ac.kr

148 Seonam University www.seonam.ac.kr

149 Seoul Christian University www.scu.ac.kr

150 Seoul Jangsin University and Theological Seminary www.seouljangsin.ac.kr

151 Seoul National University www.snu.ac.kr

152 Seoul National University of Education www.snue.ac.kr

153 Seoul National University of Technology www.snut.ac.kr

154 Seoul Theological University www.stu.ac.kr

155 Seoul Women's University www.swu.ac.kr

156 Seowon University www.seowon.ac.kr

157 Silla University www.silla.ac.kr

158 Sogang University www.sogang.ac.kr

159 Sookmyung Women's University www.sookmyung.ac.kr

160 Soonchunhyang University www.sch.ac.kr

161 Soongsil University www.soongsil.ac.kr

162 Sunchon National University www.sunchon.ac.kr

163 Sungkonghoe University www.skhu.ac.kr

164 Sungkyul University www.sungkyul.edu

165 Sungkyunkwan University www.skku.ac.kr

166 Sungshin Women's University www.sungshin.ac.kr

167 Sunmoon University www.sunmoon.ac.kr

168 Suwon Catholic University www.suwoncatholic.ac.kr

169 Taegu Arts University www.tau.ac.kr

170 Taejon Catholic University www.dcatholic.ac.kr

171 Taeshin Christian University www.taeshin.ac.kr

172 Tamna University www.tamna.ac.kr

173 The Catholic University of Korea www.catholic.ac.kr

174 The University of Seoul www.uos.ac.kr

175 The University of Suwon www.suwon.ac.kr

176 Tongmyong Universityof Information Technology www.tit.ac.kr

177 Uiduk University www.uiduk.ac.kr

178 University of Inchon www.incheon.ac.kr

179 University of Ulsan www.ulsan.ac.kr

180 Wonkwang University www.wonkwang.ac.kr

181 Woosong University www.woosong.ac.kr

182 Woosuk University www.woosuk.ac.kr

183 Yeungnam University www.yu.ac.kr

184 Yewon University www.yewon.ac.kr

185 Yongin University www.yongin.ac.kr

186 Yonsei University www.ysid.yonsei.ac.kr

187 Yosu National University www.yosu.ac.kr

188 Youngnam Thrological Coolege & Seminary www.yntcs.ac.kr

189 Youngsan University www.ysu.ac.kr

190 Youngsan Won-Buddhist University www.youngsan.ac.kr

Tuesday, February 28, 2006

Some resources for noise-robust and channel-robust speech processing

The original of this page can be accessed here. I copy to my site for easy references.

This page collects links to software and data resources related to research on automatic speech recognition (ASR) that is robust to background noise and convolutional distortions such as reverberation. Some of the links pointed to by this page are also relevant to research on enhancing speech for human listening. If you would like to suggest more links for this page, you are invited to contact the page's maintainer, David Gelbart at ICSI.

If you use software or other resources pointed to by this page, please respect the license terms (and, when applicable, patent rights). If the use contributes to an academic publication, the maintainer suggest mentioning this in the publication and referencing the original (not this page of pointers) source by giving a publication or URL reference that will allow others to obtain the resource. This serves four purposes: (1) it alerts readers of the publication to the availability of the resource, (2) it provides a precise specification of that aspect of the work described in the publication, (3) it assigns credit where credit is due, and (4) it shows readers of the publications that sharing resources with others leads to public recognition, encouraging future sharing.

Successful approaches to robust ASR may combine more than one robustness technique. Because of the simple data flow of much signal processing code, different tools can often be used together simply by running them in sequence, using pipes or intermediate files. Two convenient choices for intermediate file formats are HTK feature files, and waveforms. Many of the tools online here operate on HTK feature files, or can output HTK feature files. The HTK format is a useful intermediate file format for feature files because it is simple to read, write, and convert to other formats, and because of the popularity of HTK. Also, some algorithms can be used with other tools without any modification to those other tools by having the algorithms run speech-enhancement-style, outputting processed waveforms which the
other tools treat as they would any other audio input file. Using processed waveforms as an intermediate format also allows listening, waveform plotting, and spectrogram plotting, which may lead to useful insights. (If using processed waveforms as an intermediate format, it may be worthwhile to store these processed waveforms in floating point, rather than the usual 16-bit integer storage format, to reduce roundoff error and eliminate the risk of overflow/saturation error. Since algorithms may change the scale of waveforms, there is a risk of overflow or underflow with a 16-bit integer format even if the original waveforms were well scaled for that format.)

Enhancement/compensation software for ASR and human listening:

Software for ASR:

Software for signal quality measurement:

Software and data for reproducing or simulating acoustic conditions:

Other:


VOICEBOX

The VOICEBOX Matlab toolbox for audio processing includes a noise reduction routine (specsubm), routines to read and write audio files from Matlab, and many other things.

Beamforming Toolkit

The Karlsruhe beamforming toolkit: "btk is a toolkit that provides a basis for the implementation of
powerful beamforming algorithms. btk uses Python as a scripting language for ease of control and modification. The capacity to efficiently perform advanced numerical computations is provided by
Numeric Python (NumPy), the GNU Scientific Library (GSL), as well as a few extra algorithms we've implemented ourselves."

Qualcomm-ICSI-OGI front end, speech detection, and noise reduction

This archive contains source code and documentation for the Qualcomm-ICSI-OGI noise-robust front end described in the ICSLP 2002 paper by Adami et al. The archive also contains tools for using the speech detection, Wiener filter noise reduction, or nonspeech frame dropping features of the front end independently of other features. The noise reduction can be used independently of other components to produce noise-reduced waveforms.

Matlab noise reduction tools by Patrick Wolfe

Matlab source code for various noise reduction algorithms is available here.

Trausti Kristjansson

Trausti Kristjansson created this page (while at the University of Toronto) which provides Matlab source code for (1) spectral subtraction noise removal, (2) the Algonquin variational inference algorithm for removing noise and channel effects, and (3) the Recognition Analyzer diagnostic tool which displays features, HTK log likelihoods, and HTK state sequences and can create resynthesized audio from MFCC features.

Marc Ferras' code for multi-microphone speech enhancement

This page provides source code for several blind multi-microphone speech enhancement techniques. These were implemented by Marc Ferras while pursuing his masters thesis on
multi-microphone signal processing for automatic speech recognition in meeting rooms.

The RESPITE CASA Toolkit

The RESPITE CASA Toolkit is a toolkit for Computational Auditory Scene Analysis (CASA).
This includes a tutorial on using the toolkit for missing data speech recognition.

Seneff auditory model

This page has source code for an implementation of Stephanie Seneff's auditory model
front end for ASR.

RASTA and MSG

C/C++ implementations of the RASTA and MSG (modulation-filtered spectrogram) algorithms for robust feature extraction are available as part of this ICSI speech software package. There is also this older page for RASTA at ICSI. There is a MATLAB implementation of RASTA at Dan Ellis' Matlab page.

MVA (Mean, Variance, ARMA)

This page provides source code for this technique proposed by Chia-Ping Chen and Jeff Bilmes which post-processes noisy cepstra by doing mean and variance normalization (M for mean, V for variance) and bandpass modulation filtering (A for ARMA).

Gabor filter analysis for speech recognition

This page provides articles, filter definitions, software tools, and discussion related to automatic speech recognition (ASR) with Gabor filters. A Matlab package for feature selection using the Feature Finding Neural Networks (FFNN) approach proposed by Tino Gramß (Gramss) is available as well. (This FFNN package was used to select Gabor filters for ASR.)

Objective Speech Quality Assessment

The CSLU Robust Speech Processing Laboratory software repository page hosts the Objective Speech Quality Assessment package (developed by Bryan Pellom, referred to in an ICSLP 98 paper by Hansen and Pellom) which calculates various metrics of speech quality based on comparing clean audio with noisy or noise-reduced audio.

NIST Speech Quality Assurance Package (SPQA)

The SPQA package includes SNR measurement tools which do not require a clean audio reference.

FaNT tool for adding noise or telephone characteristics to speech

The FaNT (Filtering and Noise-adding Tool) tool can be used to add noise to speech recordings at a desired SNR (signal-to-noise ratio). The SNR can be calculated using frequency weighting (G.712 or A-weighting), and the speech energy is calculated following ITU recommendation P.56. The tool can also be used to filter speech with certain frequency characteristics defined by the ITU for telephone equipment. This tool was used to create the noisy data for the popular AURORA 2 speech recognition corpus.

Acoustic impulse responses

This page, created by James Hopgood, is a collection of acoustic impulse responses for simulating convolutional distortion. The focus is on hands-free / far-field acoustic conditions.
Some past speech recognition work (by Shire, Kingsbury, Avendano, Palomaki, Morgan, Chen, Gelbart, possibly others) has been done using a set of impulse responses collected using the varechoic chamber at Bell Labs. It is planned to make these available on Hopgood's web page. Until they are available there, a download link has been placed here.

More acoustic impulse responses are available as part of the Sound Scene Database in Real Acoustical Environments from the Real World Computing Partnership, here. The site noisevault.com has acoustic impulse responses as well as links to software and documents regarding impulse response measurement and acoustic simulation; it seems aimed at audio engineers and audio engineering hobbyists.

Room acoustics simulator

The AudioGroup at the University of Patras have placed public domain room acoustics simulators online here.

Additive Noise Sources

The CSLU Robust Speech Processing Laboratory software repository page hosts a package named Additive Noise Sources which contains noise files for use in simulating additive noise.

NOISEX noises

This page at Rice has a set of downloadable noises. I think these are from the NOISEX-92 collection, but I don't know if this is the complete collection. I am not trying to give a comprehensive list of corpora on this page, but this page in the comp.speech FAQ has some links.

ShATR multiple simultaneous speaker corpus

Here. "ShATR is a corpus of overlapped speech collected by the University of Sheffield Speech and Hearing Research Group in collaboration with ATR in order to support research into computational auditory scene analysis. The task involved four participants working in pairs to solve two crosswords. A fifth participant acted as a hint-giver. Eight channels of audio data were recorded from the following sensors: one close microphone per speaker, one omnidirectional microphone, and the two channels of a binaurally-wired mannekin. Around 41% of the corpus contains overlapped speakers. In addition, a variety of other audio data was collected from each participant. The entire corpus, which has a duration of around 37 minutes, has been segmented and transcribed at 5 levels, from subtasks down to phones. In addition, all nonspeech sounds have been marked."

A brief list of resources that are not specific to noise and channel robustness

WaveSurfer speech visualization tool (view waveforms, spectrograms, formant tracks, pitch tracks) and other KTH-hosted software, HTK recognizer, SPHINX recognizer , ICSI speech software package (link above), ISIP recognizer and ISIP Foundation Classes for speech processing, CSLR SONIC recognizer, CMU-Cambridge Statistical Language Modeling toolkit,

SRILM - The SRI Language Modeling Toolkit, some more links to tools here.

A list of phonetics tutorials and speech processing tutorials and software