Description
I am currently a research engineer at Clova Speech since March 2020. At Clova Speech, I am developing speech recognition system on adverse condition.
In 2020, I received Ph.D. (Thesis: Generalization of neural network on unseen acoustic environment and sentence for spoken dialog system) from the School of Electrical Engineering at Korea Advanced Institute of Science and Technology (KAIST) under the supervision of Prof. Soo-Young Lee. In 2012, I received B.S. from the department of Electrical Engineering (major) and Mathematical Science (minor) at KAIST.
Research Interests
Dialog System, Speech Recognition, Neural Networks and Machine Learning
Publications
(J: journal, C: conference, W: workshop, A: arxiv preprint, T: phD thesis)
(* authors contributed equally)
2020
[A] Semi-supervised Disentanglement with Independent Vector Variational Autoencoders
- Bo-Kyeong Kim, Seungjin Kim, Geonmin Kim, Soo-Young Lee
- Under review
- [paper]
[T] Generalization of Neural Network on Unseen Acoustic Environment and Sentence for Spoken Dialog System
- phD Thesis, Korea Advanced Institute of Science and Technology
(advised by Prof Daeshik Kim and Soo-Young Lee) - [paper]
2019
[J4] Style-Controlled Synthesis of Clothing Segments for Fashion Image Manipulation
- Bo-Kyeong Kim, Geonmin Kim, Soo-Young Lee
- IEEE TMM 2019
- [paper]
[J3] Unpaired Speech Enhancement by Acoustic and Adversarial Supervision for Speech Recognition
2018
[A1] A Fully Time-domain Neural Model for Subband-based Speech Synthesizer
- Azam Rabiee, Geonmin Kim, Tae-Ho Kim, Soo-Young Lee
- arxiv 2018
- [paper]
[J2] Rescroing of N-best Hypotheses using Top-down Selective Attention for Automatic Speech Recognition
- Ho-Gyeong Km, Hwaran Lee, Geonmin Kim, Sang-Hoon Oh, Soo-Young Lee
- IEEE SPL 2018
- [paper]
~2017
[W4] A Deep Chatbot for QA and Chitchat
- Geonmin Kim*, Hwaran Lee*, CheongAn Lee, Eunmi Hong, Byunggeun Kim, Soo-Young Lee (team kAIb)
- NIPS competition track workshop 2017
- [paper] [code] [poster]
[C3] Compositional Sentence Representation from Character within Large Context Text
- Geonmin Kim Hwaran Lee, Bo-Kyeong Kim, Soo-Young Lee
- ICONIP 2017
- [paper]
[W3] Fusing Aligned and Non-Aligned Face Information for Automatic Affect Recognition in the Wild: A Deep learning Approach
- Bo-Kyeong Kim, Suh-Yeon Dong, Jihyeon Roh, Geonmin Kim, Soo-Young Lee
- CVPR workshop, 2016
- [paper]
[J1] Deep CNNs Along the Time Axis With Intermap Pooling within for Robustness to Spectral Variations
- Hwaran Lee, Geonmin Kim, Ho-Gyeong Kim, Sang-Hoon Oh, Soo-Young Lee
- IEEE SPL, 2016
- [paper]
[C2] Active Learning for Large-scale Object Classification: from Exploration to Exploitation
- Ho-Gyeong Kim, Jihyeon Roh, Hwaran Lee, Geonmin Kim, Soo-Young Lee
- HAI 2015
- [paper]
[W2] Spoken Sentence Embedding from Character by Jointly Learning Character-level Compositional Word Model and RNN Sentence Encoder
- Geonmin Kim, Hwaran Lee, Jaemyung Yu, Soo-Young Lee
- NBNI 2015 (only abstract)
- [paper]`
[W1] Learning Tonotopically Organized Auditory Feature-map from Speech by an Intermap Pooling Layer in a Deep CNN
- Hwaran Lee, Geonmin Kim, Jihyeon Roh, Soo-Young Lee
- NBNI 2015 (only abstract)
- [paper]
[C1] Implement Real-time Polyphonic Pitch Detection and Feedback System for the Melodic Instrument Player
- Geonmin Kim, Chang-Hyun Kim, Soo-Young Lee
- ICONIP 2012
- [paper]
Honors and Awards
- Ranked 3rd, ConvAI challenge, NIPS 2017 Competition Track Workshop, 2017
- Best Paper Award, HAI, 2015
- Qualcomm Innovation Award, 2015
title: Active Learning for Large-scale Object Classification: from Exploration to Exploitation - BK21 Plus Financial Support for Graduates Long Term Training, July. 2014
Visiting student at BioPOETS lab, UC Berkeley, U.S., July. - August. 2014 - KAIST Graduate Scholarship, Mar. 2013 - Feb. 2020
Working experience
(L: leader, M: member)
Computational NeuroSystem Laboratory, KAIST (Graduate student, Mar. 2013 - Dec. 2019)
Speech recognition
- Semi-supervised continuous speech recognition (2016, L)
- End-to-end continuous speech recognition (2015, L)
- Acoustic model for Korean Syllable (2013-2014, L)
for spontaneous spoken dialog system for language learning Electronics and Telecommunications Research Institute (ETRI)
Speech enhancement
-
Location-robust blind source extraction (2018, M)
for free-running embedded speech recognition technology for natural language dialogue with robots Korea Evaluation Institute of Industrial Technology (KEIT) -
Unpaired speech enhancement (2017, L)
for spontaneous spoken dialog system for language learning Electronics and Telecommunications Research Institute (ETRI)
Natural language generation
- Article based question-answering and chitchat bot (2017, L)
for emotional intelligence technology to infer human emotion and carry on dialogue accordingly Institute for Information & Communication Technology Promotion (IITP)
Sony Interactive Entertainment America (Intern, Oct. 2012 - Feb. 2013 )
- Multiple keyword spotting in audio