Abhinav Garg

Picture 

Research Scientist
Language and Voice Team, Samsung Research Seoul
Links : CV, Linkedin
Bio, GScholar
You can use links on the left menu to navigate other pages.

About Me

I am a Research Scientist at the Language and Voice Team, part of Samsung Research, Seoul. At Samsung Research I have been part of the Automatic Speech Recognition (ASR) group from the last 3 years. Over the past 3 years I have worked on various components of an online (streaming) ASR system including pre-processing, model development and post processing.

In addition to co-authoring over ten top-tier publications at reputed conferences such as ASRU, ICASSP, and INTERSPEECH, I was also involved in deploying our ASR solution to millions of Galaxy devices, including smartphones, TVs, and fridges.

My recent research focus is towards self supervision, semi supervision in ASR. And also universal ASR systems.

I have an Btech in Computer Science and Engineering from IIT Kanpur, where I served as teaching assistant and mentor for courses on Machine Learning & Introduction to Programming. Apart from a CSE major, I also have a minor in Industrial Managment and Engineering (IME).
Feel free to contact me for any queries/discussions.

Publications

Links: Google Scholar

1. Abhinav Garg, Ashutosh Gupta, Dhananjaya Gowda, Shatrughan Singh, and Chanwoo Kim. “Hierarchical Multi-Stage Word-to-Grapheme Named Entity Corrector for Automatic Speech Recognition.” In: INTERSPEECH. 2020, pp. 1793–1797. URL: http:www.interspeech2020.orguploadfile pdf/Tue-1-8-6.pdf

2. Abhinav Garg, Gowtham P Vadisetti, Dhananjaya Gowda, Sichen Jin, Aditya Jayasimha, Youngho Han, Jiyeon Kim, Junmo Park, Kwangyoun Kim, Sooyeon Kim, et al. “Streaming On-Device End-to-End ASR System for Privacy-Sensitive Voice-Typing.” In: INTERSPEECH. 2020, pp. 3371–3375. URL: http:www.interspeech2020.orguploadfilepdf/Wed-3-9-6.pdf

3. Dhananjaya Gowda, Abhinav Garg, Kwangyoun Kim, Mehul Kumar, and Chanwoo Kim. “Multi-Task Multi-Resolution Char-to-BPE Cross-Attention Decoder for End-to-End Speech Recognition.” In: INTERSPEECH. 2019, pp. 2783–2787

4. Chanwoo Kim, Minkyu Shin, Abhinav Garg, and Dhananjaya Gowda. “Improved Vocal Tract Length Perturbation for a State-of-the-Art End-to-End Speech Recognition System.” In: INTERSPEECH. 2019, pp. 739–743

5. Chanwoo Kim, Abhinav Garg, Dhananjaya Gowda, Seongkyu Mun, and Changwoo Han. “Streaming end-to-end speech recognition with jointly trained neural feature enhancement”. In: ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE. 2021, pp. 6773–6777

6. Ankur Kumar*, Sachin Singh*, Dhananjaya Gowda*, Abhinav Garg, Shatrughan Singh, and Chanwoo Kim. “Utterance Confidence Measure for End-to-End Speech Recognition with Applications to Distributed Speech Recognition Scenarios.” In: INTERSPEECH. 2020, pp. 4357–4361

7. Dhananjaya Gowda*, Ankur Kumar*, Kwangyoun Kim, Hejung Yang, Abhinav Garg, Sachin Singh, Jiyeon Kim, Mehul Kumar, Sichen Jin, Shatrughan Singh, et al. “Utterance Invariant Training for Hybrid Two-Pass End-to-End Speech Recognition.” In: INTERSPEECH. 2020, pp. 2827–2831

8. Abhinav Garg, Dhananjaya Gowda, Ankur Kumar, Kwangyoun Kim, Mehul Kumar, and Chanwoo Kim. “Improved multi-stage training of online attention-based encoder-decoder models”. In: 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). IEEE. 2019, pp. 70–77

9. Dhananjaya Gowda, Abhinav Garg, Jiyeon Kim, Mehul Kumar, Sachin Singh, Ashutosh Gupta, Ankur Kumar, Nauman Dawalatabad, Aman Maghan, Shatrughan Singh, et al. “Hitnet: byte-to-bpe hierarchical transcription network for end-to-end speech recognition”. In: ASRU 2021: IEEE Workshop on Automatic Speech Recognition & Understanding. 2021

10. Chanwoo Kim, Sungsoo Kim, Kwangyoun Kim, Mehul Kumar, Jiyeon Kim, Kyungmin Lee, Changwoo Han, Abhinav Garg, Eunhyang Kim, Minkyoo Shin, et al. “End-to-end training of a large vocabulary end-to-end speech recognition system”. In: ASRU 2019: IEEE Workshop on Automatic Speech Recognition & Understanding. 2019

11. Jiyeon Kim, Mehul Kumar, Dhananjaya Gowda, Abhinav Garg, and Chanwoo Kim. “Semi-supervised transfer learning for language expansion of end-to-end speech recognition models to low-resource languages”. In: 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). 2021

12. Jiyeon Kim, Mehul Kumar, Dhananjaya Gowda, Abhinav Garg, and Chanwoo Kim. “A comparison of streaming models and data augmentation methods for robust speech recognition”. In: 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). 2021

13. Chanwoo Kim, Dhananjaya Gowda, Dongsoo Lee, Jiyeon Kim, Ankur Kumar, Sungsoo Kim, Abhinav Garg, and Changwoo Han. “A review of on-device fully neural end-to-end automatic speech recognition algorithms”. In: ACSSC 2020: Asilomar Conference on Signals, Systems, and Computers. 2020

14. Chanwoo Kim, Dhananjaya N Gowda, Abhinav Garg, and Kyungmin Lee. System and method for modifying speech recognition result. US Patent App. 16/990,343. Feb. 2021

15. Dhananjaya N Gowda, KIM Kwangyoun, Abhinav Garg, and Chanwoo Kim. Method and device for speech recognition. US Patent App. 16/750,274. July 2020