Kaldi ASR Toolkit

Under this topic you can find information about the Kaldi ASR Toolkit, like URLs and paths where to find it. Kaldi is a more recent ASR toolkit compared to SPRAAK. Like SPRAAK, it contains functionality to train different types of GMM-HMM acoustic models, but also various types of Deep Neural Networks (DNNs), the current standard in ASR. This page provides links to Kaldi's own documentation pages and tips & tricks on how to use Kaldi for certain contexts. Feel free to add experiences which you feel are useful to others (i.e. to not 'reinvent the wheel').

Recommendation (DEPRECATED!): user your own LaMachine on Ponyland

It is recommended to use your own LaMachine to use Kaldi on Ponyland instead of the shared one (however, LaMachine is now deprecated!). The instructions to install/prepare your own LaMachine are here: Your own Kaldi-CLAM-LaMachine (just Step 1: Prerequisites).

Please note that you should change 'cristian' by other name in every step.

Once you have your own LaMachine-CLAM-Kaldi, you can use these commands to activate your environment:

 $ ssh thunderlane
 $ lamachine-lacristianmachine-activate
 (lacristianmachine)$ cd `echo $KALDI_ROOT/egs`

Also, thunderlane and rarity are the best servers to work with Kaldi.

Details

Name: Kaldi ASR
Type: open source software
Developer info page: http://www.kaldi-asr.org
Documentation page: http://www.kaldi-asr.org/doc/
Compile and installation instructions: http://kaldi-asr.org/doc/install.html

Location of Kaldi sources: https://github.com/kaldi-asr/kaldi
Location of linux x86_64 compiled binaries (usable on Ponyland): applejack.science.ru.nl:/vol/custom/opt/lamachine2/opt/kaldi. This is a central location for the Kaldi binaries and (for now) seem to always be the most recent version. These are compiled with NVIDIA CUDA 9.1.85 GPU support. In other words NVIDIA CUDA 9.1 should be installed on the target machine (already installed on Ponyland).
Note: for the above applejack urls a Ponyland SSH account is required from Wessel Stoop

AlexASR: a kaldi-based incremental online decoder

AlexASR is an incremental online decoder based on Kaldi. It can be used if you'd like to use ASR in a time sensitive context. It immediately decodes the speech as it comes in and only requires some finalization after the last audio packet was received. For example, we used this decoder in the game developed in the CHASING project. To reduce player waiting time on ASR results, AlexASR was used to decode speech as it was being recorded from the player.
Location of sources and info: https://github.com/UFAL-DSG/alex-asr
See also Alex ASR.

Useful tutorial links

There are some useful tutorials to get to know the Kaldi toolkit and how to operate it to do various things. Below a list sorted per topic.
Training acoustic models

A Kaldi introduction by former CLST employee Emre Yilmaz: Developing ASR Systems Using the KALDI toolkit
Kaldi's own relatively detailed tutorial providing an overview of the toolkit: http://kaldi-asr.org/doc/tutorial.html
Kaldi for Dummies tutorial for absolute beginners of Kaldi to create a simple ASR system: http://kaldi-asr.org/doc/kaldi_for_dummies.html
Eleanor Chodroff's tutorial: http://pages.jh.edu/~echodro1/tutorial/kaldi/kaldi-intro.html
Josh Meyer's tutorial on training nnet2 models (very useful to get to know the basics of the existing training scripts in Kaldi): http://jrmeyer.github.io/kaldi/2016/12/15/DNN-AM-Kaldi.html
John Hopkins's tutorial: https://www.dropbox.com/s/az474t7trsqkpri/2016-05-SLTU-Workshop.pdf?dl=0 or 2016-05-sltu-workshop.pdf

The offical Kaldi lectures: http://www.danielpovey.com/kaldi-lectures.html

More practical tutorials:

https://github.com/SethiPawandeep/kaldi-for-dummies (with raw audio data)
https://github.com/keighrim/kaldi-yesno-tutorial
- You might find this repo useful for keighrim tutorial: https://github.com/jmolina116/kaldi-yesno-tutorial (with raw audio data)
https://github.com/jmolina116/kaldi-digits (with raw audio data)
http://www.eleanorchodroff.com/tutorial/kaldi/training-acoustic-models.html
https://www.oxinabox.net/Kaldi-Notes/tidigits/
https://www.oxinabox.net/Kaldi-Notes/tidigits/eval
https://www.oxinabox.net/Kaldi-Notes/tidigits/lang_prep
http://jrmeyer.github.io/asr/2019/08/17/Kaldi-troubleshooting.html
http://codingandlearning.blogspot.com/search/label/KWS14

Forced alignment using existing acoustic models

Eleanor Chodroff's page: https://www.eleanorchodroff.com/tutorial/kaldi/forced-alignment.html

CLST-ASR

Table of Contents

Kaldi ASR Toolkit

Recommendation (DEPRECATED!): user your own LaMachine on Ponyland

Details

AlexASR: a kaldi-based incremental online decoder

Useful tutorial links

CLST-ASR

User Tools

Site Tools

Table of Contents

Kaldi ASR Toolkit

Recommendation (DEPRECATED!): user your own LaMachine on Ponyland

Details

AlexASR: a kaldi-based incremental online decoder

Useful tutorial links

Page Tools