User Tools

Site Tools


language_model_settings

Table of Contents

Language model settings

In SPRAAK you can use Finite State Grammars (FSGs) and N-gram language models. Both have a different set of settings when using them in the recognition process. Below an explanation of several settings as I used them:

The parameters can be set by the following command on the interactive SPRAAK commandline (spr_cwr_main): search lmi cost_A cost_C
See also: http://www.spraak.org/documentation/doxygen/doc/html/spr__cwr__main_8c.html

FSGs

For FSGs I used the following settings in recognition experiments:

  • fail_cost (inside the text file defining the FSG): the threshold at which the recognition of a word fails and results in <PARTIAL> tags. Increasing or decreasing this value (can also be negative) helps in removing the <PARTIAL> tags in the recognition results.

N-gram

For N-gram language models (e.g. bigram, trigram, etc.) I used the below settings in recognition experiments:

  • cost_C (in .ini config file for recognition experiment): word startup cost: cost that is added whenever a new word is started.
  • cost_A (in .ini config file for recognition experiment): scaling factor of the LM vs. the AM.

Also see: http://www.spraak.org/documentation/doxygen/doc/html/spr__lm__scale.html for detailed information on cost_C and cost_A.

language_model_settings.txt · Last modified: 2015/04/30 10:13 by mganzeboom