Search Help(Japanese)
Open-Source Large
Vocabulary CSR Engine Julius
 
top

How to write a recognition grammar for Julius

Warning: recognition grammars below does not run on Julius,
unless you have an acoustic model.
Currently we have no English acoustic model.
Descriptions below are mere explanation of grammar specification in Julius.
Sorry for inconvenience.

Recognition Grammar

Julius can perform speech recognition based on a written grammar. The grammar describes possible syntax or patterns of words on a specific task. When given a speech input, Julius searches for the most likely word sequence under constraint of the given grammar.

The following is a brief description of how to write a recognition grammar for Julius.

Writing a recognition grammar for Julius

In Julius, the recognition grammar should be given into two separate files: ".grammar file" and ".voca file". ".grammar file" defines category-level syntax, i.e. allowed connection of words by their category name. ".voca file" defines word candidates in each category, with its pronunciation information.

An example grammar is used in the following section. The task name is "fruit order", which assumes an ordering task at a fruit shop, dealing with utterances like "four apples, please," "Well, I'll take an orange," and so on.

.grammar file

The allowed connection of words should be defined in ".grammar file", using word category names as terminal symbols.

Below is an example grammar file for the fruit order task, "fruit.grammar". It's like a famous BNF style. The initial sentence symbol to start should be "S". The rewrite rules should be defined each per line, using ":" as delimiter. Characters of ascii alphabets, numbers, and underscore are allowed for the symbol names, and they are case sensitive.

S : NS_B HMM SENT NS_E
S : NS_B SENT NS_E
SENT: TAKE_V FRUIT PLEASE
SENT: TAKE_V FRUIT
SENT: FRUIT PLEASE
SENT: FRUIT
FRUIT: NUM FRUIT_N
FRUIT: FRUIT_N_1
Terminal symbols, i.e. symbols which does not appear on the left item, are treated as "word category" name, and words in each category should be defined in voca file.

In this example, (NS_B, NS_E, HMM, TAKE_V, PLEASE, NUM, FRUIT_N, FRUIT_N_1 are word categories, and their contents should be specified in .voca file. NS_B and NS_E corresponds to the head silence and tail silence of input speech, and should be defined in all grammar for Julius.

Note1: If you want to use an infinite loop in part of your grammar, you should write a recursion rule like this (only left-recursion is allowed):

S: NS_B WORD_LOOP NS_E
WORD_LOOP: WORD_LOOP WORD
WORD_LOOP: WORD
Note2: although this grammar syntax allows up to context-free class, Julius can handle only regular expression class, since Julius uses DFA parser. If you write a grammar whose class goes over the DFA class, the grammar compiler (will be explained below) will complains it.

.voca file

.voca file contains word definition for each word category defined in the .grammar file. Below is the corresponding "fruit.voca" file.
% NS_B
<s>		sil

% NS_E
</s>		sil

% HMM
FILLER		f m
FILLER		w eh l

% TAKE_V
I'lltake	ay l t ey k
I'llhave	ay l hh ar v

% PLEASE
please		p l iy z

% FRUIT_N_1
apple		ae p ax l
orange		ao r ax n jh
orange		ao r ix n jh
grape		g r ey p
banana		b ax n ae n ax
plum		p l ah m

% FRUIT_N
apples		ae p ax l z
oranges		ao r ax n jh ax z
oranges		ao r ix n jh ix z
grapes		g r ey p s
bananas		b ax n ae n ax z
plums		p l ah m s

% NUM
one		w ah n
two		t uw
three		th r iy
four		f ao r
five		f ay v
six		s ih k s
seven		s eh v ax n
eight		ey t
nine		n ay n
ten		t eh n
eleven		ix l eh v ax n
twelve		t w eh l v
After specifying a word category with "%", words in the category should be defined each by line. The first column is the string which will be output when recognized, and the rest are the pronunciation. Space and tab are the field separator. In the example above, NS_B and NS_E category has one word entry with silence model, to correspond with the head and tail silence in speech input.

The pronunciation should be defined as a sequence of HMM name in your acoustic model. If you have some variety in pronunciation of a word, you can define all the variations each by line. See the word entry "orange" above.

Converting to Julius format

.grammar and .voca files should be converted to .dfa and .dict files using "mkdfa.pl", a grammar compiler. The prefix of .grammar and .voca file should be specified to mkdfa.pl. An example of running mkdfa.pl on the example grammar is shown below.
% mkdfa.pl fruit
fruit.grammar has 8 rules
fruit.voca    has 8 categories and 31 words
---
Now parsing grammar file
Now modifying grammar to minimize states[0]
Now parsing vocabulary file
Now making nondeterministic finite automaton[10/10]
Now making deterministic finite automaton[10/10]
Now making triplet list[10/10]
---
-rw-r--r--    1 foo      users         182 May  9 16:03 fruit.dfa
-rw-r--r--    1 foo      users         626 May  9 16:03 fruit.dict
-rw-r--r--    1 foo      users          66 May  9 16:03 fruit.term
The generated "fruit.dfa" contains finite automaton information, and "fruit.dict" contains word dictionary, both in Julius format.

Checking the grammar

One easy method to check the coverage of your grammar is to generate word sequences according to your grammar. Generation of allowed utterances will help knowing what utterance patterns are covered on that grammareasier than analizing the grammatical rules itself. You can check your grammar by looking whether invalid utterance are generated, or required utterances are not generated.

A tool "generate" is provided with Julius which generates random sentences from a grammar. You can also specify the number of sentence generation by "-n", and also can select terminal name output instead of word instances by "-t" (.term file is needed).

Below is an example of executing "generate" on the example grammar "fruit.*".

% generate fruit
Reading in dictionary...
31 words...done
Reading in DFA grammar...done
Mapping dict item <-> DFA terminal (category)...done
Reading in term file (optional)...done
8 categories, 31 words
DFA has 10 nodes and 18 arcs
-----
 <s> FILLER seven apples please </s>
 <s> banana </s>
 <s> FILLER I'llhave three oranges </s>
 <s> five bananas </s>
 <s> eight plums </s>
 <s> FILLER I'lltake seven plums </s>
 <s> FILLER banana </s>
 <s> ten oranges </s>
 <s> FILLER plum </s>
 <s> apple </s>

Executing Julius

Specify .dfa file with "-dfa", and .dict file with "-v". The rest options are the same as Julius. See other documents of Julius for all the detailed functions and options of Julius.

Sample grammars

These are free examples of a recognition grammar. They are just served as an example of a grammar for Julius, and Please note that AN ENGLISH MODEL IS NEEDED to run these grammar.

Other related tools

Several tools are included in Julius distribution to develop a recognition grammar for Julius. See "gramtool/00readme.txt" for their usage.
How to write a recognition grammar for Julius
Quick Download

SourceForge.jp

 
Copyright 2014 Julius development team