The Julius book

Akinobu LEE

The contents of this book is based on Julius rev. 4.1.5.

Revision History
Revision 1.0.32010/06/04
Update for 4.1.5: add MSVC and update cygwin at chapter 2, add notes for MFCC feature setting on chapter 4.
Revision 1.0.12009/02/12
Minor update for 4.1.2 to add "-mapunk" option.
Revision 1.0.02008/09/30
Under construction, released with Julius rev.4.1.0.

Table of Contents

Preface
1. Overview
System Requirement
Things needed to run speech recognition
Tools and libraries in the distribution
2. Installation
Install from binary package
Compile from source
Configuration options
libsent options
libjulius options
julius options
Building Julius on various platform
Linux
Windows - cygwin
Windows - mingw
Windows - Microsoft Visual C++
3. Audio Input
Audio Format
Number of bits
Number of channels
Sampling Rates
File input
Supported format
Live microphone input
Preparing microphone input
Notes for supported OS / devices
About Input Delay
Network and Socket inputs
original
esd
standard
DATLINK/NetAudio
Feature vector file input
Audio I/O Extension by Plugin
A. Major Changes
Changes from 4.0 to 4.1
Changes from 3.5.3 to 4.0
Changes from 3.5 to 3.5.3
Changes from 3.4.2 to 3.5
B. Options
Julius application option
Global options
Audio input
Speech detection by level and zero-cross
Input rejection
Gaussian mixture model / GMM-VAD
Decoding switches
Misc. options
Instance declaration for multi decoding
Language model (-LM)
N-gram
Grammar
Isolated word
User-defined LM
Misc. LM options
Acoustic model and feature analysis (-AM) (-AM_GMM)
Acoustic HMM
Speech analysis
Normalization
Front-end processing
Misc. AM options
Recognition process and search (-SR)
1st pass parameters
2nd pass parameters
Short-pause segmentation / decoder-VAD
Word lattice / confusion network output
Multi-gram / multi-dic recognition
Forced alignment
Misc. search options
I. Reference Manuals
julius — open source multi-purpose LVCSR engine
jcontrol — a sample module client written in C
jclient.pl — sample client for module mode (perl version)
mkbingram — make binary N-gram from ARPA N-gram file
mkbinhmm — convert HMM definition file in HTK ascii format to Julius binary format
mkbinhmmlist — convert HMMList file into binary format
adinrec — record audio device and save one utterance to a file
adintool — a tool to record / split / send / receive audio streams
mkss — calculate average spectrum for spectral subtraction
mkgshmm — convert monophone HMM to GS HMM for Julius
generate-ngram — random sentence generator from N-gram
mkdfa.pl — grammar compiler
generate — random sentence generator from a grammar
nextword — display next predicted words (in reverse order)
accept_check — Check whether a grammar accept / reject given word sequences
dfa_minimize — Minimize a DFA grammar network
dfa_determinize — Determinize NFA grammar network.
gram2sapixml.pl — convert Julius grammar to SAPI XML grammar format
C. License term