Chapter 1. Overview

Table of Contents

System Requirement
Things needed to run speech recognition
Tools and libraries in the distribution

This chapter describes the general information of Julius. The system requirements, models and package overview is described.

System Requirement

Julius is developed under Linux and Windows. It can also run on many Unixens like Solaris, FreeBSD and MacOS X. Since Julius is written in pure C and has little dependency on external libraries, it can run on other platforms. Developers has been ported Julius to Windows Mobile, iPhone and other microprocessor environments.

Julius supports a recognition of live speech input via audio capture device at all the supported OS above. See the "Audio Input" Chapter for the list of requirements for live input on each OS.

Things needed to run speech recognition

To perform speech recognition with Julius, you should prepare "models" for the target language and task. The models should define the linguistic property of the target language: recognition unit, audio properties of each unit and the linguistic constraint for the connection between the units. Typically the unit should be a word, and you should give Julius these models below:

  • "Word dictionary", which defines vocabulary. It deines the words to be recognized and their pronunciations as a phoneme sequence.

  • "Language model", which defines syntax level rules that defines the connection constraint between words. It should give the constraint for the acceptable or preferable sentence patterns. It can be eigher a rule-based grammar, or probabilistic model such as word N-gram. The language model is not needed for isolated word recognition.

  • "Acoustic model", which is a stocastic model of input waveform patterns, typically per phoneme. Julius adopts Hidden Markov Model (HMM) for the acousic modeling.

Since Julius itself is language-independent decoding program, it can run for a new language if given dictionary, language model and acoustic model for the language.

Julius is a mere speech decoder which computes most likely sentence for given input, so the recognition accuracy largely depends on the models.

Julius adopts acoustic models in HTK ascii format, pronunciation dictionary in almost HTK format, and word 3-gram language models in ARPA standard format (forward 2-gram and reverse 3-gram trained from same corpus).

You can get standard Japanese models for free from the Julius web site, and more various models is being delivered at Continuous Speech Recognition Consortium, Japan. For more detail, please contact csrc@astem.or.jp.

For English, we currently have a sample English acoustic model trained from the WSJ database. According to the license of the database, this model CANNOT be used to develop or test products for commercialization, nor can they use it in any commercial product or for any commercial purpose. Also, the performance is not so good. Please contact to us for further information.

More up-to-date information can be obtained on the Web page.

Tools and libraries in the distribution

Julius is distributed basically as source archive, and binary packages for Linux and Windows are also available. The source archive contains full program sources of Julius and related tools, release information, sample configuration file, sample plugin source codes and Unix online manuals. The binary packages are based on the source archive, containing pre-compiled executables and related files extracted from the source archive. You can also get a development snapshot via CVS.

These tools are included:

  • julius --- main speech recognition software Julius

  • adinrec --- audio detection/recording check tool

  • adintool --- a tool to record / split / send / receive audio streams

  • jcontrol --- a sample module client written in C

  • jclient.pl --- a sample module client written in Perl

  • mkbingram --- convert ARPA N-gram file into binary format

  • mkbinhmm --- convert HTK ASCII hmmdefs file into binary format

  • mkbinhmmlist --- convert HMMList file into binary format

  • mkgshmm --- convert monophone HMM to GS HMM for Julius

  • mkss --- calculate average spectrum for spectral subtraction

  • Tools for language modeling --- mkdfa.pl, mkfa, dfa_determinize, dfa_minimize, accept_check, nextword, generate, generate-ngram, gram2sapixml.pl, yomi2voca.pl

In Linux, the libraries, header files and some scripts will be installed for development.

  • libsent.a --- Julius low-level library

  • libjulius.a --- Julius main library

  • include/sent/* --- headers for libsent

  • include/julius/* --- headers for libjulius

  • libsent-config, libjulius-config --- scripts to get required flags for compilation with libsent and libjulius