[ Japanese | English | Back to top ]
Last update: "2005/09/29 04:03:21"



mkbingram - make binary N-gram from two arpa LMs


mkbingram 2gram.arpa rev3gram.arpa bingram


mkbingram makes a binary N-gram file for Julius from word 2-gram and reverse word 3-gram LMs in ARPA standard for- mat. Using the binary file, the initial startup of Julius becomes much faster. Note that the word 2-gram and reverse word 3-gram should be trained in the same corpus, same parameters (i.e. cut- off thresholds) and have the same vocabulary. mkbingram can read gzipped ARPA file. mkbingram that comes with Julius version 3.5 and later can generate more size-optimized binary N-gram by using 24bit index instead of 32bit and 2-gram backoff data compres- sion. The byte order was also changed from 3.5 to use the system's native order by default. Although the old binary N-gram can be directly read by Julius, (in that case Julius performs on-line conversion), you can also update your binary N-gram using mkbingram using -d option. Please note that binary N-gram file converted by mkbingram of version 3.5 and later cannot be read by Julius/Julian 3.4.2 and earlier.


2gram.arpa input word 2-gram file in ARPA standard format. rev3gram.arpa input reverse word 3-gram in ARPA standard format. -d old_bingram input binary N-gram file (for conversion from old format). bingram output binary N-gram file.


Convert ARPA files to binary format: % mkbingram ARPA_2gram ARPA_rev_3gram outfile Convert old binary N-gram file to new format: % mkbingram -d old_bingram new_bingram


You can specify the generated binary N-gram file on Julius/Julian using option "-d".




Copyright (c) 1991-2004 Kyoto University, Japan Copyright (c) 2000-2004 Nara Institute of Science and Technology, Japan Copyright (c) 2005 Nagoya Institute of Technology, Japan


LEE Akinobu (Nagoya Institute of Technology, Japan) contact: julius@kuis.kyoto-u.ac.jp


Same as Julius. LOCAL MKBINGRAM(1)

$Id: mkbingram.html.en,v 2007/01/10 08:01:57 kudravka_ Exp $