Installation notes for Aligner 2.4beta ====================================== Dr. Stefan Rapp, rapp@conante.com, 2006-08-14 Supported Systems ================= Aligner should install (at least) on Linux and Windows (using the linux API extension cygwin) computers. It has been installed on IRIX and Solaris before, this version however has only been tested to install properly on Ubuntu 8.04 LTS and Cygwin under Windows XP. No special features required, so expect to install this version successfully also on different flavours of Unix most probably also on OS X. We're happy to hear from any successful installation and possible modifications that have been found necessary. For the installation, you may need to install further packages in your Linux or Cygwin installation, e.g. flex, tcsh and gcc. Prerequisites ============= 1. You need to install HTK before. You can get HTK from http://htk.eng.cam.ac.uk You will need to register in order to receive the code. Registration and download are free. Current version is 3.4.1 (testing was done with this version). There is also a compiled version available for windows, but it has not been tested whether it will work with the Aligner yet. (I would expect it to work though, please tell me, if you try.) Follow the HTK installation instructions. You can set most environment variables for compilation/installation by sourcing a file matching your hardware in the directory env. Under cygwin, the linux files are also appropriate. If you have trouble compiling HTKLib/HGraf or HTKTools/HSLab, simply remove these targets from the HTKLib/makefile and HTKTools/makefile. The Aligner doesn't need them, and there are other, more suitable tools available for viewing signal and label files. The Aligner assumes the HTKTools HVite, HCopy, HParse etc. to be available in your path. 2. When installing Aligner under Windows, you will also need to install cygwin before. You can get cygwin from http://www.cygwin.com Make sure that you select the tcsh and flex packages, they might not be installed by default. Aligner also relies on the availability of gawk, sed, perl, and sh -- they should, however, be included in a standard install. Please report any hints for installation of cygwin to me. 3. You will need a large vocabulary pronunciation lexicon for reasonable transcriptions, or gradually build up your own pronunciation lexicon. Currently, Aligner uses rather simple rule based grapheme to phoneme conversion that might not be up to your transcription standards. At IMS and Sony, Celex (deu, eng) and BDLEX (fra) was used. You can purchase a Celex licence at LDC for $150. (It is #2 in the top 10 most sold corpora of LDC) http://www.ldc.upenn.edu If you send me evidence that you have a valid license, I can send you the file celex.dict already in the appropriate format. Otherwise contact me for file format specifications of .dict files for integration of your preferred pronunciation lexicon. Automatic conversion from celex is planned to be included in a future release. It is also planned to include a corpus based grapheme to phoneme (G2P) conversion in the future. For eng, there is a file describing the necessary steps for generating a dictionary from Celex sources, and a preliminary version of a corpus based G2P. Installing Aligner ================== [0.] Make sure that you have HTK installed and available in your path. [1.] Unpack the Aligner to a suitable directory: tar xf Aligner-2-4-beta.tgz [2.] Compile the source codes: cd Aligenr/src make clean make [3.] Install (copy) the binaries into Aligner/bin: Under linux say: make install Under cygwin say: make cygwininstall (cygwin binaries have a .exe extension, so make install will fail) [4.] Generate index files for the dictionaries Set your locale to C: Under bash say: export LANG=C Under (t)csh say: setenv LANG C You can check if your locale is C by saying locale You should have at least LC_COLLATE set to "C" by now. Now sort your lexica and create the indices: cd ../dict make (You do not need to install them to anywhere else. For the English g2p you need the java compiler in your path) [5.] Set up your environment and path Aligner needs the following environment variables. Using export or setenv, set them to these values: ALIGNERHOME ...your_chosen_path.../Aligner ALANG deu It is recommended to add $ALIGNERHOME/bin/$ALANG to your PATH variable: under bash: export PATH=$ALIGNERHOME/bin/$ALANG:$PATH or, under (t)csh: setenv PATH $ALIGNERHOME/bin/$ALANG:$PATH Say cd Alignphones to see if the Aligner is found in your path. (You might need to say 'rehash' before under (t)csh.) Take the time to recheck that the HTK tools are in your path as well (say, e.g., 'HVite' to view its usage message). [6.] Aligner is ready to use. Enjoy. See USAGE on what to do with it. Suggestions on how to improve this installation instructions are always welcome. Please report any errors, inconsistencies or problems with installation to me via e-mail: rapp@conante.com. You can also reach me by phone: +49-7426-933-883