We produce some of our research results as open source software and release them to the public, in cooperation with other research institutions. They are used by many researchers as a research software. The following is a list of software that is publicly available today.
Large Vocabulary ASR Engine Julius
“Julius” is an open-source, high-performance, general-purpose large-vocabulary continuous speech recognition engine for the research and development of speech recognition systems. Continuous speech recognition of tens of thousands of words can be performed in real time on a small device. It is highly versatile and can be applied to a wide range of applications by recombining modules such as phonetic dictionaries, language models, and acoustic models. The functionality is provided in a library and can be embedded in applications.
Julius is a compact and versatile speech recognition engine that is used by many research institutes and companies worldwide. Since its release in 1996, it has had more than 410,000 downloads and nearly 6 million page views, making it the de facto standard for universities in Japan.
You can get full source code, pre-compiled binaries, dictation kit for Japanese, auto-segmentation (alignment) kit, grammar construction kit, and other related tools and documentations from its GitHub site.
MMDAgent-EX: an Open Dialogue Agent Platform
MMDAgent-EX is a mobile dialogue agent application. It can play any MMDAgent dialogue content on any device on Windows, MacOS, Linux, as well as iOS and Android. This app is upwardly compatible with MMDAgent and includes various enhancements such as log data collection, content protection, and support for PMX models, as well as a mechanism for distributing interactive content over the web. A beta version was created and released in 2019 for the study of voice interactive interactions in cloud environments, data collection, and creation of conversational agents.
MMDAgent: voice interation building toolkit
MMDAgent is a toolkit for building speech interaction and speech dialogue systems. It is highly integrated with speech recognition and speech synthesis technologies, enabling fast, lightweight conversations with CG agents. The 3D-CG rendering module is compatible with existing CG software (MikuMikuDance) and can be combined with interactive scripts to create expressive interactions.
It is open-source, including all modules from audio to rendering, and runs on Windows, MacOS, Linux and Android .It is easy to use with simple module integration via messages, and can be easily extended by adding plug-ins.
Since its presentation and release at CEATEC in 2011, it has been used by many researchers and general users, and has had a significant impact, including the launch of [CREST’s project] (http://www.udialogue.org/).
HELEN: dialogue builder and debugger for MMDAgent
HELEN is an extension package for the Atom editor to edit MMDAgent’s dialogue scenario files (fst files). It has the following three functions.
- Helping to edit FST files (displaying graphs of dialogue flows, automatic checking of recognition dictionaries, etc.)
- Real-time debugging of MMDAgent (real-time visualization of state transitions, sending arbitrary messages)
- Feedback by operation log (MMDAgent operation log → analysis → display feedback in dialogue scenario)
This software was developed and published in 2018 as a result of research by members of our laboratory. Available from GitHub site.