Research Products
We produce some of our research results as open source software and release them to the public, in cooperation with other research institutions. They are used by many researchers as a research software. The following is a list of software that is publicly available today.
Large Vocabulary ASR Engine Julius
“Julius” is an open-source, high-performance, general-purpose large-vocabulary continuous speech recognition engine for the research and development of speech recognition systems. Continuous speech recognition of tens of thousands of words can be performed in real time on a small device. It is highly versatile and can be applied to a wide range of applications by recombining modules such as phonetic dictionaries, language models, and acoustic models. The functionality is provided in a library and can be embedded in applications.
Julius is a compact and versatile speech recognition engine that is used by many research institutes and companies worldwide. Since its release in 1996, it has had more than 410,000 downloads and nearly 6 million page views, making it the de facto standard for universities in Japan.
You can get full source code, pre-compiled binaries, dictation kit for Japanese, auto-segmentation (alignment) kit, grammar construction kit, and other related tools and documentations from its GitHub site.
MMDAgent-EX: An Open-Source Research and Development Platform for Spoken Dialogue System, Multimodal Dialogue, and Avatar Communication Using CG Avatars
MMDAgent-EX is an open-source research and development platform for voice dialogue, multimodal dialogue, and avatar communication using CG avatars. It has evolved from the MMDAgent, a voice interaction construction toolkit used since April 2011 through September 2022, which included applications such as bidirectional voice guide digital signage near the main gate of the university and the JST Strategic Creative Research Advancement Project (CREST). This platform has been enhanced and extended for research and development purposes. It features improved performance in all aspects, including integration with ChatGPT, sensor data, external collaborations, and control from other applications, offering high extensibility. It also supports multiple operating systems including Windows, macOS, and Ubuntu. It was released as open-source software in December 2023.
The official website provides tutorials, usage explanations, and technical documentation.
MMDAgent: voice interation building toolkit
MMDAgent is a toolkit for building speech interaction and speech dialogue systems. It is highly integrated with speech recognition and speech synthesis technologies, enabling fast, lightweight conversations with CG agents. The 3D-CG rendering module is compatible with existing CG software (MikuMikuDance) and can be combined with interactive scripts to create expressive interactions.
It is open-source, including all modules from audio to rendering, and runs on Windows, MacOS, Linux and Android .It is easy to use with simple module integration via messages, and can be easily extended by adding plug-ins.
Since its presentation and release at CEATEC in 2011, it has been used by many researchers and general users, and has had a significant impact, including the launch of [CREST’s project] (http://www.udialogue.org/).
HELEN: dialogue builder and debugger for MMDAgent
HELEN is an extension package for the Atom editor to edit MMDAgent’s dialogue scenario files (fst files). It has the following three functions.
- Helping to edit FST files (displaying graphs of dialogue flows, automatic checking of recognition dictionaries, etc.)
- Real-time debugging of MMDAgent (real-time visualization of state transitions, sending arbitrary messages)
- Feedback by operation log (MMDAgent operation log → analysis → display feedback in dialogue scenario)
This software was developed and published in 2018 as a result of research by members of our laboratory. Available from GitHub site.