Patent Full nameSystem and method for joining morphemes into lexical units and their transcription into hiragana syllables and Latin characters of a Japanese text. [Bunparser].

Active from/to: 2016 – 2018

Inventors: Dr. Alessandro Mantelli, Prof. Marcella Mariotti

Owner: Ca’ Foscari University of Venice

What is this project about?

Bunparser: System and method for joining morphemes into lexical units and related transcription into hiragana syllables and Latin characters of a Japanese text was devised and patented (No. IT201900002235, 15/2/2021) within the JaLea project (Mariotti, Mantelli 2016 – 2018).

The implementation of the algorithm underlying this patent makes it possible to divide a Japanese text into long lexical units LUW (Long Unit Word) and assign to each unit the correct transcription in hiragana and rōmaji.

Bunparser also allows the correct transcription in hiragana and rōmaji of numerical values written in Japanese or Arabic numerals and their classifiers, a functionality not normally present in other morphological analysers such as Mecab, Cabocha, Juman and Chasen.