Multi-language text prediction with adaptive learning
Prediction by Partial Matching (PPM) is a character-based language modeling algorithm originally developed for data compression by Cleary and Witten (1984). PPM predicts the next character in a sequence by analyzing patterns in the preceding context, using variable-length contexts to balance specificity with robustness. Unlike fancy Large Language Models it cant magically infer from untrained text - but it is remarkably efficient if the data is in its training.
PPM gained prominence in assistive technology as the prediction engine powering Dasher, an innovative text-entry interface developed by the Inference Group at the University of Cambridge. Dasher uses PPM's probabilistic predictions to enable efficient text entry through continuous gestures, making it particularly valuable for users with motor impairments.
This implementation is based on the PPM language model from Google Research's JSLM project (ppm_language_model.js). This Node.js library adapts the Google Research implementation, adds word-level prediction capabilities, error tolerance features, and publishes it as an npm package for easy integration into web and Node.js applications.
In this demo, we integrate PPMPredictor with WorldAlphabets, an npm package providing frequency lists and keyboard layouts for 100+ languages. This demonstrates PPM's ability to work seamlessly across different languages and writing systems, making it ideal for multilingual assistive technology applications.
Override the lexicon and training data for the current language, or create your own custom language (e.g., Klingon, Elvish, etc.)
When adaptive learning is enabled, new words you type are learned automatically.
When enabled, the predictor considers physical key proximity when matching typos. For example, typing "heklo" instead of "hello" is more likely than "hezlo" because 'k' and 'l' are adjacent on a QWERTY keyboard. Learn more →