PPM Predictor - Multi-Language Text Prediction Demo

About PPM (Prediction by Partial Matching)

Prediction by Partial Matching (PPM) is a character-based language modeling algorithm originally developed for data compression by Cleary and Witten (1984). PPM predicts the next character in a sequence by analyzing patterns in the preceding context, using variable-length contexts to balance specificity with robustness. Unlike fancy Large Language Models it cant magically infer from untrained text - but it is remarkably efficient if the data is in its training.

PPM gained prominence in assistive technology as the prediction engine powering Dasher, an innovative text-entry interface developed by the Inference Group at the University of Cambridge. Dasher uses PPM's probabilistic predictions to enable efficient text entry through continuous gestures, making it particularly valuable for users with motor impairments.

This implementation is based on the PPM language model from Google Research's JSLM project (ppm_language_model.js). This Node.js library adapts the Google Research implementation, adds word-level prediction capabilities, error tolerance features, and publishes it as an npm package for easy integration into web and Node.js applications.

In this demo, we integrate PPMPredictor with WorldAlphabets, an npm package providing frequency lists and keyboard layouts for 100+ languages. This demonstrates PPM's ability to work seamlessly across different languages and writing systems, making it ideal for multilingual assistive technology applications.

⚙️ Configuration

Language:

Keyboard Layout:

Fuzzy Matching ⓘ

Keyboard-Aware Typos ⓘ

Adaptive Learning ⓘ

Next-Word Prediction ⓘ

0

Lexicon Words

0

Learned Words

0

Training Chars

0

Bigrams

🔧 Advanced: Custom Language Data

Override the lexicon and training data for the current language, or create your own custom language (e.g., Klingon, Elvish, etc.)

Custom Lexicon (one word per line):

Custom Training Text:

✍️ Text Prediction

Type here:

Predictions will appear here as you type...

📚 Learned Words (Adaptive Learning)

When adaptive learning is enabled, new words you type are learned automatically.

No learned words yet. Enable adaptive learning and start typing!

⌨️ Keyboard-Aware Error Tolerance

When enabled, the predictor considers physical key proximity when matching typos. For example, typing "heklo" instead of "hello" is more likely than "hezlo" because 'k' and 'l' are adjacent on a QWERTY keyboard. Learn more →

Select a keyboard layout to see the adjacency map...

💡 Features Demonstrated

Multi-language Support: 24+ languages with real training data and frequency-based lexicons from WorldAlphabets
Adaptive Learning: The model learns new words as you type (docs)
Next-Word Prediction: Bigram-based predictions suggest the next word after completing a word (docs)
Error Tolerance: Fuzzy matching finds words even with typos (docs)
Keyboard-Aware Typos: Considers physical key proximity (docs)
PPM Algorithm: Uses Prediction by Partial Matching for context-aware predictions
Real-time Configuration: All settings can be changed on-the-fly (API docs)

💡 Try these experiments:

Type with intentional typos and see how fuzzy matching corrects them
Enable next-word prediction and type common phrases to see word-pair suggestions
Switch languages mid-sentence to see how the predictions adapt
Enable adaptive learning and type new words to see them added to the lexicon
Compare keyboard-aware vs. standard fuzzy matching with common typos
Try the linear ABC layout to simulate switch scanning for AAC users