🔮 PPM Predictor

Multi-language text prediction with adaptive learning

About PPM (Prediction by Partial Matching)

Prediction by Partial Matching (PPM) is a character-based language modeling algorithm originally developed for data compression by Cleary and Witten (1984). PPM predicts the next character in a sequence by analyzing patterns in the preceding context, using variable-length contexts to balance specificity with robustness. Unlike fancy Large Language Models it cant magically infer from untrained text - but it is remarkably efficient if the data is in its training.

PPM gained prominence in assistive technology as the prediction engine powering Dasher, an innovative text-entry interface developed by the Inference Group at the University of Cambridge. Dasher uses PPM's probabilistic predictions to enable efficient text entry through continuous gestures, making it particularly valuable for users with motor impairments.

This implementation is based on the PPM language model from Google Research's JSLM project (ppm_language_model.js). This Node.js library adapts the Google Research implementation, adds word-level prediction capabilities, error tolerance features, and publishes it as an npm package for easy integration into web and Node.js applications.

In this demo, we integrate PPMPredictor with WorldAlphabets, an npm package providing frequency lists and keyboard layouts for 100+ languages. This demonstrates PPM's ability to work seamlessly across different languages and writing systems, making it ideal for multilingual assistive technology applications.

Loading libraries...

⚙️ Configuration

0
Lexicon Words
0
Learned Words
0
Training Chars
0
Bigrams
🔧 Advanced: Custom Language Data

Override the lexicon and training data for the current language, or create your own custom language (e.g., Klingon, Elvish, etc.)

✍️ Text Prediction

Predictions will appear here as you type...

📚 Learned Words (Adaptive Learning)

When adaptive learning is enabled, new words you type are learned automatically.

No learned words yet. Enable adaptive learning and start typing!

⌨️ Keyboard-Aware Error Tolerance

When enabled, the predictor considers physical key proximity when matching typos. For example, typing "heklo" instead of "hello" is more likely than "hezlo" because 'k' and 'l' are adjacent on a QWERTY keyboard. Learn more →

Select a keyboard layout to see the adjacency map...

💡 Features Demonstrated

  • Multi-language Support: 24+ languages with real training data and frequency-based lexicons from WorldAlphabets
  • Adaptive Learning: The model learns new words as you type (docs)
  • Next-Word Prediction: Bigram-based predictions suggest the next word after completing a word (docs)
  • Error Tolerance: Fuzzy matching finds words even with typos (docs)
  • Keyboard-Aware Typos: Considers physical key proximity (docs)
  • PPM Algorithm: Uses Prediction by Partial Matching for context-aware predictions
  • Real-time Configuration: All settings can be changed on-the-fly (API docs)
💡 Try these experiments:
  • Type with intentional typos and see how fuzzy matching corrects them
  • Enable next-word prediction and type common phrases to see word-pair suggestions
  • Switch languages mid-sentence to see how the predictions adapt
  • Enable adaptive learning and type new words to see them added to the lexicon
  • Compare keyboard-aware vs. standard fuzzy matching with common typos
  • Try the linear ABC layout to simulate switch scanning for AAC users