OpenAI Privacy Filter: open-weight on-device PII redaction

1 min read
pii-redactionopen-weightsprivacytoken-classificationon-device-aiapache-2-0
View as Markdown
Originally from openai.com
View source

My notes

Summary

OpenAI released Privacy Filter, an open-weight (Apache 2.0) 1.5B-parameter token-classification model for context-aware PII detection and redaction in unstructured text. It supports 128k context, runs locally so raw data never has to leave the machine, and hits 96% F1 on the PII-Masking-300k benchmark. It’s positioned not as a compliance tool but as infrastructure for privacy-preserving pipelines (logging, indexing, training, review).

Key Insight

  • Architecture is unusual: starts as an autoregressive checkpoint, then converted to a bidirectional token classifier with BIOES span decoding via constrained Viterbi. Single forward pass labels every token, no autoregressive generation, hence very fast.
  • Model is 1.5B total / 50M active parameters, small enough to run on a laptop, frontier-level on PII benchmarks.
  • Eight detection categories: private_person, private_address, private_email, private_phone, private_url, private_date, account_number (covers cards/banking), secret (covers passwords/API keys). The secret class is rare in off-the-shelf PII tools and useful for log/code redaction.
  • Benchmarks: 96% F1 on PII-Masking-300k as-is, 97.43% F1 on a corrected version (OpenAI found annotation errors in the public set). Fine-tuning lifted F1 from 54% to 96% on a domain-adaptation task with “small amount of data”.
  • Operating point is configurable, precision/recall trade-off can be tuned per workflow without retraining.
  • Explicit disclaimers: it’s not anonymization, not compliance, and accuracy degrades on short sequences and domain-specific text (legal, medical, finance still need human review).
  • Available now on Hugging Face (openai/privacy-filter) and GitHub (openai/privacy-filter) under Apache 2.0, commercial use allowed.
  • Why this matters: traditional PII tools are regex/rule-based and miss context (e.g., a name in a quote vs. a name in a customer record). A small model that runs on-device closes the gap without sending data to a third-party API.