OpenAI Privacy Filter: open-weight on-device PII redaction
1 min read
Originally from openai.com
View source
My notes
Summary
OpenAI released Privacy Filter, an open-weight (Apache 2.0) 1.5B-parameter token-classification model for context-aware PII detection and redaction in unstructured text. It supports 128k context, runs locally so raw data never has to leave the machine, and hits 96% F1 on the PII-Masking-300k benchmark. It’s positioned not as a compliance tool but as infrastructure for privacy-preserving pipelines (logging, indexing, training, review).
Key Insight
- Architecture is unusual: starts as an autoregressive checkpoint, then converted to a bidirectional token classifier with BIOES span decoding via constrained Viterbi. Single forward pass labels every token, no autoregressive generation, hence very fast.
- Model is 1.5B total / 50M active parameters, small enough to run on a laptop, frontier-level on PII benchmarks.
- Eight detection categories:
private_person,private_address,private_email,private_phone,private_url,private_date,account_number(covers cards/banking),secret(covers passwords/API keys). Thesecretclass is rare in off-the-shelf PII tools and useful for log/code redaction. - Benchmarks: 96% F1 on PII-Masking-300k as-is, 97.43% F1 on a corrected version (OpenAI found annotation errors in the public set). Fine-tuning lifted F1 from 54% to 96% on a domain-adaptation task with “small amount of data”.
- Operating point is configurable, precision/recall trade-off can be tuned per workflow without retraining.
- Explicit disclaimers: it’s not anonymization, not compliance, and accuracy degrades on short sequences and domain-specific text (legal, medical, finance still need human review).
- Available now on Hugging Face (
openai/privacy-filter) and GitHub (openai/privacy-filter) under Apache 2.0, commercial use allowed. - Why this matters: traditional PII tools are regex/rule-based and miss context (e.g., a name in a quote vs. a name in a customer record). A small model that runs on-device closes the gap without sending data to a third-party API.