FastPII Docs
Benchmarks

Benchmark Methodology

How FastPII detector accuracy and false-positive rates were evaluated.

Benchmark Methodology

The benchmark process focuses on real text samples, known ground truth, and normalized comparison of detected values.

Test corpus

The evaluation uses these reference files as ground truth inputs:

  • sample.txt
  • text1.txt
  • text2.txt
  • text3.txt
  • text4.txt
  • text5.txt
  • text6.txt
  • text7.txt

text7.txt is the false-positive control file and is expected to produce 0 findings.

Metrics

Each run measures:

  • Precision
  • Recall
  • F1 score

Detected values are compared in normalized form so formatting differences do not distort the result.

Validation approach

  1. Run the detector against each ground truth file.
  2. Normalize predicted and expected values.
  3. Match findings by detector type and normalized value.
  4. Compute precision, recall, and F1.
  5. Verify that text7.txt stays at zero findings.

Reference implementation

The evaluation workflow is based on eval_v024.py, which provides the scoring logic and corpus-driven comparison used for these published results.

Why this methodology matters

This approach rewards detectors that find the right value, not just a superficially similar substring, and it makes false positives visible instead of hiding them in aggregate totals.

On this page