FastPII Docs
Getting Started

Core Concepts

Detection, validation, privacy transformation modes, overlap resolution, and pattern extensibility in FastPII.

Core Concepts

Detection

Use detect() to scan text for supported entities.

from fastpii import PrivacyGuard

guard = PrivacyGuard(regions=["cz"])
result = guard.detect("Email: jan.novak@example.cz, RČ: 8001011238")

detect() returns a DetectionResult with a list[Finding] in result.findings.

Each Finding contains:

  • type
  • value
  • start
  • end
  • confidence
  • region
  • metadata

Validation

Use validate() when you already know the detector type and need to confirm whether a single value is structurally valid.

from fastpii import PrivacyGuard

guard = PrivacyGuard(regions=["cz"])
result = guard.validate("8001011238", "rodne_cislo")

validate() returns a ValidationResult with:

  • is_valid
  • metadata

For checksum-backed identifiers, validation uses official structural rules such as Mod 11 and weighted checksum calculations.

Anonymization

Use anonymize() to replace every detected value with the same placeholder.

guard.anonymize("Email: jan.novak@example.cz, RČ: 8001011238")

Output:

Email: [REDACTED], RČ: [REDACTED]

You can also pass a custom replacement string:

guard.anonymize("Email: jan.novak@example.cz", replacement="[PII]")

Redaction

Use redact() to replace each finding with a type-specific label.

guard.redact("Email: jan.novak@example.cz, RČ: 8001011238")

Output:

Email: [EMAIL], RČ: [RODNE_CISLO]

Masking

Use mask() to replace each finding with asterisks while preserving original length.

guard.mask("Email: jan.novak@example.cz, RČ: 8001011238")

Output:

Email: ********************, RČ: **********

Removal

Use remove() to delete detected values entirely.

guard.remove("Email: jan.novak@example.cz, RČ: 8001011238")

Output:

Email: , RČ: 

Overlap resolution

When multiple findings overlap, FastPII deduplicates them before returning the final DetectionResult.

Resolution order is based on:

  1. confidence
  2. type priority
  3. span length

The built-in priority logic favors checksum-backed identifiers over broader personal or contact entities. In the current implementation, the priority order includes:

  • checksum-backed identifiers first: rodne_cislo, ico, dic, bank_account
  • then broader entities such as address, email, date_of_birth, name
  • then lower-priority spans such as postal_code, vehicle_plate, phone

This reduces false positives when one detector captures a substring that belongs to a stronger match.

Pattern registry

FastPII uses a pattern registry and detector registry internally. PrivacyGuard registers built-in detectors for region cz, and custom detectors can also be registered programmatically.

This architecture is what makes regional extension possible without changing the PrivacyGuard public API.

On this page