FastPII Docs
Guides

Custom Detectors

Build and register custom detectors by subclassing the Detector base class.

You can extend FastPII by subclassing Detector and registering your detector with PrivacyGuard.register_detector().

Base class

Import the abstract base class from fastpii.detectors.base.

from fastpii.detectors.base import Detector

Every detector must provide:

  • name
  • region
  • description
  • detect(text) -> list[Finding]
  • validate(value) -> bool

Full working example

import re

from fastpii import PrivacyGuard
from fastpii.models import Finding
from fastpii.detectors.base import Detector


class EmployeeIdDetector(Detector):
    def __init__(self) -> None:
        super().__init__(
            name="employee_id",
            region="internal",
            description="Detect internal employee IDs in the format EMP-1234",
        )
        self.pattern = re.compile(r"\b(EMP-\d{4})\b")

    def detect(self, text: str) -> list[Finding]:
        findings: list[Finding] = []

        for match in self.pattern.finditer(text):
            value = match.group(1)
            if self.validate(value):
                findings.append(
                    Finding(
                        type=self.name,
                        value=value,
                        start=match.start(),
                        end=match.end(),
                        confidence=1.0,
                        region=self.region,
                        metadata=self._extract_metadata(value),
                    )
                )

        return findings

    def validate(self, value: str) -> bool:
        return bool(re.fullmatch(r"EMP-\d{4}", value))

    def _extract_metadata(self, value: str) -> dict[str, object]:
        return {
            "prefix": value.split("-")[0],
            "numeric_part": value.split("-")[1],
        }


guard = PrivacyGuard(regions=["cz"])
guard.register_detector(EmployeeIdDetector())

result = guard.detect("Assigned to EMP-1234 and jan.novak@email.cz")

for finding in result.findings:
    print(finding.type, finding.value, finding.metadata)

validation = guard.validate("EMP-1234", "employee_id")
print(validation.is_valid)
print(validation.metadata)

_extract_metadata protocol

PrivacyGuard.validate() checks whether the detector has a _extract_metadata method and uses it to populate ValidationResult.metadata.

That method is optional, but it is the supported way to attach structured information to validation results.

Access the detector later

detector = guard.get_detector("employee_id")
print(detector.description)

List registered detectors

for detector in guard.list_detectors():
    print(detector.name, detector.region)

On this page