FastPII Docs
Detectors

Email Address

Detect email addresses with Czech domain awareness and support for markdown mailto links.

Purpose

Use the email detector for email addresses in Czech-language or mixed-language text.

This detector reports accuracy above 95% and boosts confidence for Czech-oriented domains such as .cz and .sk.

It also handles markdown mailto links by detecting the visible email address and skipping the duplicate mailto: target.

Detector Name

email

Supported Formats

  • Standard email addresses
  • Czech domain-aware emails such as .cz and .sk
  • Markdown mailto links like [jan.novak@email.cz](mailto:jan.novak@email.cz)

Examples:

  • jan.novak@email.cz
  • kontakt@firma.sk

Validation Logic

The detector first matches candidate email strings using the shared email pattern.

It then applies lightweight format validation.

Validation Rules

  1. The value must contain exactly one @.
  2. The local part must be 1-64 characters.
  3. The domain must be 1-255 characters.
  4. The domain must contain at least one dot.
  5. The full address must not contain consecutive dots.

Markdown Mailto Handling

When text contains a markdown mailto link, FastPII records only the visible email and skips the duplicated mailto: portion.

There is no checksum algorithm for this detector.

Python Examples

Detect a Czech email address

from fastpii import PrivacyGuard

guard = PrivacyGuard(regions=["cz"])
result = guard.detect("Contact: jan.novak@email.cz", detector_names=["email"])

for finding in result.findings:
    print(finding.value, finding.metadata)

Validate an email directly

from fastpii import PrivacyGuard

guard = PrivacyGuard(regions=["cz"])
result = guard.validate("jan.novak@email.cz", "email")

print(result.is_valid)
print(result.metadata)

Expected Metadata

This detector can return:

  • local_part
  • domain
  • provider
  • is_czech_domain

Known provider values include gmail, seznam, centrum, email, microsoft, and yahoo when the domain matches those families.

Example Output

{
    "local_part": "jan.novak",
    "domain": "email.cz",
    "provider": "email",
    "is_czech_domain": True,
}

Limitations

  • Validation is structural, not mailbox-aware.
  • Provider detection is heuristic and domain-based.
  • Non-Czech domains are still supported, but with lower confidence than .cz or .sk.

Notes

Verified detection example from the codebase:

guard.detect("Contact: jan.novak@email.cz")

On this page