Email Address
Detect email addresses with Czech domain awareness and support for markdown mailto links.
Purpose
Use the email detector for email addresses in Czech-language or mixed-language text.
This detector reports accuracy above 95% and boosts confidence for Czech-oriented domains such as .cz and .sk.
It also handles markdown mailto links by detecting the visible email address and skipping the duplicate mailto: target.
Detector Name
emailSupported Formats
- Standard email addresses
- Czech domain-aware emails such as
.czand.sk - Markdown mailto links like
[jan.novak@email.cz](mailto:jan.novak@email.cz)
Examples:
jan.novak@email.czkontakt@firma.sk
Validation Logic
The detector first matches candidate email strings using the shared email pattern.
It then applies lightweight format validation.
Validation Rules
- The value must contain exactly one
@. - The local part must be 1-64 characters.
- The domain must be 1-255 characters.
- The domain must contain at least one dot.
- The full address must not contain consecutive dots.
Markdown Mailto Handling
When text contains a markdown mailto link, FastPII records only the visible email and skips the duplicated mailto: portion.
There is no checksum algorithm for this detector.
Python Examples
Detect a Czech email address
from fastpii import PrivacyGuard
guard = PrivacyGuard(regions=["cz"])
result = guard.detect("Contact: jan.novak@email.cz", detector_names=["email"])
for finding in result.findings:
print(finding.value, finding.metadata)Validate an email directly
from fastpii import PrivacyGuard
guard = PrivacyGuard(regions=["cz"])
result = guard.validate("jan.novak@email.cz", "email")
print(result.is_valid)
print(result.metadata)Expected Metadata
This detector can return:
local_partdomainprovideris_czech_domain
Known provider values include gmail, seznam, centrum, email, microsoft, and yahoo when the domain matches those families.
Example Output
{
"local_part": "jan.novak",
"domain": "email.cz",
"provider": "email",
"is_czech_domain": True,
}Limitations
- Validation is structural, not mailbox-aware.
- Provider detection is heuristic and domain-based.
- Non-Czech domains are still supported, but with lower confidence than
.czor.sk.
Notes
Verified detection example from the codebase:
guard.detect("Contact: jan.novak@email.cz")