Detectors
Personal Name
Detect Czech personal names with dictionary-backed matching and gender hints from surname patterns.
Purpose
Use the name detector for Czech personal names in running text.
This detector reports accuracy above 90% and combines a Czech name database with first-name plus surname pattern matching.
It also classifies likely gender and can infer marital status from -ová surname endings.
Detector Name
nameSupported Formats
FirstName Surname- Czech names with diacritics
- Female surnames ending in
-ová
Examples:
Jan NovákJana Nováková
Validation Logic
The detector matches two capitalized Czech name parts and then filters the result through dictionary and confidence rules.
Detection Rules
- Match a capitalized first name plus surname pattern.
- Ignore heading-like words such as
customer,report, oroverview. - Classify likely gender from surname and first-name data.
- Compute confidence from dictionary membership.
- Reject matches with confidence below
0.5.
Gender Logic
FastPII uses this priority order:
- Surname ending in
ováorova-> female - Known surname database match
- Known first-name database match
- Fallback first-name ending heuristic
There is no checksum algorithm for this detector.
Python Examples
Detect a Czech name
from fastpii import PrivacyGuard
guard = PrivacyGuard(regions=["cz"])
result = guard.detect("Jan Novák přijel domu", detector_names=["name"])
for finding in result.findings:
print(finding.value, finding.metadata)Validate a name string
from fastpii import PrivacyGuard
guard = PrivacyGuard(regions=["cz"])
result = guard.validate("Jan Novák", "name")
print(result.is_valid)
print(result.metadata)Expected Metadata
This detector documents these metadata fields:
firstnamesurnamegendermarital_status
The detector uses -ová as the strong married-female surname signal.
Example Output
{
"firstname": "Jan",
"surname": "Novák",
"gender": "m",
}Limitations
- Confidence depends on the built-in Czech name data.
- Ambiguous capitalized phrases may still be rejected if confidence is too low.
marital_statusis only added for female surnames ending inováorova.
Notes
Verified detection example from the codebase:
guard.detect("Jan Novák přijel domu")