Custom Regex Redaction Patterns
Learn how to write custom regex patterns for the Secret Redactor to detect organization-specific secrets, internal tokens, and proprietary credential formats in your text.
Detailed Explanation
Custom Regex Redaction Patterns
While the Secret Redactor includes built-in detection for common secret types (API keys, tokens, passwords), every organization has unique credential formats that require custom patterns. Writing effective regex patterns for secret detection requires balancing sensitivity (catching all secrets) with specificity (avoiding false positives).
Writing Effective Detection Patterns
A good detection regex should:
- Anchor on unique prefixes — If your tokens start with a known string, use it
- Specify character classes —
[A-Za-z0-9]is better than.* - Define length bounds —
{32,64}prevents matching short strings - Use word boundaries —
\bprevents partial matches within longer strings
Example: Custom Internal Token
Suppose your organization uses tokens formatted as myco_prod_[32 hex chars]:
myco_(prod|staging|dev)_[0-9a-f]{32}
Example: Internal Service Account ID
For a service that generates IDs like svc-12345678-abcd-efgh:
svc-[0-9]{8}-[a-z]{4}-[a-z]{4}
Example: Encoded Credentials
Some systems use Base64-encoded credentials in headers:
Basic\s+[A-Za-z0-9+/]+=*
This matches HTTP Basic Authentication headers where the credentials are Base64-encoded.
Pattern Testing Methodology
Before deploying a custom pattern, test it against:
- Known secrets — Ensure it matches all variations of your credential format
- Similar non-secrets — Verify it does not flag commit hashes, UUIDs, or other legitimate strings
- Edge cases — Test with credentials at the start/end of lines, inside JSON strings, and URL-encoded
Common Pitfalls
- Greedy matching — Using
.+instead of.+?can consume too much text - Missing anchors — Without
\bor specific context, patterns match inside unrelated words - Case sensitivity — Some credentials are case-sensitive, others are not
- Overlapping patterns — Multiple patterns matching the same text can cause double-redaction
Combining Built-in and Custom Patterns
The Secret Redactor applies built-in patterns first, then custom patterns. This layered approach ensures comprehensive coverage: known formats are caught by optimized rules, while organization-specific formats are caught by custom regex.
Use Case
A security team at a fintech company uses internally generated tokens with a proprietary format (company prefix + environment + random string). By adding a custom regex pattern to the Secret Redactor, they ensure these internal tokens are automatically detected alongside standard API keys and passwords whenever any team member redacts text.