Security
Pipeline Overview
Every knowledge entry passes through 3 stages before storage:
Stage 1: Regex Patterns (20+ built-in)
| Pattern | Example Detected |
|---|---|
| AWS Access Key | AKIAIOSFODNN7EXAMPLE |
| GitHub Token | ghp_ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefgh |
| OpenAI Key | sk-proj-... (80+ chars) |
| Stripe Secret | sk_live_... |
| Private Key | -----BEGIN RSA PRIVATE KEY----- |
| JWT | eyJhbGci... (3-part structure) |
| DB Connection | postgres://user:pass@host/db |
| Password Assignment | password="secret123" |
Stage 2: Entropy Detection
Catches secrets that don't match known patterns. Uses Shannon entropy to find high-randomness strings near keywords like key, token, secret, auth.
Stage 3: AI Classification
GPT-5-mini fallback for ambiguous content. Only runs when Stages 1-2 find nothing in text longer than 100 characters.
Detection Examples
Input:
My API key is sk-proj-abc123def456ghi789jkl012mno345pqr678stu901vwx234yzAfter protection:
My API key is [MASKED:openai_key]Input:
Connect to postgres://admin:s3cret_p4ss@db.prod.example.com:5432/myappAfter protection:
Connect to [MASKED:connection_string]Input:
Set SECRET_TOKEN=aF3kR9mN2xP7qW4sD8jL1bV6yT0uE5hZAfter protection:
Set [MASKED:env_secret_assignment]What Passes Safely
These are not detected as secrets:
- UUIDs:
550e8400-e29b-41d4-a716-446655440000 - URLs without credentials:
https://docs.example.com/api/v1 - Short hashes:
abc123def(under 16 characters) - Package versions:
next@16.1.6 - Normal code: function names, variable names, imports
Custom Rules
Projects can add custom regex patterns via Settings > Security Rules:
- Add a regex pattern (e.g.,
INTERNAL_.*_KEY) - Choose action: Mask (replace with placeholder) or Block (prevent saving)
- Test with sample input before saving
- Enable/disable rules at any time