SCAN Plugin Pattern Reference
This document provides a comprehensive reference for all built-in patterns used by the SCAN plugin and guidance on creating custom patterns.
Pattern Overview
SCAN uses regular expressions (regex) to identify potential secrets in your codebase. The plugin includes over 50 built-in patterns covering common secret types, and you can add custom patterns for organization-specific secrets.
How Patterns Work
- Pattern Matching: SCAN scans each file line by line, applying regex patterns
- Context Analysis: The surrounding code context is analyzed to reduce false positives
- Entropy Analysis: High-entropy strings are flagged even if they don't match specific patterns
- Confidence Scoring: Each finding is assigned a confidence level based on multiple factors
Built-in Patterns
AWS Patterns
AWS Access Key ID
Pattern: AKIA[0-9A-Z]{16}
Example: AKIAIOSFODNN7EXAMPLE
Confidence: High
AWS Secret Access Key
Pattern: aws(.{0,20})?['"][0-9a-zA-Z/+]{40}['"]
Example: aws_secret_access_key="wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
Confidence: High
Google Cloud Platform (GCP)
GCP API Key
Pattern: AIza[0-9A-Za-z-_]{35}
Example: AIzaSyDaGmWKa4JsXZ-HjGw7ISLn_3namBGewQe
Confidence: High
Azure
Azure Storage Account Key
Pattern: DefaultEndpointsProtocol=https;AccountName=.*;AccountKey=.*
Description: Azure storage connection strings
Confidence: High
Cloud Services
GitHub
GitHub Personal Access Token
Pattern: ghp_[A-Za-z0-9]{36}
Example: ghp_1234567890abcdef1234567890abcdef12345678
Confidence: High
GitHub OAuth Token
Pattern: gho_[A-Za-z0-9]{36}
Description: GitHub OAuth access tokens
Confidence: High
GitLab
GitLab Personal Access Token
Pattern: glpat-[A-Za-z0-9-_]{20}
Description: GitLab personal access tokens
Confidence: High
Database Credentials
Generic Database URLs
Database Connection String
Pattern: (mysql|postgresql|mongodb|redis)://[^s]+:[^s]+@[^s]+
Example: mysql://user:password@localhost:3306/database
Confidence: High
API Keys and Tokens
Service-Specific APIs
Slack Token
Pattern: xox[baprs]-[A-Za-z0-9]{10,48}
Example: xoxb-1234567890-1234567890-abcdefghijklmnopqrstuvwx
Confidence: High
Stripe API Key
Pattern: sk_live_[A-Za-z0-9]{24}
Description: Stripe live API keys
Confidence: High
Cryptographic Keys
Private Key
Pattern: -----BEGIN [A-Z]+ PRIVATE KEY-----
Description: PEM-formatted private keys
Confidence: High
Custom Pattern Creation
Basic Custom Patterns
Add custom patterns to catch organization-specific secrets:
scan {
customPatterns = listOf(
"MYCOMPANY_API_[A-Z0-9]{32}",
"INTERNAL_SECRET_[a-f0-9]{64}",
"PROD_KEY_[A-Za-z0-9\\-_]{40}"
)
}
Advanced Custom Patterns
scan {
customPatterns = listOf(
// Match passwords in specific file types
"(?i)password\\s*[=:]\\s*[\"'][^\"']{8,}[\"']",
// Match API keys with specific prefixes
"(?:api[_-]?key|apikey)\\s*[=:]\\s*[\"']?([A-Za-z0-9]{20,})[\"']?",
// Organization-specific format
"ACME_[A-Z]{2}_[0-9]{8}_[A-Za-z0-9]{16}"
)
}
Pattern Files
You can organize custom patterns in separate YAML files for better maintainability:
patterns/custom-patterns.yml
patterns:
- name: "Company API Key"
regex: "COMPANY_API_[A-Z0-9]{32}"
confidence: "high"
description: "Internal company API keys"
- name: "Internal Secret"
regex: "INTERNAL_SECRET_[A-Za-z0-9]{16,}"
confidence: "medium"
description: "Internal application secrets"
Loading Pattern Files
scan {
customPatternsFile = "patterns/custom-patterns.yml"
}
Best Practices
Pattern Design
- Be Specific: Avoid overly broad patterns that cause false positives
- Use Anchors: Use
^
and$
when matching entire lines - Consider Context: Think about where the pattern might appear
- Test Thoroughly: Validate against real codebases
Common Pitfalls
❌ Overly Broad Patterns
// Too broad - will match everything
".*password.*"
// Better - more specific
"password\\s*=\\s*[\"'][^\"']{8,}[\"']"
✅ Case Insensitive Patterns
// Case sensitive - might miss variations
"Password"
// Case insensitive
"(?i)password"
Related Documentation
- Configuration Reference: How to configure custom patterns
- User Guide: Complete walkthrough with examples
- CI/CD Examples: Pattern usage in automation