DefaultRiskScorer evaluates the action across 5 independent factors and produces a weighted composite score between 0.0 and 1.0. This score determines which challenge the human operator must complete before execution proceeds.
The 5 Scoring Factors
| Factor | Weight | What It Analyzes |
|---|---|---|
function_name | 0.30 | Verb classification — destructive, mutating, or read-only |
arguments | 0.25 | Sensitive patterns in argument values (credentials, SQL, shell commands) |
docstring | 0.20 | High-risk or caution keywords in the function’s docstring |
hints | 0.15 | Caller-supplied risk hints (boolean flags, numeric values) |
novelty | 0.10 | How many times this function has been called before |
All factor scores are clamped to the
[0.0, 1.0] range before weighting. The final composite score is also clamped, ensuring it never exceeds 1.0.Factor 1: Function Name (weight: 0.30)
The scorer extracts the verb from the function name and classifies it into one of three tiers.Destructive Verbs — score 0.95
These verbs indicate irreversible or highly impactful operations:
Mutating Verbs — score 0.55
These verbs modify state but are typically reversible:
Read Verbs — score 0.10
These verbs indicate read-only operations with minimal risk:
Factor 2: Arguments (weight: 0.25)
The scorer scans all argument values (converted to strings) for patterns that indicate sensitive or dangerous content.Sensitive Patterns
| Category | Patterns |
|---|---|
| Credentials | production, .env, secret, password, token, key, credential |
| Dangerous SQL | DROP, DELETE, TRUNCATE, ALTER |
| Shell dangers | rm -rf, sudo, chmod 777 |
| Network | URLs, email addresses, IP addresses |
Python
Factor 3: Docstring (weight: 0.20)
The scorer inspects the function’s docstring for keywords that signal risk.High-Risk Keywords — score 0.85
Caution Keywords — score 0.50
Factor 4: Hints (weight: 0.15)
Callers can supply explicit risk hints to influence the score. Hints come in two forms:Boolean Hints
Each boolean hint set toTrue adds 0.30 to the hint score (cumulative, clamped to 1.0).
Numeric Hints
Numeric hints are scaled asmin(value / 10000, 1.0) * 0.8:
Factor 5: Novelty (weight: 0.10)
The novelty factor captures how familiar a particular function call is. The first invocation of any gated function is treated as highest novelty; the score decreases linearly with subsequent calls.| Call Number | Novelty Score |
|---|---|
| 1st call | 0.90 |
| 2nd call | 0.81 |
| 3rd call | 0.72 |
| 5th call | 0.54 |
| 10th call | 0.10 |
| 11th+ call | 0.10 (floor) |
0.9 for the first call and linearly decreases to 0.1 by the 10th call, remaining at 0.1 thereafter.
Novelty tracking is per-function, per-session. A new session resets all novelty counters.
Worked Example
Consider a functiondelete_user("usr_123", env="production") with docstring "Permanently remove a user account." called for the first time, with no hints:
| Factor | Raw Score | Weight | Contribution |
|---|---|---|---|
function_name — “delete” is destructive | 0.95 | 0.30 | 0.285 |
arguments — “production” detected | ~0.70 | 0.25 | 0.175 |
docstring — “permanently” detected | 0.85 | 0.20 | 0.170 |
hints — none provided | 0.00 | 0.15 | 0.000 |
novelty — first call | 0.90 | 0.10 | 0.090 |
| Total | 0.720 |
0.72 maps to risk level HIGH, which triggers a quiz challenge.
Custom Scorers
DefaultRiskScorer covers general-purpose use cases. For domain-specific scoring, you can use a CompositeRiskScorer to blend the default scorer with custom logic, or replace it entirely with your own implementation.
Risk Levels
See how scores map to LOW, MEDIUM, HIGH, and CRITICAL
Risk Scorers
Combine or replace scorers with Composite, Max, and Fixed