How we benchmark domains: scoring methodology
A "score / 100" on a security report only means something if you know how it was computed. Otherwise it's astrology. Here is the entire algorithm UnveilScan uses, end-to-end, with no hand-waving.
The four-category model
Every checker we run belongs to exactly one of four categories: DNS, TLS, WEB or EMAIL. Each category gets its own subscore on 0-100 (computed below). The global score is a weighted average:
global = (DNS×20 + TLS×30 + WEB×30 + EMAIL×20) / 100
The weights:
| Category | Weight | Why |
|---|---|---|
| DNS | 20% | Foundation but rarely actively exploited; matters for hijacking and reputation. |
| TLS | 30% | Direct attack surface; misconfigured TLS = MITM possibility. |
| WEB | 30% | The biggest day-to-day surface (XSS, headers, CSP, leaks, CVEs). |
| 20% | Mail spoofing has direct phishing and brand-damage cost. |
How a category subscore is computed
Each category starts at 100. Every finding (an issue raised by a checker) subtracts a fixed penalty based on its severity:
| Severity | Penalty | Examples |
|---|---|---|
| info | 0 | "X-Powered-By header leaks tech stack" — informational, no score impact. |
| low | 5 | "X-Content-Type-Options missing", "Permissions-Policy missing". |
| medium | 15 | "HSTS missing", "CSP missing", "DMARC pct<100". |
| high | 30 | "Subdomain takeover vulnerable", "DKIM key <1024 bits". |
| critical | 60 | "TLS 1.0 enabled", "Apache CVE-2021-41773 confirmed", "actuator/env exposed". |
Plus a special case:
- Checker error / skip = -10 points to the category. The signal is "we couldn't evaluate this part of your posture", which is as much a configuration concern as a finding.
- Transient error (network hiccup on our side, upstream API down) = ignored. Doesn't count against the score. Logged but doesn't penalise.
Each category subscore is then clamped to [0, 100]. So a TLS subscore can't go negative — once you've accumulated enough penalty to zero it, additional TLS findings don't make the total worse. This is a deliberate choice, not a bug: a server that's catastrophically broken on TLS is broken regardless of whether the count is 5 or 50.
The "fix-everything ceiling"
Alongside your current score we expose a ceiling: the score you would reach if every active finding were fixed right now. Two cases:
- Ceiling = 100 → your domain has only fixable issues. Clean slate possible.
- Ceiling < 100 → some checkers errored or skipped. Even with perfect findings, the -10 per error keeps you below 100. The /scan/[id] page tells you exactly how many.
This makes the score honest. A domain with "ceiling 92, current 67" knows there's 25 points of recoverable headroom plus 8 points held by infrastructure issues outside their direct control. That's actionable in a way "67/100" alone isn't.
Per-finding leverage
On every finding card we also show the marginal score impact: how many points fixing just this one finding gets back. Because of category caps + weighting, a critical in EMAIL is worth less globally than a critical in TLS:
- Critical in TLS (30% weight): -60 in TLS, -18 globally.
- Critical in EMAIL (20% weight): -60 in EMAIL, -12 globally.
The displayed leverage is computed by re-running the scorer with that one finding stripped and reading the delta. It's exact, not approximate. If the math doesn't add up, please tell us — we ship the algorithm for inspection.
What we deliberately don't score
To keep the methodology honest we exclude a few things from the score:
- Resolved findings — once you fix a finding via re-check, the row stays in the DB for audit but the scorer ignores it. The score reflects current posture, not history.
- Suppressed findings — accepted-risk findings (the user explicitly suppressed via the API) don't count.
- Transient errors — described above. A crt.sh 502 is our problem, not yours.
- Quality findings (HTML, SEO, a11y, perf hints) — bundled in WEB on Extended, but never elevated above LOW. They affect the score lightly because they're real signals, but they aren't security per se.
- The `info` severity — by definition, no score impact.
Letter grades
A letter grade is mapped from the global score using these bins, deliberately tuned to be less generous than typical "Mozilla-A by default" maps:
| Score | Grade |
|---|---|
| 95-100 | A+ |
| 85-94 | A |
| 75-84 | B |
| 65-74 | C |
| 50-64 | D |
| <50 | F |
Comparable scanners frequently start the A bin at 80 or 75. We start at 85. The same domain grades half a letter lower with us than with the typical alternative. This is a feature.
Inspecting the algorithm
The scorer is open-source for visibility. It lives at internal/scorer/scorer.go
in our codebase (the project itself is closed-source, but we publish the scorer file as
reference material when asked — write to support and we'll send it). The Compute() function
is ~50 lines of Go and unit-tested against ~12 cases that pin the formula in place.
We don't change weights silently. Any change to the weighting or the severity table goes into the changelog with a date and a rationale, and we document the score impact for a representative sample of domains.
See the scoring on a real domain
Run a Basic scan, click any finding to see its leverage. Free, no signup needed.
Scan a domain