How we benchmark domains: scoring methodology

Posted 2026-04-29 · 6 min read · scoringmethodology

A "score / 100" on a security report only means something if you know how it was computed. Otherwise it's astrology. Here is the entire algorithm UnveilScan uses, end-to-end, with no hand-waving.

The four-category model

Every checker we run belongs to exactly one of four categories: DNS, TLS, WEB or EMAIL. Each category gets its own subscore on 0-100 (computed below). The global score is a weighted average:

global = (DNS×20 + TLS×30 + WEB×30 + EMAIL×20) / 100

The weights:

Category	Weight	Why
DNS	20%	Foundation but rarely actively exploited; matters for hijacking and reputation.
TLS	30%	Direct attack surface; misconfigured TLS = MITM possibility.
WEB	30%	The biggest day-to-day surface (XSS, headers, CSP, leaks, CVEs).
EMAIL	20%	Mail spoofing has direct phishing and brand-damage cost.

How a category subscore is computed

Each category starts at 100. Every finding (an issue raised by a checker) subtracts a fixed penalty based on its severity:

Severity	Penalty	Examples
info	0	"X-Powered-By header leaks tech stack" — informational, no score impact.
low	5	"X-Content-Type-Options missing", "Permissions-Policy missing".
medium	15	"HSTS missing", "CSP missing", "DMARC pct<100".
high	30	"Subdomain takeover vulnerable", "DKIM key <1024 bits".
critical	60	"TLS 1.0 enabled", "Apache CVE-2021-41773 confirmed", "actuator/env exposed".

Plus a special case:

Checker error / skip = -10 points to the category. The signal is "we couldn't evaluate this part of your posture", which is as much a configuration concern as a finding.
Transient error (network hiccup on our side, upstream API down) = ignored. Doesn't count against the score. Logged but doesn't penalise.

Each category subscore is then clamped to [0, 100]. So a TLS subscore can't go negative — once you've accumulated enough penalty to zero it, additional TLS findings don't make the total worse. This is a deliberate choice, not a bug: a server that's catastrophically broken on TLS is broken regardless of whether the count is 5 or 50.

The "fix-everything ceiling"

Alongside your current score we expose a ceiling: the score you would reach if every active finding were fixed right now. Two cases:

Ceiling = 100 → your domain has only fixable issues. Clean slate possible.
Ceiling < 100 → some checkers errored or skipped. Even with perfect findings, the -10 per error keeps you below 100. The /scan/[id] page tells you exactly how many.

This makes the score honest. A domain with "ceiling 92, current 67" knows there's 25 points of recoverable headroom plus 8 points held by infrastructure issues outside their direct control. That's actionable in a way "67/100" alone isn't.

Per-finding leverage

On every finding card we also show the marginal score impact: how many points fixing just this one finding gets back. Because of category caps + weighting, a critical in EMAIL is worth less globally than a critical in TLS:

Critical in TLS (30% weight): -60 in TLS, -18 globally.
Critical in EMAIL (20% weight): -60 in EMAIL, -12 globally.

The displayed leverage is computed by re-running the scorer with that one finding stripped and reading the delta. It's exact, not approximate. If the math doesn't add up, please tell us — we ship the algorithm for inspection.

What we deliberately don't score

To keep the methodology honest we exclude a few things from the score:

Resolved findings — once you fix a finding via re-check, the row stays in the DB for audit but the scorer ignores it. The score reflects current posture, not history.
Suppressed findings — accepted-risk findings (the user explicitly suppressed via the API) don't count.
Transient errors — described above. A crt.sh 502 is our problem, not yours.
Quality findings (HTML, SEO, a11y, perf hints) — bundled in WEB on Extended, but never elevated above LOW. They affect the score lightly because they're real signals, but they aren't security per se.
The `info` severity — by definition, no score impact.

Letter grades

A letter grade is mapped from the global score using these bins, deliberately tuned to be less generous than typical "Mozilla-A by default" maps:

Score	Grade
95-100	A+
85-94	A
75-84	B
65-74	C
50-64	D
<50	F

Comparable scanners frequently start the A bin at 80 or 75. We start at 85. The same domain grades half a letter lower with us than with the typical alternative. This is a feature.

Inspecting the algorithm

The scorer is open-source for visibility. It lives at internal/scorer/scorer.go in our codebase (the project itself is closed-source, but we publish the scorer file as reference material when asked — write to support and we'll send it). The Compute() function is ~50 lines of Go and unit-tested against ~12 cases that pin the formula in place.

We don't change weights silently. Any change to the weighting or the severity table goes into the changelog with a date and a rationale, and we document the score impact for a representative sample of domains.

See the scoring on a real domain

Run a Basic scan, click any finding to see its leverage. Free, no signup needed.

Scan a domain

UnveilScan Blog