corpus :: ::

NRS regulators
live knowledge-graph status

ON1 maintains a multi-regulator nuclear-safety knowledge graph spanning the regulatory bodies of 17 jurisdictions. Each entry below shows what the crawler has pulled from that regulator's own publication surface, indexed for hybrid (BM25 + dense) retrieval. Updated daily via polite, robots-respecting BFS.

Corpus totals

Regulators

code regulator pages status

How the crawl works

Each regulator gets one or more seed URLs, an allow-prefix filter (so we stay in the publications/guidance area and skip press/news/HR), and a polite crawl budget (default ~400 pages, 2-second delay, depth 4, respects robots.txt). Content is chunked, embedded with nomic-embed-text, and indexed for hybrid retrieval (BM25 + dense via RRF). Operators query via Nucolai; SSM-tenant users get an @ssm.se-gated view via ssmailab.

This page reads a daily snapshot (regulators-corpus-latest.json). Counts may lag the live index by up to 24h. Per-regulator zero counts indicate either a seed 404 or content that didn't pass the polite-crawl filters — see the Q14 receipt for the full trace.