How the dataset is structured, how source quality is evaluated, how entity relationships are classified, and where we know we have gaps.

Schema — Data Dictionary

Canonical enum values live in scripts/utils/schema.py and are enforced by scripts/99_validate.py. When updating this file, update schema.py in the same commit.

Table overview

TableRole
actions.csvSpine — one row per freeze, restrict, or non-action event
triggers.csvExternal catalysts (OFAC designations, court orders, statutes)
incidents.csvReal-world events that may precipitate requests
requests.csvExplicit or plausible requests for Circle action (nullable resulting_action_id captures refusals)
policies.csvCircle policies, versioned by effective date
implementations.csvMechanical execution records (on-chain tx or off-chain operational step)
entities.csvEvery relevant party, with YES/MAYBE/NO Circle relationship classification
sources.csvEvery citation, audit-trailed with SHA256
action_sources.csvMany-to-many join between actions and sources

actions.csv — spine

One row per discrete freeze / restrict / non-action event.

FieldTypeNotes
action_idstring, PKStable ID, e.g. CU-ACT-0001
action_dateISO 8601 UTCDate of the action itself, not the date of disclosure
mechanism_typeenumBLACKLIST, UNBLACKLIST, PAUSE, UNPAUSE, REDEMPTION_REFUSAL, ACCOUNT_CLOSURE, JURISDICTIONAL, LAW_ENFORCEMENT_RESPONSE, NON_ACTION, POLICY_COMMITMENT
target_identifierstringAddress, account ID, jurisdiction code, or counterparty name
target_typeenumADDRESS, ACCOUNT, JURISDICTION, COUNTERPARTY, CATEGORY, NA
target_categorystring, nullableFree-tag, e.g. sanctioned_entity, stolen_funds
statusenumACTIVE, REVERSED, UNCLEAR
trigger_idFK → triggers, nullableLegal / regulatory trigger
implementation_idFK → implementations, nullableMechanical execution record
confidenceenumHIGH, MEDIUM, LOW
issuerstringStablecoin issuer, e.g. Circle

Null trigger_id, policy_id, or implementation_id values are themselves findings — they signal a policy-versus-practice gap or opaque implementation and should be coded deliberately, not treated as missing data.


Sourcing Standards

Source tiers

Every source_id carries a source_tier:

Confidence mapping

Evidence availableconfidence
On-chain event recovered via Dune / EtherscanHIGH
PRIMARY source explicitly describing the actionHIGH
≥2 SECONDARY sources corroboratingHIGH or MEDIUM
1 SECONDARY source, no corroborationMEDIUM
TERTIARY sources onlyLOW
Inferred from context, no direct statementLOW

Entity → Circle Relationship Classification

Every entities.csv row carries circle_relationship ∈ {YES, MAYBE, NO}.

YES

Requires at least one documented source showing: equity investment, Centre Consortium membership, revenue-share agreement, board/advisor interlock, banking counterparty named in Transparency Reports, or disclosed commercial partnership.

NO

Requires affirmative evidence of absence or adversarial distance (entity on a sanctions list Circle has complied with; Circle has publicly severed ties; entity is in active litigation against Circle).

MAYBE

Default residual. Used whenever we lack documented evidence in either direction. Expected to be the largest bucket.


Completeness Map

Summary

Known Gaps

Non-EVM chains (Solana, NEAR, Stellar, Algorand, Hedera) are out of scope for Pass A. Internal Circle policies not published externally are unobservable. Requests Circle received but did not disclose and which produced no action are partially unobservable.