Methodology
Shadow Network Intelligence is built on transparency and trust. Our reports are rooted in official data, processed through a rigorous, automated pipeline, and reviewed to ensure accuracy and clarity. This page explains how we generate our reports, what data we rely on, and how we handle limitations.
Data Sources
All reports are based on publicly available data from the Federal Election Commission (FEC), accessed through the official FEC API. We currently pull data from the following endpoints:
- Candidate filings – to identify who’s running, for what office, and with which committees
- Committee filings – to track fundraising entities and their designations
- Schedule A (contributions received) – to analyze donor patterns and fundraising networks
- Schedule B (disbursements) – to follow where campaign money is spent
- Schedule E (independent expenditures) – to surface outside spending by PACs and super PACs
This data is refreshed daily to reflect the latest available information as campaigns report new activity.
Ingestion Process
Our Data collection began in April, 2025 and includes records dated January 1, 2025 and later. Our backend pipeline is built in Python and structured to automatically pull, transform, and store campaign finance data in a PostgreSQL database. Each ingester is tailored to a specific FEC endpoint and includes:
- Automated data collection with pagination, error handling, and retries
- Deduplication and upserting into normalized database tables
- Ingestion and update dates appended to each record
All data is linked by candidate and committee IDs, allowing for accurate cross-referencing across schedules and reports.
Report Generation
Once the data is processed, we generate reports using a modular notebook-based pipeline. Each report:
- Runs a structured set of SQL queries to extract key insights
- Produces visualizations and summary tables to highlight trends
- Uses AI-assisted narration to generate plain-language summaries and captions, with human review and final edits
- Is rendered as both HTML and PDF, using a consistent layout and style template for clarity and professionalism
This pipeline allows us to produce reports quickly while maintaining consistency, traceability, and transparency across every section.
Limitations and Caveats
While we strive for accuracy and thoroughness, we acknowledge the inherent limitations of public campaign finance data:
- Employer and occupation fields are self-reported by donors and may be inconsistent
- Contributions under $200 often lack identifying details
- Real-world bundling, coordinated giving, and PAC networks require interpretation based on patterns
- Data may be updated or amended after initial filing—our pipeline reflects the most recent available data at time of processing
We apply clustering techniques (e.g., employer normalization, ZIP code grouping) to help surface possible coordination, but our findings should be understood as patterns of interest, not proof of illegality.
Ethical Commitments
We are committed to:
- Nonpartisan, fact-based reporting with no ideological agenda
- Transparent methods and documentation of data sources
- Protecting the safety and integrity of our research team, while ensuring that our outputs are fully auditable
- Respecting the public’s right to know how campaigns are funded and influenced
Collaboration and Transparency
We welcome inquiries from journalists, watchdog organizations, researchers, and others who want to:
- Validate or replicate our findings
- Request deeper analysis or custom reports
- Collaborate on investigations or data tools
To discuss potential collaboration or request access to documentation or schema details, please reach out to us.
We believe that democracy thrives in sunlight. Our methodology is designed to make that light as clear and focused as possible.