Scraper program
Focused automation, trusted data
This guide explains how to apply for scraper access, submit verifiable payloads, and graduate from manual review to verified status. Keep your scope narrow, include evidence, and respect the canonical IDs surfaced across the SportsDatabase portal.
Default allocation: 1 req/secManual review required
Application requirements
- Editor access. Scraper tooling is limited to Editors and above. Purchase a tier or earn 5,500 points to unlock the application form embedded in the portal.
- Clearly defined intent. Describe the competition, timeframe, and entity scope (Events, Players, Teams, or Leagues). Broad “all sports” scrapers will be rejected.
- Cadence and accountability. Share how often the scraper runs, where data originates, and what quality checks you run before submitting.
- Evidence. Every payload must include verifiable sources that moderators can audit quickly.
Submission schema
- Reuse the shared contribution schemas (sports/leagues/teams/players/events).
- Populate catalogId + slug fields using the canonical IDs exposed in the portal.
- Reject payloads where parent entities do not exist—scrapers must never create parent references on the fly.
- Include submittedAt and via metadata for traceability.
Moderation checklist
- Initial submissions stay in MANUAL_REVIEW until moderators approve at least 20 clean payloads.
- Verified scrapers may receive higher rate limits and lighter review, but bad data resets your status.
- Points awarded for scraper submissions are reduced by 50% and labeled SCRAPER_IMPORT in contributor history.
- Moderators can suspend or revoke keys at any time if evidence is missing or abuse is detected.
API endpoint & headers
Endpoint
POST /v1/scraper/ingest
Send JSON bodies that match the contribution schemas. Requests must be made over HTTPS; sandbox traffic uses your local API server when running pnpm dev.
Required headers
- X-SportsDB-Scraper-Key — your one-time-revealed scraper key secret.
- Content-Type:
application/json. - Optional: Idempotency-Key to avoid duplicate submissions.
The API responds with 202 Accepted plus a submission ID when validation succeeds. Most errors return 400 with a detailed path describing which field failed.
Sample payload
{
"entityType": "LEAGUE",
"payload": {
"catalogId": "201",
"sportSlug": "soccer",
"name": "Demo League",
"slug": "demo-league",
"country": "USA",
"summary": "Short blurb (20+ chars)",
"sources": ["https://example.com"]
}
}Evidence URLs are optional today but strongly encouraged. Parent slugs must already exist before you submit.
Rate limits & verification tiers
Level 1 · Events
Fixtures, schedules, box scores
Default: 1 req/sec
Manual review required for each payload
Level 2 · Players
Player bios & roster updates
Default: 1 req/sec
Requires team slug validation
Level 3 · Teams
Team metadata, venues, leagues
Default: 0.5 req/sec
Unlocks after 30 verified event/player submissions
Level 4 (Leagues) is invite-only and requires ongoing reviewer sponsorship. Rate limit bumps beyond the defaults are handled through the Scraper Admin dashboard → Lifecycle panel.
Node ingestion sample
Use scripts/scraper-league-demo.ts to pull a league row from SportsData.db and submit it to/v1/scraper/ingest. The payload is intentionally redundant so moderators can review and reject it while verifying the pipeline.
Environment variables
SPORTSDB_SCRAPER_KEY="scr_xxx" # required
SPORTSDB_API_KEY="api_xxx" # optional if global API auth enabled
SPORTSDB_API_URL="http://localhost:5555/v1/scraper/ingest" # optional
SPORTSDB_DB_PATH="c:/path/to/SportsData.db" # optional
SPORTSDB_LEAGUE_ID="4328" # optional, defaults to first row
SPORTSDB_SPORT_SLUG="soccer" # optional
Only the scraper key must be set; other overrides help target specific leagues or environments.
Run locally
pnpm tsx scripts/scraper-league-demo.ts
Successful runs print the submission ID. Visit /portal/moderation?source=scraper to see the redundant entry and reject it.
Internal QA Scraper
Owner: Acid Alchamy
Level EVENTS- Verification
- VERIFIED · APPROVED
- Rate limit
- 5 req/sec · 12/5/25, 6:52 AM
Testing sandbox checklist
- Submit 10 payloads covering your declared entity scope with complete evidence links.
- Ensure parent references exist (sportSlug, leagueSlug, teamSlug) before submission.
- Respond to moderator feedback within 24 hours—silence will pause your application.
- Once approved, monitor `/portal/scrapers` for verification updates and rate-limit changes.