Scraper program

Focused automation, trusted data

This guide explains how to apply for scraper access, submit verifiable payloads, and graduate from manual review to verified status. Keep your scope narrow, include evidence, and respect the canonical IDs surfaced across the SportsDatabase portal.

Default allocation: 1 req/secManual review required

Application requirements

  1. Editor access. Scraper tooling is limited to Editors and above. Purchase a tier or earn 5,500 points to unlock the application form embedded in the portal.
  2. Clearly defined intent. Describe the competition, timeframe, and entity scope (Events, Players, Teams, or Leagues). Broad “all sports” scrapers will be rejected.
  3. Cadence and accountability. Share how often the scraper runs, where data originates, and what quality checks you run before submitting.
  4. Evidence. Every payload must include verifiable sources that moderators can audit quickly.

Submission schema

  • Reuse the shared contribution schemas (sports/leagues/teams/players/events).
  • Populate catalogId + slug fields using the canonical IDs exposed in the portal.
  • Reject payloads where parent entities do not exist—scrapers must never create parent references on the fly.
  • Include submittedAt and via metadata for traceability.

Moderation checklist

  • Initial submissions stay in MANUAL_REVIEW until moderators approve at least 20 clean payloads.
  • Verified scrapers may receive higher rate limits and lighter review, but bad data resets your status.
  • Points awarded for scraper submissions are reduced by 50% and labeled SCRAPER_IMPORT in contributor history.
  • Moderators can suspend or revoke keys at any time if evidence is missing or abuse is detected.

API endpoint & headers

Endpoint

POST /v1/scraper/ingest

Send JSON bodies that match the contribution schemas. Requests must be made over HTTPS; sandbox traffic uses your local API server when running pnpm dev.

Required headers

  • X-SportsDB-Scraper-Key — your one-time-revealed scraper key secret.
  • Content-Type: application/json.
  • Optional: Idempotency-Key to avoid duplicate submissions.

The API responds with 202 Accepted plus a submission ID when validation succeeds. Most errors return 400 with a detailed path describing which field failed.

Sample payload

{
  "entityType": "LEAGUE",
  "payload": {
    "catalogId": "201",
    "sportSlug": "soccer",
    "name": "Demo League",
    "slug": "demo-league",
    "country": "USA",
    "summary": "Short blurb (20+ chars)",
    "sources": ["https://example.com"]
  }
}

Evidence URLs are optional today but strongly encouraged. Parent slugs must already exist before you submit.

Rate limits & verification tiers

Level 1 · Events

Fixtures, schedules, box scores

Default: 1 req/sec

Manual review required for each payload

Level 2 · Players

Player bios & roster updates

Default: 1 req/sec

Requires team slug validation

Level 3 · Teams

Team metadata, venues, leagues

Default: 0.5 req/sec

Unlocks after 30 verified event/player submissions

Level 4 (Leagues) is invite-only and requires ongoing reviewer sponsorship. Rate limit bumps beyond the defaults are handled through the Scraper Admin dashboard → Lifecycle panel.

Node ingestion sample

Use scripts/scraper-league-demo.ts to pull a league row from SportsData.db and submit it to/v1/scraper/ingest. The payload is intentionally redundant so moderators can review and reject it while verifying the pipeline.

Environment variables

SPORTSDB_SCRAPER_KEY="scr_xxx"  # required
SPORTSDB_API_KEY="api_xxx"  # optional if global API auth enabled
SPORTSDB_API_URL="http://localhost:5555/v1/scraper/ingest"  # optional
SPORTSDB_DB_PATH="c:/path/to/SportsData.db"  # optional
SPORTSDB_LEAGUE_ID="4328"  # optional, defaults to first row
SPORTSDB_SPORT_SLUG="soccer"  # optional

Only the scraper key must be set; other overrides help target specific leagues or environments.

Run locally

pnpm tsx scripts/scraper-league-demo.ts

Successful runs print the submission ID. Visit /portal/moderation?source=scraper to see the redundant entry and reject it.

Recently approved scrapers

Open admin dashboard

Internal QA Scraper

Owner: Acid Alchamy

Level EVENTS
Verification
VERIFIED · APPROVED
Rate limit
5 req/sec · 12/5/25, 6:52 AM

Testing sandbox checklist

  1. Submit 10 payloads covering your declared entity scope with complete evidence links.
  2. Ensure parent references exist (sportSlug, leagueSlug, teamSlug) before submission.
  3. Respond to moderator feedback within 24 hours—silence will pause your application.
  4. Once approved, monitor `/portal/scrapers` for verification updates and rate-limit changes.