Home

/

Library

/

mapping-heuristics.md

Mapping Heuristics v1

Mapping Heuristics v1

Goal

Link venue markets to a canonical global_markets row with a confidence score.

Terminology (Canon)

  • Exchange: the platform/company (Polymarket, Kalshi).
  • Venue: Cornice-internal alias for Exchange (used in logs and fields).
  • Market: the event definition + resolution rules (cross-venue concept).
  • Contract: the tradable instrument for a specific Outcome (venue-specific).

Normalization

  • Lowercase, strip punctuation, collapse whitespace.
  • Remove stopwords: "will", "the", "a", "an", "by", "at", "in".
  • Normalize currency/units: $100k -> 100000, USD -> usd.

Matching Signals

  • Title similarity: Jaccard or cosine similarity on tokenized titles.
  • End time proximity: within a configurable window (e.g., 24h).
  • Category overlap: shared tags/labels if present.
  • Outcome alignment: both are binary YES/NO (hard gate).

Confidence Score (Draft)

score = 0.6 * title_similarity
      + 0.3 * end_time_score
      + 0.1 * category_score

Status Rules

  • score >= 0.85 -> auto_confirmed
  • 0.70 <= score < 0.85 -> pending
  • < 0.70 -> no link created

UI Workflow

  • Show pending pairs with title + end time.
  • Confirm or reject to lock the mapping.