name: matching-engine description: Core matching algorithm using pgvector semantic similarity. Finds "I have, they need" and "I need, they have" connections between users.
Matching goal
Connect users in two independent directions:
- "I have, they need" (forward): My HAVE resource is semantically similar to someone's WANT
- "I need, they have" (reverse): Someone's HAVE resource is semantically similar to my WANT
Each direction is scored and displayed independently. There is no combined "bidirectional" score — the two directions are separate sections on the matches page.
Threshold configuration
All scoring thresholds live in src/lib/constants.ts under MATCH_THRESHOLDS. Never hardcode threshold values elsewhere — always import from constants.
import { MATCH_THRESHOLDS } from '@/lib/constants';
// MATCH_THRESHOLDS.MIN_SCORE — minimum to store a match
// MATCH_THRESHOLDS.STRONG — "strong match" badge
// MATCH_THRESHOLDS.GOOD — "good match" badge
// MATCH_THRESHOLDS.MAX_PER_USER — max matches stored per user
// MATCH_THRESHOLDS.CANDIDATE_POOL — nearest-neighbor candidates per query
These values depend on the embedding model's score distribution. When switching embedding models, recalibrate by:
- Running
scripts/recalculate-all-matches.ts - Checking the actual score distribution with SQL
- Adjusting thresholds in
constants.tsso badge tiers produce meaningful separation
How matching works
Step 1: Generate embeddings on resource creation
See src/server/services/embedding.ts. The embedding provider is configurable (Gemini, OpenAI, mock).
Step 2: Find matches with pgvector
src/server/services/matching.ts uses cosine similarity (1 - (a <=> b)) to find nearest-neighbor resources across users.
Step 3: Per-direction scoring
For each candidate user, we track the best forward score and best reverse score independently. A match is stored if either direction exceeds MATCH_THRESHOLDS.MIN_SCORE.
Step 4: Dual-row insert
Each match inserts TWO rows in a single transaction:
- Primary row (A → B): A's perspective
- Mirror row (B → A): Scores and resource IDs swapped, so B immediately sees the match
Mirror rows use ON CONFLICT DO UPDATE to handle the case where B already has a row for that pair.
Match table: four resource references
The Prisma schema stores both directions per row:
forwardHave/forwardWant— my HAVE matched their WANTreverseHave/reverseWant— their HAVE matched my WANT
A row may have only forward, only reverse, or both populated.
API: querying matches by direction
src/server/routers/match.ts — the myMatches endpoint accepts a direction param:
direction: 'forward'→ filter byforwardScore >= minScoredirection: 'reverse'→ filter byreverseScore >= minScoredirection: undefined→ filter byscore >= minScore
UI: two-section matches page
/matches displays two stacked sections (not tabs):
- "I have, they need" — forward matches. Card shows: who needs + resource title
- "I need, they have" — reverse matches. Card shows: who has + resource title
Score badge is displayed on its own row below the resource title.
When matching runs
- On resource create/update — recalculate for the triggering user
- On resource close/pause — recalculate (closed resources excluded)
- Vercel Cron (every 4 hours) — reconcile stale users + clean up matches referencing non-ACTIVE resources
- Never on page load — always serve from cached Match table
Performance notes
- pgvector with IVFFlat index: good enough for 100k resources
- Create index:
CREATE INDEX ON resources USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100); - Cron processes at most 20 users per invocation (60s timeout)
What NOT to do
- Don't hardcode threshold values — always use
MATCH_THRESHOLDSfrom constants - Don't treat forward and reverse as a single combined score
- Don't require both directions for a match to be valid
- Don't run matching on every page load — serve from cached Match table
- Don't try chain matching yet — that's v2