matching-engine

name: matching-engine description: Core matching algorithm using pgvector semantic similarity. Finds "I have, they need" and "I need, they have" connections between users.

Matching goal

Connect users in two independent directions:

"I have, they need" (forward): My HAVE resource is semantically similar to someone's WANT
"I need, they have" (reverse): Someone's HAVE resource is semantically similar to my WANT

Each direction is scored and displayed independently. There is no combined "bidirectional" score — the two directions are separate sections on the matches page.

Threshold configuration

All scoring thresholds live in src/lib/constants.ts under MATCH_THRESHOLDS. Never hardcode threshold values elsewhere — always import from constants.

import { MATCH_THRESHOLDS } from '@/lib/constants';
// MATCH_THRESHOLDS.MIN_SCORE     — minimum to store a match
// MATCH_THRESHOLDS.STRONG        — "strong match" badge
// MATCH_THRESHOLDS.GOOD          — "good match" badge
// MATCH_THRESHOLDS.MAX_PER_USER  — max matches stored per user
// MATCH_THRESHOLDS.CANDIDATE_POOL — nearest-neighbor candidates per query

These values depend on the embedding model's score distribution. When switching embedding models, recalibrate by:

Running scripts/recalculate-all-matches.ts
Checking the actual score distribution with SQL
Adjusting thresholds in constants.ts so badge tiers produce meaningful separation

How matching works

Step 1: Generate embeddings on resource creation

See src/server/services/embedding.ts. The embedding provider is configurable (Gemini, OpenAI, mock).

Step 2: Find matches with pgvector

src/server/services/matching.ts uses cosine similarity (1 - (a <=> b)) to find nearest-neighbor resources across users.

Step 3: Per-direction scoring

For each candidate user, we track the best forward score and best reverse score independently. A match is stored if either direction exceeds MATCH_THRESHOLDS.MIN_SCORE.

Step 4: Dual-row insert

Each match inserts TWO rows in a single transaction:

Primary row (A → B): A's perspective
Mirror row (B → A): Scores and resource IDs swapped, so B immediately sees the match

Mirror rows use ON CONFLICT DO UPDATE to handle the case where B already has a row for that pair.

Match table: four resource references

The Prisma schema stores both directions per row:

forwardHave / forwardWant — my HAVE matched their WANT
reverseHave / reverseWant — their HAVE matched my WANT

A row may have only forward, only reverse, or both populated.

API: querying matches by direction

src/server/routers/match.ts — the myMatches endpoint accepts a direction param:

direction: 'forward' → filter by forwardScore >= minScore
direction: 'reverse' → filter by reverseScore >= minScore
direction: undefined → filter by score >= minScore

UI: two-section matches page

/matches displays two stacked sections (not tabs):

"I have, they need" — forward matches. Card shows: who needs + resource title
"I need, they have" — reverse matches. Card shows: who has + resource title

Score badge is displayed on its own row below the resource title.

When matching runs

On resource create/update — recalculate for the triggering user
On resource close/pause — recalculate (closed resources excluded)
Vercel Cron (every 4 hours) — reconcile stale users + clean up matches referencing non-ACTIVE resources
Never on page load — always serve from cached Match table

Performance notes

pgvector with IVFFlat index: good enough for 100k resources
Create index: CREATE INDEX ON resources USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100);
Cron processes at most 20 users per invocation (60s timeout)

What NOT to do

Don't hardcode threshold values — always use MATCH_THRESHOLDS from constants
Don't treat forward and reverse as a single combined score
Don't require both directions for a match to be valid
Don't run matching on every page load — serve from cached Match table
Don't try chain matching yet — that's v2

ナビゲーション

Skillsとは？

リンク