Model-swap detection for inference marketplaces

Your buyers pay for Opus.
Prove they're getting it.

Some of your sellers advertise Opus and quietly serve something cheaper — and when quality drops, the buyer blames your marketplace, not the seller. Litmus is the neutral layer that catches them: independent attestors test what each seller is actually serving and post the verdict on-chain.

LIVE — 0% of premium-model offers we've tested serve a cheaper model than advertised.
  Litmus report · marketplace #SI-2291 2 offers
Advertisedclaude-opus-4.6
Verified
↳ serving claude-opus-4.6 · 20 probes matched
0/100 authenticity 3 attestors · on-chain
Advertisedclaude-opus-4.6
Downgrade
↳ actually serving llama-3.3-70b · 17 / 20 probes failed
0/100 authenticity slashed · tx 0x8f…c402
The trust hole

You can't verify your own sellers. And buyers know it.

Sellers win the cheapest-first race by quietly swapping the model. A request for Opus comes back from Llama. Your buyer's product quality drops — and they blame your marketplace, not the seller. You profit from the cheapest offer, so self-verification isn't trust. It's marketing.

Buyer pays forclaude-opus-4.6
Seller servesllama-3.3-70b
Buyer churnsblames you
The evidence

We can already show you who's lying.

Our probes run as ordinary buyer traffic — indistinguishable from a real request, so a cheating seller can't dodge them. Point us at your marketplace and we'll hand you a report: every offer, the model it claims, the model it's actually serving, with on-chain evidence anyone can replay.

No integration required to get the report. You'll know within a day.

The fix

Catch the swap before your buyers do.

Litmus is a verification protocol, not a service you have to trust. Independent attestors stake real money, probe your sellers, and publish a score on-chain. You just read it — and the cheaters get caught while the buyer's still happy.

01 / NEUTRAL

Not your job to be trusted.

You're conflicted — you earn on the cheapest seller. Litmus is the disinterested third party. Buyers believe a score you didn't produce.

02 / ON-CHAIN

Every score is auditable.

Scores, evidence hashes, and slashing events settle on Base. A buyer — or you — can replay any verdict. It's proof, not a press release.

03 / ZERO OPS

You run nothing.

No probes, no attestors, no settlement to operate. Litmus handles the testing economy. You add one field to your router.

Make vs. buy

Building it yourself doesn't solve the problem.

In-house verification

  • You're the conflicted party — buyers discount your own "verified" stamp
  • Months of probe engineering, on-chain settlement, and attestor coordination
  • You still have to be trusted not to protect your top-earning sellers
  • No shared standard — every marketplace reinvents it differently

Litmus

  • Neutral by construction — the protocol, not you, issues the verdict
  • Auditable on-chain — buyers replay the evidence themselves
  • Live in an afternoon — one read on your routing path
  • A shared badge buyers learn to look for across the market
Integration

One read on your routing path.

Verification runs async, entirely off your hot path — it never touches your latency or your margins. Read a seller's score from an edge KV cache in sub-millisecond and fold it into the sort you already run. Feature-flag it. Ship it behind a toggle. No crypto knowledge required to read a number.

router/rank.ts
// litmus score: 0–10000 bps, edge-cached
const score = await litmus.get(offer.id)   // KV, <1ms

// fold into your existing cheapest-first sort
const effectiveCost =
  estimatedCost / (1 + score / 10000 * WEIGHT)

// WEIGHT = 0  → behavior unchanged
// WEIGHT = 1  → prefer a verified seller
//              over a 50%-score seller at 1.5x price
Buyer-facing

The badge buyers start filtering for.

Surface the score next to price. Honest sellers earn a premium; cheaters get routed last. You become the marketplace buyers recommend — because it's the one that proves what it sells.

98 verified claude-opus-4.6 · seller_a1
$11.40 / 1M
11 flagged claude-opus-4.6 · seller_x7
$9.10 / 1M
Show verified sellers only
Under the hood

Why the score can't be gamed.

▦ STAKE

Attestors put up USDC

Verifiers stake real money to participate. A dishonest verdict costs them their stake.

◈ BLEND

Probes look like traffic

Tests route through your normal buyer path. A seller can't serve the real model only when watched.

⚡ SLASH

Liars get slashed

Caught downgrading? The seller's stake is slashed on-chain and the offer drops in routing.

⟳ CHALLENGE

Anyone can dispute

Every result has a challenge window. Re-run the probes, post contradicting evidence, win the slash.

Built to plug into the marketplaces buyers already use
Surplus Intelligence OpenRouter Together Requesty Helicone Glama Portkey

OpenAI-compatible from day one. If you route inference, Litmus reads in one call.

Free · no integration required

See who's lying on your marketplace.

✓ Report requested. We'll probe your offers and send the results within 24h.

Prefer to talk first? Book a 15-min call →