The product

One synthetic population. Every domain renders from it.

We don't sell loose tables of fake rows. We generate synthetic patient archetypes, then render their entire data trail — eligibility, claims, Rx, revenue, and more — across every line of business, all from the same coherent people. Medicare Advantage is available now; everything else is the same engine pointed at a new render target.

Download the free sample →See pricing

Synthetic population

every record renders from it

4 live

Data domains

+ encounters · labs · quality

Available now

every line of business next

36 mo

Longitudinal

member-month grain

One engine, every render target

The archetype is the source. The data is what it renders.

A synthetic member isn't a row — it's a clinical life. Once we've generated the person, we can emit whatever data domain you need from that single coherent source, all joining on member_id. Four domains ship today; the rest render from the same people as we expand.

How the engine works →

Eligibility & enrollment

Available now

Member-month enrollment, demographics, plan/program, and benefit status — the spine every other domain links to.

Medical claims

Available now

Line-level institutional + professional claims: diagnoses, procedures, settings, allowed/paid, and realistic adjustment chains.

Pharmacy

Available now

NCPDP-grade drug fills with refill chains, benefit phases, formulary tiers, and net-of-rebate economics.

Revenue & payment

Available now

Payer-side revenue the way a plan receives it — capitation, risk scores, and the factors behind every dollar.

Encounters

Expanding

Encounter-level utilization independent of billing — the visit-and-service record health systems and value-based programs run on.

Labs & results

Expanding

Ordered tests with realistic result values trended to each member's conditions — A1c that tracks the diabetic, eGFR that tracks the CKD.

Quality measures

Expanding

Measure-ready numerators, denominators, and gaps (HEDIS-style / Stars) rendered from each member's actual care.

Lines of business

Medicare Advantage today · the rest expanding

Medicare AdvantageliveMedicare FFSsoonCommercial / employersoonMedicaidsoonACA / exchangesoon

The product below — the files, sizes, releases, and pricing — is the Medicare Advantage line, available to download today. Every other line of business renders from the same archetype engine.

Medicare Advantage — available now

Available now

One population, rendered four ways.

Today the Medicare Advantage line renders four linked domains — eligibility, medical, Rx, and revenue. Each file is a faithful representation of how a plan actually receives that data stream. They are not independent samples — they are the same synthetic members seen from four angles, and they all join on member_id.

Eligibility

one row per member-month

The enrollment spine. Demographics, plan/contract, dual & LIS & ESRD status, and the HCC condition flags that drive everything downstream.

Member-month grain across 36 months
Realistic age-ins, disenrollment, mortality
Dual / LIS / ESRD flags that flip mid-year
Links to every other file via member_id

Revenue (MMR)

one row per member-month

CMS payment the way a plan actually receives it. Part C & Part D capitation, V24/V28 blended risk scores, and the demographic + dual factors behind each dollar.

V24, V28, and blended risk scores
Part C + Part D capitation lines
Coding-intensity + normalization applied
Reconciles to eligibility member-months

Medical claims

one row per claim line

Line-level institutional + professional claims. DRGs, CPT/HCPCS, revenue codes, place of service, 25 diagnosis slots, allowed/paid, and realistic adjustment chains.

Institutional + professional, multi-line claims
Up to 25 ICD-10 diagnoses per claim
Acute episode bundles (CHF, AMI, sepsis…)
Adjustments, denials, reversals, paid-date lag

Rx claims (Part D)

one row per fill line

NCPDP-grade pharmacy claims. NDC-level fills with refill chains, benefit phases (deductible → coverage gap → catastrophic), formulary tiers, and net-of-rebate economics.

Refill chains with realistic adherence
Benefit phases + IRA Part D cap effects
Formulary tiers, DAW codes, pharmacy NPIs
Gross, member, plan, and net-of-rebate paid

Referential integrity is guaranteed: every claim line, every fill, and every MMR row points back to a member that exists in eligibility for that month. You can join the full picture — diagnosis to spend to risk score to payment — without a single orphaned key.

Packaging

Buy one file, or take the bundle and save.

Pricing is à la carte. Need only revenue to backtest a risk model, or only Rx for a Part D study? Buy that one file. Need the whole joined population? The full bundle of all four costs less than the sum of its parts.

Starter panel

100,000 members · per release

100k

À la carte

Eligibility$1,000

Revenue (MMR)$1,000

Rx claims$1,000

Medical claims$1,000

All four, separately$4,000

Full bundle

All four linked files

$3,000

Save 25%

Production panel

500,000 members · per release

500k

À la carte

Eligibility$2,500

Revenue (MMR)$2,500

Rx claims$2,500

Medical claims$2,500

All four, separately$10,000

Full bundle

All four linked files

$7,500

Save 25%

Prices are per release, in USD. The same à-la-carte and bundle structure applies on every available version — you choose the fidelity separately from the files and the size. See full pricing →

Size tiers

Prototype on a starter panel. Ship on a production one.

Both tiers are the same generator, the same schema, and the same fidelity — they differ only in how many synthetic members you get. Pick the size by the job, not the quality.

Starter panel

100,000 members

A 100,000-member population — large enough to be statistically meaningful, small enough to download, query on a laptop, and iterate on fast. Built for prototyping: schema validation, pipeline development, demos, methodology checks, and cost- sensitive experiments where you need realistic structure more than population scale.

Develop and unit-test ingestion + transforms
Demo a product on realistic, PHI-free data
Sanity-check a methodology before you scale it
From $1000 per file

Production panel

500,000 members

A 500,000-member population — enough density to model rarer conditions, stabilize HCC prevalence, and trust tail behavior in the cost curve. Built for production: training and validating risk-adjustment models, benchmarking a real book of business, and any analysis where small-cell noise would otherwise dominate.

Train and validate risk-adjustment models at scale
Benchmark a population with stable rates and tails
Study rare cohorts without tiny-sample noise
From $2,500 per file

Versions = model family

Choose a release the way you'd choose a model.

Each version is a distinct model of reality. Newer releases capture more of the messy truth and carry a higher fidelity score; older ones are lighter and cheaper and still hit national control totals. One is on the roadmap. Pick the fidelity your use case actually needs.

The releases below (v1, v2) are the Medicare Advantage line — more lines of business will follow on the same versioned engine.

v2 · MeridianCurrent

Persistence, seasonality, and the social-determinants MLR fix.

fidelity score

1,000

AI archetypes

2.0

quality ver.

Best for: Benchmarking, model training, and anything sensitive to member-level persistence or seasonality.

Changelog · 2026-06

1,000 AI-generated patient archetypes (up from 50)
Member-level utilization persistence (sticky high-utilizers)
Monthly seasonality on ER / IP / surgery / office visits
Dual MLR now correctly exceeds non-dual (was inverted in v1)
Explicit well-cohort carve-out for a realistic 5/50 spend curve

v1 · BaselineAvailable

The first calibrated release. Solid control totals, simpler dynamics.

fidelity score

AI archetypes

1.0

quality ver.

Best for: Schema validation, pipeline development, and cost-sensitive prototyping.

Changelog · 2026-05

Four linked parquet files, full schema
Calibrated to MA national control totals (risk, MLR, PMPM, util/1000)
HCC-driven member journeys + acute episode bundles
Automated credibility audit shipped with every batch

v3 · nextRoadmap

Real-claims pattern learning, SNP cohorts, provider continuity.

—

fidelity score

5,000

AI archetypes

3.0

quality ver.

Best for: Coming next — the highest-fidelity release.

Planned

Pattern-learning pipeline trained on licensed real claims
SNP cohort modeling (D-SNP / C-SNP / I-SNP)
Provider continuity + referral networks
Readmission + post-acute pathways to clinical targets

Fidelity scores are our composite credibility metric — the same eval, run release over release, so the trend is honest. v3 is calibrated to published benchmarks today; the real-claims pattern-learning that earns its score is the next thing we're building. How we measure fidelity →

Every dataset is auditable

Each purchase ships with its own credibility audit.

You should never have to take our word for it. Every dataset you buy — any file, any size, any release — arrives with an actuarial summary and a credibility audit generated against that exact batch. The metrics below are from the current Medicare Advantage release.

The report bundle covers the metrics an actuary or data scientist would check first: PMPM, medical loss ratio, average risk score, utilization per 1,000, HCC prevalence, and cost concentration — each compared to a published CMS, MedPAC, or public-insurer benchmark, with citations. If a number drifts outside its expected band, the audit flags it rather than hiding it.

How we measure fidelity

Actuarial summaryMA release · illustrative

MetricDatasetBenchmark

Avg risk score1.011.00–1.10

Medical loss ratio88%85–92%

PMPM (total)$1,042$980–1,100

IP admits / 1,000274250–300

HCC prevalence79%75–82%

Top-5% cost share49%~50%

Figures are illustrative. Each dataset's real audit — computed on the batch you purchase — ships in its report bundle alongside the data.

Judge the fidelity yourself.

Start with the free 1,000-member sample — full schema, full report bundle, no signup. Then browse what's ready to download and pick your files, size, and release.

Get the free sample Browse available datasets