Glossary

From Data to Bedside · every term in the pathway, defined and linked

This glossary indexes every concept in the pathway. Each entry gives a one-line definition and links to the pathway node where the term is taught in full; on the pathway, the term links back here. Terms are listed alphabetically.

1-inpatient / 2-outpatient rule

Counting a case from one inpatient diagnosis or two outpatient diagnoses on separate dates to filter out rule-out codes. in the pathway →

On the pathway · 01 · Measurement · Outcome phenotyping and algorithm validation

Is your outcome a validated algorithm or an unchecked code rule?

the overall ideaOutcome phenotyping and validation
the rule itselfClaims/EHR phenotype algorithm
common coding rule1-inpatient / 2-outpatient rule
tuning the ruleAlgorithm validation (PPV and sensitivity tradeoff)
reference standardEndpoint adjudication and chart review
bundled outcomesComposite endpoint construction

3+3 design

A classic phase I escalation in small cohorts until toxicity appears. in the pathway →

On the pathway · 00 · Framing · Dose-finding and early-phase designs

Which early-phase design question are you facing?

the overall design familyDose-finding and early-phase designs
dose escalation by fixed cohort rule3+3 design
model-based dose escalationContinual reassessment method
the highest acceptably safe doseMaximum tolerated dose
phase II screening for efficacySimon’s two-stage design

A

Absolute risk reduction

The absolute difference in risk between groups; its reciprocal is the number needed to treat. in the pathway →

On the pathway · 03 · Estimate · Relative versus absolute

Which scale frames the effect?

the overall ideaRelative versus absolute
effect on the absolute scaleAbsolute risk reduction

Accelerated failure time (AFT) models

Parametric models on log survival time, yielding a time ratio when proportional hazards fails. in the pathway →

On the pathway · 03 · Estimate · Competing risks and parametric survival

Does a competing event block the outcome?

modeling time to eventCompeting risks and survival models
a competing event existsCompeting risks
you want etiologyCause-specific hazard
you want absolute riskCumulative incidence function (CIF)
modeling absolute risk directlyFine-Gray subdistribution hazard
proportional hazards failsAccelerated failure time (AFT) models
some patients are curedCure models

Active-comparator new-user design

Restricts to initiators of a treatment versus an active alternative, curbing confounding by indication and prevalent-user and immortal-time distortions. in the pathway →

On the pathway · 00 · Framing · Observational study designs

Which observational design fits the question and dominant bias?

the overall familyObservational study designs
exposure known, follow forwardCohort study
snapshot at one timeCross-sectional study
rare outcome, look backCase-control study
controls sampled within cohortNested case-control
random subcohort, multiple outcomesCase-cohort design
transient trigger, acute eventCase-crossover design
within-person rate comparisonSelf-controlled case series (SCCS)
corrects exposure time trendsCase-time-control design
initiators, active comparatorActive-comparator new-user design
standing source populationDisease registry

ADaM

A CDISC standard for analysis-ready clinical trial datasets derived from SDTM. in the pathway →

On the pathway · 01 · Measurement · Data standards and provenance

Which data standard or provenance layer?

the overall idea of standards and provenanceData standards and provenance
clinical coding terminology for findingsSNOMED
regulatory model for collected trial dataCDISC SDTM
analysis-ready dataset standardADaM

ADSL

A subject-level ADaM dataset with one row per trial participant. in the pathway →

On the pathway · 01 · Measurement · Assembling a clinical trial dataset

What are you building or tracing?

the overall ideaAssembling a clinical trial dataset
one row per subjectADSL
link result back to sourceTraceability

Age-standardization

Adjusting rates to a standard population so comparisons are not confounded by differing age structures, done directly or indirectly. in the pathway →

On the pathway · 01 · Measurement · Measures of disease frequency

What frequency are you trying to measure?

the overall ideaMeasures of disease frequency
existing cases at a time pointPrevalence
new cases over follow-upIncidence
new cases as a proportion at riskCumulative incidence
new cases per unit follow-up timeIncidence rate
denominator of summed follow-upPerson-time
unadjusted rate in a populationCrude rate
comparing rates across populationsAge-standardization
observed versus expected deathsStandardized mortality ratio

AIC

An information criterion trading goodness of fit against the number of parameters to compare non-nested models, where lower is better. in the pathway →

On the pathway · 03 · Estimate · Model fit, comparison, and prediction error

Which fit or error measure do you need?

the overall ideaModel fit, comparison, and prediction error
variance explained, linear modelR-squared
pseudo-fit for logistic modelMcFadden’s pseudo-R-squared
compare models, penalize parametersAIC
compare models, penalize more heavilyBIC
prediction error, average magnitudeMean absolute error
prediction error, penalize large missesRMSE

Algorithm validation (PPV and sensitivity tradeoff)

Tightening a rule raises positive predictive value but lowers sensitivity, and vice versa. in the pathway →

On the pathway · 01 · Measurement · Outcome phenotyping and algorithm validation

Is your outcome a validated algorithm or an unchecked code rule?

the overall ideaOutcome phenotyping and validation
the rule itselfClaims/EHR phenotype algorithm
common coding rule1-inpatient / 2-outpatient rule
tuning the ruleAlgorithm validation (PPV and sensitivity tradeoff)
reference standardEndpoint adjudication and chart review
bundled outcomesComposite endpoint construction

Allocation concealment

A safeguard ensuring the next treatment assignment cannot be foreseen and gamed. in the pathway →

On the pathway · 00 · Framing · Randomization and blinding

What allocation or masking concern?

the overall ideaRandomization and blinding
hide the upcoming assignmentAllocation concealment
balance arms in small chunksBlock randomization
balance within prognostic strataStratified randomization
dynamically balance many factorsMinimization
mask treatment after allocationBlinding

Analysis populations

Who counts in the analysis, contrasting intention-to-treat, per-protocol, and as-treated, itself a choice of estimand. in the pathway →

On the pathway · 02 · Model · Analysis populations (ITT vs per-protocol)

Which set of subjects do you analyze?

the overall ideaAnalysis populations
as randomized, regardless of adherenceIntention-to-treat
only those who followed protocolPer-protocol
grouped by treatment actually receivedAs-treated

ANOVA

Analysis of variance, extending the t-test to compare a continuous outcome across more than two groups. in the pathway →

On the pathway · 02 · Model · Bivariate tests (t-test, chi-square)

Which two-variable association are you testing?

the overall family of testsBivariate tests
mean across two groupsT-test
means across three or more groupsANOVA
ranks across two groupsMann-Whitney test
ranks across three or more groupsKruskal-Wallis test
two categorical variables, large countsChi-square test
two categorical variables, small countsFisher’s exact test
linear correlation of two continuousPearson correlation
monotonic correlation, rankedSpearman correlation
concordance-based rank correlationKendall’s tau
effect size for two meansCohen’s d
effect size for ANOVAEta-squared
effect size for categorical associationCramer’s V

As-treated

Analyzing patients by the treatment they actually received. in the pathway →

On the pathway · 02 · Model · Analysis populations (ITT vs per-protocol)

Which set of subjects do you analyze?

the overall ideaAnalysis populations
as randomized, regardless of adherenceIntention-to-treat
only those who followed protocolPer-protocol
grouped by treatment actually receivedAs-treated

Assay sensitivity

The assumption that a trial could have detected a real difference had one existed, since a sloppy trial looks non-inferior. in the pathway →

On the pathway · 00 · Framing · Non-inferiority and equivalence

What are you trying to show?

the overall ideaNon-inferiority and equivalence
new is not meaningfully worseNon-inferiority trial
new is neither worse nor betterEquivalence trial
how much worse is tolerableNon-inferiority margin
trial can detect a real differenceAssay sensitivity

Assembling a clinical trial dataset

Standardizing trial data from case-report forms through CDISC SDTM into ADaM analysis datasets, governed by traceability. in the pathway →

On the pathway · 01 · Measurement · Assembling a clinical trial dataset

What are you building or tracing?

the overall ideaAssembling a clinical trial dataset
one row per subjectADSL
link result back to sourceTraceability

Assembling the analytic cohort

Turning a research database into one analysis-ready table via extract-transform-load, fixing its grain and time structure. in the pathway →

On the pathway · 01 · Measurement · Assembling the analytic cohort

Which cohort-construction step?

the overall ideaAssembling the analytic cohort
pull and reshape raw source dataExtract-transform-load
set the time-zero anchorIndex date
define pre-index covariate historyLookback window
restrict to treatment initiatorsNew-user design

ATC and defined daily dose (DDD)

WHO classification grouping drugs by therapeutic class, paired with a standard daily dose unit for comparable utilization. in the pathway →

On the pathway · 01 · Measurement · Claims and coding standards

Which vocabulary encodes each claim field, and what does it capture?

the overall ideaClaims and coding standards
diagnosesICD-10-CM diagnosis codes
inpatient proceduresICD-10-PCS procedure codes
professional servicesCPT/HCPCS codes
dispensed drugsNDC (National Drug Code)
labs and observationsLOINC lab codes
providerNPI (provider identifier)
cross-database mappingOMOP standardized vocabularies (OHDSI)
translating codesCode crosswalks and mappings
drug utilizationATC and defined daily dose (DDD)

ATE

The average treatment effect, the contrast of potential outcomes over everyone. in the pathway → \[\text{ATE} = E[Y(1) - Y(0)]\] where \(\text{ATE}\) is the average treatment effect over the whole population; \(Y(1)\) is the outcome a unit would have under treatment; \(Y(0)\) is the outcome the same unit would have under no treatment.

On the pathway · 02 · Model · Choosing the estimand

Whose causal effect do you target?

the overall ideaChoosing the estimand
the formal target quantityEstimand
effect across the whole populationATE
effect among the treatedATT
effect among compliers onlyLATE

ATT

The average treatment effect on the treated, the contrast of potential outcomes among treated units. in the pathway →

On the pathway · 02 · Model · Choosing the estimand

Whose causal effect do you target?

the overall ideaChoosing the estimand
the formal target quantityEstimand
effect across the whole populationATE
effect among the treatedATT
effect among compliers onlyLATE

Attributable risk and population attributable fraction (PAF)

Attributable risk is the excess risk among the exposed; PAF is the share of population cases removable by eliminating the exposure. in the pathway →

On the pathway · 02 · Model · Sparse data and resampling

Are cells sparse or analytic standard errors doubtful?

the overall familySparse data and resampling
separation or small samplesFirth penalized regression
very sparse, exact inferenceExact logistic regression
no clean closed-form varianceBootstrap and resampling methods
public-health impact measuresAttributable risk and population attributable fraction (PAF)

Attrition bias

Bias from differential loss to follow-up over time between groups. in the pathway →

On the pathway · ∗ · Defend it · Study biases, by rung

Which bias is threatening the study?

the overall map of biasesStudy biases, by rung
who got sampled or enrolledSampling bias
distorted entry into the studySelection bias
selection among hospitalized patientsBerkson’s bias
conditioning on a common effectOver-adjustment
employed groups appear healthierHealthy-worker effect
only survivors are observedSurvivorship bias
differential loss to follow-upAttrition bias
nonresponders differ systematicallyNonresponse bias
a common cause of exposure and outcomeConfounding
treatment chosen by prognosisConfounding by indication
unequal outcome ascertainmentDetection bias
inaccurate recall of exposureRecall bias
interviewer shapes responsesInterviewer bias
earlier detection inflates survivalLead-time bias
slow cases preferentially detectedLength-time bias
detecting harmless diseaseOverdiagnosis
selective reporting of resultsPublication bias

AUC

The probability that a randomly chosen case gets a higher predicted risk than a randomly chosen non-case, where 0.5 is chance and 1 is perfect ranking. in the pathway →

On the pathway · 03 · Estimate · Calibration versus discrimination

Which aspect of predictive performance?

the overall ideaCalibration versus discrimination
ranking cases above non-casesDiscrimination
summarizing ranking across thresholdsAUC
predicted risks match observedCalibration
testing calibration formallyHosmer-Lemeshow statistic

B

Back-door criterion

The rule that reads the adjustment set straight off a causal diagram. in the pathway →

On the pathway · 02 · Model · Causal diagrams (DAGs) and conceptual frameworks

Which causal-diagram concept?

the overall frameworkCausal diagrams
the graph notation itselfDAG
common cause of exposure and outcomeConfounder
variable on the causal pathMediator
common effect, conditioning opens biasCollider
rule for sufficient adjustment setsBack-door criterion

Bagging

Training many trees on bootstrap resamples and averaging them to lower variance, the basis of the random forest. in the pathway →

On the pathway · 02 · Model · Learning algorithms and ensembles

Which learner or ensemble fits?

the overall ideaLearning algorithms and ensembles
single interpretable splitsDecision tree (machine learning)
classify by closest neighborsK-nearest neighbours
maximum-margin separating boundarySupport vector machine
average parallel bootstrapped modelsBagging
many decorrelated bagged treesRandom forest
sequentially correct prior errorsBoosting

Bayes’ theorem

The rule that the posterior is proportional to the likelihood times the prior. in the pathway → \[\text{posterior} \propto \text{likelihood} \times \text{prior}\] where \(\text{posterior}\) is the updated distribution of the parameter after seeing the data; \(\text{likelihood}\) is what the data say about the parameter; \(\text{prior}\) is what you believed about the parameter before seeing the data.

On the pathway · 02 · Model · Bayesian inference

Which Bayesian concept is in play?

the overall frameworkBayesian inference
the updating rule itselfBayes’ theorem
beliefs after seeing dataPosterior distribution
interval summary of the posteriorCredible interval

Bayesian computation

Exploring posteriors with no closed form by simulation, principally Markov chain Monte Carlo. in the pathway →

On the pathway · 02 · Model · Bayesian computation (MCMC)

Which sampling or diagnostic tool?

the overall ideaBayesian computation
sampling the posterior generallyMCMC
proposal-and-accept samplerMetropolis-Hastings
sample each parameter conditionallyGibbs sampling
gradient-guided efficient samplerHamiltonian Monte Carlo
check chains have convergedR-hat
check model reproduces the dataPosterior predictive check

Bayesian inference

Treating the parameter as a random quantity with a distribution the data update, yielding a posterior summarized by a credible interval. in the pathway →

On the pathway · 02 · Model · Bayesian inference

Which Bayesian concept is in play?

the overall frameworkBayesian inference
the updating rule itselfBayes’ theorem
beliefs after seeing dataPosterior distribution
interval summary of the posteriorCredible interval

Belmont principles

The three principles underpinning research ethics: respect for persons, beneficence, and justice. in the pathway →

On the pathway · § · Conduct it · Research ethics and the IRB

Which ethics concept or body?

the overall ideaResearch ethics and the IRB
foundational ethical principlesBelmont principles
genuine uncertainty justifying a trialClinical equipoise
participant’s voluntary agreementInformed consent
body that reviews and approves studiesInstitutional review board

Benjamini-Hochberg

A procedure controlling the false-discovery rate among rejected hypotheses. in the pathway →

On the pathway · 02 · Model · Multiplicity control

How do you control multiple testing?

the overall ideaMultiplicity control
bound any false positiveFamily-wise error rate
bound false positives among rejectionsFalse-discovery rate
simple conservative FWER divisorBonferroni correction
stepwise FWER controlHolm’s procedure
step-up FDR controlBenjamini-Hochberg
test hypotheses in ordered familiesGatekeeping procedure

Berkson’s bias

The spurious association produced by conditioning on hospital admission. in the pathway →

On the pathway · ∗ · Defend it · Study biases, by rung

Which bias is threatening the study?

the overall map of biasesStudy biases, by rung
who got sampled or enrolledSampling bias
distorted entry into the studySelection bias
selection among hospitalized patientsBerkson’s bias
conditioning on a common effectOver-adjustment
employed groups appear healthierHealthy-worker effect
only survivors are observedSurvivorship bias
differential loss to follow-upAttrition bias
nonresponders differ systematicallyNonresponse bias
a common cause of exposure and outcomeConfounding
treatment chosen by prognosisConfounding by indication
unequal outcome ascertainmentDetection bias
inaccurate recall of exposureRecall bias
interviewer shapes responsesInterviewer bias
earlier detection inflates survivalLead-time bias
slow cases preferentially detectedLength-time bias
detecting harmless diseaseOverdiagnosis
selective reporting of resultsPublication bias

Bias quantification

Putting a number on how much unmeasured confounding it would take to overturn a result, as one pre-specified sensitivity analysis. in the pathway →

On the pathway · ∗ · Defend it · Bias quantification

How do you quantify unmeasured bias?

the overall ideaBias quantification
strength needed to explain awayE-value
hidden bias in matched designsRosenbaum bounds

Bias-variance and regularization

The tradeoff between a model too simple to capture signal and one flexible enough to chase noise, managed by regularization. in the pathway →

On the pathway · 02 · Model · Bias-variance and regularization

Which concept or penalty?

the overall ideaBias-variance and regularization
the underlying error tradeoffBias-variance tradeoff
fitting noise, poor generalizationOverfitting
penalizing complexity broadlyRegularization
estimating out-of-sample errorCross-validation
shrink coefficients, keep allRidge regression
shrink and select variablesLasso
blend selection and shrinkageElastic net

Bias-variance tradeoff

The balance where a too-simple model underfits and a too-flexible model overfits, with prediction error their sum plus irreducible noise. in the pathway → \[\text{expected prediction error} = \text{bias}^2 + \text{variance} + \text{irreducible noise}\] where \(\text{expected prediction error}\) is the average error on new data; \(\text{bias}^2\) is the squared error from a model too simple to capture the signal; \(\text{variance}\) is the error from a model flexible enough to chase noise; \(\text{irreducible noise}\) is the variation no model can remove.

On the pathway · 02 · Model · Bias-variance and regularization

Which concept or penalty?

the overall ideaBias-variance and regularization
the underlying error tradeoffBias-variance tradeoff
fitting noise, poor generalizationOverfitting
penalizing complexity broadlyRegularization
estimating out-of-sample errorCross-validation
shrink coefficients, keep allRidge regression
shrink and select variablesLasso
blend selection and shrinkageElastic net

BIC

An information criterion like AIC but penalizing each extra parameter more heavily, so it favors smaller models, where lower is better. in the pathway →

On the pathway · 03 · Estimate · Model fit, comparison, and prediction error

Which fit or error measure do you need?

the overall ideaModel fit, comparison, and prediction error
variance explained, linear modelR-squared
pseudo-fit for logistic modelMcFadden’s pseudo-R-squared
compare models, penalize parametersAIC
compare models, penalize more heavilyBIC
prediction error, average magnitudeMean absolute error
prediction error, penalize large missesRMSE

Binomial distribution

The distribution of counts of successes. in the pathway →

On the pathway · 01 · Measurement · Probability distributions and the CLT

Which distribution or sampling result?

the overall ideaProbability distributions
continuous bell-shaped variableNormal distribution
fixed-trial success countsBinomial distribution
counts of rare eventsPoisson distribution
overdispersed count dataNegative binomial distribution
why sample means turn normalCentral limit theorem
spread of a sample estimateStandard error

Bivariate tests

Classical tests of whether two variables are associated, each a special case of a regression model. in the pathway →

On the pathway · 02 · Model · Bivariate tests (t-test, chi-square)

Which two-variable association are you testing?

the overall family of testsBivariate tests
mean across two groupsT-test
means across three or more groupsANOVA
ranks across two groupsMann-Whitney test
ranks across three or more groupsKruskal-Wallis test
two categorical variables, large countsChi-square test
two categorical variables, small countsFisher’s exact test
linear correlation of two continuousPearson correlation
monotonic correlation, rankedSpearman correlation
concordance-based rank correlationKendall’s tau
effect size for two meansCohen’s d
effect size for ANOVAEta-squared
effect size for categorical associationCramer’s V

Bland-Altman plot

A plot of differences against means that reveals systematic disagreement two methods can have despite high correlation. in the pathway →

On the pathway · 01 · Measurement · Reliability and validity

Which measurement property are you assessing?

the overall ideaReliability and validity
consistency of measurementReliability
measuring the intended constructValidity
internal consistency of scale itemsCronbach’s alpha
agreement on continuous measuresIntraclass correlation
plot method agreement and biasBland-Altman plot
categorical agreement, two ratersCohen’s kappa
ordered-category agreement, two ratersWeighted kappa
categorical agreement, many ratersFleiss’ kappa

Blinding

Keeping patients, clinicians, and outcome assessors unaware of the assigned arm to prevent the bias that knowing it introduces. in the pathway →

On the pathway · 00 · Framing · Randomization and blinding

What allocation or masking concern?

the overall ideaRandomization and blinding
hide the upcoming assignmentAllocation concealment
balance arms in small chunksBlock randomization
balance within prognostic strataStratified randomization
dynamically balance many factorsMinimization
mask treatment after allocationBlinding

Block randomization

Permuted-block randomization that keeps trial arms close to equal in size as enrollment proceeds. in the pathway →

On the pathway · 00 · Framing · Randomization and blinding

What allocation or masking concern?

the overall ideaRandomization and blinding
hide the upcoming assignmentAllocation concealment
balance arms in small chunksBlock randomization
balance within prognostic strataStratified randomization
dynamically balance many factorsMinimization
mask treatment after allocationBlinding

Bonferroni correction

Dividing alpha by the number of tests to control the family-wise error rate. in the pathway →

On the pathway · 02 · Model · Multiplicity control

How do you control multiple testing?

the overall ideaMultiplicity control
bound any false positiveFamily-wise error rate
bound false positives among rejectionsFalse-discovery rate
simple conservative FWER divisorBonferroni correction
stepwise FWER controlHolm’s procedure
step-up FDR controlBenjamini-Hochberg
test hypotheses in ordered familiesGatekeeping procedure

Boosting

Fitting trees in sequence, each correcting the last’s residuals, to lower bias, as in gradient boosting and XGBoost. in the pathway →

On the pathway · 02 · Model · Learning algorithms and ensembles

Which learner or ensemble fits?

the overall ideaLearning algorithms and ensembles
single interpretable splitsDecision tree (machine learning)
classify by closest neighborsK-nearest neighbours
maximum-margin separating boundarySupport vector machine
average parallel bootstrapped modelsBagging
many decorrelated bagged treesRandom forest
sequentially correct prior errorsBoosting

Bootstrap and resampling methods

Repeatedly resample the observed data with replacement and recompute the estimate, building an empirical sampling distribution for intervals when analytic standard errors are awkward. in the pathway →

On the pathway · 02 · Model · Sparse data and resampling

Are cells sparse or analytic standard errors doubtful?

the overall familySparse data and resampling
separation or small samplesFirth penalized regression
very sparse, exact inferenceExact logistic regression
no clean closed-form varianceBootstrap and resampling methods
public-health impact measuresAttributable risk and population attributable fraction (PAF)

Budget impact analysis

Projecting the total cost to a specific budget holder of adopting an intervention across the eligible population over a near-term horizon under realistic uptake. in the pathway →

On the pathway · 05 · Decision rule · Budget impact analysis

What affordability question are you in?

estimating the cost of adoptionBudget impact analysis

C

Calibration

Whether a model’s predicted risks match observed event rates, read off a calibration plot or tested with goodness-of-fit. in the pathway →

On the pathway · 03 · Estimate · Calibration versus discrimination

Which aspect of predictive performance?

the overall ideaCalibration versus discrimination
ranking cases above non-casesDiscrimination
summarizing ranking across thresholdsAUC
predicted risks match observedCalibration
testing calibration formallyHosmer-Lemeshow statistic

Calibration (modeling)

Tuning an unobservable parameter until a model’s outputs match observed targets, with the resulting uncertainty carried forward. in the pathway →

On the pathway · 05 · Decision rule · Model validation and calibration

What situation?

the overall ideaModel validation and calibration
tuning model outputs to realityCalibration (modeling)
confirming the model runs correctlyVerification

Calibration versus discrimination

Discrimination asks whether a model ranks higher-risk patients above lower-risk ones, while calibration asks whether predicted risks match observed rates. in the pathway →

On the pathway · 03 · Estimate · Calibration versus discrimination

Which aspect of predictive performance?

the overall ideaCalibration versus discrimination
ranking cases above non-casesDiscrimination
summarizing ranking across thresholdsAUC
predicted risks match observedCalibration
testing calibration formallyHosmer-Lemeshow statistic

Case-cohort design

Samples a random subcohort plus all cases, letting one comparison group serve several outcomes from the same source population. in the pathway →

On the pathway · 00 · Framing · Observational study designs

Which observational design fits the question and dominant bias?

the overall familyObservational study designs
exposure known, follow forwardCohort study
snapshot at one timeCross-sectional study
rare outcome, look backCase-control study
controls sampled within cohortNested case-control
random subcohort, multiple outcomesCase-cohort design
transient trigger, acute eventCase-crossover design
within-person rate comparisonSelf-controlled case series (SCCS)
corrects exposure time trendsCase-time-control design
initiators, active comparatorActive-comparator new-user design
standing source populationDisease registry

Case-control study

Starts from outcome status, comparing prior exposure in cases versus controls; efficient for rare outcomes and long latencies. in the pathway →

On the pathway · 00 · Framing · Observational study designs

Which observational design fits the question and dominant bias?

the overall familyObservational study designs
exposure known, follow forwardCohort study
snapshot at one timeCross-sectional study
rare outcome, look backCase-control study
controls sampled within cohortNested case-control
random subcohort, multiple outcomesCase-cohort design
transient trigger, acute eventCase-crossover design
within-person rate comparisonSelf-controlled case series (SCCS)
corrects exposure time trendsCase-time-control design
initiators, active comparatorActive-comparator new-user design
standing source populationDisease registry

Case-crossover design

Self-controlled design comparing a person’s exposure shortly before an event to their own earlier reference periods, suited to transient triggers. in the pathway →

On the pathway · 00 · Framing · Observational study designs

Which observational design fits the question and dominant bias?

the overall familyObservational study designs
exposure known, follow forwardCohort study
snapshot at one timeCross-sectional study
rare outcome, look backCase-control study
controls sampled within cohortNested case-control
random subcohort, multiple outcomesCase-cohort design
transient trigger, acute eventCase-crossover design
within-person rate comparisonSelf-controlled case series (SCCS)
corrects exposure time trendsCase-time-control design
initiators, active comparatorActive-comparator new-user design
standing source populationDisease registry

Case-time-control design

Adds a control group to the case-crossover design to adjust for exposure trends over calendar time. in the pathway →

On the pathway · 00 · Framing · Observational study designs

Which observational design fits the question and dominant bias?

the overall familyObservational study designs
exposure known, follow forwardCohort study
snapshot at one timeCross-sectional study
rare outcome, look backCase-control study
controls sampled within cohortNested case-control
random subcohort, multiple outcomesCase-cohort design
transient trigger, acute eventCase-crossover design
within-person rate comparisonSelf-controlled case series (SCCS)
corrects exposure time trendsCase-time-control design
initiators, active comparatorActive-comparator new-user design
standing source populationDisease registry

Causal designs without randomization

A set of designs, each neutralizing a specific dominant threat to causal inference, matched to the threat endangering the question. in the pathway →

On the pathway · 02 · Model · Causal designs without randomization

Which quasi-experimental design fits?

the overall ideaCausal designs without randomization
before-after across exposed and controlDifference-in-differences
a haphazard nudge to exposureInstrumental variables
assignment by a cutoff thresholdRegression discontinuity
weighted donors build a counterfactualSynthetic control

Causal diagrams

A directed acyclic graph of assumed causal effects that sorts each covariate into a confounder, mediator, or collider. in the pathway →

On the pathway · 02 · Model · Causal diagrams (DAGs) and conceptual frameworks

Which causal-diagram concept?

the overall frameworkCausal diagrams
the graph notation itselfDAG
common cause of exposure and outcomeConfounder
variable on the causal pathMediator
common effect, conditioning opens biasCollider
rule for sufficient adjustment setsBack-door criterion

Causal estimators

Methods that turn a fixed design and adjustment set into a number, including propensity scores and g-methods. in the pathway →

On the pathway · 02 · Model · Causal estimators (propensity scores, g-methods)

How do you estimate the causal effect?

the overall ideaCausal estimators
model treatment assignment probabilityPropensity score
reweight by inverse treatment probabilityIPTW
model and average outcomesG-formula
combine outcome and treatment modelsDoubly-robust estimators
targeted machine-learning estimationTMLE

Cause-specific hazard

Instantaneous event rate among patients still at risk, used to study etiology and biological mechanism. in the pathway →

On the pathway · 03 · Estimate · Competing risks and parametric survival

Does a competing event block the outcome?

modeling time to eventCompeting risks and survival models
a competing event existsCompeting risks
you want etiologyCause-specific hazard
you want absolute riskCumulative incidence function (CIF)
modeling absolute risk directlyFine-Gray subdistribution hazard
proportional hazards failsAccelerated failure time (AFT) models
some patients are curedCure models

CDISC SDTM

A CDISC standard for structuring clinical trial tabulation data. in the pathway →

On the pathway · 01 · Measurement · Data standards and provenance

Which data standard or provenance layer?

the overall idea of standards and provenanceData standards and provenance
clinical coding terminology for findingsSNOMED
regulatory model for collected trial dataCDISC SDTM
analysis-ready dataset standardADaM

Central limit theorem

The result that the mean of a large enough sample is approximately normal whatever the underlying shape, enabling z- and t-based inference. in the pathway →

On the pathway · 01 · Measurement · Probability distributions and the CLT

Which distribution or sampling result?

the overall ideaProbability distributions
continuous bell-shaped variableNormal distribution
fixed-trial success countsBinomial distribution
counts of rare eventsPoisson distribution
overdispersed count dataNegative binomial distribution
why sample means turn normalCentral limit theorem
spread of a sample estimateStandard error

Certainty of evidence (GRADE)

Rating how much confidence a body of evidence warrants, separately from effect size, downgrading for risk of bias, inconsistency, indirectness, imprecision, and publication bias. in the pathway →

On the pathway · 04 · Synthesis · Certainty of evidence (GRADE)

Which certainty-of-evidence concept is in play?

rating confidence in estimatesCertainty of evidence (GRADE)
the named frameworkGRADE

Characterizing the distribution

Examining what you measured, its shape, spread, and relationships, before assuming a model or summary is honest. in the pathway →

On the pathway · 01 · Measurement · Characterizing the distribution

What shape feature?

the overall ideaCharacterizing the distribution
asymmetry of the distributionSkewness
heaviness of the tailsKurtosis
smoothing a nonlinear trendLOESS smoother

Charlson comorbidity index (CCI)

A weighted count of selected serious conditions, originally calibrated to predict one-year mortality, used as a single comorbidity summary. in the pathway →

On the pathway · 01 · Measurement · Comorbidity and frailty adjustment

How do I adjust for how sick patients already were?

the overall ideaComorbidity and frailty adjustment
you need a mortality-weighted scoreCharlson comorbidity index (CCI)
you want broad comorbidity coverageElixhauser comorbidity measures
patients are older or frailClaims-based frailty index

Checking model assumptions

The diagnostics for the checkable statistical assumptions of a regression, distinct from a causal identifying assumption. in the pathway →

On the pathway · 02 · Model · Checking model assumptions

Which model assumption to check?

the overall ideaChecking model assumptions
non-constant residual varianceHeteroscedasticity
predictors too collinearVariance inflation factor
single points driving the fitCook’s distance
hazard ratio constant over timeProportional hazards
test that proportionality formallySchoenfeld residuals
fix variance without refittingRobust standard errors

CHEERS

The reporting checklist for economic evaluations, the economic-evaluation member of the reporting-standards family. in the pathway →

On the pathway · 05 · Decision rule · Perspective and the reference case

Whose costs and benefits count?

the overall ideaPerspective and the reference case
standardized analysis conventionsReference case
reporting checklist for economicsCHEERS
count all costs to societySocietal perspective
value of foregone alternativesOpportunity cost

Chi-square test

A test of independence between two categorical variables, with Fisher’s exact test used when cell counts are small. in the pathway →

On the pathway · 02 · Model · Bivariate tests (t-test, chi-square)

Which two-variable association are you testing?

the overall family of testsBivariate tests
mean across two groupsT-test
means across three or more groupsANOVA
ranks across two groupsMann-Whitney test
ranks across three or more groupsKruskal-Wallis test
two categorical variables, large countsChi-square test
two categorical variables, small countsFisher’s exact test
linear correlation of two continuousPearson correlation
monotonic correlation, rankedSpearman correlation
concordance-based rank correlationKendall’s tau
effect size for two meansCohen’s d
effect size for ANOVAEta-squared
effect size for categorical associationCramer’s V

Choosing a prior

Selecting the distribution encoding belief before the data, the most attacked part of a Bayesian analysis. in the pathway →

On the pathway · 02 · Model · Choosing a prior

What kind of prior do you need?

the overall choiceChoosing a prior
prior matched to the likelihoodConjugate prior
strong external informationInformative prior
light regularizing informationWeakly-informative prior

Choosing the estimand

Naming the exact quantity to be estimated, which effect and in whom, before choosing the method. in the pathway →

On the pathway · 02 · Model · Choosing the estimand

Whose causal effect do you target?

the overall ideaChoosing the estimand
the formal target quantityEstimand
effect across the whole populationATE
effect among the treatedATT
effect among compliers onlyLATE

Claims and coding standards

The coded vocabularies behind each claim field, where analysis depends on knowing what each captures and how they map to one another. in the pathway →

On the pathway · 01 · Measurement · Claims and coding standards

Which vocabulary encodes each claim field, and what does it capture?

the overall ideaClaims and coding standards
diagnosesICD-10-CM diagnosis codes
inpatient proceduresICD-10-PCS procedure codes
professional servicesCPT/HCPCS codes
dispensed drugsNDC (National Drug Code)
labs and observationsLOINC lab codes
providerNPI (provider identifier)
cross-database mappingOMOP standardized vocabularies (OHDSI)
translating codesCode crosswalks and mappings
drug utilizationATC and defined daily dose (DDD)

Claims data

Billing-driven encounter and prescription data covering a payer’s population broadly, where a code is a bill not a diagnosis and clinical detail is thin. in the pathway →

On the pathway · 01 · Measurement · Data sources and their tradeoffs

Which data source or pitfall?

the overall ideaData sources and their tradeoffs
clinical detail from care recordsElectronic health record data
billing records across encountersClaims data
enrolled cohorts for a conditionRegistries
sampled population questionnairesSurvey data
group-level inference pitfallEcological fallacy

Claims-based frailty index

A frailty proxy built from diagnosis and service codes, approximating functional decline when direct frailty assessment is unavailable in data. in the pathway →

On the pathway · 01 · Measurement · Comorbidity and frailty adjustment

How do I adjust for how sick patients already were?

the overall ideaComorbidity and frailty adjustment
you need a mortality-weighted scoreCharlson comorbidity index (CCI)
you want broad comorbidity coverageElixhauser comorbidity measures
patients are older or frailClaims-based frailty index

Claims/EHR phenotype algorithm

A rule mapping recorded codes and encounters to a presumed clinical event or condition. in the pathway →

On the pathway · 01 · Measurement · Outcome phenotyping and algorithm validation

Is your outcome a validated algorithm or an unchecked code rule?

the overall ideaOutcome phenotyping and validation
the rule itselfClaims/EHR phenotype algorithm
common coding rule1-inpatient / 2-outpatient rule
tuning the ruleAlgorithm validation (PPV and sensitivity tradeoff)
reference standardEndpoint adjudication and chart review
bundled outcomesComposite endpoint construction

Classification performance metrics

Measures read off the confusion matrix of predicted versus actual, including precision, recall, and F1. in the pathway →

On the pathway · 02 · Model · Classification performance metrics

Which classification metric?

the overall ideaClassification performance metrics
share of predicted positives correctPrecision
share of true positives caughtRecall
balance precision and recallF1 score
tradeoff across all thresholdsPrecision-recall curve

Clinical equipoise

Genuine uncertainty in the expert community about which trial arm is better, the ethical license to randomize patients. in the pathway →

On the pathway · § · Conduct it · Research ethics and the IRB

Which ethics concept or body?

the overall ideaResearch ethics and the IRB
foundational ethical principlesBelmont principles
genuine uncertainty justifying a trialClinical equipoise
participant’s voluntary agreementInformed consent
body that reviews and approves studiesInstitutional review board

Clone-censor-weight

A per-protocol target-trial method that clones patients into each strategy, censors deviators, and reweights to avoid immortal time bias. in the pathway →

On the pathway · 02 · Model · Real-world causal-inference extensions

How do causal methods scale to claims and time?

the overall ideaReal-world causal-inference extensions
you have many candidate covariatesHigh-dimensional propensity score (hdPS)
outcome modeling suits the problemDisease risk score
treatment strategy unfolds over timeClone-censor-weight
censoring depends on covariatesInverse-probability-of-censoring weighting (IPCW)
you need a simple guardLandmark analysis

Cluster sampling

Drawing whole groups such as schools or blocks to cut field cost when no list of individuals exists. in the pathway →

On the pathway · 01 · Measurement · Survey sampling design

How do you draw the sample?

the overall ideaSurvey sampling design
every unit known nonzero chanceProbability sample
equal-chance draw from frameSimple random sampling
sample within population strataStratified sampling
sample whole groups togetherCluster sampling
sample in successive nested stagesMultistage sampling
variance inflation from clusteringDesign effect

Clustering

Grouping similar observations, used for phenotyping disease subtypes from a panel of measurements. in the pathway →

On the pathway · 02 · Model · Unsupervised learning

What unlabeled-data structure are you finding?

the overall familyUnsupervised learning
grouping similar observationsClustering
nested grouping by linkageHierarchical clustering
partitioning into k groupsK-means
reducing the number of featuresDimensionality reduction
orthogonal variance componentsPrincipal component analysis

Cochran’s Q

A statistical test for heterogeneity across studies in a meta-analysis. in the pathway →

On the pathway · 04 · Synthesis · Meta-analysis and pooling

Which pooling or heterogeneity tool?

the overall ideaMeta-analysis and pooling
one true effect assumedFixed-effect meta-analysis
effects vary across studiesRandom-effects meta-analysis
how much effects varyHeterogeneity
testing for heterogeneityCochran’s Q
proportion of variance from heterogeneityI-squared
between-study variance estimateTau-squared
range for a new studyPrediction interval
explaining heterogeneity by covariatesMeta-regression
visualizing small-study effectsFunnel plot
testing funnel asymmetryEgger’s test

Code crosswalks and mappings

Lookup tables translating one vocabulary into another, each translation lossy in known ways. in the pathway →

On the pathway · 01 · Measurement · Claims and coding standards

Which vocabulary encodes each claim field, and what does it capture?

the overall ideaClaims and coding standards
diagnosesICD-10-CM diagnosis codes
inpatient proceduresICD-10-PCS procedure codes
professional servicesCPT/HCPCS codes
dispensed drugsNDC (National Drug Code)
labs and observationsLOINC lab codes
providerNPI (provider identifier)
cross-database mappingOMOP standardized vocabularies (OHDSI)
translating codesCode crosswalks and mappings
drug utilizationATC and defined daily dose (DDD)

Cohen’s d

The effect-size measure accompanying a t-test. in the pathway →

On the pathway · 02 · Model · Bivariate tests (t-test, chi-square)

Which two-variable association are you testing?

the overall family of testsBivariate tests
mean across two groupsT-test
means across three or more groupsANOVA
ranks across two groupsMann-Whitney test
ranks across three or more groupsKruskal-Wallis test
two categorical variables, large countsChi-square test
two categorical variables, small countsFisher’s exact test
linear correlation of two continuousPearson correlation
monotonic correlation, rankedSpearman correlation
concordance-based rank correlationKendall’s tau
effect size for two meansCohen’s d
effect size for ANOVAEta-squared
effect size for categorical associationCramer’s V

Cohen’s kappa

A measure of two raters’ categorical agreement corrected for what chance alone would produce. in the pathway → \[\kappa = \frac{p_o - p_e}{1 - p_e}\] where \(\kappa\) is Cohen’s kappa, the chance-corrected agreement; \(p_o\) is the observed agreement between the two raters; \(p_e\) is the agreement expected if the raters labelled independently.

On the pathway · 01 · Measurement · Reliability and validity

Which measurement property are you assessing?

the overall ideaReliability and validity
consistency of measurementReliability
measuring the intended constructValidity
internal consistency of scale itemsCronbach’s alpha
agreement on continuous measuresIntraclass correlation
plot method agreement and biasBland-Altman plot
categorical agreement, two ratersCohen’s kappa
ordered-category agreement, two ratersWeighted kappa
categorical agreement, many ratersFleiss’ kappa

Cohort study

Follows defined people forward from exposure to outcome; prospective when assembled before outcomes occur, retrospective when reconstructed from existing records. in the pathway →

On the pathway · 00 · Framing · Observational study designs

Which observational design fits the question and dominant bias?

the overall familyObservational study designs
exposure known, follow forwardCohort study
snapshot at one timeCross-sectional study
rare outcome, look backCase-control study
controls sampled within cohortNested case-control
random subcohort, multiple outcomesCase-cohort design
transient trigger, acute eventCase-crossover design
within-person rate comparisonSelf-controlled case series (SCCS)
corrects exposure time trendsCase-time-control design
initiators, active comparatorActive-comparator new-user design
standing source populationDisease registry

Collider

A common effect of two variables, where adjusting actively opens bias rather than removing it. in the pathway →

On the pathway · 02 · Model · Causal diagrams (DAGs) and conceptual frameworks

Which causal-diagram concept?

the overall frameworkCausal diagrams
the graph notation itselfDAG
common cause of exposure and outcomeConfounder
variable on the causal pathMediator
common effect, conditioning opens biasCollider
rule for sufficient adjustment setsBack-door criterion

Comorbidity and frailty adjustment

Summarizing a patient’s baseline illness burden from claims into a validated score used to adjust for confounding by underlying health. in the pathway →

On the pathway · 01 · Measurement · Comorbidity and frailty adjustment

How do I adjust for how sick patients already were?

the overall ideaComorbidity and frailty adjustment
you need a mortality-weighted scoreCharlson comorbidity index (CCI)
you want broad comorbidity coverageElixhauser comorbidity measures
patients are older or frailClaims-based frailty index

Competing risks

A setting where one event, such as death, prevents the event of interest from ever occurring. in the pathway →

On the pathway · 03 · Estimate · Competing risks and parametric survival

Does a competing event block the outcome?

modeling time to eventCompeting risks and survival models
a competing event existsCompeting risks
you want etiologyCause-specific hazard
you want absolute riskCumulative incidence function (CIF)
modeling absolute risk directlyFine-Gray subdistribution hazard
proportional hazards failsAccelerated failure time (AFT) models
some patients are curedCure models

Competing risks and survival models

Methods for time-to-event data where competing events block the outcome or where parametric forms replace the proportional hazards assumption. in the pathway →

On the pathway · 03 · Estimate · Competing risks and parametric survival

Does a competing event block the outcome?

modeling time to eventCompeting risks and survival models
a competing event existsCompeting risks
you want etiologyCause-specific hazard
you want absolute riskCumulative incidence function (CIF)
modeling absolute risk directlyFine-Gray subdistribution hazard
proportional hazards failsAccelerated failure time (AFT) models
some patients are curedCure models

Complex-sample design and survey weighting

Design-aware analysis using survey weights, strata, and primary sampling units so an oversampled, clustered sample speaks for its population. in the pathway →

On the pathway · 01 · Measurement · Complex-sample design and survey weighting

Which weighting or design adjustment?

the overall ideaComplex-sample design and survey weighting
scale respondents to the populationSurvey weight
precision lost to the designEffective sample size

Composite endpoint construction

Combining several outcome phenotypes into one variable, where the weakest component dominates overall measurement error. in the pathway →

On the pathway · 01 · Measurement · Outcome phenotyping and algorithm validation

Is your outcome a validated algorithm or an unchecked code rule?

the overall ideaOutcome phenotyping and validation
the rule itselfClaims/EHR phenotype algorithm
common coding rule1-inpatient / 2-outpatient rule
tuning the ruleAlgorithm validation (PPV and sensitivity tradeoff)
reference standardEndpoint adjudication and chart review
bundled outcomesComposite endpoint construction

Composite strategy

An intercurrent-event strategy that folds the event into the endpoint. in the pathway →

On the pathway · 02 · Model · Trial estimands and intercurrent events

How do you handle intercurrent events?

the overall ideaTrial estimands and intercurrent events
events disrupting outcome interpretationIntercurrent events
ignore them, use assigned treatmentTreatment-policy strategy
imagine they did not occurHypothetical strategy
fold event into the outcomeComposite strategy
restrict to a defined subpopulationPrincipal-stratum strategy

Conditional independence

The unverifiable assumption underlying propensity-score methods. in the pathway →

On the pathway · 02 · Model · Identifying assumptions

Which identifying assumption do you need?

the overall ideaIdentifying assumptions
treatment independent of confoundersConditional independence
instrument affects outcome only via exposureExclusion restriction
groups would have tracked togetherParallel trends

Conducting a systematic review

A protocol-driven, pre-registered search with reproducible strings, dual independent screening, structured extraction, and a PRISMA flow diagram accounting for every record. in the pathway →

On the pathway · 04 · Synthesis · Conducting a systematic review

What situation?

the overall ideaConducting a systematic review
registering the review protocolPROSPERO

Confidence interval

The range of values compatible with the data around a point estimate, frequently misread as a direct probability statement about the true value. in the pathway →

On the pathway · 03 · Estimate · Uncertainty and inference

How to express estimate uncertainty?

the overall ideaUncertainty and inference
a plausible range for the estimateConfidence interval

Confounder

A common cause of exposure and outcome, which you adjust for. in the pathway →

On the pathway · 02 · Model · Causal diagrams (DAGs) and conceptual frameworks

Which causal-diagram concept?

the overall frameworkCausal diagrams
the graph notation itselfDAG
common cause of exposure and outcomeConfounder
variable on the causal pathMediator
common effect, conditioning opens biasCollider
rule for sufficient adjustment setsBack-door criterion

Confounding

A common cause of exposure and outcome that distorts the estimate, with confounding by indication the clinical archetype. in the pathway →

On the pathway · ∗ · Defend it · Study biases, by rung

Which bias is threatening the study?

the overall map of biasesStudy biases, by rung
who got sampled or enrolledSampling bias
distorted entry into the studySelection bias
selection among hospitalized patientsBerkson’s bias
conditioning on a common effectOver-adjustment
employed groups appear healthierHealthy-worker effect
only survivors are observedSurvivorship bias
differential loss to follow-upAttrition bias
nonresponders differ systematicallyNonresponse bias
a common cause of exposure and outcomeConfounding
treatment chosen by prognosisConfounding by indication
unequal outcome ascertainmentDetection bias
inaccurate recall of exposureRecall bias
interviewer shapes responsesInterviewer bias
earlier detection inflates survivalLead-time bias
slow cases preferentially detectedLength-time bias
detecting harmless diseaseOverdiagnosis
selective reporting of resultsPublication bias

Confounding by indication

The clinical archetype of confounding, or channeling, where the reason for treatment also predicts the outcome. in the pathway →

On the pathway · ∗ · Defend it · Study biases, by rung

Which bias is threatening the study?

the overall map of biasesStudy biases, by rung
who got sampled or enrolledSampling bias
distorted entry into the studySelection bias
selection among hospitalized patientsBerkson’s bias
conditioning on a common effectOver-adjustment
employed groups appear healthierHealthy-worker effect
only survivors are observedSurvivorship bias
differential loss to follow-upAttrition bias
nonresponders differ systematicallyNonresponse bias
a common cause of exposure and outcomeConfounding
treatment chosen by prognosisConfounding by indication
unequal outcome ascertainmentDetection bias
inaccurate recall of exposureRecall bias
interviewer shapes responsesInterviewer bias
earlier detection inflates survivalLead-time bias
slow cases preferentially detectedLength-time bias
detecting harmless diseaseOverdiagnosis
selective reporting of resultsPublication bias

Conjugate prior

A prior chosen so the posterior shares its form and the update is closed-form, such as a beta prior with a binomial likelihood. in the pathway →

On the pathway · 02 · Model · Choosing a prior

What kind of prior do you need?

the overall choiceChoosing a prior
prior matched to the likelihoodConjugate prior
strong external informationInformative prior
light regularizing informationWeakly-informative prior

Consensus methods (Delphi, nominal group)

Formal methods for a panel to converge on a recommendation when evidence underdetermines it, including the Delphi method, nominal group technique, and RAND/UCLA method. in the pathway →

On the pathway · 06 · Recommendation · Consensus methods (Delphi, nominal group)

How do experts reach consensus?

the overall ideaConsensus methods (Delphi, nominal group)
anonymous iterative roundsDelphi method
structured in-person rankingNominal group technique

Consistency

The identifiability condition that the treatment is a well-defined intervention so a potential outcome means something specific. in the pathway →

On the pathway · 02 · Model · Potential outcomes and identifiability

Which identifiability condition is at stake?

the overall frameworkPotential outcomes and identifiability
the counterfactual setupPotential-outcomes framework
only one outcome is observedFundamental problem of causal inference
treated and untreated comparableExchangeability
every covariate stratum has bothPositivity
observed equals counterfactual under treatmentConsistency
no interference, single versionSUTVA

CONSORT

The reporting checklist for randomized trials. in the pathway →

On the pathway · 06 · Recommendation · Reporting standards

Which study type are you reporting?

the overall ideaReporting standards
randomized controlled trialCONSORT
observational studySTROBE
systematic reviewPRISMA
prediction model studyTRIPOD

Continual reassessment method

A model-based phase I design estimating the maximum tolerated dose more efficiently with fewer patients overdosed. in the pathway →

On the pathway · 00 · Framing · Dose-finding and early-phase designs

Which early-phase design question are you facing?

the overall design familyDose-finding and early-phase designs
dose escalation by fixed cohort rule3+3 design
model-based dose escalationContinual reassessment method
the highest acceptably safe doseMaximum tolerated dose
phase II screening for efficacySimon’s two-stage design

Continuous enrollment and observable time

Requiring uninterrupted coverage so that a patient’s care is captured, letting absence of a code mean absence of care. in the pathway →

On the pathway · 01 · Measurement · Data feasibility, enrollment, and linkage

Can this data actually answer my question?

the overall ideaData feasibility, enrollment, and linkage
you need observable follow-upContinuous enrollment and observable time
you must size the populationDatabase feasibility and the attrition funnel
you join multiple datasetsPrivacy-preserving record linkage (tokenization)

Contrast

A weighted sum of coefficients estimating a quantity such as a subgroup effect when the model carries an interaction. in the pathway →

On the pathway · 02 · Model · Linear combinations and contrasts

Which comparison of model terms?

the overall ideaLinear combinations and contrasts
a specific weighted group comparisonContrast

Cook’s distance

A diagnostic for influential points in a regression. in the pathway →

On the pathway · 02 · Model · Checking model assumptions

Which model assumption to check?

the overall ideaChecking model assumptions
non-constant residual varianceHeteroscedasticity
predictors too collinearVariance inflation factor
single points driving the fitCook’s distance
hazard ratio constant over timeProportional hazards
test that proportionality formallySchoenfeld residuals
fix variance without refittingRobust standard errors

Cost-benefit analysis

Economic evaluation that monetizes the health benefit so it can be compared directly with cost. in the pathway →

On the pathway · 05 · Decision rule · Cost-effectiveness and the ICER

Which economic-evaluation framing fits?

the overall ideaCost-effectiveness and the ICER
extra cost per extra effectICER
value costs and benefits in moneyCost-benefit analysis
effects identical, compare costs onlyCost-minimization analysis
effects in quality-adjusted life yearsCost-utility analysis
value at a willingness thresholdNet monetary benefit
maximum payable per unit benefitWillingness-to-pay threshold

Cost-effectiveness acceptability curve

A curve reading off the probability that each option is the best buy at each willingness-to-pay threshold. in the pathway →

On the pathway · 05 · Decision rule · Uncertainty in cost-effectiveness (PSA)

How are you handling cost-effectiveness uncertainty?

the overall ideaUncertainty in cost-effectiveness (PSA)
propagating parameter uncertaintyProbabilistic sensitivity analysis
plotting cost and effect differencesCost-effectiveness plane
probability of being cost-effectiveCost-effectiveness acceptability curve

Cost-effectiveness alongside a trial

Estimating cost-effectiveness directly from a trial’s patient-level cost and outcome data, often via net-benefit regression, with high internal validity but a short horizon. in the pathway →

On the pathway · 05 · Decision rule · Cost-effectiveness alongside a trial

What situation?

the overall ideaCost-effectiveness alongside a trial
regressing net benefit on covariatesNet-benefit regression

Cost-effectiveness and the ICER

Economic evaluation putting cost and benefit on the same page, with the incremental cost-effectiveness ratio judged against a willingness-to-pay threshold. in the pathway →

On the pathway · 05 · Decision rule · Cost-effectiveness and the ICER

Which economic-evaluation framing fits?

the overall ideaCost-effectiveness and the ICER
extra cost per extra effectICER
value costs and benefits in moneyCost-benefit analysis
effects identical, compare costs onlyCost-minimization analysis
effects in quality-adjusted life yearsCost-utility analysis
value at a willingness thresholdNet monetary benefit
maximum payable per unit benefitWillingness-to-pay threshold

Cost-effectiveness plane

The plane on which a probabilistic analysis plots its cloud of incremental cost-and-effect pairs. in the pathway →

On the pathway · 05 · Decision rule · Uncertainty in cost-effectiveness (PSA)

How are you handling cost-effectiveness uncertainty?

the overall ideaUncertainty in cost-effectiveness (PSA)
propagating parameter uncertaintyProbabilistic sensitivity analysis
plotting cost and effect differencesCost-effectiveness plane
probability of being cost-effectiveCost-effectiveness acceptability curve

Cost-minimization analysis

Economic evaluation that compares only costs, applicable only when the outcomes of the options are genuinely equal. in the pathway →

On the pathway · 05 · Decision rule · Cost-effectiveness and the ICER

Which economic-evaluation framing fits?

the overall ideaCost-effectiveness and the ICER
extra cost per extra effectICER
value costs and benefits in moneyCost-benefit analysis
effects identical, compare costs onlyCost-minimization analysis
effects in quality-adjusted life yearsCost-utility analysis
value at a willingness thresholdNet monetary benefit
maximum payable per unit benefitWillingness-to-pay threshold

Cost-utility analysis

Economic evaluation measuring benefit in quality-adjusted life years so different conditions become comparable. in the pathway →

On the pathway · 05 · Decision rule · Cost-effectiveness and the ICER

Which economic-evaluation framing fits?

the overall ideaCost-effectiveness and the ICER
extra cost per extra effectICER
value costs and benefits in moneyCost-benefit analysis
effects identical, compare costs onlyCost-minimization analysis
effects in quality-adjusted life yearsCost-utility analysis
value at a willingness thresholdNet monetary benefit
maximum payable per unit benefitWillingness-to-pay threshold

Costing methods

How the cost in cost-effectiveness is estimated, from micro-costing each resource to gross costing a whole episode, sorted into direct medical, direct non-medical, and indirect costs. in the pathway →

On the pathway · 05 · Decision rule · Costing methods

How to value the resources used?

the overall ideaCosting methods
aggregate top-down unit costsGross costing
itemized bottom-up resource countsMicro-costing
value lost productivity over a lifetimeHuman-capital approach
value productivity loss until replacedFriction-cost approach

CPT/HCPCS codes

Codes for professional services, procedures, and supplies in outpatient and physician billing. in the pathway →

On the pathway · 01 · Measurement · Claims and coding standards

Which vocabulary encodes each claim field, and what does it capture?

the overall ideaClaims and coding standards
diagnosesICD-10-CM diagnosis codes
inpatient proceduresICD-10-PCS procedure codes
professional servicesCPT/HCPCS codes
dispensed drugsNDC (National Drug Code)
labs and observationsLOINC lab codes
providerNPI (provider identifier)
cross-database mappingOMOP standardized vocabularies (OHDSI)
translating codesCode crosswalks and mappings
drug utilizationATC and defined daily dose (DDD)

Cramer’s V

An effect-size measure for a chi-square table. in the pathway →

On the pathway · 02 · Model · Bivariate tests (t-test, chi-square)

Which two-variable association are you testing?

the overall family of testsBivariate tests
mean across two groupsT-test
means across three or more groupsANOVA
ranks across two groupsMann-Whitney test
ranks across three or more groupsKruskal-Wallis test
two categorical variables, large countsChi-square test
two categorical variables, small countsFisher’s exact test
linear correlation of two continuousPearson correlation
monotonic correlation, rankedSpearman correlation
concordance-based rank correlationKendall’s tau
effect size for two meansCohen’s d
effect size for ANOVAEta-squared
effect size for categorical associationCramer’s V

Credible interval

A range the parameter lies in with stated probability, a direct probability statement the frequentist interval cannot make. in the pathway →

On the pathway · 02 · Model · Bayesian inference

Which Bayesian concept is in play?

the overall frameworkBayesian inference
the updating rule itselfBayes’ theorem
beliefs after seeing dataPosterior distribution
interval summary of the posteriorCredible interval

Cronbach’s alpha

A gauge of the internal consistency of a multi-item scale. in the pathway →

On the pathway · 01 · Measurement · Reliability and validity

Which measurement property are you assessing?

the overall ideaReliability and validity
consistency of measurementReliability
measuring the intended constructValidity
internal consistency of scale itemsCronbach’s alpha
agreement on continuous measuresIntraclass correlation
plot method agreement and biasBland-Altman plot
categorical agreement, two ratersCohen’s kappa
ordered-category agreement, two ratersWeighted kappa
categorical agreement, many ratersFleiss’ kappa

Cross-sectional study

Measures exposure and outcome at a single point in time, giving prevalence cheaply but rarely establishing temporal order. in the pathway →

On the pathway · 00 · Framing · Observational study designs

Which observational design fits the question and dominant bias?

the overall familyObservational study designs
exposure known, follow forwardCohort study
snapshot at one timeCross-sectional study
rare outcome, look backCase-control study
controls sampled within cohortNested case-control
random subcohort, multiple outcomesCase-cohort design
transient trigger, acute eventCase-crossover design
within-person rate comparisonSelf-controlled case series (SCCS)
corrects exposure time trendsCase-time-control design
initiators, active comparatorActive-comparator new-user design
standing source populationDisease registry

Cross-validation

Estimating out-of-sample error on held-out folds to choose the right model flexibility. in the pathway →

On the pathway · 02 · Model · Bias-variance and regularization

Which concept or penalty?

the overall ideaBias-variance and regularization
the underlying error tradeoffBias-variance tradeoff
fitting noise, poor generalizationOverfitting
penalizing complexity broadlyRegularization
estimating out-of-sample errorCross-validation
shrink coefficients, keep allRidge regression
shrink and select variablesLasso
blend selection and shrinkageElastic net

Crude rate

The unadjusted whole-population frequency, which confounds comparisons across populations with different age structures. in the pathway →

On the pathway · 01 · Measurement · Measures of disease frequency

What frequency are you trying to measure?

the overall ideaMeasures of disease frequency
existing cases at a time pointPrevalence
new cases over follow-upIncidence
new cases as a proportion at riskCumulative incidence
new cases per unit follow-up timeIncidence rate
denominator of summed follow-upPerson-time
unadjusted rate in a populationCrude rate
comparing rates across populationsAge-standardization
observed versus expected deathsStandardized mortality ratio

Cumulative incidence

The risk of disease: new cases over a fixed period divided by the population at risk. in the pathway →

On the pathway · 01 · Measurement · Measures of disease frequency

What frequency are you trying to measure?

the overall ideaMeasures of disease frequency
existing cases at a time pointPrevalence
new cases over follow-upIncidence
new cases as a proportion at riskCumulative incidence
new cases per unit follow-up timeIncidence rate
denominator of summed follow-upPerson-time
unadjusted rate in a populationCrude rate
comparing rates across populationsAge-standardization
observed versus expected deathsStandardized mortality ratio

Cumulative incidence function (CIF)

Probability of experiencing the event by a given time, accounting for competing events that remove patients. in the pathway →

On the pathway · 03 · Estimate · Competing risks and parametric survival

Does a competing event block the outcome?

modeling time to eventCompeting risks and survival models
a competing event existsCompeting risks
you want etiologyCause-specific hazard
you want absolute riskCumulative incidence function (CIF)
modeling absolute risk directlyFine-Gray subdistribution hazard
proportional hazards failsAccelerated failure time (AFT) models
some patients are curedCure models

Cure models

Survival models that split the population into a cured fraction and a susceptible fraction with its own distribution. in the pathway →

On the pathway · 03 · Estimate · Competing risks and parametric survival

Does a competing event block the outcome?

modeling time to eventCompeting risks and survival models
a competing event existsCompeting risks
you want etiologyCause-specific hazard
you want absolute riskCumulative incidence function (CIF)
modeling absolute risk directlyFine-Gray subdistribution hazard
proportional hazards failsAccelerated failure time (AFT) models
some patients are curedCure models

Cycle length and the half-cycle correction

Two timing choices in a state-transition model: a cycle short enough to miss no important event, and a correction for transitions occurring partway through a cycle. in the pathway →

On the pathway · 05 · Decision rule · Cycle length and the half-cycle correction

Which cycle-timing issue applies?

the overall ideaCycle length and the half-cycle correction
adjust for mid-cycle transitionsHalf-cycle correction

D

DAG

A directed acyclic graph: variables as nodes and assumed causal effects as arrows, with no cycles. in the pathway →

On the pathway · 02 · Model · Causal diagrams (DAGs) and conceptual frameworks

Which causal-diagram concept?

the overall frameworkCausal diagrams
the graph notation itselfDAG
common cause of exposure and outcomeConfounder
variable on the causal pathMediator
common effect, conditioning opens biasCollider
rule for sufficient adjustment setsBack-door criterion

Data feasibility, enrollment, and linkage

Confirming a database can answer the question, that follow-up is observable, and that datasets are joined without exposing patient identities. in the pathway →

On the pathway · 01 · Measurement · Data feasibility, enrollment, and linkage

Can this data actually answer my question?

the overall ideaData feasibility, enrollment, and linkage
you need observable follow-upContinuous enrollment and observable time
you must size the populationDatabase feasibility and the attrition funnel
you join multiple datasetsPrivacy-preserving record linkage (tokenization)

Data management and reproducibility

The discipline between collection and analysis, from clean data capture and a database lock to a scripted, version-controlled pipeline that regenerates the numbers. in the pathway →

On the pathway · § · Conduct it · Data management and reproducibility

Which data-management step are you at?

the overall practiceData management and reproducibility
freezing the dataset before analysisDatabase lock

Data privacy and security

The duty owed to people in health data, governed by HIPAA in the US and GDPR in Europe, with de-identification or synthetic data enabling research sharing. in the pathway →

On the pathway · § · Conduct it · Data privacy and security

Which rule or method?

the overall ideaData privacy and security
US health privacy lawHIPAA
EU data protection lawGDPR
de-identify by removing identifiersSafe Harbor
de-identify by statistical opinionExpert determination
generating artificial substitute recordsSynthetic data

Data safety monitoring board

An independent board, not the sponsor, that decides whether to stop a trial early for efficacy, futility, or harm. in the pathway →

On the pathway · 00 · Framing · Interim analyses and group-sequential design

Which interim-monitoring element?

the overall ideaInterim analyses and group-sequential design
committee reviewing accruing dataData safety monitoring board
planned looks with stopping rulesGroup-sequential design
false-positive risk to spendType-I error
stringent early-look boundaryO’Brien-Fleming boundary
constant nominal-level boundaryPocock boundary

Data sources and their tradeoffs

Each data source carries a characteristic strength and bias that bounds every question it can answer. in the pathway →

On the pathway · 01 · Measurement · Data sources and their tradeoffs

Which data source or pitfall?

the overall ideaData sources and their tradeoffs
clinical detail from care recordsElectronic health record data
billing records across encountersClaims data
enrolled cohorts for a conditionRegistries
sampled population questionnairesSurvey data
group-level inference pitfallEcological fallacy

Data standards and provenance

The structure a datapoint inherits from how it was recorded, through CDISC standards or coding ontologies. in the pathway →

On the pathway · 01 · Measurement · Data standards and provenance

Which data standard or provenance layer?

the overall idea of standards and provenanceData standards and provenance
clinical coding terminology for findingsSNOMED
regulatory model for collected trial dataCDISC SDTM
analysis-ready dataset standardADaM

Database feasibility and the attrition funnel

Counting how many patients survive each eligibility criterion to judge whether a source supports the planned study. in the pathway →

On the pathway · 01 · Measurement · Data feasibility, enrollment, and linkage

Can this data actually answer my question?

the overall ideaData feasibility, enrollment, and linkage
you need observable follow-upContinuous enrollment and observable time
you must size the populationDatabase feasibility and the attrition funnel
you join multiple datasetsPrivacy-preserving record linkage (tokenization)

Database lock

A dated point after which no value in a study database changes silently, marking the clean source for analysis. in the pathway →

On the pathway · § · Conduct it · Data management and reproducibility

Which data-management step are you at?

the overall practiceData management and reproducibility
freezing the dataset before analysisDatabase lock

Decision tree (decision analysis)

A model mapping a one-off choice and its probabilistic consequences, clean for an acute decision but clumsy once events repeat. in the pathway →

On the pathway · 05 · Decision rule · Decision-analytic models

Which model structure fits the problem?

the overall ideaDecision-analytic models
branching one-time event sequenceDecision tree (decision analysis)
recurring health states over cyclesMarkov model
survival curves partition statesPartitioned survival model
infection spread depends on prevalenceDynamic transmission model
weight future values lowerDiscounting

Decision tree (machine learning)

A predictor splitting predictors into regions, interpretable but unstable on its own. in the pathway →

On the pathway · 02 · Model · Learning algorithms and ensembles

Which learner or ensemble fits?

the overall ideaLearning algorithms and ensembles
single interpretable splitsDecision tree (machine learning)
classify by closest neighborsK-nearest neighbours
maximum-margin separating boundarySupport vector machine
average parallel bootstrapped modelsBagging
many decorrelated bagged treesRandom forest
sequentially correct prior errorsBoosting

Decision-analytic models

Models estimating lifetime costs and QALYs that are rarely observed directly, from decision trees and Markov models to microsimulation and transmission models. in the pathway →

On the pathway · 05 · Decision rule · Decision-analytic models

Which model structure fits the problem?

the overall ideaDecision-analytic models
branching one-time event sequenceDecision tree (decision analysis)
recurring health states over cyclesMarkov model
survival curves partition statesPartitioned survival model
infection spread depends on prevalenceDynamic transmission model
weight future values lowerDiscounting

Decision-curve analysis

Weighing the trade-offs of acting on a test or model directly in terms of net benefit across the range of thresholds a clinician might hold. in the pathway →

On the pathway · 05 · Decision rule · Decision-curve analysis

Which clinical-utility concept is in play?

the overall methodDecision-curve analysis
utility weighted by thresholdNet benefit

Delphi method

A consensus method where an expert panel answers in iterative anonymous rounds, revising after seeing a statistical summary, so opinion converges without face-to-face pressure. in the pathway →

On the pathway · 06 · Recommendation · Consensus methods (Delphi, nominal group)

How do experts reach consensus?

the overall ideaConsensus methods (Delphi, nominal group)
anonymous iterative roundsDelphi method
structured in-person rankingNominal group technique

Descriptive epidemiology

Describing a health event by person, place, and time to generate hypotheses and fix the frequency measure reported. in the pathway →

On the pathway · 00 · Framing · Descriptive epidemiology: person, place, time

What situation?

the overall ideaDescriptive epidemiology

Design effect

The factor by which clustering inflates variance, used to scale up the target sample size to hold the effective sample size. in the pathway → \[\text{DEFF} = \frac{\text{Var}_{\text{complex}}}{\text{Var}_{\text{SRS}}}\] where \(\text{DEFF}\) is the design effect, the variance penalty from the complex design; \(\text{Var}_{\text{complex}}\) is the variance under the actual complex sampling design; \(\text{Var}_{\text{SRS}}\) is the variance a simple random sample of the same size would give.

On the pathway · 01 · Measurement · Survey sampling design

How do you draw the sample?

the overall ideaSurvey sampling design
every unit known nonzero chanceProbability sample
equal-chance draw from frameSimple random sampling
sample within population strataStratified sampling
sample whole groups togetherCluster sampling
sample in successive nested stagesMultistage sampling
variance inflation from clusteringDesign effect

Detection bias

Differential ascertainment of the outcome by exposure group, also called observer bias. in the pathway →

On the pathway · ∗ · Defend it · Study biases, by rung

Which bias is threatening the study?

the overall map of biasesStudy biases, by rung
who got sampled or enrolledSampling bias
distorted entry into the studySelection bias
selection among hospitalized patientsBerkson’s bias
conditioning on a common effectOver-adjustment
employed groups appear healthierHealthy-worker effect
only survivors are observedSurvivorship bias
differential loss to follow-upAttrition bias
nonresponders differ systematicallyNonresponse bias
a common cause of exposure and outcomeConfounding
treatment chosen by prognosisConfounding by indication
unequal outcome ascertainmentDetection bias
inaccurate recall of exposureRecall bias
interviewer shapes responsesInterviewer bias
earlier detection inflates survivalLead-time bias
slow cases preferentially detectedLength-time bias
detecting harmless diseaseOverdiagnosis
selective reporting of resultsPublication bias

Diagnostic-accuracy studies

Study design measuring how well an index test discriminates disease against a reference standard, prone to spectrum, verification, and incorporation bias. in the pathway →

On the pathway · 05 · Decision rule · Diagnostic-accuracy studies

Which accuracy measure or pitfall?

the overall ideaDiagnostic-accuracy studies
how results shift disease oddsLikelihood ratios
index test informs reference standardIncorporation bias
only some get the reference standardVerification bias
unrepresentative case mixSpectrum bias

Difference-in-differences

A causal design that neutralizes a specific dominant threat to inference, resting on a parallel-trends assumption. in the pathway →

On the pathway · 02 · Model · Causal designs without randomization

Which quasi-experimental design fits?

the overall ideaCausal designs without randomization
before-after across exposed and controlDifference-in-differences
a haphazard nudge to exposureInstrumental variables
assignment by a cutoff thresholdRegression discontinuity
weighted donors build a counterfactualSynthetic control

Differential misclassification

Measurement error related to the outcome, which can bias an effect in either direction and is harder to reason about. in the pathway →

On the pathway · 01 · Measurement · Measurement error and misclassification

What kind of measurement error?

the overall ideaMeasurement error and misclassification
error unrelated to other variablesNon-differential misclassification
error differing by groupDifferential misclassification
true variance over observed varianceReliability ratio

Dimensionality reduction

Compressing many correlated variables into a few, through PCA or nonlinear methods. in the pathway →

On the pathway · 02 · Model · Unsupervised learning

What unlabeled-data structure are you finding?

the overall familyUnsupervised learning
grouping similar observationsClustering
nested grouping by linkageHierarchical clustering
partitioning into k groupsK-means
reducing the number of featuresDimensionality reduction
orthogonal variance componentsPrincipal component analysis

Discounting

Converting future costs and effects to present value over a model’s time horizon. in the pathway →

On the pathway · 05 · Decision rule · Decision-analytic models

Which model structure fits the problem?

the overall ideaDecision-analytic models
branching one-time event sequenceDecision tree (decision analysis)
recurring health states over cyclesMarkov model
survival curves partition statesPartitioned survival model
infection spread depends on prevalenceDynamic transmission model
weight future values lowerDiscounting

Discrimination

Whether a model ranks higher-risk patients above lower-risk ones, measured by the AUC. in the pathway →

On the pathway · 03 · Estimate · Calibration versus discrimination

Which aspect of predictive performance?

the overall ideaCalibration versus discrimination
ranking cases above non-casesDiscrimination
summarizing ranking across thresholdsAUC
predicted risks match observedCalibration
testing calibration formallyHosmer-Lemeshow statistic

Disease registry

A systematically maintained roster of people with a condition or exposure that supplies a standing population for many designs. in the pathway →

On the pathway · 00 · Framing · Observational study designs

Which observational design fits the question and dominant bias?

the overall familyObservational study designs
exposure known, follow forwardCohort study
snapshot at one timeCross-sectional study
rare outcome, look backCase-control study
controls sampled within cohortNested case-control
random subcohort, multiple outcomesCase-cohort design
transient trigger, acute eventCase-crossover design
within-person rate comparisonSelf-controlled case series (SCCS)
corrects exposure time trendsCase-time-control design
initiators, active comparatorActive-comparator new-user design
standing source populationDisease registry

Disease risk score

A summary that models outcome risk from covariates instead of treatment probability, offering an alternative to the propensity score. in the pathway →

On the pathway · 02 · Model · Real-world causal-inference extensions

How do causal methods scale to claims and time?

the overall ideaReal-world causal-inference extensions
you have many candidate covariatesHigh-dimensional propensity score (hdPS)
outcome modeling suits the problemDisease risk score
treatment strategy unfolds over timeClone-censor-weight
censoring depends on covariatesInverse-probability-of-censoring weighting (IPCW)
you need a simple guardLandmark analysis

Dose-finding and early-phase designs

Early studies that find the tolerable dose and the efficacy signal before a confirmatory trial. in the pathway →

On the pathway · 00 · Framing · Dose-finding and early-phase designs

Which early-phase design question are you facing?

the overall design familyDose-finding and early-phase designs
dose escalation by fixed cohort rule3+3 design
model-based dose escalationContinual reassessment method
the highest acceptably safe doseMaximum tolerated dose
phase II screening for efficacySimon’s two-stage design

Double-barreled question

A survey item that asks two things at once. in the pathway →

On the pathway · 01 · Measurement · Questionnaire and instrument design

Which questionnaire flaw is in play?

the overall craftQuestionnaire and instrument design
asking two things at onceDouble-barreled question
wording that steers the answerLeading question

Double-programming

Independent re-derivation of a dataset or output by a second programmer without seeing the first, reconciled value by value as the sign-off. in the pathway →

On the pathway · § · Conduct it · Statistical programming: TFLs and double-programming QC

What situation?

the overall ideaStatistical programming and TFLs
the reported tables and figuresTFLs
independent reproduction for QCDouble-programming

Doubly-robust estimators

Estimators such as augmented IPW and TMLE that combine a propensity and an outcome model and stay consistent if either is right. in the pathway →

On the pathway · 02 · Model · Causal estimators (propensity scores, g-methods)

How do you estimate the causal effect?

the overall ideaCausal estimators
model treatment assignment probabilityPropensity score
reweight by inverse treatment probabilityIPTW
model and average outcomesG-formula
combine outcome and treatment modelsDoubly-robust estimators
targeted machine-learning estimationTMLE

Drug era (OMOP)

A derived continuous exposure span in the OMOP model built from raw drug records using an explicit persistence gap. in the pathway →

On the pathway · 01 · Measurement · Defining exposure in real-world data

How do raw fills become a defined exposure with a start and end?

the overall ideaExposure definition in RWD
building one courseExposure episode construction
tolerating gapsGrace period and permissible gap
shifting the clockInduction, latency, and lag windows
adherence metricProportion of days covered (PDC)
adherence metricMedication possession ratio (MPR)
how long treatedPersistence (time to discontinuation)
standardized spanDrug era (OMOP)

Dynamic transmission model

An infectious-disease model capturing how treating one person changes others’ risk through herd immunity, which a fixed-risk cohort model cannot. in the pathway →

On the pathway · 05 · Decision rule · Decision-analytic models

Which model structure fits the problem?

the overall ideaDecision-analytic models
branching one-time event sequenceDecision tree (decision analysis)
recurring health states over cyclesMarkov model
survival curves partition statesPartitioned survival model
infection spread depends on prevalenceDynamic transmission model
weight future values lowerDiscounting

E

E-value

A measure of how strong a hidden confounder would have to be, in association with both treatment and outcome, to explain away an observed result. in the pathway → \[E = \text{RR} + \sqrt{\text{RR}(\text{RR} - 1)}\] where \(E\) is the E-value, the smallest association a hidden confounder would need with both treatment and outcome to explain the estimate away; \(\text{RR}\) is the observed risk ratio, taken above 1 (for a protective effect, apply the formula to its reciprocal).

On the pathway · ∗ · Defend it · Bias quantification

How do you quantify unmeasured bias?

the overall ideaBias quantification
strength needed to explain awayE-value
hidden bias in matched designsRosenbaum bounds

Ecological fallacy

Reading a group-level association as if it held for individuals, a trap of aggregated data. in the pathway →

On the pathway · 01 · Measurement · Data sources and their tradeoffs

Which data source or pitfall?

the overall ideaData sources and their tradeoffs
clinical detail from care recordsElectronic health record data
billing records across encountersClaims data
enrolled cohorts for a conditionRegistries
sampled population questionnairesSurvey data
group-level inference pitfallEcological fallacy

Effect measures

The scales for reporting a result, including relative measures like risk and odds ratios and absolute measures like risk difference and number needed to treat. in the pathway → \[\text{RR} = \frac{\text{risk}_{\text{exposed}}}{\text{risk}_{\text{unexposed}}}, \quad \text{RD} = \text{risk}_{\text{exposed}} - \text{risk}_{\text{unexposed}}\] where \(\text{RR}\) is the risk ratio, a relative measure; \(\text{RD}\) is the risk difference, an absolute measure; \(\text{risk}_{\text{exposed}}\) is the outcome risk in the exposed group; \(\text{risk}_{\text{unexposed}}\) is the outcome risk in the unexposed group.

On the pathway · 03 · Estimate · Effect measures

Which effect measure to report?

the overall ideaEffect measures
ratio of risks between groupsRisk ratio
ratio of odds between groupsOdds ratio
absolute difference in riskRisk difference
patients treated per outcome preventedNumber needed to treat

Effective sample size

The sample size discounted by the design effect, so a design effect of 2 leaves the precision of half the respondents. in the pathway → \[n_{\text{eff}} = \frac{n}{\text{DEFF}}\] where \(n_{\text{eff}}\) is the effective sample size, the precision the design actually delivers; \(n\) is the achieved sample size; \(\text{DEFF}\) is the design effect, the variance penalty from clustering and unequal weighting.

On the pathway · 01 · Measurement · Complex-sample design and survey weighting

Which weighting or design adjustment?

the overall ideaComplex-sample design and survey weighting
scale respondents to the populationSurvey weight
precision lost to the designEffective sample size

Egger’s test

A statistical test for funnel-plot asymmetry, used to check for publication bias. in the pathway →

On the pathway · 04 · Synthesis · Meta-analysis and pooling

Which pooling or heterogeneity tool?

the overall ideaMeta-analysis and pooling
one true effect assumedFixed-effect meta-analysis
effects vary across studiesRandom-effects meta-analysis
how much effects varyHeterogeneity
testing for heterogeneityCochran’s Q
proportion of variance from heterogeneityI-squared
between-study variance estimateTau-squared
range for a new studyPrediction interval
explaining heterogeneity by covariatesMeta-regression
visualizing small-study effectsFunnel plot
testing funnel asymmetryEgger’s test

Elastic net

A regularization that blends ridge and lasso penalties. in the pathway →

On the pathway · 02 · Model · Bias-variance and regularization

Which concept or penalty?

the overall ideaBias-variance and regularization
the underlying error tradeoffBias-variance tradeoff
fitting noise, poor generalizationOverfitting
penalizing complexity broadlyRegularization
estimating out-of-sample errorCross-validation
shrink coefficients, keep allRidge regression
shrink and select variablesLasso
blend selection and shrinkageElastic net

Electronic health record data

Clinically rich data recorded for care, so messy, single-system, and informatively missing rather than research-ready. in the pathway →

On the pathway · 01 · Measurement · Data sources and their tradeoffs

Which data source or pitfall?

the overall ideaData sources and their tradeoffs
clinical detail from care recordsElectronic health record data
billing records across encountersClaims data
enrolled cohorts for a conditionRegistries
sampled population questionnairesSurvey data
group-level inference pitfallEcological fallacy

Elixhauser comorbidity measures

A broader set of comorbidity categories, often kept as separate indicators rather than one number, to adjust for diverse baseline conditions. in the pathway →

On the pathway · 01 · Measurement · Comorbidity and frailty adjustment

How do I adjust for how sick patients already were?

the overall ideaComorbidity and frailty adjustment
you need a mortality-weighted scoreCharlson comorbidity index (CCI)
you want broad comorbidity coverageElixhauser comorbidity measures
patients are older or frailClaims-based frailty index

Empirical calibration

Fitting the spread of many null estimates to recalibrate p-values and intervals for observed systematic error. in the pathway →

On the pathway · ∗ · Defend it · Negative controls and empirical calibration

Need to detect hidden residual confounding?

probing residual confoundingNegative controls and calibration
checking confounding on exposureNegative control exposure
checking confounding on outcomeNegative control outcome
many negative controls availableEmpirical calibration
quantifying systematic errorQuantitative bias analysis

Endpoint adjudication and chart review

Clinician review of source records, blinded to exposure, serving as the reference standard for validation. in the pathway →

On the pathway · 01 · Measurement · Outcome phenotyping and algorithm validation

Is your outcome a validated algorithm or an unchecked code rule?

the overall ideaOutcome phenotyping and validation
the rule itselfClaims/EHR phenotype algorithm
common coding rule1-inpatient / 2-outpatient rule
tuning the ruleAlgorithm validation (PPV and sensitivity tradeoff)
reference standardEndpoint adjudication and chart review
bundled outcomesComposite endpoint construction

Endpoint logic and pre-registration

Fixing the primary endpoint the sample size rests on and publicly committing to it before unblinding, keeping confirmatory analyses confirmatory. in the pathway →

On the pathway · 00 · Framing · Endpoint logic and pre-registration

Which endpoint or pre-specification concern?

the overall ideaEndpoint logic and pre-registration
the main pre-specified outcomePrimary endpoint
a stand-in for the outcomeSurrogate endpoint
criteria validating a surrogatePrentice’s criteria
lock analysis plan in advancePre-registration

EQ-5D

A preference-based instrument used to derive the utility weights that anchor quality-adjusted life years. in the pathway →

On the pathway · 05 · Decision rule · QALYs and health-state utilities

Which utility concept do you need?

the overall ideaQALYs and health-state utilities
quality-adjusted life yearsQALY
preference weight for a health stateHealth-state utility
a standardized utility instrumentEQ-5D

Equivalence trial

A trial that bounds the difference between treatments on both sides. in the pathway →

On the pathway · 00 · Framing · Non-inferiority and equivalence

What are you trying to show?

the overall ideaNon-inferiority and equivalence
new is not meaningfully worseNon-inferiority trial
new is neither worse nor betterEquivalence trial
how much worse is tolerableNon-inferiority margin
trial can detect a real differenceAssay sensitivity

Estimand

The exact quantity to be estimated: which effect, in whom. in the pathway →

On the pathway · 02 · Model · Choosing the estimand

Whose causal effect do you target?

the overall ideaChoosing the estimand
the formal target quantityEstimand
effect across the whole populationATE
effect among the treatedATT
effect among compliers onlyLATE

Eta-squared

An ANOVA effect-size measure, the share of variance the groups explain. in the pathway →

On the pathway · 02 · Model · Bivariate tests (t-test, chi-square)

Which two-variable association are you testing?

the overall family of testsBivariate tests
mean across two groupsT-test
means across three or more groupsANOVA
ranks across two groupsMann-Whitney test
ranks across three or more groupsKruskal-Wallis test
two categorical variables, large countsChi-square test
two categorical variables, small countsFisher’s exact test
linear correlation of two continuousPearson correlation
monotonic correlation, rankedSpearman correlation
concordance-based rank correlationKendall’s tau
effect size for two meansCohen’s d
effect size for ANOVAEta-squared
effect size for categorical associationCramer’s V

Evidence-to-decision

Frameworks making the move from evidence to a recommendation explicit, weighing benefits and harms alongside values, feasibility, equity, and cost. in the pathway →

On the pathway · 06 · Recommendation · Evidence-to-decision

What situation?

the overall ideaEvidence-to-decision

EVPI

Expected value of perfect information: an upper bound on what further research could be worth, equal to the expected loss from deciding under current uncertainty. in the pathway →

On the pathway · 05 · Decision rule · Value of information (EVPI)

What is more evidence worth?

the overall ideaValue of information (EVPI)
value of removing all uncertaintyEVPI
value of a specific future studyExpected value of sample information

Exact logistic regression

Conditions on sufficient statistics and enumerates the permutation distribution, giving valid inference without asymptotic approximations when data are very sparse. in the pathway →

On the pathway · 02 · Model · Sparse data and resampling

Are cells sparse or analytic standard errors doubtful?

the overall familySparse data and resampling
separation or small samplesFirth penalized regression
very sparse, exact inferenceExact logistic regression
no clean closed-form varianceBootstrap and resampling methods
public-health impact measuresAttributable risk and population attributable fraction (PAF)

Exchangeability

The identifiability condition that treated and untreated are comparable once confounders are controlled, meaning no unmeasured confounding. in the pathway →

On the pathway · 02 · Model · Potential outcomes and identifiability

Which identifiability condition is at stake?

the overall frameworkPotential outcomes and identifiability
the counterfactual setupPotential-outcomes framework
only one outcome is observedFundamental problem of causal inference
treated and untreated comparableExchangeability
every covariate stratum has bothPositivity
observed equals counterfactual under treatmentConsistency
no interference, single versionSUTVA

Exclusion restriction

The unverifiable assumption underlying instrumental-variable designs. in the pathway →

On the pathway · 02 · Model · Identifying assumptions

Which identifying assumption do you need?

the overall ideaIdentifying assumptions
treatment independent of confoundersConditional independence
instrument affects outcome only via exposureExclusion restriction
groups would have tracked togetherParallel trends

Expected value of partial perfect information (EVPPI)

Prices resolving specific uncertain parameters, identifying which uncertainty is worth further research. in the pathway →

On the pathway · 05 · Decision rule · Real-world cost and HTA methods

Modeling skewed real-world costs for HTA?

valuing real-world costsReal-world cost and HTA methods
costs are zero-inflatedTwo-part and other cost models
extreme cost outliers existWinsorization and trimming of cost outliers
summarizing population spendPer-member-per-month costing (PMPM/PPPM)
trial ends before lifetimeSurvival extrapolation for HTA
value has multiple dimensionsMulti-criteria decision analysis (MCDA)
prioritizing which uncertainty mattersExpected value of partial perfect information (EVPPI)

Expected value of sample information

A measure valuing a study of a given design and size, going beyond perfect information to price real research. in the pathway →

On the pathway · 05 · Decision rule · Value of information (EVPI)

What is more evidence worth?

the overall ideaValue of information (EVPI)
value of removing all uncertaintyEVPI
value of a specific future studyExpected value of sample information

Expert determination

A HIPAA de-identification route where a statistician certifies the re-identification risk is very small. in the pathway →

On the pathway · § · Conduct it · Data privacy and security

Which rule or method?

the overall ideaData privacy and security
US health privacy lawHIPAA
EU data protection lawGDPR
de-identify by removing identifiersSafe Harbor
de-identify by statistical opinionExpert determination
generating artificial substitute recordsSynthetic data

Exposure definition in RWD

Turning prescription or claim records into an exposure variable with a defined start, window, and end so it is clear who is treated and when. in the pathway →

On the pathway · 01 · Measurement · Defining exposure in real-world data

How do raw fills become a defined exposure with a start and end?

the overall ideaExposure definition in RWD
building one courseExposure episode construction
tolerating gapsGrace period and permissible gap
shifting the clockInduction, latency, and lag windows
adherence metricProportion of days covered (PDC)
adherence metricMedication possession ratio (MPR)
how long treatedPersistence (time to discontinuation)
standardized spanDrug era (OMOP)

Exposure episode construction

Stitching consecutive fills into a continuous treatment span using rules for combining overlapping or sequential supplies. in the pathway →

On the pathway · 01 · Measurement · Defining exposure in real-world data

How do raw fills become a defined exposure with a start and end?

the overall ideaExposure definition in RWD
building one courseExposure episode construction
tolerating gapsGrace period and permissible gap
shifting the clockInduction, latency, and lag windows
adherence metricProportion of days covered (PDC)
adherence metricMedication possession ratio (MPR)
how long treatedPersistence (time to discontinuation)
standardized spanDrug era (OMOP)

Extract-transform-load

Pulling from source tables, deriving study variables from operational definitions, and assembling one analysis-ready table. in the pathway →

On the pathway · 01 · Measurement · Assembling the analytic cohort

Which cohort-construction step?

the overall ideaAssembling the analytic cohort
pull and reshape raw source dataExtract-transform-load
set the time-zero anchorIndex date
define pre-index covariate historyLookback window
restrict to treatment initiatorsNew-user design

F

F1 score

The harmonic mean of precision and recall, high only when both are. in the pathway → \[F_1 = \frac{2 \cdot \text{precision} \cdot \text{recall}}{\text{precision} + \text{recall}}\] where \(F_1\) is the F1 score, the harmonic mean of precision and recall; \(\text{precision}\) is the share of positive predictions that are correct; \(\text{recall}\) is the share of true positives caught.

On the pathway · 02 · Model · Classification performance metrics

Which classification metric?

the overall ideaClassification performance metrics
share of predicted positives correctPrecision
share of true positives caughtRecall
balance precision and recallF1 score
tradeoff across all thresholdsPrecision-recall curve

False-discovery rate

The expected share of false positives among rejections, controlled by Benjamini-Hochberg, better for screening. in the pathway →

On the pathway · 02 · Model · Multiplicity control

How do you control multiple testing?

the overall ideaMultiplicity control
bound any false positiveFamily-wise error rate
bound false positives among rejectionsFalse-discovery rate
simple conservative FWER divisorBonferroni correction
stepwise FWER controlHolm’s procedure
step-up FDR controlBenjamini-Hochberg
test hypotheses in ordered familiesGatekeeping procedure

Family-wise error rate

The chance of even one false positive, held down by Bonferroni or Holm’s step-down procedure. in the pathway →

On the pathway · 02 · Model · Multiplicity control

How do you control multiple testing?

the overall ideaMultiplicity control
bound any false positiveFamily-wise error rate
bound false positives among rejectionsFalse-discovery rate
simple conservative FWER divisorBonferroni correction
stepwise FWER controlHolm’s procedure
step-up FDR controlBenjamini-Hochberg
test hypotheses in ordered familiesGatekeeping procedure

Fine-Gray subdistribution hazard

A hazard that models the cumulative incidence function directly, giving covariate effects on absolute risk. in the pathway →

On the pathway · 03 · Estimate · Competing risks and parametric survival

Does a competing event block the outcome?

modeling time to eventCompeting risks and survival models
a competing event existsCompeting risks
you want etiologyCause-specific hazard
you want absolute riskCumulative incidence function (CIF)
modeling absolute risk directlyFine-Gray subdistribution hazard
proportional hazards failsAccelerated failure time (AFT) models
some patients are curedCure models

Firth penalized regression

Adds a bias-reducing penalty to the likelihood, keeping coefficient estimates finite and less biased even under separation in small or sparse data. in the pathway →

On the pathway · 02 · Model · Sparse data and resampling

Are cells sparse or analytic standard errors doubtful?

the overall familySparse data and resampling
separation or small samplesFirth penalized regression
very sparse, exact inferenceExact logistic regression
no clean closed-form varianceBootstrap and resampling methods
public-health impact measuresAttributable risk and population attributable fraction (PAF)

Fisher’s exact test

A test of association between categorical variables used when cell counts are small. in the pathway →

On the pathway · 02 · Model · Bivariate tests (t-test, chi-square)

Which two-variable association are you testing?

the overall family of testsBivariate tests
mean across two groupsT-test
means across three or more groupsANOVA
ranks across two groupsMann-Whitney test
ranks across three or more groupsKruskal-Wallis test
two categorical variables, large countsChi-square test
two categorical variables, small countsFisher’s exact test
linear correlation of two continuousPearson correlation
monotonic correlation, rankedSpearman correlation
concordance-based rank correlationKendall’s tau
effect size for two meansCohen’s d
effect size for ANOVAEta-squared
effect size for categorical associationCramer’s V

Fixed-effect meta-analysis

A pooling model assuming every study estimates one common effect, weighting each only by the inverse of its variance. in the pathway → \[w = \frac{1}{\text{variance}}\] where \(w\) is the weight a study receives in the pooled estimate; \(\text{variance}\) is the variance of that study’s effect estimate.

On the pathway · 04 · Synthesis · Meta-analysis and pooling

Which pooling or heterogeneity tool?

the overall ideaMeta-analysis and pooling
one true effect assumedFixed-effect meta-analysis
effects vary across studiesRandom-effects meta-analysis
how much effects varyHeterogeneity
testing for heterogeneityCochran’s Q
proportion of variance from heterogeneityI-squared
between-study variance estimateTau-squared
range for a new studyPrediction interval
explaining heterogeneity by covariatesMeta-regression
visualizing small-study effectsFunnel plot
testing funnel asymmetryEgger’s test

Fleiss’ kappa

A kappa extending chance-corrected agreement past two raters. in the pathway →

On the pathway · 01 · Measurement · Reliability and validity

Which measurement property are you assessing?

the overall ideaReliability and validity
consistency of measurementReliability
measuring the intended constructValidity
internal consistency of scale itemsCronbach’s alpha
agreement on continuous measuresIntraclass correlation
plot method agreement and biasBland-Altman plot
categorical agreement, two ratersCohen’s kappa
ordered-category agreement, two ratersWeighted kappa
categorical agreement, many ratersFleiss’ kappa

Friction-cost approach

Valuing lost productivity by counting only earnings lost until a worker is replaced. in the pathway →

On the pathway · 05 · Decision rule · Costing methods

How to value the resources used?

the overall ideaCosting methods
aggregate top-down unit costsGross costing
itemized bottom-up resource countsMicro-costing
value lost productivity over a lifetimeHuman-capital approach
value productivity loss until replacedFriction-cost approach

Fundamental problem of causal inference

That only one of a unit’s potential outcomes is ever observed. in the pathway →

On the pathway · 02 · Model · Potential outcomes and identifiability

Which identifiability condition is at stake?

the overall frameworkPotential outcomes and identifiability
the counterfactual setupPotential-outcomes framework
only one outcome is observedFundamental problem of causal inference
treated and untreated comparableExchangeability
every covariate stratum has bothPositivity
observed equals counterfactual under treatmentConsistency
no interference, single versionSUTVA

Funnel plot

A plot used to check for publication bias in a meta-analysis, where asymmetry suggests missing null studies. in the pathway →

On the pathway · 04 · Synthesis · Meta-analysis and pooling

Which pooling or heterogeneity tool?

the overall ideaMeta-analysis and pooling
one true effect assumedFixed-effect meta-analysis
effects vary across studiesRandom-effects meta-analysis
how much effects varyHeterogeneity
testing for heterogeneityCochran’s Q
proportion of variance from heterogeneityI-squared
between-study variance estimateTau-squared
range for a new studyPrediction interval
explaining heterogeneity by covariatesMeta-regression
visualizing small-study effectsFunnel plot
testing funnel asymmetryEgger’s test

G

G-estimation

A g-method estimating a structural nested model for time-varying confounding. in the pathway →

On the pathway · 02 · Model · Time-varying confounding and g-methods

How do you handle time-varying confounding?

the overall ideaTime-varying confounding
confounder both affects and respondsTreatment-confounder feedback
weight to remove time-varying confoundingMarginal structural model
model effect directly through timeG-estimation

G-formula

G-computation: modeling the outcome under each treatment and averaging over the covariate distribution. in the pathway →

On the pathway · 02 · Model · Causal estimators (propensity scores, g-methods)

How do you estimate the causal effect?

the overall ideaCausal estimators
model treatment assignment probabilityPropensity score
reweight by inverse treatment probabilityIPTW
model and average outcomesG-formula
combine outcome and treatment modelsDoubly-robust estimators
targeted machine-learning estimationTMLE

Gate question

A question that routes a respondent past items that do not apply, creating by-design blanks. in the pathway →

On the pathway · 01 · Measurement · Survey instruments: skip patterns and branching

Which skip-logic element is in play?

the overall ideaSurvey skip patterns
item routing later questionsGate question

Gatekeeping procedure

A hierarchical procedure ordering trial hypotheses and spending alpha down the sequence, testing a secondary endpoint only if the primary won. in the pathway →

On the pathway · 02 · Model · Multiplicity control

How do you control multiple testing?

the overall ideaMultiplicity control
bound any false positiveFamily-wise error rate
bound false positives among rejectionsFalse-discovery rate
simple conservative FWER divisorBonferroni correction
stepwise FWER controlHolm’s procedure
step-up FDR controlBenjamini-Hochberg
test hypotheses in ordered familiesGatekeeping procedure

GDPR

The European regulation imposing a stricter consent-and-purpose regime on personal data than US rules. in the pathway →

On the pathway · § · Conduct it · Data privacy and security

Which rule or method?

the overall ideaData privacy and security
US health privacy lawHIPAA
EU data protection lawGDPR
de-identify by removing identifiersSafe Harbor
de-identify by statistical opinionExpert determination
generating artificial substitute recordsSynthetic data

GEE

Generalized estimating equations, used for clustered or repeated measures. in the pathway →

On the pathway · 02 · Model · Regression families

Which regression for your outcome?

the overall ideaRegression families
unifying exponential-family frameworkGLM
continuous outcomeLinear regression
binary outcomeLogistic regression
count outcomePoisson regression
overdispersed countsNegative binomial regression
excess zeros in countsZero-inflated model
separate zero and positive partsHurdle model
correlated or clustered outcomesGEE
repeated measures over timeMMRM

Generalizability and transportability

Generalizability asks whether the study sample represents the target population; transportability formalizes when an estimate can be carried to a different population. in the pathway →

On the pathway · 04 · Synthesis · Generalizability and transportability

Do findings carry to other populations?

the overall ideaGeneralizability and transportability

Generalized additive models

Models that extend splines to fit smooth nonlinear predictor effects. in the pathway →

On the pathway · 02 · Model · Model modifications (splines, interactions)

How do you flex the model?

the overall ideaModel modifications
effect depends on another variableInteraction term
fixed exposure term in count modelsOffset
smooth nonlinear flexible curvesSplines
additive smooth function componentsGeneralized additive models

Gibbs sampling

A classic MCMC algorithm for drawing posterior samples. in the pathway →

On the pathway · 02 · Model · Bayesian computation (MCMC)

Which sampling or diagnostic tool?

the overall ideaBayesian computation
sampling the posterior generallyMCMC
proposal-and-accept samplerMetropolis-Hastings
sample each parameter conditionallyGibbs sampling
gradient-guided efficient samplerHamiltonian Monte Carlo
check chains have convergedR-hat
check model reproduces the dataPosterior predictive check

GLM

A generalized linear model: a choice of outcome distribution plus a link function. in the pathway →

On the pathway · 02 · Model · Regression families

Which regression for your outcome?

the overall ideaRegression families
unifying exponential-family frameworkGLM
continuous outcomeLinear regression
binary outcomeLogistic regression
count outcomePoisson regression
overdispersed countsNegative binomial regression
excess zeros in countsZero-inflated model
separate zero and positive partsHurdle model
correlated or clustered outcomesGEE
repeated measures over timeMMRM

Good clinical practice

The operational standard (ICH E6) making a trial’s data trustworthy through defined responsibilities, a followed protocol, source-data verification, and an audit trail. in the pathway →

On the pathway · § · Conduct it · Good clinical practice

What conduct standard governs the trial?

the ethical and quality standardGood clinical practice

Grace period and permissible gap

Allowed days between supplies before exposure is broken, and extra coverage past the last day of supply before discontinuation. in the pathway →

On the pathway · 01 · Measurement · Defining exposure in real-world data

How do raw fills become a defined exposure with a start and end?

the overall ideaExposure definition in RWD
building one courseExposure episode construction
tolerating gapsGrace period and permissible gap
shifting the clockInduction, latency, and lag windows
adherence metricProportion of days covered (PDC)
adherence metricMedication possession ratio (MPR)
how long treatedPersistence (time to discontinuation)
standardized spanDrug era (OMOP)

GRADE

A system rating the certainty of a body of evidence, downgrading for risk of bias, inconsistency, indirectness, imprecision, and publication bias. in the pathway →

On the pathway · 04 · Synthesis · Certainty of evidence (GRADE)

Which certainty-of-evidence concept is in play?

rating confidence in estimatesCertainty of evidence (GRADE)
the named frameworkGRADE

Gross costing

Top-down costing that values a whole episode of care with one aggregate weight such as a DRG payment. in the pathway →

On the pathway · 05 · Decision rule · Costing methods

How to value the resources used?

the overall ideaCosting methods
aggregate top-down unit costsGross costing
itemized bottom-up resource countsMicro-costing
value lost productivity over a lifetimeHuman-capital approach
value productivity loss until replacedFriction-cost approach

Group-sequential design

A design that pre-specifies interim analyses and spends the alpha across them with a stopping boundary. in the pathway →

On the pathway · 00 · Framing · Interim analyses and group-sequential design

Which interim-monitoring element?

the overall ideaInterim analyses and group-sequential design
committee reviewing accruing dataData safety monitoring board
planned looks with stopping rulesGroup-sequential design
false-positive risk to spendType-I error
stringent early-look boundaryO’Brien-Fleming boundary
constant nominal-level boundaryPocock boundary

H

Half-cycle correction

A fix for the counting error from tallying state membership only at cycle boundaries, since on average subjects transition partway through a cycle. in the pathway →

On the pathway · 05 · Decision rule · Cycle length and the half-cycle correction

Which cycle-timing issue applies?

the overall ideaCycle length and the half-cycle correction
adjust for mid-cycle transitionsHalf-cycle correction

Hamiltonian Monte Carlo

The MCMC engine of Stan, mixing far more efficiently in high dimensions. in the pathway →

On the pathway · 02 · Model · Bayesian computation (MCMC)

Which sampling or diagnostic tool?

the overall ideaBayesian computation
sampling the posterior generallyMCMC
proposal-and-accept samplerMetropolis-Hastings
sample each parameter conditionallyGibbs sampling
gradient-guided efficient samplerHamiltonian Monte Carlo
check chains have convergedR-hat
check model reproduces the dataPosterior predictive check

Hazard ratios and non-proportional hazards

A hazard ratio assumes a constant effect on instantaneous risk over time; when that fails, the single ratio becomes a censoring-dependent weighted average. in the pathway →

On the pathway · 03 · Estimate · Hazard ratios and non-proportional hazards

Which survival concept?

the overall ideaHazard ratios and non-proportional hazards
censoring unrelated to outcomeNon-informative censoring
summary when hazards are non-proportionalRestricted mean survival time

Health technology assessment and value frameworks

A body weighing cost-effectiveness against clinical benefit, budget impact, and equity to reach a coverage or pricing verdict, run differently across health systems. in the pathway →

On the pathway · 05 · Decision rule · Health technology assessment and value frameworks

Which HTA framework or body?

the overall ideaHealth technology assessment and value frameworks
UK appraisal agencyNICE
US value-assessment organizationInstitute for Clinical and Economic Review

Health-state utility

A preference-based weight between zero and one for a health state, elicited from instruments like the EQ-5D or time-trade-off and standard-gamble methods. in the pathway →

On the pathway · 05 · Decision rule · QALYs and health-state utilities

Which utility concept do you need?

the overall ideaQALYs and health-state utilities
quality-adjusted life yearsQALY
preference weight for a health stateHealth-state utility
a standardized utility instrumentEQ-5D

Healthy-worker effect

The tendency of an employed cohort to be healthier than the general population. in the pathway →

On the pathway · ∗ · Defend it · Study biases, by rung

Which bias is threatening the study?

the overall map of biasesStudy biases, by rung
who got sampled or enrolledSampling bias
distorted entry into the studySelection bias
selection among hospitalized patientsBerkson’s bias
conditioning on a common effectOver-adjustment
employed groups appear healthierHealthy-worker effect
only survivors are observedSurvivorship bias
differential loss to follow-upAttrition bias
nonresponders differ systematicallyNonresponse bias
a common cause of exposure and outcomeConfounding
treatment chosen by prognosisConfounding by indication
unequal outcome ascertainmentDetection bias
inaccurate recall of exposureRecall bias
interviewer shapes responsesInterviewer bias
earlier detection inflates survivalLead-time bias
slow cases preferentially detectedLength-time bias
detecting harmless diseaseOverdiagnosis
selective reporting of resultsPublication bias

Heterogeneity

The degree to which studies’ results actually disagree beyond chance, which decides whether a pooled number is informative or a fiction. in the pathway →

On the pathway · 04 · Synthesis · Meta-analysis and pooling

Which pooling or heterogeneity tool?

the overall ideaMeta-analysis and pooling
one true effect assumedFixed-effect meta-analysis
effects vary across studiesRandom-effects meta-analysis
how much effects varyHeterogeneity
testing for heterogeneityCochran’s Q
proportion of variance from heterogeneityI-squared
between-study variance estimateTau-squared
range for a new studyPrediction interval
explaining heterogeneity by covariatesMeta-regression
visualizing small-study effectsFunnel plot
testing funnel asymmetryEgger’s test

Heteroscedasticity

Non-constant residual variance, read from a residual-versus-fitted plot and confirmed with Breusch-Pagan or White. in the pathway →

On the pathway · 02 · Model · Checking model assumptions

Which model assumption to check?

the overall ideaChecking model assumptions
non-constant residual varianceHeteroscedasticity
predictors too collinearVariance inflation factor
single points driving the fitCook’s distance
hazard ratio constant over timeProportional hazards
test that proportionality formallySchoenfeld residuals
fix variance without refittingRobust standard errors

Hierarchical Bayesian models

Multilevel models that estimate each group’s parameter while sharing a common prior, pulling estimates toward the mean. in the pathway →

On the pathway · 02 · Model · Hierarchical (multilevel) Bayesian models

Which multilevel Bayesian idea?

the overall ideaHierarchical Bayesian models
borrow strength across groupsPartial pooling

Hierarchical clustering

A clustering method building a nested tree of groupings without fixing the number of clusters in advance. in the pathway →

On the pathway · 02 · Model · Unsupervised learning

What unlabeled-data structure are you finding?

the overall familyUnsupervised learning
grouping similar observationsClustering
nested grouping by linkageHierarchical clustering
partitioning into k groupsK-means
reducing the number of featuresDimensionality reduction
orthogonal variance componentsPrincipal component analysis

High-dimensional propensity score (hdPS)

An algorithm that screens thousands of claims codes to select empirical proxy confounders for the propensity-score model automatically. in the pathway →

On the pathway · 02 · Model · Real-world causal-inference extensions

How do causal methods scale to claims and time?

the overall ideaReal-world causal-inference extensions
you have many candidate covariatesHigh-dimensional propensity score (hdPS)
outcome modeling suits the problemDisease risk score
treatment strategy unfolds over timeClone-censor-weight
censoring depends on covariatesInverse-probability-of-censoring weighting (IPCW)
you need a simple guardLandmark analysis

HIPAA

The US law governing identifiable health information, which a dataset must satisfy through de-identification before sharing for research. in the pathway →

On the pathway · § · Conduct it · Data privacy and security

Which rule or method?

the overall ideaData privacy and security
US health privacy lawHIPAA
EU data protection lawGDPR
de-identify by removing identifiersSafe Harbor
de-identify by statistical opinionExpert determination
generating artificial substitute recordsSynthetic data

Holm’s procedure

A step-down procedure controlling the family-wise error rate with more power than Bonferroni. in the pathway →

On the pathway · 02 · Model · Multiplicity control

How do you control multiple testing?

the overall ideaMultiplicity control
bound any false positiveFamily-wise error rate
bound false positives among rejectionsFalse-discovery rate
simple conservative FWER divisorBonferroni correction
stepwise FWER controlHolm’s procedure
step-up FDR controlBenjamini-Hochberg
test hypotheses in ordered familiesGatekeeping procedure

Homogeneity check

The test before pooling for whether stratum-specific estimates differ by more than noise, which would indicate effect modification. in the pathway →

On the pathway · 02 · Model · Stratified analysis (Mantel-Haenszel)

Which stratified-analysis step are you at?

the overall approachStratified analysis
pooling across strataMantel-Haenszel estimator
combining stratum log effectsWoolf’s method
testing for effect modificationHomogeneity check

Hosmer-Lemeshow statistic

A goodness-of-fit test of whether a model’s predicted risks match observed event rates across groups. in the pathway →

On the pathway · 03 · Estimate · Calibration versus discrimination

Which aspect of predictive performance?

the overall ideaCalibration versus discrimination
ranking cases above non-casesDiscrimination
summarizing ranking across thresholdsAUC
predicted risks match observedCalibration
testing calibration formallyHosmer-Lemeshow statistic

Human-capital approach

Valuing lost productivity by counting all earnings foregone to illness. in the pathway →

On the pathway · 05 · Decision rule · Costing methods

How to value the resources used?

the overall ideaCosting methods
aggregate top-down unit costsGross costing
itemized bottom-up resource countsMicro-costing
value lost productivity over a lifetimeHuman-capital approach
value productivity loss until replacedFriction-cost approach

Hurdle model

A count model with a zero-versus-positive gate followed by a truncated count. in the pathway →

On the pathway · 02 · Model · Regression families

Which regression for your outcome?

the overall ideaRegression families
unifying exponential-family frameworkGLM
continuous outcomeLinear regression
binary outcomeLogistic regression
count outcomePoisson regression
overdispersed countsNegative binomial regression
excess zeros in countsZero-inflated model
separate zero and positive partsHurdle model
correlated or clustered outcomesGEE
repeated measures over timeMMRM

Hypothetical strategy

An intercurrent-event strategy targeting the outcome had the event not occurred. in the pathway →

On the pathway · 02 · Model · Trial estimands and intercurrent events

How do you handle intercurrent events?

the overall ideaTrial estimands and intercurrent events
events disrupting outcome interpretationIntercurrent events
ignore them, use assigned treatmentTreatment-policy strategy
imagine they did not occurHypothetical strategy
fold event into the outcomeComposite strategy
restrict to a defined subpopulationPrincipal-stratum strategy

I

I-squared

A statistic reporting the fraction of total variation across studies that is beyond chance, summarizing heterogeneity. in the pathway →

On the pathway · 04 · Synthesis · Meta-analysis and pooling

Which pooling or heterogeneity tool?

the overall ideaMeta-analysis and pooling
one true effect assumedFixed-effect meta-analysis
effects vary across studiesRandom-effects meta-analysis
how much effects varyHeterogeneity
testing for heterogeneityCochran’s Q
proportion of variance from heterogeneityI-squared
between-study variance estimateTau-squared
range for a new studyPrediction interval
explaining heterogeneity by covariatesMeta-regression
visualizing small-study effectsFunnel plot
testing funnel asymmetryEgger’s test

ICD-10-CM diagnosis codes

Clinical modification of ICD-10 used to code diagnoses and conditions for morbidity reporting. in the pathway →

On the pathway · 01 · Measurement · Claims and coding standards

Which vocabulary encodes each claim field, and what does it capture?

the overall ideaClaims and coding standards
diagnosesICD-10-CM diagnosis codes
inpatient proceduresICD-10-PCS procedure codes
professional servicesCPT/HCPCS codes
dispensed drugsNDC (National Drug Code)
labs and observationsLOINC lab codes
providerNPI (provider identifier)
cross-database mappingOMOP standardized vocabularies (OHDSI)
translating codesCode crosswalks and mappings
drug utilizationATC and defined daily dose (DDD)

ICD-10-PCS procedure codes

Procedure coding system for inpatient hospital procedures. in the pathway →

On the pathway · 01 · Measurement · Claims and coding standards

Which vocabulary encodes each claim field, and what does it capture?

the overall ideaClaims and coding standards
diagnosesICD-10-CM diagnosis codes
inpatient proceduresICD-10-PCS procedure codes
professional servicesCPT/HCPCS codes
dispensed drugsNDC (National Drug Code)
labs and observationsLOINC lab codes
providerNPI (provider identifier)
cross-database mappingOMOP standardized vocabularies (OHDSI)
translating codesCode crosswalks and mappings
drug utilizationATC and defined daily dose (DDD)

ICER

Incremental cost-effectiveness ratio: the extra cost divided by the extra benefit of one option over the next. in the pathway → \[\text{ICER} = \frac{\Delta\text{cost}}{\Delta\text{effect}}, \quad \text{NMB} = \text{effect} \times \text{WTP} - \text{cost}\] where \(\text{ICER}\) is the incremental cost-effectiveness ratio of one option over the next; \(\Delta\text{cost}\) is the extra cost of the option; \(\Delta\text{effect}\) is the extra benefit of the option; \(\text{NMB}\) is the net monetary benefit, the same comparison made linear; \(\text{effect}\) is the health benefit gained; \(\text{WTP}\) is the willingness-to-pay threshold per unit of benefit; \(\text{cost}\) is the cost of the option.

On the pathway · 05 · Decision rule · Cost-effectiveness and the ICER

Which economic-evaluation framing fits?

the overall ideaCost-effectiveness and the ICER
extra cost per extra effectICER
value costs and benefits in moneyCost-benefit analysis
effects identical, compare costs onlyCost-minimization analysis
effects in quality-adjusted life yearsCost-utility analysis
value at a willingness thresholdNet monetary benefit
maximum payable per unit benefitWillingness-to-pay threshold

IDE

FDA investigational device exemption, usually needed before a device trial begins. in the pathway →

On the pathway · § · Conduct it · Regulatory pathways and registration

Which regulatory application?

the overall ideaRegulatory pathways and registration
investigational drug applicationIND
investigational device exemptionIDE

Identifying assumptions

The claim each causal design rests on that the data cannot verify, such as parallel trends or an exclusion restriction. in the pathway →

On the pathway · 02 · Model · Identifying assumptions

Which identifying assumption do you need?

the overall ideaIdentifying assumptions
treatment independent of confoundersConditional independence
instrument affects outcome only via exposureExclusion restriction
groups would have tracked togetherParallel trends

Immortal time

A stretch of follow-up during which the outcome could not yet have occurred, a bias target-trial emulation surfaces. in the pathway →

On the pathway · 00 · Framing · Target-trial emulation

Which target-trial element?

the overall ideaTarget-trial emulation
misaligned follow-up start creating biasImmortal time

Immortal time bias

Mistakenly assigning follow-up during which the outcome could not occur to the treated group, manufacturing a survival advantage from bookkeeping. in the pathway →

On the pathway · 01 · Measurement · Immortal time bias

What situation creates this bias?

the overall ideaImmortal time bias

Incidence

The rate of new cases, measured as cumulative incidence over a fixed period or as an incidence rate per person-time. in the pathway →

On the pathway · 01 · Measurement · Measures of disease frequency

What frequency are you trying to measure?

the overall ideaMeasures of disease frequency
existing cases at a time pointPrevalence
new cases over follow-upIncidence
new cases as a proportion at riskCumulative incidence
new cases per unit follow-up timeIncidence rate
denominator of summed follow-upPerson-time
unadjusted rate in a populationCrude rate
comparing rates across populationsAge-standardization
observed versus expected deathsStandardized mortality ratio

Incidence rate

New cases divided by the person-time at risk, which handles varying follow-up. in the pathway →

On the pathway · 01 · Measurement · Measures of disease frequency

What frequency are you trying to measure?

the overall ideaMeasures of disease frequency
existing cases at a time pointPrevalence
new cases over follow-upIncidence
new cases as a proportion at riskCumulative incidence
new cases per unit follow-up timeIncidence rate
denominator of summed follow-upPerson-time
unadjusted rate in a populationCrude rate
comparing rates across populationsAge-standardization
observed versus expected deathsStandardized mortality ratio

Incorporation bias

Bias arising when the index test is itself part of the reference standard it is judged against. in the pathway →

On the pathway · 05 · Decision rule · Diagnostic-accuracy studies

Which accuracy measure or pitfall?

the overall ideaDiagnostic-accuracy studies
how results shift disease oddsLikelihood ratios
index test informs reference standardIncorporation bias
only some get the reference standardVerification bias
unrepresentative case mixSpectrum bias

IND

FDA investigational new drug application, usually needed before a drug trial begins. in the pathway →

On the pathway · § · Conduct it · Regulatory pathways and registration

Which regulatory application?

the overall ideaRegulatory pathways and registration
investigational drug applicationIND
investigational device exemptionIDE

Index date

A single time zero at which eligibility, exposure assignment, and follow-up start are all aligned for each patient. in the pathway →

On the pathway · 01 · Measurement · Assembling the analytic cohort

Which cohort-construction step?

the overall ideaAssembling the analytic cohort
pull and reshape raw source dataExtract-transform-load
set the time-zero anchorIndex date
define pre-index covariate historyLookback window
restrict to treatment initiatorsNew-user design

Induction, latency, and lag windows

Time shifts that delay when exposure can plausibly cause an outcome, excluding implausibly early events. in the pathway →

On the pathway · 01 · Measurement · Defining exposure in real-world data

How do raw fills become a defined exposure with a start and end?

the overall ideaExposure definition in RWD
building one courseExposure episode construction
tolerating gapsGrace period and permissible gap
shifting the clockInduction, latency, and lag windows
adherence metricProportion of days covered (PDC)
adherence metricMedication possession ratio (MPR)
how long treatedPersistence (time to discontinuation)
standardized spanDrug era (OMOP)

Informative prior

A prior encoding real external knowledge, powerful when data are sparse. in the pathway →

On the pathway · 02 · Model · Choosing a prior

What kind of prior do you need?

the overall choiceChoosing a prior
prior matched to the likelihoodConjugate prior
strong external informationInformative prior
light regularizing informationWeakly-informative prior

Informed consent

The requirement that a participant understand the study, its risks, and their freedom to refuse or withdraw, with extra protection for vulnerable groups. in the pathway →

On the pathway · § · Conduct it · Research ethics and the IRB

Which ethics concept or body?

the overall ideaResearch ethics and the IRB
foundational ethical principlesBelmont principles
genuine uncertainty justifying a trialClinical equipoise
participant’s voluntary agreementInformed consent
body that reviews and approves studiesInstitutional review board

Institute for Clinical and Economic Review

A US body publishing value assessments that anchor drug-price negotiations without a binding cost-per-QALY rule. in the pathway →

On the pathway · 05 · Decision rule · Health technology assessment and value frameworks

Which HTA framework or body?

the overall ideaHealth technology assessment and value frameworks
UK appraisal agencyNICE
US value-assessment organizationInstitute for Clinical and Economic Review

Institutional review board

A body that reviews a study before it starts, weighing risks against benefits and able to halt or modify a protocol. in the pathway →

On the pathway · § · Conduct it · Research ethics and the IRB

Which ethics concept or body?

the overall ideaResearch ethics and the IRB
foundational ethical principlesBelmont principles
genuine uncertainty justifying a trialClinical equipoise
participant’s voluntary agreementInformed consent
body that reviews and approves studiesInstitutional review board

Instrumental variables

A causal design using a variable affecting exposure only, resting on an exclusion restriction. in the pathway →

On the pathway · 02 · Model · Causal designs without randomization

Which quasi-experimental design fits?

the overall ideaCausal designs without randomization
before-after across exposed and controlDifference-in-differences
a haphazard nudge to exposureInstrumental variables
assignment by a cutoff thresholdRegression discontinuity
weighted donors build a counterfactualSynthetic control

Intention-to-treat

Analyzing every randomized patient in the arm assigned regardless of what they took, preserving randomization. in the pathway →

On the pathway · 02 · Model · Analysis populations (ITT vs per-protocol)

Which set of subjects do you analyze?

the overall ideaAnalysis populations
as randomized, regardless of adherenceIntention-to-treat
only those who followed protocolPer-protocol
grouped by treatment actually receivedAs-treated

Interaction term

A term capturing effect modification, letting an effect differ across subgroups instead of being averaged. in the pathway →

On the pathway · 02 · Model · Model modifications (splines, interactions)

How do you flex the model?

the overall ideaModel modifications
effect depends on another variableInteraction term
fixed exposure term in count modelsOffset
smooth nonlinear flexible curvesSplines
additive smooth function componentsGeneralized additive models

Intercurrent events

Things happening after randomization that complicate the outcome, such as stopping the drug, switching, rescue medication, or death. in the pathway →

On the pathway · 02 · Model · Trial estimands and intercurrent events

How do you handle intercurrent events?

the overall ideaTrial estimands and intercurrent events
events disrupting outcome interpretationIntercurrent events
ignore them, use assigned treatmentTreatment-policy strategy
imagine they did not occurHypothetical strategy
fold event into the outcomeComposite strategy
restrict to a defined subpopulationPrincipal-stratum strategy

Interim analyses and group-sequential design

Pre-specified looks at accumulating trial data that spend alpha across them so peeking does not inflate the false-positive rate. in the pathway →

On the pathway · 00 · Framing · Interim analyses and group-sequential design

Which interim-monitoring element?

the overall ideaInterim analyses and group-sequential design
committee reviewing accruing dataData safety monitoring board
planned looks with stopping rulesGroup-sequential design
false-positive risk to spendType-I error
stringent early-look boundaryO’Brien-Fleming boundary
constant nominal-level boundaryPocock boundary

Interviewer bias

Bias from a data collector’s knowledge of a subject’s status shaping what is recorded. in the pathway →

On the pathway · ∗ · Defend it · Study biases, by rung

Which bias is threatening the study?

the overall map of biasesStudy biases, by rung
who got sampled or enrolledSampling bias
distorted entry into the studySelection bias
selection among hospitalized patientsBerkson’s bias
conditioning on a common effectOver-adjustment
employed groups appear healthierHealthy-worker effect
only survivors are observedSurvivorship bias
differential loss to follow-upAttrition bias
nonresponders differ systematicallyNonresponse bias
a common cause of exposure and outcomeConfounding
treatment chosen by prognosisConfounding by indication
unequal outcome ascertainmentDetection bias
inaccurate recall of exposureRecall bias
interviewer shapes responsesInterviewer bias
earlier detection inflates survivalLead-time bias
slow cases preferentially detectedLength-time bias
detecting harmless diseaseOverdiagnosis
selective reporting of resultsPublication bias

Intraclass correlation

A measure of reproducibility for a continuous measurement across raters or repeats. in the pathway →

On the pathway · 01 · Measurement · Reliability and validity

Which measurement property are you assessing?

the overall ideaReliability and validity
consistency of measurementReliability
measuring the intended constructValidity
internal consistency of scale itemsCronbach’s alpha
agreement on continuous measuresIntraclass correlation
plot method agreement and biasBland-Altman plot
categorical agreement, two ratersCohen’s kappa
ordered-category agreement, two ratersWeighted kappa
categorical agreement, many ratersFleiss’ kappa

Inverse-probability-of-censoring weighting (IPCW)

Reweighting uncensored patients to stand in for similar censored ones, correcting the informative censoring that artificial censoring or dropout creates. in the pathway →

On the pathway · 02 · Model · Real-world causal-inference extensions

How do causal methods scale to claims and time?

the overall ideaReal-world causal-inference extensions
you have many candidate covariatesHigh-dimensional propensity score (hdPS)
outcome modeling suits the problemDisease risk score
treatment strategy unfolds over timeClone-censor-weight
censoring depends on covariatesInverse-probability-of-censoring weighting (IPCW)
you need a simple guardLandmark analysis

IPTW

Inverse-probability-of-treatment weighting, which reweights subjects by the inverse of their propensity score to balance measured confounders. in the pathway →

On the pathway · 02 · Model · Causal estimators (propensity scores, g-methods)

How do you estimate the causal effect?

the overall ideaCausal estimators
model treatment assignment probabilityPropensity score
reweight by inverse treatment probabilityIPTW
model and average outcomesG-formula
combine outcome and treatment modelsDoubly-robust estimators
targeted machine-learning estimationTMLE

K

K-means

A clustering method partitioning data into k groups by minimizing within-cluster distance to the cluster mean. in the pathway →

On the pathway · 02 · Model · Unsupervised learning

What unlabeled-data structure are you finding?

the overall familyUnsupervised learning
grouping similar observationsClustering
nested grouping by linkageHierarchical clustering
partitioning into k groupsK-means
reducing the number of featuresDimensionality reduction
orthogonal variance componentsPrincipal component analysis

K-nearest neighbours

A predictor using the majority or average of the k closest cases, sensitive to scaling and dimensionality. in the pathway →

On the pathway · 02 · Model · Learning algorithms and ensembles

Which learner or ensemble fits?

the overall ideaLearning algorithms and ensembles
single interpretable splitsDecision tree (machine learning)
classify by closest neighborsK-nearest neighbours
maximum-margin separating boundarySupport vector machine
average parallel bootstrapped modelsBagging
many decorrelated bagged treesRandom forest
sequentially correct prior errorsBoosting

Kendall’s tau

A measure of concordance between two ordinal rankings, with tau-c for rectangular tables. in the pathway →

On the pathway · 02 · Model · Bivariate tests (t-test, chi-square)

Which two-variable association are you testing?

the overall family of testsBivariate tests
mean across two groupsT-test
means across three or more groupsANOVA
ranks across two groupsMann-Whitney test
ranks across three or more groupsKruskal-Wallis test
two categorical variables, large countsChi-square test
two categorical variables, small countsFisher’s exact test
linear correlation of two continuousPearson correlation
monotonic correlation, rankedSpearman correlation
concordance-based rank correlationKendall’s tau
effect size for two meansCohen’s d
effect size for ANOVAEta-squared
effect size for categorical associationCramer’s V

Kruskal-Wallis test

A rank-based alternative to one-way ANOVA when normality is doubtful. in the pathway →

On the pathway · 02 · Model · Bivariate tests (t-test, chi-square)

Which two-variable association are you testing?

the overall family of testsBivariate tests
mean across two groupsT-test
means across three or more groupsANOVA
ranks across two groupsMann-Whitney test
ranks across three or more groupsKruskal-Wallis test
two categorical variables, large countsChi-square test
two categorical variables, small countsFisher’s exact test
linear correlation of two continuousPearson correlation
monotonic correlation, rankedSpearman correlation
concordance-based rank correlationKendall’s tau
effect size for two meansCohen’s d
effect size for ANOVAEta-squared
effect size for categorical associationCramer’s V

Kurtosis

A summary of a distribution’s tail-heaviness, part of reading its shape. in the pathway →

On the pathway · 01 · Measurement · Characterizing the distribution

What shape feature?

the overall ideaCharacterizing the distribution
asymmetry of the distributionSkewness
heaviness of the tailsKurtosis
smoothing a nonlinear trendLOESS smoother

L

Landmark analysis

Classifying exposure status as of a fixed later time and analyzing from there, so early events are not misattributed to exposure. in the pathway →

On the pathway · 02 · Model · Real-world causal-inference extensions

How do causal methods scale to claims and time?

the overall ideaReal-world causal-inference extensions
you have many candidate covariatesHigh-dimensional propensity score (hdPS)
outcome modeling suits the problemDisease risk score
treatment strategy unfolds over timeClone-censor-weight
censoring depends on covariatesInverse-probability-of-censoring weighting (IPCW)
you need a simple guardLandmark analysis

Lasso

L1 regularization that shrinks some coefficients exactly to zero and so also selects variables. in the pathway →

On the pathway · 02 · Model · Bias-variance and regularization

Which concept or penalty?

the overall ideaBias-variance and regularization
the underlying error tradeoffBias-variance tradeoff
fitting noise, poor generalizationOverfitting
penalizing complexity broadlyRegularization
estimating out-of-sample errorCross-validation
shrink coefficients, keep allRidge regression
shrink and select variablesLasso
blend selection and shrinkageElastic net

LATE

The local average treatment effect, the contrast of potential outcomes among compliers. in the pathway →

On the pathway · 02 · Model · Choosing the estimand

Whose causal effect do you target?

the overall ideaChoosing the estimand
the formal target quantityEstimand
effect across the whole populationATE
effect among the treatedATT
effect among compliers onlyLATE

Lead-time bias

The apparent survival gain from diagnosing earlier without changing the disease course. in the pathway →

On the pathway · ∗ · Defend it · Study biases, by rung

Which bias is threatening the study?

the overall map of biasesStudy biases, by rung
who got sampled or enrolledSampling bias
distorted entry into the studySelection bias
selection among hospitalized patientsBerkson’s bias
conditioning on a common effectOver-adjustment
employed groups appear healthierHealthy-worker effect
only survivors are observedSurvivorship bias
differential loss to follow-upAttrition bias
nonresponders differ systematicallyNonresponse bias
a common cause of exposure and outcomeConfounding
treatment chosen by prognosisConfounding by indication
unequal outcome ascertainmentDetection bias
inaccurate recall of exposureRecall bias
interviewer shapes responsesInterviewer bias
earlier detection inflates survivalLead-time bias
slow cases preferentially detectedLength-time bias
detecting harmless diseaseOverdiagnosis
selective reporting of resultsPublication bias

Leading question

A survey item whose wording presses the respondent toward a particular answer. in the pathway →

On the pathway · 01 · Measurement · Questionnaire and instrument design

Which questionnaire flaw is in play?

the overall craftQuestionnaire and instrument design
asking two things at onceDouble-barreled question
wording that steers the answerLeading question

Learning algorithms and ensembles

The supervised toolkit beyond regression, including k-nearest neighbours, support vector machines, decision trees, and ensembles. in the pathway →

On the pathway · 02 · Model · Learning algorithms and ensembles

Which learner or ensemble fits?

the overall ideaLearning algorithms and ensembles
single interpretable splitsDecision tree (machine learning)
classify by closest neighborsK-nearest neighbours
maximum-margin separating boundarySupport vector machine
average parallel bootstrapped modelsBagging
many decorrelated bagged treesRandom forest
sequentially correct prior errorsBoosting

Leave-one-out and specification curves

Re-estimating after dropping a single unit, or across many defensible modeling choices, to expose whether a finding rests on one unit or holds broadly. in the pathway →

On the pathway · ∗ · Defend it · Leave-one-out and specification curves

How are you probing specification robustness?

the overall ideaLeave-one-out and specification curves
results across many model choicesSpecification-curve analysis

Length-time bias

The over-representation of slow, indolent cases that screening preferentially catches. in the pathway →

On the pathway · ∗ · Defend it · Study biases, by rung

Which bias is threatening the study?

the overall map of biasesStudy biases, by rung
who got sampled or enrolledSampling bias
distorted entry into the studySelection bias
selection among hospitalized patientsBerkson’s bias
conditioning on a common effectOver-adjustment
employed groups appear healthierHealthy-worker effect
only survivors are observedSurvivorship bias
differential loss to follow-upAttrition bias
nonresponders differ systematicallyNonresponse bias
a common cause of exposure and outcomeConfounding
treatment chosen by prognosisConfounding by indication
unequal outcome ascertainmentDetection bias
inaccurate recall of exposureRecall bias
interviewer shapes responsesInterviewer bias
earlier detection inflates survivalLead-time bias
slow cases preferentially detectedLength-time bias
detecting harmless diseaseOverdiagnosis
selective reporting of resultsPublication bias

Likelihood ratios

Summaries of a diagnostic table independent of prevalence that update pre-test odds to post-test odds directly. in the pathway → \[\text{LR}+ = \frac{\text{sens}}{1 - \text{spec}}, \quad \text{LR}- = \frac{1 - \text{sens}}{\text{spec}}, \quad \text{post-test odds} = \text{pre-test odds} \times \text{LR}\] where \(\text{LR}+\) is the positive likelihood ratio, how much a positive result raises the odds; \(\text{LR}-\) is the negative likelihood ratio, how much a negative result lowers the odds; \(\text{sens}\) is the sensitivity of the test; \(\text{spec}\) is the specificity of the test; \(\text{pre-test odds}\) is the odds of disease before the test, from prevalence; \(\text{post-test odds}\) is the odds of disease after the test result.

On the pathway · 05 · Decision rule · Diagnostic-accuracy studies

Which accuracy measure or pitfall?

the overall ideaDiagnostic-accuracy studies
how results shift disease oddsLikelihood ratios
index test informs reference standardIncorporation bias
only some get the reference standardVerification bias
unrepresentative case mixSpectrum bias

Linear combinations and contrasts

A weighted sum of regression coefficients reported as the quantity of interest, with a standard error drawn from the variance-covariance matrix. in the pathway →

On the pathway · 02 · Model · Linear combinations and contrasts

Which comparison of model terms?

the overall ideaLinear combinations and contrasts
a specific weighted group comparisonContrast

Linear regression

A regression for continuous outcomes, returning a mean difference. in the pathway →

On the pathway · 02 · Model · Regression families

Which regression for your outcome?

the overall ideaRegression families
unifying exponential-family frameworkGLM
continuous outcomeLinear regression
binary outcomeLogistic regression
count outcomePoisson regression
overdispersed countsNegative binomial regression
excess zeros in countsZero-inflated model
separate zero and positive partsHurdle model
correlated or clustered outcomesGEE
repeated measures over timeMMRM

LOESS smoother

A smoother drawn on a scatter to reveal the shape of a relationship before assuming it is linear. in the pathway →

On the pathway · 01 · Measurement · Characterizing the distribution

What shape feature?

the overall ideaCharacterizing the distribution
asymmetry of the distributionSkewness
heaviness of the tailsKurtosis
smoothing a nonlinear trendLOESS smoother

Logistic regression

A regression for binary outcomes, returning an odds ratio. in the pathway →

On the pathway · 02 · Model · Regression families

Which regression for your outcome?

the overall ideaRegression families
unifying exponential-family frameworkGLM
continuous outcomeLinear regression
binary outcomeLogistic regression
count outcomePoisson regression
overdispersed countsNegative binomial regression
excess zeros in countsZero-inflated model
separate zero and positive partsHurdle model
correlated or clustered outcomesGEE
repeated measures over timeMMRM

LOINC lab codes

Standard vocabulary for identifying laboratory tests and clinical observations. in the pathway →

On the pathway · 01 · Measurement · Claims and coding standards

Which vocabulary encodes each claim field, and what does it capture?

the overall ideaClaims and coding standards
diagnosesICD-10-CM diagnosis codes
inpatient proceduresICD-10-PCS procedure codes
professional servicesCPT/HCPCS codes
dispensed drugsNDC (National Drug Code)
labs and observationsLOINC lab codes
providerNPI (provider identifier)
cross-database mappingOMOP standardized vocabularies (OHDSI)
translating codesCode crosswalks and mappings
drug utilizationATC and defined daily dose (DDD)

Lookback window

The pre-index period in which confounders are measured so adjustment targets baseline causes, not post-exposure variables. in the pathway →

On the pathway · 01 · Measurement · Assembling the analytic cohort

Which cohort-construction step?

the overall ideaAssembling the analytic cohort
pull and reshape raw source dataExtract-transform-load
set the time-zero anchorIndex date
define pre-index covariate historyLookback window
restrict to treatment initiatorsNew-user design

M

MAD

The median absolute deviation, rescaled by 1.4826 to equal the standard deviation under a normal. in the pathway →

On the pathway · 02 · Model · Robust statistics for heavy tails

Which robust measure?

the overall ideaRobust statistics for heavy tails
robust spread of the dataMAD
robust outlier-resistant standardizationRobust z-score

Mann-Whitney test

A rank-based alternative to the two-group t-test when normality is doubtful, also called the Wilcoxon rank-sum. in the pathway →

On the pathway · 02 · Model · Bivariate tests (t-test, chi-square)

Which two-variable association are you testing?

the overall family of testsBivariate tests
mean across two groupsT-test
means across three or more groupsANOVA
ranks across two groupsMann-Whitney test
ranks across three or more groupsKruskal-Wallis test
two categorical variables, large countsChi-square test
two categorical variables, small countsFisher’s exact test
linear correlation of two continuousPearson correlation
monotonic correlation, rankedSpearman correlation
concordance-based rank correlationKendall’s tau
effect size for two meansCohen’s d
effect size for ANOVAEta-squared
effect size for categorical associationCramer’s V

Mantel-Haenszel estimator

A method for pooling stratum-specific odds ratios, risk ratios, or rate ratios into one. in the pathway →

On the pathway · 02 · Model · Stratified analysis (Mantel-Haenszel)

Which stratified-analysis step are you at?

the overall approachStratified analysis
pooling across strataMantel-Haenszel estimator
combining stratum log effectsWoolf’s method
testing for effect modificationHomogeneity check

MAR

Missing at random: missingness depending only on observed data, handled by multiple imputation conditional on it. in the pathway →

On the pathway · 01 · Measurement · Missing data: MCAR, MAR, MNAR

Why are values missing?

the overall ideaMissing data
missingness unrelated to anythingMCAR
missingness explained by observed dataMAR
missingness depends on unseen valuesMNAR
fill gaps and pool estimatesMultiple imputation

Marginal structural model

A g-method fitted by inverse-probability-of-treatment weighting to handle time-varying confounding. in the pathway →

On the pathway · 02 · Model · Time-varying confounding and g-methods

How do you handle time-varying confounding?

the overall ideaTime-varying confounding
confounder both affects and respondsTreatment-confounder feedback
weight to remove time-varying confoundingMarginal structural model
model effect directly through timeG-estimation

Markov model

A state-transition model moving a cohort between health states each cycle by a transition matrix, the standard tool for chronic disease. in the pathway → \[p = 1 - \exp(-r \cdot t)\] where \(p\) is the per-cycle transition probability; \(r\) is the rate reported in the published evidence; \(t\) is the cycle length over which the probability applies.

On the pathway · 05 · Decision rule · Decision-analytic models

Which model structure fits the problem?

the overall ideaDecision-analytic models
branching one-time event sequenceDecision tree (decision analysis)
recurring health states over cyclesMarkov model
survival curves partition statesPartitioned survival model
infection spread depends on prevalenceDynamic transmission model
weight future values lowerDiscounting

Maximum tolerated dose

The highest tolerable dose, the target a phase I dose-finding study estimates. in the pathway →

On the pathway · 00 · Framing · Dose-finding and early-phase designs

Which early-phase design question are you facing?

the overall design familyDose-finding and early-phase designs
dose escalation by fixed cohort rule3+3 design
model-based dose escalationContinual reassessment method
the highest acceptably safe doseMaximum tolerated dose
phase II screening for efficacySimon’s two-stage design

MCAR

Missing completely at random: a benign mechanism where missingness is unrelated to any data. in the pathway →

On the pathway · 01 · Measurement · Missing data: MCAR, MAR, MNAR

Why are values missing?

the overall ideaMissing data
missingness unrelated to anythingMCAR
missingness explained by observed dataMAR
missingness depends on unseen valuesMNAR
fill gaps and pool estimatesMultiple imputation

McFadden’s pseudo-R-squared

A rough stand-in for R-squared in generalized linear models, where a true R-squared does not apply. in the pathway →

On the pathway · 03 · Estimate · Model fit, comparison, and prediction error

Which fit or error measure do you need?

the overall ideaModel fit, comparison, and prediction error
variance explained, linear modelR-squared
pseudo-fit for logistic modelMcFadden’s pseudo-R-squared
compare models, penalize parametersAIC
compare models, penalize more heavilyBIC
prediction error, average magnitudeMean absolute error
prediction error, penalize large missesRMSE

MCMC

Markov chain Monte Carlo: drawing a dependent sequence of samples whose long-run distribution is the posterior. in the pathway →

On the pathway · 02 · Model · Bayesian computation (MCMC)

Which sampling or diagnostic tool?

the overall ideaBayesian computation
sampling the posterior generallyMCMC
proposal-and-accept samplerMetropolis-Hastings
sample each parameter conditionallyGibbs sampling
gradient-guided efficient samplerHamiltonian Monte Carlo
check chains have convergedR-hat
check model reproduces the dataPosterior predictive check

Mean absolute error

A prediction error measure in the outcome’s units, used when a few large errors should not dominate. in the pathway →

On the pathway · 03 · Estimate · Model fit, comparison, and prediction error

Which fit or error measure do you need?

the overall ideaModel fit, comparison, and prediction error
variance explained, linear modelR-squared
pseudo-fit for logistic modelMcFadden’s pseudo-R-squared
compare models, penalize parametersAIC
compare models, penalize more heavilyBIC
prediction error, average magnitudeMean absolute error
prediction error, penalize large missesRMSE

Measurement error and misclassification

Imprecision in measuring a variable, whose effect on an estimate depends on whether the error relates to the outcome. in the pathway →

On the pathway · 01 · Measurement · Measurement error and misclassification

What kind of measurement error?

the overall ideaMeasurement error and misclassification
error unrelated to other variablesNon-differential misclassification
error differing by groupDifferential misclassification
true variance over observed varianceReliability ratio

Measurement-method effects

Two devices or protocols measuring the same quantity can disagree systematically, so a threshold validated under one does not transfer. in the pathway →

On the pathway · 01 · Measurement · Measurement-method effects

What situation is this?

the overall ideaMeasurement-method effects

Measures of disease frequency

The standard forms for counting how often disease occurs, including prevalence, incidence, and rates. in the pathway →

On the pathway · 01 · Measurement · Measures of disease frequency

What frequency are you trying to measure?

the overall ideaMeasures of disease frequency
existing cases at a time pointPrevalence
new cases over follow-upIncidence
new cases as a proportion at riskCumulative incidence
new cases per unit follow-up timeIncidence rate
denominator of summed follow-upPerson-time
unadjusted rate in a populationCrude rate
comparing rates across populationsAge-standardization
observed versus expected deathsStandardized mortality ratio

Mediation analysis

Splitting a total effect into a direct effect and an indirect effect running through a mediator. in the pathway → \[\text{total} = \text{direct} + \text{indirect}, \quad \text{indirect} = a \cdot b\] where \(\text{total}\) is the total effect of the exposure on the outcome; \(\text{direct}\) is the effect not running through the mediator; \(\text{indirect}\) is the effect running through the mediator; \(a\) is the exposure-to-mediator coefficient; \(b\) is the mediator-to-outcome coefficient.

On the pathway · 02 · Model · Mediation analysis

Which mediation concept is in play?

the overall methodMediation analysis
decomposing effect through a mediatorNatural direct and indirect effects

Mediator

A variable on the causal path from exposure to outcome, left alone when the total effect is the target. in the pathway →

On the pathway · 02 · Model · Causal diagrams (DAGs) and conceptual frameworks

Which causal-diagram concept?

the overall frameworkCausal diagrams
the graph notation itselfDAG
common cause of exposure and outcomeConfounder
variable on the causal pathMediator
common effect, conditioning opens biasCollider
rule for sufficient adjustment setsBack-door criterion

Medication possession ratio (MPR)

Total days supplied divided by days in the observation interval, an adherence measure that can exceed one with overlaps. in the pathway →

On the pathway · 01 · Measurement · Defining exposure in real-world data

How do raw fills become a defined exposure with a start and end?

the overall ideaExposure definition in RWD
building one courseExposure episode construction
tolerating gapsGrace period and permissible gap
shifting the clockInduction, latency, and lag windows
adherence metricProportion of days covered (PDC)
adherence metricMedication possession ratio (MPR)
how long treatedPersistence (time to discontinuation)
standardized spanDrug era (OMOP)

Meta-analysis and pooling

Combining studies into one estimate using inverse-variance weighting, which sharpens an estimate only when the studies are estimating the same thing. in the pathway →

On the pathway · 04 · Synthesis · Meta-analysis and pooling

Which pooling or heterogeneity tool?

the overall ideaMeta-analysis and pooling
one true effect assumedFixed-effect meta-analysis
effects vary across studiesRandom-effects meta-analysis
how much effects varyHeterogeneity
testing for heterogeneityCochran’s Q
proportion of variance from heterogeneityI-squared
between-study variance estimateTau-squared
range for a new studyPrediction interval
explaining heterogeneity by covariatesMeta-regression
visualizing small-study effectsFunnel plot
testing funnel asymmetryEgger’s test

Meta-regression

A technique that tries to explain heterogeneity across studies using study-level covariates. in the pathway →

On the pathway · 04 · Synthesis · Meta-analysis and pooling

Which pooling or heterogeneity tool?

the overall ideaMeta-analysis and pooling
one true effect assumedFixed-effect meta-analysis
effects vary across studiesRandom-effects meta-analysis
how much effects varyHeterogeneity
testing for heterogeneityCochran’s Q
proportion of variance from heterogeneityI-squared
between-study variance estimateTau-squared
range for a new studyPrediction interval
explaining heterogeneity by covariatesMeta-regression
visualizing small-study effectsFunnel plot
testing funnel asymmetryEgger’s test

Metropolis-Hastings

A classic MCMC algorithm for drawing posterior samples. in the pathway →

On the pathway · 02 · Model · Bayesian computation (MCMC)

Which sampling or diagnostic tool?

the overall ideaBayesian computation
sampling the posterior generallyMCMC
proposal-and-accept samplerMetropolis-Hastings
sample each parameter conditionallyGibbs sampling
gradient-guided efficient samplerHamiltonian Monte Carlo
check chains have convergedR-hat
check model reproduces the dataPosterior predictive check

Micro-costing

Bottom-up costing that counts each resource used and multiplies it by its unit price. in the pathway →

On the pathway · 05 · Decision rule · Costing methods

How to value the resources used?

the overall ideaCosting methods
aggregate top-down unit costsGross costing
itemized bottom-up resource countsMicro-costing
value lost productivity over a lifetimeHuman-capital approach
value productivity loss until replacedFriction-cost approach

Minimization

An adaptive assignment that places each patient to keep arms balanced across several factors at once. in the pathway →

On the pathway · 00 · Framing · Randomization and blinding

What allocation or masking concern?

the overall ideaRandomization and blinding
hide the upcoming assignmentAllocation concealment
balance arms in small chunksBlock randomization
balance within prognostic strataStratified randomization
dynamically balance many factorsMinimization
mask treatment after allocationBlinding

Missing data

Why a value is missing decides what can be done about it, across the MCAR, MAR, and MNAR mechanisms. in the pathway →

On the pathway · 01 · Measurement · Missing data: MCAR, MAR, MNAR

Why are values missing?

the overall ideaMissing data
missingness unrelated to anythingMCAR
missingness explained by observed dataMAR
missingness depends on unseen valuesMNAR
fill gaps and pool estimatesMultiple imputation

MMRM

The mixed model for repeated measures, standard for a longitudinal trial endpoint, using all timepoints and handling dropout under missing-at-random. in the pathway →

On the pathway · 02 · Model · Regression families

Which regression for your outcome?

the overall ideaRegression families
unifying exponential-family frameworkGLM
continuous outcomeLinear regression
binary outcomeLogistic regression
count outcomePoisson regression
overdispersed countsNegative binomial regression
excess zeros in countsZero-inflated model
separate zero and positive partsHurdle model
correlated or clustered outcomesGEE
repeated measures over timeMMRM

MNAR

Missing not at random: missingness depending on the unseen value itself, needing pattern-mixture or tipping-point sensitivity approaches. in the pathway →

On the pathway · 01 · Measurement · Missing data: MCAR, MAR, MNAR

Why are values missing?

the overall ideaMissing data
missingness unrelated to anythingMCAR
missingness explained by observed dataMAR
missingness depends on unseen valuesMNAR
fill gaps and pool estimatesMultiple imputation

Model fit, comparison, and prediction error

The continuous-outcome counterpart to calibration and discrimination, covering variance explained, model comparison, and honest out-of-sample error. in the pathway →

On the pathway · 03 · Estimate · Model fit, comparison, and prediction error

Which fit or error measure do you need?

the overall ideaModel fit, comparison, and prediction error
variance explained, linear modelR-squared
pseudo-fit for logistic modelMcFadden’s pseudo-R-squared
compare models, penalize parametersAIC
compare models, penalize more heavilyBIC
prediction error, average magnitudeMean absolute error
prediction error, penalize large missesRMSE

Model modifications

Standard adaptations to a base regression, including splines, interactions, transformations, and offsets, each answering a specific signal. in the pathway →

On the pathway · 02 · Model · Model modifications (splines, interactions)

How do you flex the model?

the overall ideaModel modifications
effect depends on another variableInteraction term
fixed exposure term in count modelsOffset
smooth nonlinear flexible curvesSplines
additive smooth function componentsGeneralized additive models

Model validation and calibration

Checks that build trust in a model: verification that it is coded correctly and validation across face, internal, external, and predictive layers that it represents reality. in the pathway →

On the pathway · 05 · Decision rule · Model validation and calibration

What situation?

the overall ideaModel validation and calibration
tuning model outputs to realityCalibration (modeling)
confirming the model runs correctlyVerification

Monte Carlo simulation

Generating data under a known process and running the planned analysis over many replicates to study an estimator’s bias, coverage, and required sample size. in the pathway →

On the pathway · ∗ · Defend it · Monte Carlo simulation

What situation?

the overall ideaMonte Carlo simulation

Multi-criteria decision analysis (MCDA)

Explicitly weighting criteria such as equity and severity when a single ratio cannot capture value. in the pathway →

On the pathway · 05 · Decision rule · Real-world cost and HTA methods

Modeling skewed real-world costs for HTA?

valuing real-world costsReal-world cost and HTA methods
costs are zero-inflatedTwo-part and other cost models
extreme cost outliers existWinsorization and trimming of cost outliers
summarizing population spendPer-member-per-month costing (PMPM/PPPM)
trial ends before lifetimeSurvival extrapolation for HTA
value has multiple dimensionsMulti-criteria decision analysis (MCDA)
prioritizing which uncertainty mattersExpected value of partial perfect information (EVPPI)

Multiple imputation

Filling in missing values conditional on observed data, valid when data are missing at random. in the pathway →

On the pathway · 01 · Measurement · Missing data: MCAR, MAR, MNAR

Why are values missing?

the overall ideaMissing data
missingness unrelated to anythingMCAR
missingness explained by observed dataMAR
missingness depends on unseen valuesMNAR
fill gaps and pool estimatesMultiple imputation

Multiplicity control

Methods to rein in false positives when many hypotheses are tested, via family-wise error or false-discovery control. in the pathway →

On the pathway · 02 · Model · Multiplicity control

How do you control multiple testing?

the overall ideaMultiplicity control
bound any false positiveFamily-wise error rate
bound false positives among rejectionsFalse-discovery rate
simple conservative FWER divisorBonferroni correction
stepwise FWER controlHolm’s procedure
step-up FDR controlBenjamini-Hochberg
test hypotheses in ordered familiesGatekeeping procedure

Multistage sampling

Nesting sampling stages: sampling primary sampling units, then units within them, often with probability proportional to size. in the pathway →

On the pathway · 01 · Measurement · Survey sampling design

How do you draw the sample?

the overall ideaSurvey sampling design
every unit known nonzero chanceProbability sample
equal-chance draw from frameSimple random sampling
sample within population strataStratified sampling
sample whole groups togetherCluster sampling
sample in successive nested stagesMultistage sampling
variance inflation from clusteringDesign effect

N

Natural direct and indirect effects

The counterfactual framing of mediation, needing no unmeasured confounding of the mediator-outcome relationship. in the pathway →

On the pathway · 02 · Model · Mediation analysis

Which mediation concept is in play?

the overall methodMediation analysis
decomposing effect through a mediatorNatural direct and indirect effects

NDC (National Drug Code)

Identifier encoding drug manufacturer, product, and package, requiring mapping to reach the ingredient level. in the pathway →

On the pathway · 01 · Measurement · Claims and coding standards

Which vocabulary encodes each claim field, and what does it capture?

the overall ideaClaims and coding standards
diagnosesICD-10-CM diagnosis codes
inpatient proceduresICD-10-PCS procedure codes
professional servicesCPT/HCPCS codes
dispensed drugsNDC (National Drug Code)
labs and observationsLOINC lab codes
providerNPI (provider identifier)
cross-database mappingOMOP standardized vocabularies (OHDSI)
translating codesCode crosswalks and mappings
drug utilizationATC and defined daily dose (DDD)

Negative binomial distribution

A distribution for overdispersed counts whose variance exceeds the mean. in the pathway →

On the pathway · 01 · Measurement · Probability distributions and the CLT

Which distribution or sampling result?

the overall ideaProbability distributions
continuous bell-shaped variableNormal distribution
fixed-trial success countsBinomial distribution
counts of rare eventsPoisson distribution
overdispersed count dataNegative binomial distribution
why sample means turn normalCentral limit theorem
spread of a sample estimateStandard error

Negative binomial regression

A count regression used when overdispersion makes the variance exceed the mean. in the pathway →

On the pathway · 02 · Model · Regression families

Which regression for your outcome?

the overall ideaRegression families
unifying exponential-family frameworkGLM
continuous outcomeLinear regression
binary outcomeLogistic regression
count outcomePoisson regression
overdispersed countsNegative binomial regression
excess zeros in countsZero-inflated model
separate zero and positive partsHurdle model
correlated or clustered outcomesGEE
repeated measures over timeMMRM

Negative control exposure

An exposure sharing the real exposure’s confounding structure but with no plausible causal link to the outcome. in the pathway →

On the pathway · ∗ · Defend it · Negative controls and empirical calibration

Need to detect hidden residual confounding?

probing residual confoundingNegative controls and calibration
checking confounding on exposureNegative control exposure
checking confounding on outcomeNegative control outcome
many negative controls availableEmpirical calibration
quantifying systematic errorQuantitative bias analysis

Negative control outcome

An outcome sharing the real outcome’s confounding structure but that exposure cannot plausibly cause. in the pathway →

On the pathway · ∗ · Defend it · Negative controls and empirical calibration

Need to detect hidden residual confounding?

probing residual confoundingNegative controls and calibration
checking confounding on exposureNegative control exposure
checking confounding on outcomeNegative control outcome
many negative controls availableEmpirical calibration
quantifying systematic errorQuantitative bias analysis

Negative controls and calibration

Using outcomes or exposures with known null effects to detect and correct residual confounding in real analyses. in the pathway →

On the pathway · ∗ · Defend it · Negative controls and empirical calibration

Need to detect hidden residual confounding?

probing residual confoundingNegative controls and calibration
checking confounding on exposureNegative control exposure
checking confounding on outcomeNegative control outcome
many negative controls availableEmpirical calibration
quantifying systematic errorQuantitative bias analysis

Nested case-control

Case-control study inside a defined cohort, sampling controls at the time each case occurs to preserve risk-set comparability. in the pathway →

On the pathway · 00 · Framing · Observational study designs

Which observational design fits the question and dominant bias?

the overall familyObservational study designs
exposure known, follow forwardCohort study
snapshot at one timeCross-sectional study
rare outcome, look backCase-control study
controls sampled within cohortNested case-control
random subcohort, multiple outcomesCase-cohort design
transient trigger, acute eventCase-crossover design
within-person rate comparisonSelf-controlled case series (SCCS)
corrects exposure time trendsCase-time-control design
initiators, active comparatorActive-comparator new-user design
standing source populationDisease registry

Net benefit

A metric weighing true positives against false positives at a threshold probability, going beyond accuracy by accounting for the consequences of acting. in the pathway →

On the pathway · 05 · Decision rule · Decision-curve analysis

Which clinical-utility concept is in play?

the overall methodDecision-curve analysis
utility weighted by thresholdNet benefit

Net monetary benefit

A restatement of a cost-effectiveness comparison as effect times willingness-to-pay minus cost, avoiding the awkwardness of ratios and handling dominance. in the pathway →

On the pathway · 05 · Decision rule · Cost-effectiveness and the ICER

Which economic-evaluation framing fits?

the overall ideaCost-effectiveness and the ICER
extra cost per extra effectICER
value costs and benefits in moneyCost-benefit analysis
effects identical, compare costs onlyCost-minimization analysis
effects in quality-adjusted life yearsCost-utility analysis
value at a willingness thresholdNet monetary benefit
maximum payable per unit benefitWillingness-to-pay threshold

Net-benefit regression

Converting each patient’s cost and effect into one net-benefit outcome at a willingness-to-pay threshold and regressing it on treatment arm, giving covariate adjustment for free. in the pathway →

On the pathway · 05 · Decision rule · Cost-effectiveness alongside a trial

What situation?

the overall ideaCost-effectiveness alongside a trial
regressing net benefit on covariatesNet-benefit regression

Network meta-analysis

Combining a whole network of trials to estimate every pairwise treatment contrast and rank options, even when no trial compared them all directly. in the pathway →

On the pathway · 04 · Synthesis · Network meta-analysis

Which network meta-analysis concern?

the overall ideaNetwork meta-analysis
comparability across the networkTransitivity
direct versus indirect agreementNode-splitting
rank treatments overallSUCRA

New-user design

A cohort design applying a washout window so prevalent users do not contaminate the comparison. in the pathway →

On the pathway · 01 · Measurement · Assembling the analytic cohort

Which cohort-construction step?

the overall ideaAssembling the analytic cohort
pull and reshape raw source dataExtract-transform-load
set the time-zero anchorIndex date
define pre-index covariate historyLookback window
restrict to treatment initiatorsNew-user design

NICE

A national agency that pairs cost-effectiveness analysis with an explicit cost-per-QALY threshold to reach coverage decisions. in the pathway →

On the pathway · 05 · Decision rule · Health technology assessment and value frameworks

Which HTA framework or body?

the overall ideaHealth technology assessment and value frameworks
UK appraisal agencyNICE
US value-assessment organizationInstitute for Clinical and Economic Review

Node-splitting

A formal check of consistency in a network meta-analysis, comparing the direct and indirect estimate for each contrast to flag disagreement. in the pathway →

On the pathway · 04 · Synthesis · Network meta-analysis

Which network meta-analysis concern?

the overall ideaNetwork meta-analysis
comparability across the networkTransitivity
direct versus indirect agreementNode-splitting
rank treatments overallSUCRA

Nominal group technique

An in-person consensus method structuring convergence through silent ranking then discussion. in the pathway →

On the pathway · 06 · Recommendation · Consensus methods (Delphi, nominal group)

How do experts reach consensus?

the overall ideaConsensus methods (Delphi, nominal group)
anonymous iterative roundsDelphi method
structured in-person rankingNominal group technique

Non-differential misclassification

Measurement error unrelated to the outcome, which usually biases an effect toward the null. in the pathway →

On the pathway · 01 · Measurement · Measurement error and misclassification

What kind of measurement error?

the overall ideaMeasurement error and misclassification
error unrelated to other variablesNon-differential misclassification
error differing by groupDifferential misclassification
true variance over observed varianceReliability ratio

Non-inferiority and equivalence

Trials aiming to show a treatment is not meaningfully worse, or is bounded on both sides, rather than better. in the pathway →

On the pathway · 00 · Framing · Non-inferiority and equivalence

What are you trying to show?

the overall ideaNon-inferiority and equivalence
new is not meaningfully worseNon-inferiority trial
new is neither worse nor betterEquivalence trial
how much worse is tolerableNon-inferiority margin
trial can detect a real differenceAssay sensitivity

Non-inferiority margin

The pre-specified amount by which a new treatment may be worse and still pass, set from clinical tolerability and the control’s advantage. in the pathway →

On the pathway · 00 · Framing · Non-inferiority and equivalence

What are you trying to show?

the overall ideaNon-inferiority and equivalence
new is not meaningfully worseNon-inferiority trial
new is neither worse nor betterEquivalence trial
how much worse is tolerableNon-inferiority margin
trial can detect a real differenceAssay sensitivity

Non-inferiority trial

A trial testing against a shifted null, passing if the effect is no worse than standard by more than a pre-specified margin. in the pathway →

On the pathway · 00 · Framing · Non-inferiority and equivalence

What are you trying to show?

the overall ideaNon-inferiority and equivalence
new is not meaningfully worseNon-inferiority trial
new is neither worse nor betterEquivalence trial
how much worse is tolerableNon-inferiority margin
trial can detect a real differenceAssay sensitivity

Non-informative censoring

The assumption that censored subjects are representative of those still at risk, which informative dropout violates. in the pathway →

On the pathway · 03 · Estimate · Hazard ratios and non-proportional hazards

Which survival concept?

the overall ideaHazard ratios and non-proportional hazards
censoring unrelated to outcomeNon-informative censoring
summary when hazards are non-proportionalRestricted mean survival time

Nonresponse bias

Bias from those who do not answer a survey differing systematically from those who do. in the pathway →

On the pathway · ∗ · Defend it · Study biases, by rung

Which bias is threatening the study?

the overall map of biasesStudy biases, by rung
who got sampled or enrolledSampling bias
distorted entry into the studySelection bias
selection among hospitalized patientsBerkson’s bias
conditioning on a common effectOver-adjustment
employed groups appear healthierHealthy-worker effect
only survivors are observedSurvivorship bias
differential loss to follow-upAttrition bias
nonresponders differ systematicallyNonresponse bias
a common cause of exposure and outcomeConfounding
treatment chosen by prognosisConfounding by indication
unequal outcome ascertainmentDetection bias
inaccurate recall of exposureRecall bias
interviewer shapes responsesInterviewer bias
earlier detection inflates survivalLead-time bias
slow cases preferentially detectedLength-time bias
detecting harmless diseaseOverdiagnosis
selective reporting of resultsPublication bias

Normal distribution

The Gaussian distribution, often used for continuous measurements, whose standardized form is the z. in the pathway →

On the pathway · 01 · Measurement · Probability distributions and the CLT

Which distribution or sampling result?

the overall ideaProbability distributions
continuous bell-shaped variableNormal distribution
fixed-trial success countsBinomial distribution
counts of rare eventsPoisson distribution
overdispersed count dataNegative binomial distribution
why sample means turn normalCentral limit theorem
spread of a sample estimateStandard error

NPI (provider identifier)

National Provider Identifier for the rendering or billing clinician or organization. in the pathway →

On the pathway · 01 · Measurement · Claims and coding standards

Which vocabulary encodes each claim field, and what does it capture?

the overall ideaClaims and coding standards
diagnosesICD-10-CM diagnosis codes
inpatient proceduresICD-10-PCS procedure codes
professional servicesCPT/HCPCS codes
dispensed drugsNDC (National Drug Code)
labs and observationsLOINC lab codes
providerNPI (provider identifier)
cross-database mappingOMOP standardized vocabularies (OHDSI)
translating codesCode crosswalks and mappings
drug utilizationATC and defined daily dose (DDD)

Number needed to treat

Absolute measure of benefit, the number of patients treated to prevent one event, equal to the reciprocal of the absolute risk reduction. in the pathway → \[\text{NNT} = \frac{1}{\text{ARR}}\] where \(\text{NNT}\) is the number needed to treat, how many patients must be treated for one to benefit; \(\text{ARR}\) is the absolute risk reduction, the difference in risk between arms.

On the pathway · 03 · Estimate · Effect measures

Which effect measure to report?

the overall ideaEffect measures
ratio of risks between groupsRisk ratio
ratio of odds between groupsOdds ratio
absolute difference in riskRisk difference
patients treated per outcome preventedNumber needed to treat

O

O’Brien-Fleming boundary

An alpha-spending boundary that is stringent early and near-nominal at the trial’s end. in the pathway →

On the pathway · 00 · Framing · Interim analyses and group-sequential design

Which interim-monitoring element?

the overall ideaInterim analyses and group-sequential design
committee reviewing accruing dataData safety monitoring board
planned looks with stopping rulesGroup-sequential design
false-positive risk to spendType-I error
stringent early-look boundaryO’Brien-Fleming boundary
constant nominal-level boundaryPocock boundary

Observational study designs

The family of non-randomized designs that observe exposures and outcomes as they occur, each chosen to fit a question and limit a specific bias. in the pathway →

On the pathway · 00 · Framing · Observational study designs

Which observational design fits the question and dominant bias?

the overall familyObservational study designs
exposure known, follow forwardCohort study
snapshot at one timeCross-sectional study
rare outcome, look backCase-control study
controls sampled within cohortNested case-control
random subcohort, multiple outcomesCase-cohort design
transient trigger, acute eventCase-crossover design
within-person rate comparisonSelf-controlled case series (SCCS)
corrects exposure time trendsCase-time-control design
initiators, active comparatorActive-comparator new-user design
standing source populationDisease registry

Odds ratio

Ratio of the odds of an outcome between groups, often misread as a risk ratio when the outcome is common, which overstates the effect. in the pathway →

On the pathway · 03 · Estimate · Effect measures

Which effect measure to report?

the overall ideaEffect measures
ratio of risks between groupsRisk ratio
ratio of odds between groupsOdds ratio
absolute difference in riskRisk difference
patients treated per outcome preventedNumber needed to treat

Offset

A term for exposure time or population at risk that turns a Poisson count model into a rate model. in the pathway →

On the pathway · 02 · Model · Model modifications (splines, interactions)

How do you flex the model?

the overall ideaModel modifications
effect depends on another variableInteraction term
fixed exposure term in count modelsOffset
smooth nonlinear flexible curvesSplines
additive smooth function componentsGeneralized additive models

OMOP standardized vocabularies (OHDSI)

Common data model mapping heterogeneous source codes to standard concepts so studies run across databases, at some loss of detail. in the pathway →

On the pathway · 01 · Measurement · Claims and coding standards

Which vocabulary encodes each claim field, and what does it capture?

the overall ideaClaims and coding standards
diagnosesICD-10-CM diagnosis codes
inpatient proceduresICD-10-PCS procedure codes
professional servicesCPT/HCPCS codes
dispensed drugsNDC (National Drug Code)
labs and observationsLOINC lab codes
providerNPI (provider identifier)
cross-database mappingOMOP standardized vocabularies (OHDSI)
translating codesCode crosswalks and mappings
drug utilizationATC and defined daily dose (DDD)

Operating characteristics

Sensitivity and specificity describe a test in the abstract, while predictive values describe what a result means for a patient and shift with prevalence. in the pathway →

On the pathway · 05 · Decision rule · Operating characteristics

Which diagnostic-performance measure?

the overall ideaOperating characteristics
true positives among diseasedSensitivity
true negatives among healthySpecificity
disease probability given a resultPredictive values

Operationalizing the variable

Writing a variable definition precise enough, with codes, thresholds, and windows, that two analysts produce the same cases. in the pathway →

On the pathway · 01 · Measurement · Operationalizing the variable

What measurement situation are you in?

turning a concept into a variableOperationalizing the variable

Opportunity cost

The principle that every dollar spent is health some other patient could have had. in the pathway →

On the pathway · 05 · Decision rule · Perspective and the reference case

Whose costs and benefits count?

the overall ideaPerspective and the reference case
standardized analysis conventionsReference case
reporting checklist for economicsCHEERS
count all costs to societySocietal perspective
value of foregone alternativesOpportunity cost

Outcome phenotyping and validation

Treating a claims or EHR outcome as an algorithm whose accuracy must be measured, because its predictive value and sensitivity bias the estimate. in the pathway →

On the pathway · 01 · Measurement · Outcome phenotyping and algorithm validation

Is your outcome a validated algorithm or an unchecked code rule?

the overall ideaOutcome phenotyping and validation
the rule itselfClaims/EHR phenotype algorithm
common coding rule1-inpatient / 2-outpatient rule
tuning the ruleAlgorithm validation (PPV and sensitivity tradeoff)
reference standardEndpoint adjudication and chart review
bundled outcomesComposite endpoint construction

Over-adjustment

Conditioning on a mediator or collider, adding bias while trying to remove it, the mirror image of confounding. in the pathway →

On the pathway · ∗ · Defend it · Study biases, by rung

Which bias is threatening the study?

the overall map of biasesStudy biases, by rung
who got sampled or enrolledSampling bias
distorted entry into the studySelection bias
selection among hospitalized patientsBerkson’s bias
conditioning on a common effectOver-adjustment
employed groups appear healthierHealthy-worker effect
only survivors are observedSurvivorship bias
differential loss to follow-upAttrition bias
nonresponders differ systematicallyNonresponse bias
a common cause of exposure and outcomeConfounding
treatment chosen by prognosisConfounding by indication
unequal outcome ascertainmentDetection bias
inaccurate recall of exposureRecall bias
interviewer shapes responsesInterviewer bias
earlier detection inflates survivalLead-time bias
slow cases preferentially detectedLength-time bias
detecting harmless diseaseOverdiagnosis
selective reporting of resultsPublication bias

Overdiagnosis

Detecting disease that would never have caused harm, inflating apparent screening benefit. in the pathway →

On the pathway · ∗ · Defend it · Study biases, by rung

Which bias is threatening the study?

the overall map of biasesStudy biases, by rung
who got sampled or enrolledSampling bias
distorted entry into the studySelection bias
selection among hospitalized patientsBerkson’s bias
conditioning on a common effectOver-adjustment
employed groups appear healthierHealthy-worker effect
only survivors are observedSurvivorship bias
differential loss to follow-upAttrition bias
nonresponders differ systematicallyNonresponse bias
a common cause of exposure and outcomeConfounding
treatment chosen by prognosisConfounding by indication
unequal outcome ascertainmentDetection bias
inaccurate recall of exposureRecall bias
interviewer shapes responsesInterviewer bias
earlier detection inflates survivalLead-time bias
slow cases preferentially detectedLength-time bias
detecting harmless diseaseOverdiagnosis
selective reporting of resultsPublication bias

Overfitting

When a model flexible enough to chase noise fits the training data but fails on new data. in the pathway →

On the pathway · 02 · Model · Bias-variance and regularization

Which concept or penalty?

the overall ideaBias-variance and regularization
the underlying error tradeoffBias-variance tradeoff
fitting noise, poor generalizationOverfitting
penalizing complexity broadlyRegularization
estimating out-of-sample errorCross-validation
shrink coefficients, keep allRidge regression
shrink and select variablesLasso
blend selection and shrinkageElastic net

P

Parallel trends

The unverifiable assumption underlying difference-in-differences. in the pathway →

On the pathway · 02 · Model · Identifying assumptions

Which identifying assumption do you need?

the overall ideaIdentifying assumptions
treatment independent of confoundersConditional independence
instrument affects outcome only via exposureExclusion restriction
groups would have tracked togetherParallel trends

Parameter uncertainty

Second-order uncertainty in an input’s true value because it was estimated from finite data, propagated by probabilistic sensitivity analysis. in the pathway →

On the pathway · 05 · Decision rule · Types of uncertainty

Which source of uncertainty?

the overall ideaTypes of uncertainty
uncertainty in input estimatesParameter uncertainty
random variation between individualsStochastic uncertainty
uncertainty in model structureStructural uncertainty
finding where conclusions flipThreshold analysis

Partial pooling

Shrinkage that stabilizes small or sparse groups by borrowing strength from the rest, between pooled and fully separate estimates. in the pathway →

On the pathway · 02 · Model · Hierarchical (multilevel) Bayesian models

Which multilevel Bayesian idea?

the overall ideaHierarchical Bayesian models
borrow strength across groupsPartial pooling

Partitioned survival model

An oncology model reading state membership straight off the progression-free and overall survival curves rather than a transition matrix. in the pathway →

On the pathway · 05 · Decision rule · Decision-analytic models

Which model structure fits the problem?

the overall ideaDecision-analytic models
branching one-time event sequenceDecision tree (decision analysis)
recurring health states over cyclesMarkov model
survival curves partition statesPartitioned survival model
infection spread depends on prevalenceDynamic transmission model
weight future values lowerDiscounting

Pearson correlation

A measure of linear association between two continuous variables. in the pathway →

On the pathway · 02 · Model · Bivariate tests (t-test, chi-square)

Which two-variable association are you testing?

the overall family of testsBivariate tests
mean across two groupsT-test
means across three or more groupsANOVA
ranks across two groupsMann-Whitney test
ranks across three or more groupsKruskal-Wallis test
two categorical variables, large countsChi-square test
two categorical variables, small countsFisher’s exact test
linear correlation of two continuousPearson correlation
monotonic correlation, rankedSpearman correlation
concordance-based rank correlationKendall’s tau
effect size for two meansCohen’s d
effect size for ANOVAEta-squared
effect size for categorical associationCramer’s V

PECO

The observational cousin of PICO, naming population, exposure, comparator, and outcome. in the pathway →

On the pathway · 00 · Framing · Research question (PICO / PECO)

Which question framework fits?

the overall ideaResearch question
intervention question for a trialPICO
add an explicit time horizonPICOT
add study-design eligibilityPICOS
exposure question for observational workPECO

Per-member-per-month costing (PMPM/PPPM)

Spend normalized by enrollment time, comparing populations with different follow-up at the budget level. in the pathway →

On the pathway · 05 · Decision rule · Real-world cost and HTA methods

Modeling skewed real-world costs for HTA?

valuing real-world costsReal-world cost and HTA methods
costs are zero-inflatedTwo-part and other cost models
extreme cost outliers existWinsorization and trimming of cost outliers
summarizing population spendPer-member-per-month costing (PMPM/PPPM)
trial ends before lifetimeSurvival extrapolation for HTA
value has multiple dimensionsMulti-criteria decision analysis (MCDA)
prioritizing which uncertainty mattersExpected value of partial perfect information (EVPPI)

Per-protocol

Restricting analysis to those who followed the protocol, which answers the biological question but breaks randomization. in the pathway →

On the pathway · 02 · Model · Analysis populations (ITT vs per-protocol)

Which set of subjects do you analyze?

the overall ideaAnalysis populations
as randomized, regardless of adherenceIntention-to-treat
only those who followed protocolPer-protocol
grouped by treatment actually receivedAs-treated

Persistence (time to discontinuation)

Duration from initiation to the first permissible-gap-exceeding break in supply. in the pathway →

On the pathway · 01 · Measurement · Defining exposure in real-world data

How do raw fills become a defined exposure with a start and end?

the overall ideaExposure definition in RWD
building one courseExposure episode construction
tolerating gapsGrace period and permissible gap
shifting the clockInduction, latency, and lag windows
adherence metricProportion of days covered (PDC)
adherence metricMedication possession ratio (MPR)
how long treatedPersistence (time to discontinuation)
standardized spanDrug era (OMOP)

Person-time

Each subject’s time under observation summed across the cohort, the denominator of an incidence rate. in the pathway →

On the pathway · 01 · Measurement · Measures of disease frequency

What frequency are you trying to measure?

the overall ideaMeasures of disease frequency
existing cases at a time pointPrevalence
new cases over follow-upIncidence
new cases as a proportion at riskCumulative incidence
new cases per unit follow-up timeIncidence rate
denominator of summed follow-upPerson-time
unadjusted rate in a populationCrude rate
comparing rates across populationsAge-standardization
observed versus expected deathsStandardized mortality ratio

Perspective and the reference case

Whose costs count changes the answer, so a standardized reference case and impact inventory make analyses comparable, distinguishing healthcare-sector from societal perspectives. in the pathway →

On the pathway · 05 · Decision rule · Perspective and the reference case

Whose costs and benefits count?

the overall ideaPerspective and the reference case
standardized analysis conventionsReference case
reporting checklist for economicsCHEERS
count all costs to societySocietal perspective
value of foregone alternativesOpportunity cost

PICO

Population, intervention, comparator, outcome: a framework forcing a clinical question to be specific enough to design around. in the pathway →

On the pathway · 00 · Framing · Research question (PICO / PECO)

Which question framework fits?

the overall ideaResearch question
intervention question for a trialPICO
add an explicit time horizonPICOT
add study-design eligibilityPICOS
exposure question for observational workPECO

PICOS

PICO with an appended study design, the convention in systematic reviews. in the pathway →

On the pathway · 00 · Framing · Research question (PICO / PECO)

Which question framework fits?

the overall ideaResearch question
intervention question for a trialPICO
add an explicit time horizonPICOT
add study-design eligibilityPICOS
exposure question for observational workPECO

PICOT

PICO with an appended timeframe, the convention in clinical-question teaching. in the pathway →

On the pathway · 00 · Framing · Research question (PICO / PECO)

Which question framework fits?

the overall ideaResearch question
intervention question for a trialPICO
add an explicit time horizonPICOT
add study-design eligibilityPICOS
exposure question for observational workPECO

Placebo and falsification tests

Looking for an effect where none should exist, such as a pre-treatment period or unaffected outcome, to test whether a design is sound. in the pathway →

On the pathway · ∗ · Defend it · Placebo and falsification tests

What situation is this?

the overall ideaPlacebo and falsification tests

Pocock boundary

An alpha-spending boundary that holds a constant threshold across interim looks. in the pathway →

On the pathway · 00 · Framing · Interim analyses and group-sequential design

Which interim-monitoring element?

the overall ideaInterim analyses and group-sequential design
committee reviewing accruing dataData safety monitoring board
planned looks with stopping rulesGroup-sequential design
false-positive risk to spendType-I error
stringent early-look boundaryO’Brien-Fleming boundary
constant nominal-level boundaryPocock boundary

Poisson distribution

The distribution of counts of rare events. in the pathway →

On the pathway · 01 · Measurement · Probability distributions and the CLT

Which distribution or sampling result?

the overall ideaProbability distributions
continuous bell-shaped variableNormal distribution
fixed-trial success countsBinomial distribution
counts of rare eventsPoisson distribution
overdispersed count dataNegative binomial distribution
why sample means turn normalCentral limit theorem
spread of a sample estimateStandard error

Poisson regression

A regression for counts returning a rate ratio, assuming the variance equals the mean. in the pathway →

On the pathway · 02 · Model · Regression families

Which regression for your outcome?

the overall ideaRegression families
unifying exponential-family frameworkGLM
continuous outcomeLinear regression
binary outcomeLogistic regression
count outcomePoisson regression
overdispersed countsNegative binomial regression
excess zeros in countsZero-inflated model
separate zero and positive partsHurdle model
correlated or clustered outcomesGEE
repeated measures over timeMMRM

Positivity

The identifiability condition that every kind of unit could have received either treatment, also called overlap. in the pathway →

On the pathway · 02 · Model · Potential outcomes and identifiability

Which identifiability condition is at stake?

the overall frameworkPotential outcomes and identifiability
the counterfactual setupPotential-outcomes framework
only one outcome is observedFundamental problem of causal inference
treated and untreated comparableExchangeability
every covariate stratum has bothPositivity
observed equals counterfactual under treatmentConsistency
no interference, single versionSUTVA

Posterior distribution

The updated distribution of a parameter after combining prior belief with the data. in the pathway →

On the pathway · 02 · Model · Bayesian inference

Which Bayesian concept is in play?

the overall frameworkBayesian inference
the updating rule itselfBayes’ theorem
beliefs after seeing dataPosterior distribution
interval summary of the posteriorCredible interval

Posterior predictive check

Asking whether data simulated from the fitted model resemble the real data. in the pathway →

On the pathway · 02 · Model · Bayesian computation (MCMC)

Which sampling or diagnostic tool?

the overall ideaBayesian computation
sampling the posterior generallyMCMC
proposal-and-accept samplerMetropolis-Hastings
sample each parameter conditionallyGibbs sampling
gradient-guided efficient samplerHamiltonian Monte Carlo
check chains have convergedR-hat
check model reproduces the dataPosterior predictive check

Potential outcomes and identifiability

A framework defining a causal effect as the contrast of outcomes under treatment and no treatment, with conditions for estimating it from data. in the pathway →

On the pathway · 02 · Model · Potential outcomes and identifiability

Which identifiability condition is at stake?

the overall frameworkPotential outcomes and identifiability
the counterfactual setupPotential-outcomes framework
only one outcome is observedFundamental problem of causal inference
treated and untreated comparableExchangeability
every covariate stratum has bothPositivity
observed equals counterfactual under treatmentConsistency
no interference, single versionSUTVA

Potential-outcomes framework

Imagining for each unit the outcome under treatment and under no treatment, whose contrast is the causal effect. in the pathway →

On the pathway · 02 · Model · Potential outcomes and identifiability

Which identifiability condition is at stake?

the overall frameworkPotential outcomes and identifiability
the counterfactual setupPotential-outcomes framework
only one outcome is observedFundamental problem of causal inference
treated and untreated comparableExchangeability
every covariate stratum has bothPositivity
observed equals counterfactual under treatmentConsistency
no interference, single versionSUTVA

Pre-registration

A public commitment on ClinicalTrials.gov or the Open Science Framework that locks the endpoint before data are unblinded. in the pathway →

On the pathway · 00 · Framing · Endpoint logic and pre-registration

Which endpoint or pre-specification concern?

the overall ideaEndpoint logic and pre-registration
the main pre-specified outcomePrimary endpoint
a stand-in for the outcomeSurrogate endpoint
criteria validating a surrogatePrentice’s criteria
lock analysis plan in advancePre-registration

Precision

The share of positive predictions that are correct, the same as positive predictive value. in the pathway →

On the pathway · 02 · Model · Classification performance metrics

Which classification metric?

the overall ideaClassification performance metrics
share of predicted positives correctPrecision
share of true positives caughtRecall
balance precision and recallF1 score
tradeoff across all thresholdsPrecision-recall curve

Precision-recall curve

A more honest summary than ROC-AUC of classifier performance under class imbalance. in the pathway →

On the pathway · 02 · Model · Classification performance metrics

Which classification metric?

the overall ideaClassification performance metrics
share of predicted positives correctPrecision
share of true positives caughtRecall
balance precision and recallF1 score
tradeoff across all thresholdsPrecision-recall curve

Prediction and machine learning

Flexible models for predicting rather than explaining, judged on out-of-sample error and calibration, not coefficient plausibility. in the pathway →

On the pathway · 02 · Model · Prediction and machine learning

Which prediction concept is in play?

the overall areaPrediction and machine learning
explaining individual predictionsSHAP

Prediction interval

The range a new study’s true effect might fall in, wider than the confidence interval and more honest under substantial heterogeneity. in the pathway →

On the pathway · 04 · Synthesis · Meta-analysis and pooling

Which pooling or heterogeneity tool?

the overall ideaMeta-analysis and pooling
one true effect assumedFixed-effect meta-analysis
effects vary across studiesRandom-effects meta-analysis
how much effects varyHeterogeneity
testing for heterogeneityCochran’s Q
proportion of variance from heterogeneityI-squared
between-study variance estimateTau-squared
range for a new studyPrediction interval
explaining heterogeneity by covariatesMeta-regression
visualizing small-study effectsFunnel plot
testing funnel asymmetryEgger’s test

Predictive values

What a positive or negative test result means for the patient in front of you, shifting with the prevalence of disease. in the pathway → \[\text{PPV} = \frac{\text{sens} \cdot \text{prev}}{\text{sens} \cdot \text{prev} + (1 - \text{spec}) \cdot (1 - \text{prev})}\] where \(\text{PPV}\) is the positive predictive value, the chance a positive result is a true case; \(\text{sens}\) is the sensitivity, the chance a true case tests positive; \(\text{spec}\) is the specificity, the chance a non-case tests negative; \(\text{prev}\) is the prevalence, the share of the tested population with the disease.

On the pathway · 05 · Decision rule · Operating characteristics

Which diagnostic-performance measure?

the overall ideaOperating characteristics
true positives among diseasedSensitivity
true negatives among healthySpecificity
disease probability given a resultPredictive values

Prentice’s criteria

The formal test for whether a surrogate endpoint validly captures a treatment’s effect on the true clinical outcome. in the pathway →

On the pathway · 00 · Framing · Endpoint logic and pre-registration

Which endpoint or pre-specification concern?

the overall ideaEndpoint logic and pre-registration
the main pre-specified outcomePrimary endpoint
a stand-in for the outcomeSurrogate endpoint
criteria validating a surrogatePrentice’s criteria
lock analysis plan in advancePre-registration

Prevalence

The share of a population that has a condition at a point in time or over a window, reflecting both occurrence and duration. in the pathway →

On the pathway · 01 · Measurement · Measures of disease frequency

What frequency are you trying to measure?

the overall ideaMeasures of disease frequency
existing cases at a time pointPrevalence
new cases over follow-upIncidence
new cases as a proportion at riskCumulative incidence
new cases per unit follow-up timeIncidence rate
denominator of summed follow-upPerson-time
unadjusted rate in a populationCrude rate
comparing rates across populationsAge-standardization
observed versus expected deathsStandardized mortality ratio

Primary endpoint

The outcome the sample size is built on and the headline claim is read against, with everything else secondary. in the pathway →

On the pathway · 00 · Framing · Endpoint logic and pre-registration

Which endpoint or pre-specification concern?

the overall ideaEndpoint logic and pre-registration
the main pre-specified outcomePrimary endpoint
a stand-in for the outcomeSurrogate endpoint
criteria validating a surrogatePrentice’s criteria
lock analysis plan in advancePre-registration

Principal component analysis

A dimensionality-reduction method finding the orthogonal directions of greatest variance. in the pathway →

On the pathway · 02 · Model · Unsupervised learning

What unlabeled-data structure are you finding?

the overall familyUnsupervised learning
grouping similar observationsClustering
nested grouping by linkageHierarchical clustering
partitioning into k groupsK-means
reducing the number of featuresDimensionality reduction
orthogonal variance componentsPrincipal component analysis

Principal-stratum strategy

An intercurrent-event strategy restricting to those who would never have the event. in the pathway →

On the pathway · 02 · Model · Trial estimands and intercurrent events

How do you handle intercurrent events?

the overall ideaTrial estimands and intercurrent events
events disrupting outcome interpretationIntercurrent events
ignore them, use assigned treatmentTreatment-policy strategy
imagine they did not occurHypothetical strategy
fold event into the outcomeComposite strategy
restrict to a defined subpopulationPrincipal-stratum strategy

PRISMA

The reporting checklist and flow diagram for systematic reviews. in the pathway →

On the pathway · 06 · Recommendation · Reporting standards

Which study type are you reporting?

the overall ideaReporting standards
randomized controlled trialCONSORT
observational studySTROBE
systematic reviewPRISMA
prediction model studyTRIPOD

Privacy-preserving record linkage (tokenization)

Matching records across datasets using encrypted tokens instead of raw identifiers, so patients can be linked without revealing who they are. in the pathway →

On the pathway · 01 · Measurement · Data feasibility, enrollment, and linkage

Can this data actually answer my question?

the overall ideaData feasibility, enrollment, and linkage
you need observable follow-upContinuous enrollment and observable time
you must size the populationDatabase feasibility and the attrition funnel
you join multiple datasetsPrivacy-preserving record linkage (tokenization)

Probabilistic sensitivity analysis

Propagating parameter uncertainty through a Monte Carlo simulation that draws each parameter from a distribution and reruns the model thousands of times. in the pathway →

On the pathway · 05 · Decision rule · Uncertainty in cost-effectiveness (PSA)

How are you handling cost-effectiveness uncertainty?

the overall ideaUncertainty in cost-effectiveness (PSA)
propagating parameter uncertaintyProbabilistic sensitivity analysis
plotting cost and effect differencesCost-effectiveness plane
probability of being cost-effectiveCost-effectiveness acceptability curve

Probability distributions

The theoretical distributions that model data and supply the reference for test statistics. in the pathway →

On the pathway · 01 · Measurement · Probability distributions and the CLT

Which distribution or sampling result?

the overall ideaProbability distributions
continuous bell-shaped variableNormal distribution
fixed-trial success countsBinomial distribution
counts of rare eventsPoisson distribution
overdispersed count dataNegative binomial distribution
why sample means turn normalCentral limit theorem
spread of a sample estimateStandard error

Probability sample

A sample giving every unit a known, nonzero chance of selection, the basis for generalizing to the population. in the pathway →

On the pathway · 01 · Measurement · Survey sampling design

How do you draw the sample?

the overall ideaSurvey sampling design
every unit known nonzero chanceProbability sample
equal-chance draw from frameSimple random sampling
sample within population strataStratified sampling
sample whole groups togetherCluster sampling
sample in successive nested stagesMultistage sampling
variance inflation from clusteringDesign effect

Propensity score

The probability of treatment given covariates, used to match or weight treated and untreated on measured confounders. in the pathway →

On the pathway · 02 · Model · Causal estimators (propensity scores, g-methods)

How do you estimate the causal effect?

the overall ideaCausal estimators
model treatment assignment probabilityPropensity score
reweight by inverse treatment probabilityIPTW
model and average outcomesG-formula
combine outcome and treatment modelsDoubly-robust estimators
targeted machine-learning estimationTMLE

Proportion of days covered (PDC)

Fraction of a period during which a patient had drug supply on hand, capping overlapping fills. in the pathway →

On the pathway · 01 · Measurement · Defining exposure in real-world data

How do raw fills become a defined exposure with a start and end?

the overall ideaExposure definition in RWD
building one courseExposure episode construction
tolerating gapsGrace period and permissible gap
shifting the clockInduction, latency, and lag windows
adherence metricProportion of days covered (PDC)
adherence metricMedication possession ratio (MPR)
how long treatedPersistence (time to discontinuation)
standardized spanDrug era (OMOP)

Proportional hazards

The Cox-model assumption checked with scaled Schoenfeld residuals or a log-log survival plot. in the pathway →

On the pathway · 02 · Model · Checking model assumptions

Which model assumption to check?

the overall ideaChecking model assumptions
non-constant residual varianceHeteroscedasticity
predictors too collinearVariance inflation factor
single points driving the fitCook’s distance
hazard ratio constant over timeProportional hazards
test that proportionality formallySchoenfeld residuals
fix variance without refittingRobust standard errors

PROSPERO

The register where a systematic review protocol is recorded before screening, keeping the review from becoming a search for the wanted result. in the pathway →

On the pathway · 04 · Synthesis · Conducting a systematic review

What situation?

the overall ideaConducting a systematic review
registering the review protocolPROSPERO

Publication bias

Positive results being published while null ones vanish, inflating a pooled estimate and often visible as funnel-plot asymmetry. in the pathway →

On the pathway · ∗ · Defend it · Study biases, by rung

Which bias is threatening the study?

the overall map of biasesStudy biases, by rung
who got sampled or enrolledSampling bias
distorted entry into the studySelection bias
selection among hospitalized patientsBerkson’s bias
conditioning on a common effectOver-adjustment
employed groups appear healthierHealthy-worker effect
only survivors are observedSurvivorship bias
differential loss to follow-upAttrition bias
nonresponders differ systematicallyNonresponse bias
a common cause of exposure and outcomeConfounding
treatment chosen by prognosisConfounding by indication
unequal outcome ascertainmentDetection bias
inaccurate recall of exposureRecall bias
interviewer shapes responsesInterviewer bias
earlier detection inflates survivalLead-time bias
slow cases preferentially detectedLength-time bias
detecting harmless diseaseOverdiagnosis
selective reporting of resultsPublication bias

Q

QALY

Quality-adjusted life year: time spent in a health state multiplied by a utility weight between zero, equivalent to death, and one, full health. in the pathway →

On the pathway · 05 · Decision rule · QALYs and health-state utilities

Which utility concept do you need?

the overall ideaQALYs and health-state utilities
quality-adjusted life yearsQALY
preference weight for a health stateHealth-state utility
a standardized utility instrumentEQ-5D

QALYs and health-state utilities

The quality-adjusted life year multiplies time in a health state by a utility weight anchored between zero (death) and one (full health). in the pathway →

On the pathway · 05 · Decision rule · QALYs and health-state utilities

Which utility concept do you need?

the overall ideaQALYs and health-state utilities
quality-adjusted life yearsQALY
preference weight for a health stateHealth-state utility
a standardized utility instrumentEQ-5D

Quantitative bias analysis

Methods that assign explicit numerical assumptions to bias and propagate them into adjusted estimates and intervals. in the pathway →

On the pathway · ∗ · Defend it · Negative controls and empirical calibration

Need to detect hidden residual confounding?

probing residual confoundingNegative controls and calibration
checking confounding on exposureNegative control exposure
checking confounding on outcomeNegative control outcome
many negative controls availableEmpirical calibration
quantifying systematic errorQuantitative bias analysis

Questionnaire and instrument design

Fixing before fieldwork what a survey can measure, through item wording, response format, administration mode, and branching. in the pathway →

On the pathway · 01 · Measurement · Questionnaire and instrument design

Which questionnaire flaw is in play?

the overall craftQuestionnaire and instrument design
asking two things at onceDouble-barreled question
wording that steers the answerLeading question

R

R-hat

A convergence statistic that should sit near 1 when MCMC chains started far apart have mixed. in the pathway →

On the pathway · 02 · Model · Bayesian computation (MCMC)

Which sampling or diagnostic tool?

the overall ideaBayesian computation
sampling the posterior generallyMCMC
proposal-and-accept samplerMetropolis-Hastings
sample each parameter conditionallyGibbs sampling
gradient-guided efficient samplerHamiltonian Monte Carlo
check chains have convergedR-hat
check model reproduces the dataPosterior predictive check

R-squared

The share of outcome variance a model explains, which climbs mechanically as predictors are added, so use the adjusted or out-of-sample version. in the pathway →

On the pathway · 03 · Estimate · Model fit, comparison, and prediction error

Which fit or error measure do you need?

the overall ideaModel fit, comparison, and prediction error
variance explained, linear modelR-squared
pseudo-fit for logistic modelMcFadden’s pseudo-R-squared
compare models, penalize parametersAIC
compare models, penalize more heavilyBIC
prediction error, average magnitudeMean absolute error
prediction error, penalize large missesRMSE

Random forest

The standard bagging ensemble, averaging many trees trained on bootstrap resamples. in the pathway →

On the pathway · 02 · Model · Learning algorithms and ensembles

Which learner or ensemble fits?

the overall ideaLearning algorithms and ensembles
single interpretable splitsDecision tree (machine learning)
classify by closest neighborsK-nearest neighbours
maximum-margin separating boundarySupport vector machine
average parallel bootstrapped modelsBagging
many decorrelated bagged treesRandom forest
sequentially correct prior errorsBoosting

Random-effects meta-analysis

A pooling model assuming the true effect varies across studies, adding between-study variance to each weight and widening the interval. in the pathway →

On the pathway · 04 · Synthesis · Meta-analysis and pooling

Which pooling or heterogeneity tool?

the overall ideaMeta-analysis and pooling
one true effect assumedFixed-effect meta-analysis
effects vary across studiesRandom-effects meta-analysis
how much effects varyHeterogeneity
testing for heterogeneityCochran’s Q
proportion of variance from heterogeneityI-squared
between-study variance estimateTau-squared
range for a new studyPrediction interval
explaining heterogeneity by covariatesMeta-regression
visualizing small-study effectsFunnel plot
testing funnel asymmetryEgger’s test

Randomization and blinding

The schemes that assign trial arms and the safeguards, allocation concealment and blinding, that keep that assignment from being gamed or biased. in the pathway →

On the pathway · 00 · Framing · Randomization and blinding

What allocation or masking concern?

the overall ideaRandomization and blinding
hide the upcoming assignmentAllocation concealment
balance arms in small chunksBlock randomization
balance within prognostic strataStratified randomization
dynamically balance many factorsMinimization
mask treatment after allocationBlinding

Real-world causal-inference extensions

Methods extending propensity-score and g-methods to high-dimensional claims data and to treatment and censoring that vary over follow-up time. in the pathway →

On the pathway · 02 · Model · Real-world causal-inference extensions

How do causal methods scale to claims and time?

the overall ideaReal-world causal-inference extensions
you have many candidate covariatesHigh-dimensional propensity score (hdPS)
outcome modeling suits the problemDisease risk score
treatment strategy unfolds over timeClone-censor-weight
censoring depends on covariatesInverse-probability-of-censoring weighting (IPCW)
you need a simple guardLandmark analysis

Real-world cost and HTA methods

Techniques for modeling skewed real-world costs and extrapolating trial data into health technology assessment decisions. in the pathway →

On the pathway · 05 · Decision rule · Real-world cost and HTA methods

Modeling skewed real-world costs for HTA?

valuing real-world costsReal-world cost and HTA methods
costs are zero-inflatedTwo-part and other cost models
extreme cost outliers existWinsorization and trimming of cost outliers
summarizing population spendPer-member-per-month costing (PMPM/PPPM)
trial ends before lifetimeSurvival extrapolation for HTA
value has multiple dimensionsMulti-criteria decision analysis (MCDA)
prioritizing which uncertainty mattersExpected value of partial perfect information (EVPPI)

Recall

The share of true positives caught, the same as sensitivity. in the pathway →

On the pathway · 02 · Model · Classification performance metrics

Which classification metric?

the overall ideaClassification performance metrics
share of predicted positives correctPrecision
share of true positives caughtRecall
balance precision and recallF1 score
tradeoff across all thresholdsPrecision-recall curve

Recall bias

Differential memory of past exposure between cases and controls. in the pathway →

On the pathway · ∗ · Defend it · Study biases, by rung

Which bias is threatening the study?

the overall map of biasesStudy biases, by rung
who got sampled or enrolledSampling bias
distorted entry into the studySelection bias
selection among hospitalized patientsBerkson’s bias
conditioning on a common effectOver-adjustment
employed groups appear healthierHealthy-worker effect
only survivors are observedSurvivorship bias
differential loss to follow-upAttrition bias
nonresponders differ systematicallyNonresponse bias
a common cause of exposure and outcomeConfounding
treatment chosen by prognosisConfounding by indication
unequal outcome ascertainmentDetection bias
inaccurate recall of exposureRecall bias
interviewer shapes responsesInterviewer bias
earlier detection inflates survivalLead-time bias
slow cases preferentially detectedLength-time bias
detecting harmless diseaseOverdiagnosis
selective reporting of resultsPublication bias

Reference case

A standardized set of methods recommended by the Second Panel, reported alongside any analysis so results are comparable. in the pathway →

On the pathway · 05 · Decision rule · Perspective and the reference case

Whose costs and benefits count?

the overall ideaPerspective and the reference case
standardized analysis conventionsReference case
reporting checklist for economicsCHEERS
count all costs to societySocietal perspective
value of foregone alternativesOpportunity cost

Registries

Purpose-built data for one disease, deep but narrow. in the pathway →

On the pathway · 01 · Measurement · Data sources and their tradeoffs

Which data source or pitfall?

the overall ideaData sources and their tradeoffs
clinical detail from care recordsElectronic health record data
billing records across encountersClaims data
enrolled cohorts for a conditionRegistries
sampled population questionnairesSurvey data
group-level inference pitfallEcological fallacy

Regression discontinuity

A causal design exploiting a cutoff, resting on continuity at the cutoff. in the pathway →

On the pathway · 02 · Model · Causal designs without randomization

Which quasi-experimental design fits?

the overall ideaCausal designs without randomization
before-after across exposed and controlDifference-in-differences
a haphazard nudge to exposureInstrumental variables
assignment by a cutoff thresholdRegression discontinuity
weighted donors build a counterfactualSynthetic control

Regression families

The principle that the outcome dictates the model, most being a generalized linear model of an outcome distribution plus a link function. in the pathway →

On the pathway · 02 · Model · Regression families

Which regression for your outcome?

the overall ideaRegression families
unifying exponential-family frameworkGLM
continuous outcomeLinear regression
binary outcomeLogistic regression
count outcomePoisson regression
overdispersed countsNegative binomial regression
excess zeros in countsZero-inflated model
separate zero and positive partsHurdle model
correlated or clustered outcomesGEE
repeated measures over timeMMRM

Regularization

Penalizing model complexity to buy the right flexibility, through ridge, lasso, or elastic net. in the pathway →

On the pathway · 02 · Model · Bias-variance and regularization

Which concept or penalty?

the overall ideaBias-variance and regularization
the underlying error tradeoffBias-variance tradeoff
fitting noise, poor generalizationOverfitting
penalizing complexity broadlyRegularization
estimating out-of-sample errorCross-validation
shrink coefficients, keep allRidge regression
shrink and select variablesLasso
blend selection and shrinkageElastic net

Regulatory pathways and registration

The regulatory frame around a study informing a regulated decision, including FDA IND or IDE applications and mandatory ClinicalTrials.gov registration and results posting. in the pathway →

On the pathway · § · Conduct it · Regulatory pathways and registration

Which regulatory application?

the overall ideaRegulatory pathways and registration
investigational drug applicationIND
investigational device exemptionIDE

Relative versus absolute

The communication choice of whether to lead with a relative effect, which can sound large, or an absolute effect, where benefit becomes concrete. in the pathway →

On the pathway · 03 · Estimate · Relative versus absolute

Which scale frames the effect?

the overall ideaRelative versus absolute
effect on the absolute scaleAbsolute risk reduction

Reliability

Reproducibility: measuring the same quantity again and getting the same answer. in the pathway →

On the pathway · 01 · Measurement · Reliability and validity

Which measurement property are you assessing?

the overall ideaReliability and validity
consistency of measurementReliability
measuring the intended constructValidity
internal consistency of scale itemsCronbach’s alpha
agreement on continuous measuresIntraclass correlation
plot method agreement and biasBland-Altman plot
categorical agreement, two ratersCohen’s kappa
ordered-category agreement, two ratersWeighted kappa
categorical agreement, many ratersFleiss’ kappa

Reliability and validity

Two independent properties of a measurement: reproducibility on repeat, and whether it measures what it claims. in the pathway →

On the pathway · 01 · Measurement · Reliability and validity

Which measurement property are you assessing?

the overall ideaReliability and validity
consistency of measurementReliability
measuring the intended constructValidity
internal consistency of scale itemsCronbach’s alpha
agreement on continuous measuresIntraclass correlation
plot method agreement and biasBland-Altman plot
categorical agreement, two ratersCohen’s kappa
ordered-category agreement, two ratersWeighted kappa
categorical agreement, many ratersFleiss’ kappa

Reliability ratio

The signal’s share of total variance, by which non-differential error attenuates a true slope. in the pathway → \[\lambda = \frac{\sigma^2_{\text{true}}}{\sigma^2_{\text{true}} + \sigma^2_{\text{error}}}\] where \(\lambda\) is the reliability ratio, the signal’s share of total variance; \(\sigma^2_{\text{true}}\) is the variance of the true values; \(\sigma^2_{\text{error}}\) is the variance of the measurement error.

On the pathway · 01 · Measurement · Measurement error and misclassification

What kind of measurement error?

the overall ideaMeasurement error and misclassification
error unrelated to other variablesNon-differential misclassification
error differing by groupDifferential misclassification
true variance over observed varianceReliability ratio

Reporting standards

Checklists like CONSORT, STROBE, PRISMA, and TRIPOD that make a study’s methods auditable by requiring the details that let a reader judge it. in the pathway →

On the pathway · 06 · Recommendation · Reporting standards

Which study type are you reporting?

the overall ideaReporting standards
randomized controlled trialCONSORT
observational studySTROBE
systematic reviewPRISMA
prediction model studyTRIPOD

Research ethics and the IRB

Modern research ethics rests on the three Belmont principles and is enforced before a study starts by an institutional review board weighing risks against benefits. in the pathway →

On the pathway · § · Conduct it · Research ethics and the IRB

Which ethics concept or body?

the overall ideaResearch ethics and the IRB
foundational ethical principlesBelmont principles
genuine uncertainty justifying a trialClinical equipoise
participant’s voluntary agreementInformed consent
body that reviews and approves studiesInstitutional review board

Research question

A study’s question written specifically enough to act on, using PICO or PECO to fix population, intervention or exposure, comparator, and outcome. in the pathway →

On the pathway · 00 · Framing · Research question (PICO / PECO)

Which question framework fits?

the overall ideaResearch question
intervention question for a trialPICO
add an explicit time horizonPICOT
add study-design eligibilityPICOS
exposure question for observational workPECO

Restricted mean survival time

A survival summary that remains meaningful under non-proportional hazards and gives a number a patient can actually use. in the pathway →

On the pathway · 03 · Estimate · Hazard ratios and non-proportional hazards

Which survival concept?

the overall ideaHazard ratios and non-proportional hazards
censoring unrelated to outcomeNon-informative censoring
summary when hazards are non-proportionalRestricted mean survival time

Ridge regression

L2 regularization that shrinks coefficients toward zero. in the pathway →

On the pathway · 02 · Model · Bias-variance and regularization

Which concept or penalty?

the overall ideaBias-variance and regularization
the underlying error tradeoffBias-variance tradeoff
fitting noise, poor generalizationOverfitting
penalizing complexity broadlyRegularization
estimating out-of-sample errorCross-validation
shrink coefficients, keep allRidge regression
shrink and select variablesLasso
blend selection and shrinkageElastic net

Risk calculators and prediction tools

A model packaged for bedside use that carries its development population with it, so external validation and recalibration matter before its output drives action. in the pathway →

On the pathway · 05 · Decision rule · Risk calculators and prediction tools

What situation?

the overall ideaRisk calculators and prediction tools

Risk difference

Absolute effect measure: the risk in the exposed group minus the risk in the unexposed group. in the pathway →

On the pathway · 03 · Estimate · Effect measures

Which effect measure to report?

the overall ideaEffect measures
ratio of risks between groupsRisk ratio
ratio of odds between groupsOdds ratio
absolute difference in riskRisk difference
patients treated per outcome preventedNumber needed to treat

Risk ratio

Relative effect measure: the risk in the exposed group divided by the risk in the unexposed group. in the pathway →

On the pathway · 03 · Estimate · Effect measures

Which effect measure to report?

the overall ideaEffect measures
ratio of risks between groupsRisk ratio
ratio of odds between groupsOdds ratio
absolute difference in riskRisk difference
patients treated per outcome preventedNumber needed to treat

Risk-of-bias appraisal

Scoring how a study’s design and conduct threaten its result domain by domain, using structured tools like RoB 2 for trials and ROBINS-I for observational studies. in the pathway →

On the pathway · 04 · Synthesis · Risk-of-bias appraisal

Which risk-of-bias tool fits?

the overall ideaRisk-of-bias appraisal
randomized trialsRoB 2
non-randomized intervention studiesROBINS-I

RMSE

Root mean squared error, the prediction error in the outcome’s own units that punishes large misses hardest. in the pathway →

On the pathway · 03 · Estimate · Model fit, comparison, and prediction error

Which fit or error measure do you need?

the overall ideaModel fit, comparison, and prediction error
variance explained, linear modelR-squared
pseudo-fit for logistic modelMcFadden’s pseudo-R-squared
compare models, penalize parametersAIC
compare models, penalize more heavilyBIC
prediction error, average magnitudeMean absolute error
prediction error, penalize large missesRMSE

RoB 2

A structured tool for scoring risk of bias in randomized trials, domain by domain. in the pathway →

On the pathway · 04 · Synthesis · Risk-of-bias appraisal

Which risk-of-bias tool fits?

the overall ideaRisk-of-bias appraisal
randomized trialsRoB 2
non-randomized intervention studiesROBINS-I

ROBINS-I

A structured tool for scoring risk of bias in observational studies, domain by domain. in the pathway →

On the pathway · 04 · Synthesis · Risk-of-bias appraisal

Which risk-of-bias tool fits?

the overall ideaRisk-of-bias appraisal
randomized trialsRoB 2
non-randomized intervention studiesROBINS-I

Robust standard errors

Heteroscedasticity-robust (sandwich) standard errors, the modern default for non-constant variance. in the pathway →

On the pathway · 02 · Model · Checking model assumptions

Which model assumption to check?

the overall ideaChecking model assumptions
non-constant residual varianceHeteroscedasticity
predictors too collinearVariance inflation factor
single points driving the fitCook’s distance
hazard ratio constant over timeProportional hazards
test that proportionality formallySchoenfeld residuals
fix variance without refittingRobust standard errors

Robust statistics for heavy tails

Median-based summaries and MAD-scaled z-scores that resist the outliers which dominate means and standard deviations in heavy-tailed data. in the pathway →

On the pathway · 02 · Model · Robust statistics for heavy tails

Which robust measure?

the overall ideaRobust statistics for heavy tails
robust spread of the dataMAD
robust outlier-resistant standardizationRobust z-score

Robust z-score

A z-score built from the median and MAD so extreme points no longer set the scale. in the pathway → \[z = \frac{x - \text{median}}{1.4826 \times \text{MAD}}\] where \(z\) is the robust z-score for a value; \(x\) is the value being scored; \(\text{median}\) is the median of the data, the robust center; \(\text{MAD}\) is the median absolute deviation, the robust spread; \(1.4826\) rescales the MAD to equal the standard deviation under a normal.

On the pathway · 02 · Model · Robust statistics for heavy tails

Which robust measure?

the overall ideaRobust statistics for heavy tails
robust spread of the dataMAD
robust outlier-resistant standardizationRobust z-score

Rosenbaum bounds

A method quantifying how much unmeasured confounding would overturn a result in matched designs, analogous to the E-value. in the pathway →

On the pathway · ∗ · Defend it · Bias quantification

How do you quantify unmeasured bias?

the overall ideaBias quantification
strength needed to explain awayE-value
hidden bias in matched designsRosenbaum bounds

S

Safe Harbor

A HIPAA de-identification method that strips eighteen specified identifiers from a dataset. in the pathway →

On the pathway · § · Conduct it · Data privacy and security

Which rule or method?

the overall ideaData privacy and security
US health privacy lawHIPAA
EU data protection lawGDPR
de-identify by removing identifiersSafe Harbor
de-identify by statistical opinionExpert determination
generating artificial substitute recordsSynthetic data

Safety and adverse-event analysis

Tabulating adverse events by type and severity on the safety population, compared as risk differences or exposure-adjusted rates, deliberately not corrected for multiplicity. in the pathway →

On the pathway · 03 · Estimate · Safety and adverse-event analysis

Which safety analysis element?

the overall ideaSafety and adverse-event analysis
subjects who received any treatmentSafety population

Safety population

Everyone who received any treatment, the set on which adverse events are counted, rather than the randomized set. in the pathway →

On the pathway · 03 · Estimate · Safety and adverse-event analysis

Which safety analysis element?

the overall ideaSafety and adverse-event analysis
subjects who received any treatmentSafety population

Sampling bias

A sample that does not represent the target population. in the pathway →

On the pathway · ∗ · Defend it · Study biases, by rung

Which bias is threatening the study?

the overall map of biasesStudy biases, by rung
who got sampled or enrolledSampling bias
distorted entry into the studySelection bias
selection among hospitalized patientsBerkson’s bias
conditioning on a common effectOver-adjustment
employed groups appear healthierHealthy-worker effect
only survivors are observedSurvivorship bias
differential loss to follow-upAttrition bias
nonresponders differ systematicallyNonresponse bias
a common cause of exposure and outcomeConfounding
treatment chosen by prognosisConfounding by indication
unequal outcome ascertainmentDetection bias
inaccurate recall of exposureRecall bias
interviewer shapes responsesInterviewer bias
earlier detection inflates survivalLead-time bias
slow cases preferentially detectedLength-time bias
detecting harmless diseaseOverdiagnosis
selective reporting of resultsPublication bias

Schoenfeld residuals

Scaled residuals used to check the proportional-hazards assumption of a Cox model. in the pathway →

On the pathway · 02 · Model · Checking model assumptions

Which model assumption to check?

the overall ideaChecking model assumptions
non-constant residual varianceHeteroscedasticity
predictors too collinearVariance inflation factor
single points driving the fitCook’s distance
hazard ratio constant over timeProportional hazards
test that proportionality formallySchoenfeld residuals
fix variance without refittingRobust standard errors

Selection bias

Bias from who ends up in the analysis, including sampling, volunteer, nonresponse, attrition, Berkson’s, healthy-worker, and survivorship variants. in the pathway →

On the pathway · ∗ · Defend it · Study biases, by rung

Which bias is threatening the study?

the overall map of biasesStudy biases, by rung
who got sampled or enrolledSampling bias
distorted entry into the studySelection bias
selection among hospitalized patientsBerkson’s bias
conditioning on a common effectOver-adjustment
employed groups appear healthierHealthy-worker effect
only survivors are observedSurvivorship bias
differential loss to follow-upAttrition bias
nonresponders differ systematicallyNonresponse bias
a common cause of exposure and outcomeConfounding
treatment chosen by prognosisConfounding by indication
unequal outcome ascertainmentDetection bias
inaccurate recall of exposureRecall bias
interviewer shapes responsesInterviewer bias
earlier detection inflates survivalLead-time bias
slow cases preferentially detectedLength-time bias
detecting harmless diseaseOverdiagnosis
selective reporting of resultsPublication bias

Self-controlled case series (SCCS)

Models event rates across exposed and unexposed time within each affected person, removing all time-fixed within-person confounding. in the pathway →

On the pathway · 00 · Framing · Observational study designs

Which observational design fits the question and dominant bias?

the overall familyObservational study designs
exposure known, follow forwardCohort study
snapshot at one timeCross-sectional study
rare outcome, look backCase-control study
controls sampled within cohortNested case-control
random subcohort, multiple outcomesCase-cohort design
transient trigger, acute eventCase-crossover design
within-person rate comparisonSelf-controlled case series (SCCS)
corrects exposure time trendsCase-time-control design
initiators, active comparatorActive-comparator new-user design
standing source populationDisease registry

Sensitivity

The proportion of truly diseased patients a test correctly identifies as positive. in the pathway →

On the pathway · 05 · Decision rule · Operating characteristics

Which diagnostic-performance measure?

the overall ideaOperating characteristics
true positives among diseasedSensitivity
true negatives among healthySpecificity
disease probability given a resultPredictive values

Sensitivity analysis

Pre-specified analyses that deliberately vary the assumptions most likely to be challenged and report what happens, more credible than analyses run only after review. in the pathway →

On the pathway · ∗ · Defend it · Sensitivity analysis

What robustness situation are you in?

testing how conclusions hold upSensitivity analysis

SHAP

An interpretability tool that partly restores insight into flexible predictive models. in the pathway →

On the pathway · 02 · Model · Prediction and machine learning

Which prediction concept is in play?

the overall areaPrediction and machine learning
explaining individual predictionsSHAP

Simon’s two-stage design

A small single-arm phase II design that stops early when first-stage responses are too few to continue. in the pathway →

On the pathway · 00 · Framing · Dose-finding and early-phase designs

Which early-phase design question are you facing?

the overall design familyDose-finding and early-phase designs
dose escalation by fixed cohort rule3+3 design
model-based dose escalationContinual reassessment method
the highest acceptably safe doseMaximum tolerated dose
phase II screening for efficacySimon’s two-stage design

Simple random sampling

Drawing from one frame with equal selection probability. in the pathway →

On the pathway · 01 · Measurement · Survey sampling design

How do you draw the sample?

the overall ideaSurvey sampling design
every unit known nonzero chanceProbability sample
equal-chance draw from frameSimple random sampling
sample within population strataStratified sampling
sample whole groups togetherCluster sampling
sample in successive nested stagesMultistage sampling
variance inflation from clusteringDesign effect

Skewness

A summary of a distribution’s asymmetry, part of reading its shape. in the pathway →

On the pathway · 01 · Measurement · Characterizing the distribution

What shape feature?

the overall ideaCharacterizing the distribution
asymmetry of the distributionSkewness
heaviness of the tailsKurtosis
smoothing a nonlinear trendLOESS smoother

SNOMED

A clinical coding ontology used for claims and records. in the pathway →

On the pathway · 01 · Measurement · Data standards and provenance

Which data standard or provenance layer?

the overall idea of standards and provenanceData standards and provenance
clinical coding terminology for findingsSNOMED
regulatory model for collected trial dataCDISC SDTM
analysis-ready dataset standardADaM

Societal perspective

A costing viewpoint that adds patient time, caregiving, and lost productivity to medical costs, which can flip the verdict for some conditions. in the pathway →

On the pathway · 05 · Decision rule · Perspective and the reference case

Whose costs and benefits count?

the overall ideaPerspective and the reference case
standardized analysis conventionsReference case
reporting checklist for economicsCHEERS
count all costs to societySocietal perspective
value of foregone alternativesOpportunity cost

Sparse data and resampling

Methods for small cell counts or rare events, where standard likelihood is unstable and resampling or exact procedures give trustworthy estimates and intervals. in the pathway →

On the pathway · 02 · Model · Sparse data and resampling

Are cells sparse or analytic standard errors doubtful?

the overall familySparse data and resampling
separation or small samplesFirth penalized regression
very sparse, exact inferenceExact logistic regression
no clean closed-form varianceBootstrap and resampling methods
public-health impact measuresAttributable risk and population attributable fraction (PAF)

Spearman correlation

A measure of monotone association between two continuous variables. in the pathway →

On the pathway · 02 · Model · Bivariate tests (t-test, chi-square)

Which two-variable association are you testing?

the overall family of testsBivariate tests
mean across two groupsT-test
means across three or more groupsANOVA
ranks across two groupsMann-Whitney test
ranks across three or more groupsKruskal-Wallis test
two categorical variables, large countsChi-square test
two categorical variables, small countsFisher’s exact test
linear correlation of two continuousPearson correlation
monotonic correlation, rankedSpearman correlation
concordance-based rank correlationKendall’s tau
effect size for two meansCohen’s d
effect size for ANOVAEta-squared
effect size for categorical associationCramer’s V

Specification-curve analysis

Re-estimating a result across the many defensible modeling choices to show whether a conclusion holds broadly or only along one path. in the pathway →

On the pathway · ∗ · Defend it · Leave-one-out and specification curves

How are you probing specification robustness?

the overall ideaLeave-one-out and specification curves
results across many model choicesSpecification-curve analysis

Specificity

The proportion of truly disease-free patients a test correctly identifies as negative. in the pathway →

On the pathway · 05 · Decision rule · Operating characteristics

Which diagnostic-performance measure?

the overall ideaOperating characteristics
true positives among diseasedSensitivity
true negatives among healthySpecificity
disease probability given a resultPredictive values

Spectrum bias

Inflated test accuracy when cases are floridly sick and controls plainly well, so accuracy at a referral center overstates that in primary care. in the pathway →

On the pathway · 05 · Decision rule · Diagnostic-accuracy studies

Which accuracy measure or pitfall?

the overall ideaDiagnostic-accuracy studies
how results shift disease oddsLikelihood ratios
index test informs reference standardIncorporation bias
only some get the reference standardVerification bias
unrepresentative case mixSpectrum bias

SPIRIT

The reporting standard for a trial protocol, the protocol counterpart to the CONSORT checklist for the finished trial. in the pathway →

On the pathway · § · Conduct it · The study protocol (SPIRIT)

What situation?

the overall ideaThe study protocol (SPIRIT)
reporting standard for protocolsSPIRIT

Splines

Restricted cubic or natural splines that fit a smooth piecewise curve at a few knots, modeling nonlinearity more stably than high-order polynomials. in the pathway →

On the pathway · 02 · Model · Model modifications (splines, interactions)

How do you flex the model?

the overall ideaModel modifications
effect depends on another variableInteraction term
fixed exposure term in count modelsOffset
smooth nonlinear flexible curvesSplines
additive smooth function componentsGeneralized additive models

Standard error

The spread of a sample mean, shrinking with the square root of sample size, so quadrupling n halves it. in the pathway → \[\text{SE} = \frac{\sigma}{\sqrt{n}}\] where \(\text{SE}\) is the standard error of the sample mean; \(\sigma\) is the standard deviation of a single observation; \(n\) is the number of observations averaged.

On the pathway · 01 · Measurement · Probability distributions and the CLT

Which distribution or sampling result?

the overall ideaProbability distributions
continuous bell-shaped variableNormal distribution
fixed-trial success countsBinomial distribution
counts of rare eventsPoisson distribution
overdispersed count dataNegative binomial distribution
why sample means turn normalCentral limit theorem
spread of a sample estimateStandard error

Standardized mortality ratio

The ratio of observed to expected events used in indirect age-standardization. in the pathway →

On the pathway · 01 · Measurement · Measures of disease frequency

What frequency are you trying to measure?

the overall ideaMeasures of disease frequency
existing cases at a time pointPrevalence
new cases over follow-upIncidence
new cases as a proportion at riskCumulative incidence
new cases per unit follow-up timeIncidence rate
denominator of summed follow-upPerson-time
unadjusted rate in a populationCrude rate
comparing rates across populationsAge-standardization
observed versus expected deathsStandardized mortality ratio

Statistical programming and TFLs

Delivering analysis as pre-specified tables, figures, and listings, with credibility enforced by independent double-programming reconciled value by value. in the pathway →

On the pathway · § · Conduct it · Statistical programming: TFLs and double-programming QC

What situation?

the overall ideaStatistical programming and TFLs
the reported tables and figuresTFLs
independent reproduction for QCDouble-programming

Stochastic uncertainty

First-order random variation between otherwise identical individuals, the noise a microsimulation has to average out. in the pathway →

On the pathway · 05 · Decision rule · Types of uncertainty

Which source of uncertainty?

the overall ideaTypes of uncertainty
uncertainty in input estimatesParameter uncertainty
random variation between individualsStochastic uncertainty
uncertainty in model structureStructural uncertainty
finding where conclusions flipThreshold analysis

Stratified analysis

Controlling confounding by splitting data on the confounder, estimating within each stratum, and pooling the estimates. in the pathway →

On the pathway · 02 · Model · Stratified analysis (Mantel-Haenszel)

Which stratified-analysis step are you at?

the overall approachStratified analysis
pooling across strataMantel-Haenszel estimator
combining stratum log effectsWoolf’s method
testing for effect modificationHomogeneity check

Stratified randomization

Randomization that balances a few strong prognostic factors within strata. in the pathway →

On the pathway · 00 · Framing · Randomization and blinding

What allocation or masking concern?

the overall ideaRandomization and blinding
hide the upcoming assignmentAllocation concealment
balance arms in small chunksBlock randomization
balance within prognostic strataStratified randomization
dynamically balance many factorsMinimization
mask treatment after allocationBlinding

Stratified sampling

Splitting the frame into strata and sampling within each, allowing precise oversampling of a small subgroup at the cost of unequal selection probabilities. in the pathway →

On the pathway · 01 · Measurement · Survey sampling design

How do you draw the sample?

the overall ideaSurvey sampling design
every unit known nonzero chanceProbability sample
equal-chance draw from frameSimple random sampling
sample within population strataStratified sampling
sample whole groups togetherCluster sampling
sample in successive nested stagesMultistage sampling
variance inflation from clusteringDesign effect

Strength of recommendation

How firmly a guideline body is willing to speak, signaled by ACC/AHA class and level of evidence or GRADE’s strong-versus-conditional split, and it should track the certainty of evidence. in the pathway →

On the pathway · 06 · Recommendation · Strength of recommendation

How strong is the recommendation?

the overall ideaStrength of recommendation

STROBE

The reporting checklist for observational studies. in the pathway →

On the pathway · 06 · Recommendation · Reporting standards

Which study type are you reporting?

the overall ideaReporting standards
randomized controlled trialCONSORT
observational studySTROBE
systematic reviewPRISMA
prediction model studyTRIPOD

Structural uncertainty

Uncertainty in a model’s own form, which states exist and which functional form, often larger than parameter uncertainty yet routinely ignored. in the pathway →

On the pathway · 05 · Decision rule · Types of uncertainty

Which source of uncertainty?

the overall ideaTypes of uncertainty
uncertainty in input estimatesParameter uncertainty
random variation between individualsStochastic uncertainty
uncertainty in model structureStructural uncertainty
finding where conclusions flipThreshold analysis

Study biases, by rung

A family of biases mapped to the rung where each enters, spanning selection, information, confounding, synthesis, and screening biases. in the pathway →

On the pathway · ∗ · Defend it · Study biases, by rung

Which bias is threatening the study?

the overall map of biasesStudy biases, by rung
who got sampled or enrolledSampling bias
distorted entry into the studySelection bias
selection among hospitalized patientsBerkson’s bias
conditioning on a common effectOver-adjustment
employed groups appear healthierHealthy-worker effect
only survivors are observedSurvivorship bias
differential loss to follow-upAttrition bias
nonresponders differ systematicallyNonresponse bias
a common cause of exposure and outcomeConfounding
treatment chosen by prognosisConfounding by indication
unequal outcome ascertainmentDetection bias
inaccurate recall of exposureRecall bias
interviewer shapes responsesInterviewer bias
earlier detection inflates survivalLead-time bias
slow cases preferentially detectedLength-time bias
detecting harmless diseaseOverdiagnosis
selective reporting of resultsPublication bias

SUCRA

Surface under the cumulative ranking curve, summarizing a treatment’s rank where 100 percent is certainly best and 0 percent certainly worst. in the pathway →

On the pathway · 04 · Synthesis · Network meta-analysis

Which network meta-analysis concern?

the overall ideaNetwork meta-analysis
comparability across the networkTransitivity
direct versus indirect agreementNode-splitting
rank treatments overallSUCRA

Supervised and unsupervised learning

The split in machine learning by whether the data carry an outcome label. in the pathway →

On the pathway · 02 · Model · Supervised and unsupervised learning

Are outcome labels available?

the overall ideaSupervised and unsupervised learning
learn from labeled outcomesSupervised learning

Supervised learning

Learning to predict a known target label such as a diagnosis, cost, or survival time. in the pathway →

On the pathway · 02 · Model · Supervised and unsupervised learning

Are outcome labels available?

the overall ideaSupervised and unsupervised learning
learn from labeled outcomesSupervised learning

Support vector machine

A classifier finding the widest-margin boundary between classes, using a kernel to bend it nonlinearly. in the pathway →

On the pathway · 02 · Model · Learning algorithms and ensembles

Which learner or ensemble fits?

the overall ideaLearning algorithms and ensembles
single interpretable splitsDecision tree (machine learning)
classify by closest neighborsK-nearest neighbours
maximum-margin separating boundarySupport vector machine
average parallel bootstrapped modelsBagging
many decorrelated bagged treesRandom forest
sequentially correct prior errorsBoosting

Surrogate endpoint

A lab marker or scan standing in for a clinical outcome, trustworthy only once validated to capture the treatment’s effect on what patients feel. in the pathway →

On the pathway · 00 · Framing · Endpoint logic and pre-registration

Which endpoint or pre-specification concern?

the overall ideaEndpoint logic and pre-registration
the main pre-specified outcomePrimary endpoint
a stand-in for the outcomeSurrogate endpoint
criteria validating a surrogatePrentice’s criteria
lock analysis plan in advancePre-registration

Survey data

A probability sample built for population estimates that generalizes well once its weights and design are respected. in the pathway →

On the pathway · 01 · Measurement · Data sources and their tradeoffs

Which data source or pitfall?

the overall ideaData sources and their tradeoffs
clinical detail from care recordsElectronic health record data
billing records across encountersClaims data
enrolled cohorts for a conditionRegistries
sampled population questionnairesSurvey data
group-level inference pitfallEcological fallacy

Survey sampling design

The probability-sampling scheme by which a sample is drawn so it can generalize to the population. in the pathway →

On the pathway · 01 · Measurement · Survey sampling design

How do you draw the sample?

the overall ideaSurvey sampling design
every unit known nonzero chanceProbability sample
equal-chance draw from frameSimple random sampling
sample within population strataStratified sampling
sample whole groups togetherCluster sampling
sample in successive nested stagesMultistage sampling
variance inflation from clusteringDesign effect

Survey skip patterns

Branching where a gate question routes a respondent past inapplicable items, so a skipped item is blank by design rather than missing. in the pathway →

On the pathway · 01 · Measurement · Survey instruments: skip patterns and branching

Which skip-logic element is in play?

the overall ideaSurvey skip patterns
item routing later questionsGate question

Survey weight

The factor correcting for unequal selection so a sample represents the population it was drawn from. in the pathway →

On the pathway · 01 · Measurement · Complex-sample design and survey weighting

Which weighting or design adjustment?

the overall ideaComplex-sample design and survey weighting
scale respondents to the populationSurvey weight
precision lost to the designEffective sample size

Survival extrapolation for HTA

Fitting parametric or flexible models to observed survival and projecting beyond the trial horizon. in the pathway →

On the pathway · 05 · Decision rule · Real-world cost and HTA methods

Modeling skewed real-world costs for HTA?

valuing real-world costsReal-world cost and HTA methods
costs are zero-inflatedTwo-part and other cost models
extreme cost outliers existWinsorization and trimming of cost outliers
summarizing population spendPer-member-per-month costing (PMPM/PPPM)
trial ends before lifetimeSurvival extrapolation for HTA
value has multiple dimensionsMulti-criteria decision analysis (MCDA)
prioritizing which uncertainty mattersExpected value of partial perfect information (EVPPI)

Survivorship bias

Bias from studying only the units that lasted long enough to be observed. in the pathway →

On the pathway · ∗ · Defend it · Study biases, by rung

Which bias is threatening the study?

the overall map of biasesStudy biases, by rung
who got sampled or enrolledSampling bias
distorted entry into the studySelection bias
selection among hospitalized patientsBerkson’s bias
conditioning on a common effectOver-adjustment
employed groups appear healthierHealthy-worker effect
only survivors are observedSurvivorship bias
differential loss to follow-upAttrition bias
nonresponders differ systematicallyNonresponse bias
a common cause of exposure and outcomeConfounding
treatment chosen by prognosisConfounding by indication
unequal outcome ascertainmentDetection bias
inaccurate recall of exposureRecall bias
interviewer shapes responsesInterviewer bias
earlier detection inflates survivalLead-time bias
slow cases preferentially detectedLength-time bias
detecting harmless diseaseOverdiagnosis
selective reporting of resultsPublication bias

SUTVA

The stable-unit-treatment-value assumption that one unit’s treatment does not affect another’s outcome, ruling out interference or spillover. in the pathway →

On the pathway · 02 · Model · Potential outcomes and identifiability

Which identifiability condition is at stake?

the overall frameworkPotential outcomes and identifiability
the counterfactual setupPotential-outcomes framework
only one outcome is observedFundamental problem of causal inference
treated and untreated comparableExchangeability
every covariate stratum has bothPositivity
observed equals counterfactual under treatmentConsistency
no interference, single versionSUTVA

Synthetic control

A causal design that constructs a comparison unit to neutralize a dominant threat to inference. in the pathway →

On the pathway · 02 · Model · Causal designs without randomization

Which quasi-experimental design fits?

the overall ideaCausal designs without randomization
before-after across exposed and controlDifference-in-differences
a haphazard nudge to exposureInstrumental variables
assignment by a cutoff thresholdRegression discontinuity
weighted donors build a counterfactualSynthetic control

Synthetic data

New records drawn from a generative model fit to real data, reproducing the joint distribution without copying individuals, needing privacy and fidelity audits. in the pathway → \[\frac{dx}{dt} = v(x,t)\] where \(x\) is the point being transported from the noise distribution toward the data distribution; \(t\) is time along the continuous path, running from 0 to 1; \(v(x,t)\) is the learned velocity field that moves \(x\) at each point and time.

On the pathway · § · Conduct it · Data privacy and security

Which rule or method?

the overall ideaData privacy and security
US health privacy lawHIPAA
EU data protection lawGDPR
de-identify by removing identifiersSafe Harbor
de-identify by statistical opinionExpert determination
generating artificial substitute recordsSynthetic data

T

T-test

A test comparing a continuous outcome between two groups, equivalent to a linear regression on a binary indicator. in the pathway →

On the pathway · 02 · Model · Bivariate tests (t-test, chi-square)

Which two-variable association are you testing?

the overall family of testsBivariate tests
mean across two groupsT-test
means across three or more groupsANOVA
ranks across two groupsMann-Whitney test
ranks across three or more groupsKruskal-Wallis test
two categorical variables, large countsChi-square test
two categorical variables, small countsFisher’s exact test
linear correlation of two continuousPearson correlation
monotonic correlation, rankedSpearman correlation
concordance-based rank correlationKendall’s tau
effect size for two meansCohen’s d
effect size for ANOVAEta-squared
effect size for categorical associationCramer’s V

Target-trial emulation

Imagining the randomized trial you would have run, writing its protocol, then building the observational analysis to match it. in the pathway →

On the pathway · 00 · Framing · Target-trial emulation

Which target-trial element?

the overall ideaTarget-trial emulation
misaligned follow-up start creating biasImmortal time

Tau-squared

The between-study variance added to each study’s weight in a random-effects meta-analysis, often estimated by DerSimonian-Laird. in the pathway →

On the pathway · 04 · Synthesis · Meta-analysis and pooling

Which pooling or heterogeneity tool?

the overall ideaMeta-analysis and pooling
one true effect assumedFixed-effect meta-analysis
effects vary across studiesRandom-effects meta-analysis
how much effects varyHeterogeneity
testing for heterogeneityCochran’s Q
proportion of variance from heterogeneityI-squared
between-study variance estimateTau-squared
range for a new studyPrediction interval
explaining heterogeneity by covariatesMeta-regression
visualizing small-study effectsFunnel plot
testing funnel asymmetryEgger’s test

TFLs

Tables, figures, and listings: the programmed outputs of an analysis, whose shells are pre-specified in the statistical analysis plan. in the pathway →

On the pathway · § · Conduct it · Statistical programming: TFLs and double-programming QC

What situation?

the overall ideaStatistical programming and TFLs
the reported tables and figuresTFLs
independent reproduction for QCDouble-programming

The evidence-recommendation gap

The distance between how firmly a guideline is worded and the actual support beneath it, whether an extrapolated threshold, a single trial, or mere expert consensus. in the pathway →

On the pathway · 06 · Recommendation · The evidence–recommendation gap

What recommendation situation are you in?

moving from evidence to adviceThe evidence-recommendation gap

The statistical analysis plan

The document pre-committing, before unblinding, exactly how the primary question will be answered, turning a confirmatory analysis confirmatory. in the pathway →

On the pathway · § · Conduct it · The statistical analysis plan

What situation?

the overall ideaThe statistical analysis plan

The study protocol (SPIRIT)

The master plan every other document hangs from, covering objectives, eligibility, intervention, outcomes, sample size, analysis, ethics, and dissemination. in the pathway →

On the pathway · § · Conduct it · The study protocol (SPIRIT)

What situation?

the overall ideaThe study protocol (SPIRIT)
reporting standard for protocolsSPIRIT

Threshold analysis

An analysis finding the input value at which a decision flips. in the pathway →

On the pathway · 05 · Decision rule · Types of uncertainty

Which source of uncertainty?

the overall ideaTypes of uncertainty
uncertainty in input estimatesParameter uncertainty
random variation between individualsStochastic uncertainty
uncertainty in model structureStructural uncertainty
finding where conclusions flipThreshold analysis

Thresholds and cut points

Turning a continuous risk or measurement into a yes/no action, a convenient but lossy choice that trades sensitivity against specificity and encodes a value judgment. in the pathway →

On the pathway · 05 · Decision rule · Thresholds and cut points

Where to set the decision cutoff?

the overall ideaThresholds and cut points

Time-varying confounding

When a confounder is itself affected by past treatment, breaking ordinary adjustment and requiring g-methods. in the pathway →

On the pathway · 02 · Model · Time-varying confounding and g-methods

How do you handle time-varying confounding?

the overall ideaTime-varying confounding
confounder both affects and respondsTreatment-confounder feedback
weight to remove time-varying confoundingMarginal structural model
model effect directly through timeG-estimation

TMLE

Targeted maximum likelihood estimation, a doubly-robust estimator combining a propensity and an outcome model. in the pathway →

On the pathway · 02 · Model · Causal estimators (propensity scores, g-methods)

How do you estimate the causal effect?

the overall ideaCausal estimators
model treatment assignment probabilityPropensity score
reweight by inverse treatment probabilityIPTW
model and average outcomesG-formula
combine outcome and treatment modelsDoubly-robust estimators
targeted machine-learning estimationTMLE

Traceability

The rule that every analysis value trace back through ADaM to its SDTM source and original case-report form. in the pathway →

On the pathway · 01 · Measurement · Assembling a clinical trial dataset

What are you building or tracing?

the overall ideaAssembling a clinical trial dataset
one row per subjectADSL
link result back to sourceTraceability

Transitivity

The assumption that trials are similar enough in populations and methods that an indirect comparison through a common comparator is valid. in the pathway →

On the pathway · 04 · Synthesis · Network meta-analysis

Which network meta-analysis concern?

the overall ideaNetwork meta-analysis
comparability across the networkTransitivity
direct versus indirect agreementNode-splitting
rank treatments overallSUCRA

Treatment-confounder feedback

When a confounder both responds to past treatment and guides the next, common in chronic-disease cohorts. in the pathway →

On the pathway · 02 · Model · Time-varying confounding and g-methods

How do you handle time-varying confounding?

the overall ideaTime-varying confounding
confounder both affects and respondsTreatment-confounder feedback
weight to remove time-varying confoundingMarginal structural model
model effect directly through timeG-estimation

Treatment-policy strategy

An intercurrent-event strategy that counts the outcome regardless of the event, in the intention-to-treat spirit. in the pathway →

On the pathway · 02 · Model · Trial estimands and intercurrent events

How do you handle intercurrent events?

the overall ideaTrial estimands and intercurrent events
events disrupting outcome interpretationIntercurrent events
ignore them, use assigned treatmentTreatment-policy strategy
imagine they did not occurHypothetical strategy
fold event into the outcomeComposite strategy
restrict to a defined subpopulationPrincipal-stratum strategy

Trial estimands and intercurrent events

A trial’s precise question, stated under the ICH E9 R1 framework, with a named strategy for events that occur after randomization. in the pathway →

On the pathway · 02 · Model · Trial estimands and intercurrent events

How do you handle intercurrent events?

the overall ideaTrial estimands and intercurrent events
events disrupting outcome interpretationIntercurrent events
ignore them, use assigned treatmentTreatment-policy strategy
imagine they did not occurHypothetical strategy
fold event into the outcomeComposite strategy
restrict to a defined subpopulationPrincipal-stratum strategy

TRIPOD

The reporting checklist for prediction models. in the pathway →

On the pathway · 06 · Recommendation · Reporting standards

Which study type are you reporting?

the overall ideaReporting standards
randomized controlled trialCONSORT
observational studySTROBE
systematic reviewPRISMA
prediction model studyTRIPOD

Two-part and other cost models

Models separating whether cost occurred from how much, plus robust GLMs for skewed cost data. in the pathway →

On the pathway · 05 · Decision rule · Real-world cost and HTA methods

Modeling skewed real-world costs for HTA?

valuing real-world costsReal-world cost and HTA methods
costs are zero-inflatedTwo-part and other cost models
extreme cost outliers existWinsorization and trimming of cost outliers
summarizing population spendPer-member-per-month costing (PMPM/PPPM)
trial ends before lifetimeSurvival extrapolation for HTA
value has multiple dimensionsMulti-criteria decision analysis (MCDA)
prioritizing which uncertainty mattersExpected value of partial perfect information (EVPPI)

Type-I error

The false-positive rate, which unplanned peeking at accumulating data inflates. in the pathway →

On the pathway · 00 · Framing · Interim analyses and group-sequential design

Which interim-monitoring element?

the overall ideaInterim analyses and group-sequential design
committee reviewing accruing dataData safety monitoring board
planned looks with stopping rulesGroup-sequential design
false-positive risk to spendType-I error
stringent early-look boundaryO’Brien-Fleming boundary
constant nominal-level boundaryPocock boundary

Types of uncertainty

Naming the kinds of uncertainty (parameter, stochastic, heterogeneity, structural) because each needs different tools to handle honestly. in the pathway →

On the pathway · 05 · Decision rule · Types of uncertainty

Which source of uncertainty?

the overall ideaTypes of uncertainty
uncertainty in input estimatesParameter uncertainty
random variation between individualsStochastic uncertainty
uncertainty in model structureStructural uncertainty
finding where conclusions flipThreshold analysis

U

Uncertainty and inference

Reporting the range compatible with the data via confidence intervals, accounting for clustering, since statistical significance is not clinical importance. in the pathway →

On the pathway · 03 · Estimate · Uncertainty and inference

How to express estimate uncertainty?

the overall ideaUncertainty and inference
a plausible range for the estimateConfidence interval

Uncertainty in cost-effectiveness (PSA)

Methods showing how fragile an ICER is, from one-way and tornado analyses to probabilistic sensitivity analysis propagating parameter uncertainty through Monte Carlo simulation. in the pathway →

On the pathway · 05 · Decision rule · Uncertainty in cost-effectiveness (PSA)

How are you handling cost-effectiveness uncertainty?

the overall ideaUncertainty in cost-effectiveness (PSA)
propagating parameter uncertaintyProbabilistic sensitivity analysis
plotting cost and effect differencesCost-effectiveness plane
probability of being cost-effectiveCost-effectiveness acceptability curve

Unsupervised learning

Finding structure in data with no outcome label, through clustering or dimensionality reduction. in the pathway →

On the pathway · 02 · Model · Unsupervised learning

What unlabeled-data structure are you finding?

the overall familyUnsupervised learning
grouping similar observationsClustering
nested grouping by linkageHierarchical clustering
partitioning into k groupsK-means
reducing the number of featuresDimensionality reduction
orthogonal variance componentsPrincipal component analysis

V

Validity

Whether an instrument measures what it claims, through content, construct, and criterion validity. in the pathway →

On the pathway · 01 · Measurement · Reliability and validity

Which measurement property are you assessing?

the overall ideaReliability and validity
consistency of measurementReliability
measuring the intended constructValidity
internal consistency of scale itemsCronbach’s alpha
agreement on continuous measuresIntraclass correlation
plot method agreement and biasBland-Altman plot
categorical agreement, two ratersCohen’s kappa
ordered-category agreement, two ratersWeighted kappa
categorical agreement, many ratersFleiss’ kappa

Value of information (EVPI)

Pricing the decision uncertainty that remains, where the expected value of perfect information is the expected loss from deciding under current uncertainty. in the pathway →

On the pathway · 05 · Decision rule · Value of information (EVPI)

What is more evidence worth?

the overall ideaValue of information (EVPI)
value of removing all uncertaintyEVPI
value of a specific future studyExpected value of sample information

Variance inflation factor

A diagnostic for multicollinearity among predictors. in the pathway →

On the pathway · 02 · Model · Checking model assumptions

Which model assumption to check?

the overall ideaChecking model assumptions
non-constant residual varianceHeteroscedasticity
predictors too collinearVariance inflation factor
single points driving the fitCook’s distance
hazard ratio constant over timeProportional hazards
test that proportionality formallySchoenfeld residuals
fix variance without refittingRobust standard errors

Verification

Checking whether a model is coded correctly, that the implementation does the math intended. in the pathway →

On the pathway · 05 · Decision rule · Model validation and calibration

What situation?

the overall ideaModel validation and calibration
tuning model outputs to realityCalibration (modeling)
confirming the model runs correctlyVerification

Verification bias

Bias arising when only test-positive patients go on to receive the reference standard. in the pathway →

On the pathway · 05 · Decision rule · Diagnostic-accuracy studies

Which accuracy measure or pitfall?

the overall ideaDiagnostic-accuracy studies
how results shift disease oddsLikelihood ratios
index test informs reference standardIncorporation bias
only some get the reference standardVerification bias
unrepresentative case mixSpectrum bias

W

Weakly-informative prior

A prior that gently regularizes without committing to much. in the pathway →

On the pathway · 02 · Model · Choosing a prior

What kind of prior do you need?

the overall choiceChoosing a prior
prior matched to the likelihoodConjugate prior
strong external informationInformative prior
light regularizing informationWeakly-informative prior

Weighted kappa

A kappa that credits near-misses on an ordinal scale. in the pathway →

On the pathway · 01 · Measurement · Reliability and validity

Which measurement property are you assessing?

the overall ideaReliability and validity
consistency of measurementReliability
measuring the intended constructValidity
internal consistency of scale itemsCronbach’s alpha
agreement on continuous measuresIntraclass correlation
plot method agreement and biasBland-Altman plot
categorical agreement, two ratersCohen’s kappa
ordered-category agreement, two ratersWeighted kappa
categorical agreement, many ratersFleiss’ kappa

Willingness-to-pay threshold

The benchmark amount a payer will pay per unit of benefit, against which an incremental cost-effectiveness ratio is judged. in the pathway →

On the pathway · 05 · Decision rule · Cost-effectiveness and the ICER

Which economic-evaluation framing fits?

the overall ideaCost-effectiveness and the ICER
extra cost per extra effectICER
value costs and benefits in moneyCost-benefit analysis
effects identical, compare costs onlyCost-minimization analysis
effects in quality-adjusted life yearsCost-utility analysis
value at a willingness thresholdNet monetary benefit
maximum payable per unit benefitWillingness-to-pay threshold

Winsorization and trimming of cost outliers

Capping or dropping extreme cost values so a few catastrophic claims do not dominate the mean. in the pathway →

On the pathway · 05 · Decision rule · Real-world cost and HTA methods

Modeling skewed real-world costs for HTA?

valuing real-world costsReal-world cost and HTA methods
costs are zero-inflatedTwo-part and other cost models
extreme cost outliers existWinsorization and trimming of cost outliers
summarizing population spendPer-member-per-month costing (PMPM/PPPM)
trial ends before lifetimeSurvival extrapolation for HTA
value has multiple dimensionsMulti-criteria decision analysis (MCDA)
prioritizing which uncertainty mattersExpected value of partial perfect information (EVPPI)

Woolf’s method

A method for pooling stratum-specific association estimates across strata. in the pathway →

On the pathway · 02 · Model · Stratified analysis (Mantel-Haenszel)

Which stratified-analysis step are you at?

the overall approachStratified analysis
pooling across strataMantel-Haenszel estimator
combining stratum log effectsWoolf’s method
testing for effect modificationHomogeneity check

Z

Zero-inflated model: A count model mixing a structural-zero process with a count process when zeros pile up. in the pathway →

← Back to the pathway

Two ways to take this further:

Learn the methods. Create a free account → to follow new write-ups and traces as they go up, alongside the full From Data to Bedside pathway.
Put them to work on your study. Book a discovery call → for study design, causal inference, and analysis that survives review.
On the pathway · 02 · Model · Regression families

Which regression for your outcome?
- the overall ideaRegression families
- unifying exponential-family frameworkGLM
- continuous outcomeLinear regression
- binary outcomeLogistic regression
- count outcomePoisson regression
- overdispersed countsNegative binomial regression
- excess zeros in countsZero-inflated model
- separate zero and positive partsHurdle model
- correlated or clustered outcomesGEE
- repeated measures over timeMMRM