Glossary
From Data to Bedside · every term in the pathway, defined and linked
This glossary indexes every concept in the pathway. Each entry gives a one-line definition and links to the pathway node where the term is taught in full; on the pathway, the term links back here. Terms are listed alphabetically.
- 1-inpatient / 2-outpatient rule
-
Counting a case from one inpatient diagnosis or two outpatient diagnoses on separate dates to filter out rule-out codes. in the pathway →
On the pathway · 01 · Measurement · Outcome phenotyping and algorithm validationIs your outcome a validated algorithm or an unchecked code rule?
- the overall ideaOutcome phenotyping and validation
- the rule itselfClaims/EHR phenotype algorithm
- common coding rule1-inpatient / 2-outpatient rule
- tuning the ruleAlgorithm validation (PPV and sensitivity tradeoff)
- reference standardEndpoint adjudication and chart review
- bundled outcomesComposite endpoint construction
- 3+3 design
-
A classic phase I escalation in small cohorts until toxicity appears. in the pathway →
On the pathway · 00 · Framing · Dose-finding and early-phase designsWhich early-phase design question are you facing?
- the overall design familyDose-finding and early-phase designs
- dose escalation by fixed cohort rule3+3 design
- model-based dose escalationContinual reassessment method
- the highest acceptably safe doseMaximum tolerated dose
- phase II screening for efficacySimon’s two-stage design
A
- Absolute risk reduction
-
The absolute difference in risk between groups; its reciprocal is the number needed to treat. in the pathway →
On the pathway · 03 · Estimate · Relative versus absoluteWhich scale frames the effect?
- the overall ideaRelative versus absolute
- effect on the absolute scaleAbsolute risk reduction
- Accelerated failure time (AFT) models
-
Parametric models on log survival time, yielding a time ratio when proportional hazards fails. in the pathway →
On the pathway · 03 · Estimate · Competing risks and parametric survivalDoes a competing event block the outcome?
- modeling time to eventCompeting risks and survival models
- a competing event existsCompeting risks
- you want etiologyCause-specific hazard
- you want absolute riskCumulative incidence function (CIF)
- modeling absolute risk directlyFine-Gray subdistribution hazard
- proportional hazards failsAccelerated failure time (AFT) models
- some patients are curedCure models
- Active-comparator new-user design
-
Restricts to initiators of a treatment versus an active alternative, curbing confounding by indication and prevalent-user and immortal-time distortions. in the pathway →
On the pathway · 00 · Framing · Observational study designsWhich observational design fits the question and dominant bias?
- the overall familyObservational study designs
- exposure known, follow forwardCohort study
- snapshot at one timeCross-sectional study
- rare outcome, look backCase-control study
- controls sampled within cohortNested case-control
- random subcohort, multiple outcomesCase-cohort design
- transient trigger, acute eventCase-crossover design
- within-person rate comparisonSelf-controlled case series (SCCS)
- corrects exposure time trendsCase-time-control design
- initiators, active comparatorActive-comparator new-user design
- standing source populationDisease registry
- ADaM
-
A CDISC standard for analysis-ready clinical trial datasets derived from SDTM. in the pathway →
On the pathway · 01 · Measurement · Data standards and provenanceWhich data standard or provenance layer?
- the overall idea of standards and provenanceData standards and provenance
- clinical coding terminology for findingsSNOMED
- regulatory model for collected trial dataCDISC SDTM
- analysis-ready dataset standardADaM
- ADSL
-
A subject-level ADaM dataset with one row per trial participant. in the pathway →
On the pathway · 01 · Measurement · Assembling a clinical trial datasetWhat are you building or tracing?
- the overall ideaAssembling a clinical trial dataset
- one row per subjectADSL
- link result back to sourceTraceability
- Age-standardization
-
Adjusting rates to a standard population so comparisons are not confounded by differing age structures, done directly or indirectly. in the pathway →
On the pathway · 01 · Measurement · Measures of disease frequencyWhat frequency are you trying to measure?
- the overall ideaMeasures of disease frequency
- existing cases at a time pointPrevalence
- new cases over follow-upIncidence
- new cases as a proportion at riskCumulative incidence
- new cases per unit follow-up timeIncidence rate
- denominator of summed follow-upPerson-time
- unadjusted rate in a populationCrude rate
- comparing rates across populationsAge-standardization
- observed versus expected deathsStandardized mortality ratio
- AIC
-
An information criterion trading goodness of fit against the number of parameters to compare non-nested models, where lower is better. in the pathway →
On the pathway · 03 · Estimate · Model fit, comparison, and prediction errorWhich fit or error measure do you need?
- the overall ideaModel fit, comparison, and prediction error
- variance explained, linear modelR-squared
- pseudo-fit for logistic modelMcFadden’s pseudo-R-squared
- compare models, penalize parametersAIC
- compare models, penalize more heavilyBIC
- prediction error, average magnitudeMean absolute error
- prediction error, penalize large missesRMSE
- Algorithm validation (PPV and sensitivity tradeoff)
-
Tightening a rule raises positive predictive value but lowers sensitivity, and vice versa. in the pathway →
On the pathway · 01 · Measurement · Outcome phenotyping and algorithm validationIs your outcome a validated algorithm or an unchecked code rule?
- the overall ideaOutcome phenotyping and validation
- the rule itselfClaims/EHR phenotype algorithm
- common coding rule1-inpatient / 2-outpatient rule
- tuning the ruleAlgorithm validation (PPV and sensitivity tradeoff)
- reference standardEndpoint adjudication and chart review
- bundled outcomesComposite endpoint construction
- Allocation concealment
-
A safeguard ensuring the next treatment assignment cannot be foreseen and gamed. in the pathway →
On the pathway · 00 · Framing · Randomization and blindingWhat allocation or masking concern?
- the overall ideaRandomization and blinding
- hide the upcoming assignmentAllocation concealment
- balance arms in small chunksBlock randomization
- balance within prognostic strataStratified randomization
- dynamically balance many factorsMinimization
- mask treatment after allocationBlinding
- Analysis populations
-
Who counts in the analysis, contrasting intention-to-treat, per-protocol, and as-treated, itself a choice of estimand. in the pathway →
On the pathway · 02 · Model · Analysis populations (ITT vs per-protocol)Which set of subjects do you analyze?
- the overall ideaAnalysis populations
- as randomized, regardless of adherenceIntention-to-treat
- only those who followed protocolPer-protocol
- grouped by treatment actually receivedAs-treated
- ANOVA
-
Analysis of variance, extending the t-test to compare a continuous outcome across more than two groups. in the pathway →
On the pathway · 02 · Model · Bivariate tests (t-test, chi-square)Which two-variable association are you testing?
- the overall family of testsBivariate tests
- mean across two groupsT-test
- means across three or more groupsANOVA
- ranks across two groupsMann-Whitney test
- ranks across three or more groupsKruskal-Wallis test
- two categorical variables, large countsChi-square test
- two categorical variables, small countsFisher’s exact test
- linear correlation of two continuousPearson correlation
- monotonic correlation, rankedSpearman correlation
- concordance-based rank correlationKendall’s tau
- effect size for two meansCohen’s d
- effect size for ANOVAEta-squared
- effect size for categorical associationCramer’s V
- As-treated
-
Analyzing patients by the treatment they actually received. in the pathway →
On the pathway · 02 · Model · Analysis populations (ITT vs per-protocol)Which set of subjects do you analyze?
- the overall ideaAnalysis populations
- as randomized, regardless of adherenceIntention-to-treat
- only those who followed protocolPer-protocol
- grouped by treatment actually receivedAs-treated
- Assay sensitivity
-
The assumption that a trial could have detected a real difference had one existed, since a sloppy trial looks non-inferior. in the pathway →
On the pathway · 00 · Framing · Non-inferiority and equivalenceWhat are you trying to show?
- the overall ideaNon-inferiority and equivalence
- new is not meaningfully worseNon-inferiority trial
- new is neither worse nor betterEquivalence trial
- how much worse is tolerableNon-inferiority margin
- trial can detect a real differenceAssay sensitivity
- Assembling a clinical trial dataset
-
Standardizing trial data from case-report forms through CDISC SDTM into ADaM analysis datasets, governed by traceability. in the pathway →
On the pathway · 01 · Measurement · Assembling a clinical trial datasetWhat are you building or tracing?
- the overall ideaAssembling a clinical trial dataset
- one row per subjectADSL
- link result back to sourceTraceability
- Assembling the analytic cohort
-
Turning a research database into one analysis-ready table via extract-transform-load, fixing its grain and time structure. in the pathway →
On the pathway · 01 · Measurement · Assembling the analytic cohortWhich cohort-construction step?
- the overall ideaAssembling the analytic cohort
- pull and reshape raw source dataExtract-transform-load
- set the time-zero anchorIndex date
- define pre-index covariate historyLookback window
- restrict to treatment initiatorsNew-user design
- ATC and defined daily dose (DDD)
-
WHO classification grouping drugs by therapeutic class, paired with a standard daily dose unit for comparable utilization. in the pathway →
On the pathway · 01 · Measurement · Claims and coding standardsWhich vocabulary encodes each claim field, and what does it capture?
- the overall ideaClaims and coding standards
- diagnosesICD-10-CM diagnosis codes
- inpatient proceduresICD-10-PCS procedure codes
- professional servicesCPT/HCPCS codes
- dispensed drugsNDC (National Drug Code)
- labs and observationsLOINC lab codes
- providerNPI (provider identifier)
- cross-database mappingOMOP standardized vocabularies (OHDSI)
- translating codesCode crosswalks and mappings
- drug utilizationATC and defined daily dose (DDD)
- ATE
-
The average treatment effect, the contrast of potential outcomes over everyone. in the pathway →
\[\text{ATE} = E[Y(1) - Y(0)]\]
where \(\text{ATE}\) is the average treatment effect over the whole population; \(Y(1)\) is the outcome a unit would have under treatment; \(Y(0)\) is the outcome the same unit would have under no treatment.
On the pathway · 02 · Model · Choosing the estimandWhose causal effect do you target?
- the overall ideaChoosing the estimand
- the formal target quantityEstimand
- effect across the whole populationATE
- effect among the treatedATT
- effect among compliers onlyLATE
- ATT
-
The average treatment effect on the treated, the contrast of potential outcomes among treated units. in the pathway →
On the pathway · 02 · Model · Choosing the estimandWhose causal effect do you target?
- the overall ideaChoosing the estimand
- the formal target quantityEstimand
- effect across the whole populationATE
- effect among the treatedATT
- effect among compliers onlyLATE
- Attributable risk and population attributable fraction (PAF)
-
Attributable risk is the excess risk among the exposed; PAF is the share of population cases removable by eliminating the exposure. in the pathway →
On the pathway · 02 · Model · Sparse data and resamplingAre cells sparse or analytic standard errors doubtful?
- the overall familySparse data and resampling
- separation or small samplesFirth penalized regression
- very sparse, exact inferenceExact logistic regression
- no clean closed-form varianceBootstrap and resampling methods
- public-health impact measuresAttributable risk and population attributable fraction (PAF)
- Attrition bias
-
Bias from differential loss to follow-up over time between groups. in the pathway →
On the pathway · ∗ · Defend it · Study biases, by rungWhich bias is threatening the study?
- the overall map of biasesStudy biases, by rung
- who got sampled or enrolledSampling bias
- distorted entry into the studySelection bias
- selection among hospitalized patientsBerkson’s bias
- conditioning on a common effectOver-adjustment
- employed groups appear healthierHealthy-worker effect
- only survivors are observedSurvivorship bias
- differential loss to follow-upAttrition bias
- nonresponders differ systematicallyNonresponse bias
- a common cause of exposure and outcomeConfounding
- treatment chosen by prognosisConfounding by indication
- unequal outcome ascertainmentDetection bias
- inaccurate recall of exposureRecall bias
- interviewer shapes responsesInterviewer bias
- earlier detection inflates survivalLead-time bias
- slow cases preferentially detectedLength-time bias
- detecting harmless diseaseOverdiagnosis
- selective reporting of resultsPublication bias
- AUC
-
The probability that a randomly chosen case gets a higher predicted risk than a randomly chosen non-case, where 0.5 is chance and 1 is perfect ranking. in the pathway →
On the pathway · 03 · Estimate · Calibration versus discriminationWhich aspect of predictive performance?
- the overall ideaCalibration versus discrimination
- ranking cases above non-casesDiscrimination
- summarizing ranking across thresholdsAUC
- predicted risks match observedCalibration
- testing calibration formallyHosmer-Lemeshow statistic
B
- Back-door criterion
-
The rule that reads the adjustment set straight off a causal diagram. in the pathway →
On the pathway · 02 · Model · Causal diagrams (DAGs) and conceptual frameworksWhich causal-diagram concept?
- the overall frameworkCausal diagrams
- the graph notation itselfDAG
- common cause of exposure and outcomeConfounder
- variable on the causal pathMediator
- common effect, conditioning opens biasCollider
- rule for sufficient adjustment setsBack-door criterion
- Bagging
-
Training many trees on bootstrap resamples and averaging them to lower variance, the basis of the random forest. in the pathway →
On the pathway · 02 · Model · Learning algorithms and ensemblesWhich learner or ensemble fits?
- the overall ideaLearning algorithms and ensembles
- single interpretable splitsDecision tree (machine learning)
- classify by closest neighborsK-nearest neighbours
- maximum-margin separating boundarySupport vector machine
- average parallel bootstrapped modelsBagging
- many decorrelated bagged treesRandom forest
- sequentially correct prior errorsBoosting
- Bayes’ theorem
-
The rule that the posterior is proportional to the likelihood times the prior. in the pathway →
\[\text{posterior} \propto \text{likelihood} \times \text{prior}\]
where \(\text{posterior}\) is the updated distribution of the parameter after seeing the data; \(\text{likelihood}\) is what the data say about the parameter; \(\text{prior}\) is what you believed about the parameter before seeing the data.
On the pathway · 02 · Model · Bayesian inferenceWhich Bayesian concept is in play?
- the overall frameworkBayesian inference
- the updating rule itselfBayes’ theorem
- beliefs after seeing dataPosterior distribution
- interval summary of the posteriorCredible interval
- Bayesian computation
-
Exploring posteriors with no closed form by simulation, principally Markov chain Monte Carlo. in the pathway →
On the pathway · 02 · Model · Bayesian computation (MCMC)Which sampling or diagnostic tool?
- the overall ideaBayesian computation
- sampling the posterior generallyMCMC
- proposal-and-accept samplerMetropolis-Hastings
- sample each parameter conditionallyGibbs sampling
- gradient-guided efficient samplerHamiltonian Monte Carlo
- check chains have convergedR-hat
- check model reproduces the dataPosterior predictive check
- Bayesian inference
-
Treating the parameter as a random quantity with a distribution the data update, yielding a posterior summarized by a credible interval. in the pathway →
On the pathway · 02 · Model · Bayesian inferenceWhich Bayesian concept is in play?
- the overall frameworkBayesian inference
- the updating rule itselfBayes’ theorem
- beliefs after seeing dataPosterior distribution
- interval summary of the posteriorCredible interval
- Belmont principles
-
The three principles underpinning research ethics: respect for persons, beneficence, and justice. in the pathway →
On the pathway · § · Conduct it · Research ethics and the IRBWhich ethics concept or body?
- the overall ideaResearch ethics and the IRB
- foundational ethical principlesBelmont principles
- genuine uncertainty justifying a trialClinical equipoise
- participant’s voluntary agreementInformed consent
- body that reviews and approves studiesInstitutional review board
- Benjamini-Hochberg
-
A procedure controlling the false-discovery rate among rejected hypotheses. in the pathway →
On the pathway · 02 · Model · Multiplicity controlHow do you control multiple testing?
- the overall ideaMultiplicity control
- bound any false positiveFamily-wise error rate
- bound false positives among rejectionsFalse-discovery rate
- simple conservative FWER divisorBonferroni correction
- stepwise FWER controlHolm’s procedure
- step-up FDR controlBenjamini-Hochberg
- test hypotheses in ordered familiesGatekeeping procedure
- Berkson’s bias
-
The spurious association produced by conditioning on hospital admission. in the pathway →
On the pathway · ∗ · Defend it · Study biases, by rungWhich bias is threatening the study?
- the overall map of biasesStudy biases, by rung
- who got sampled or enrolledSampling bias
- distorted entry into the studySelection bias
- selection among hospitalized patientsBerkson’s bias
- conditioning on a common effectOver-adjustment
- employed groups appear healthierHealthy-worker effect
- only survivors are observedSurvivorship bias
- differential loss to follow-upAttrition bias
- nonresponders differ systematicallyNonresponse bias
- a common cause of exposure and outcomeConfounding
- treatment chosen by prognosisConfounding by indication
- unequal outcome ascertainmentDetection bias
- inaccurate recall of exposureRecall bias
- interviewer shapes responsesInterviewer bias
- earlier detection inflates survivalLead-time bias
- slow cases preferentially detectedLength-time bias
- detecting harmless diseaseOverdiagnosis
- selective reporting of resultsPublication bias
- Bias quantification
-
Putting a number on how much unmeasured confounding it would take to overturn a result, as one pre-specified sensitivity analysis. in the pathway →
On the pathway · ∗ · Defend it · Bias quantificationHow do you quantify unmeasured bias?
- the overall ideaBias quantification
- strength needed to explain awayE-value
- hidden bias in matched designsRosenbaum bounds
- Bias-variance and regularization
-
The tradeoff between a model too simple to capture signal and one flexible enough to chase noise, managed by regularization. in the pathway →
On the pathway · 02 · Model · Bias-variance and regularizationWhich concept or penalty?
- the overall ideaBias-variance and regularization
- the underlying error tradeoffBias-variance tradeoff
- fitting noise, poor generalizationOverfitting
- penalizing complexity broadlyRegularization
- estimating out-of-sample errorCross-validation
- shrink coefficients, keep allRidge regression
- shrink and select variablesLasso
- blend selection and shrinkageElastic net
- Bias-variance tradeoff
-
The balance where a too-simple model underfits and a too-flexible model overfits, with prediction error their sum plus irreducible noise. in the pathway →
\[\text{expected prediction error} = \text{bias}^2 + \text{variance} + \text{irreducible noise}\]
where \(\text{expected prediction error}\) is the average error on new data; \(\text{bias}^2\) is the squared error from a model too simple to capture the signal; \(\text{variance}\) is the error from a model flexible enough to chase noise; \(\text{irreducible noise}\) is the variation no model can remove.
On the pathway · 02 · Model · Bias-variance and regularizationWhich concept or penalty?
- the overall ideaBias-variance and regularization
- the underlying error tradeoffBias-variance tradeoff
- fitting noise, poor generalizationOverfitting
- penalizing complexity broadlyRegularization
- estimating out-of-sample errorCross-validation
- shrink coefficients, keep allRidge regression
- shrink and select variablesLasso
- blend selection and shrinkageElastic net
- BIC
-
An information criterion like AIC but penalizing each extra parameter more heavily, so it favors smaller models, where lower is better. in the pathway →
On the pathway · 03 · Estimate · Model fit, comparison, and prediction errorWhich fit or error measure do you need?
- the overall ideaModel fit, comparison, and prediction error
- variance explained, linear modelR-squared
- pseudo-fit for logistic modelMcFadden’s pseudo-R-squared
- compare models, penalize parametersAIC
- compare models, penalize more heavilyBIC
- prediction error, average magnitudeMean absolute error
- prediction error, penalize large missesRMSE
- Binomial distribution
-
The distribution of counts of successes. in the pathway →
On the pathway · 01 · Measurement · Probability distributions and the CLTWhich distribution or sampling result?
- the overall ideaProbability distributions
- continuous bell-shaped variableNormal distribution
- fixed-trial success countsBinomial distribution
- counts of rare eventsPoisson distribution
- overdispersed count dataNegative binomial distribution
- why sample means turn normalCentral limit theorem
- spread of a sample estimateStandard error
- Bivariate tests
-
Classical tests of whether two variables are associated, each a special case of a regression model. in the pathway →
On the pathway · 02 · Model · Bivariate tests (t-test, chi-square)Which two-variable association are you testing?
- the overall family of testsBivariate tests
- mean across two groupsT-test
- means across three or more groupsANOVA
- ranks across two groupsMann-Whitney test
- ranks across three or more groupsKruskal-Wallis test
- two categorical variables, large countsChi-square test
- two categorical variables, small countsFisher’s exact test
- linear correlation of two continuousPearson correlation
- monotonic correlation, rankedSpearman correlation
- concordance-based rank correlationKendall’s tau
- effect size for two meansCohen’s d
- effect size for ANOVAEta-squared
- effect size for categorical associationCramer’s V
- Bland-Altman plot
-
A plot of differences against means that reveals systematic disagreement two methods can have despite high correlation. in the pathway →
On the pathway · 01 · Measurement · Reliability and validityWhich measurement property are you assessing?
- the overall ideaReliability and validity
- consistency of measurementReliability
- measuring the intended constructValidity
- internal consistency of scale itemsCronbach’s alpha
- agreement on continuous measuresIntraclass correlation
- plot method agreement and biasBland-Altman plot
- categorical agreement, two ratersCohen’s kappa
- ordered-category agreement, two ratersWeighted kappa
- categorical agreement, many ratersFleiss’ kappa
- Blinding
-
Keeping patients, clinicians, and outcome assessors unaware of the assigned arm to prevent the bias that knowing it introduces. in the pathway →
On the pathway · 00 · Framing · Randomization and blindingWhat allocation or masking concern?
- the overall ideaRandomization and blinding
- hide the upcoming assignmentAllocation concealment
- balance arms in small chunksBlock randomization
- balance within prognostic strataStratified randomization
- dynamically balance many factorsMinimization
- mask treatment after allocationBlinding
- Block randomization
-
Permuted-block randomization that keeps trial arms close to equal in size as enrollment proceeds. in the pathway →
On the pathway · 00 · Framing · Randomization and blindingWhat allocation or masking concern?
- the overall ideaRandomization and blinding
- hide the upcoming assignmentAllocation concealment
- balance arms in small chunksBlock randomization
- balance within prognostic strataStratified randomization
- dynamically balance many factorsMinimization
- mask treatment after allocationBlinding
- Bonferroni correction
-
Dividing alpha by the number of tests to control the family-wise error rate. in the pathway →
On the pathway · 02 · Model · Multiplicity controlHow do you control multiple testing?
- the overall ideaMultiplicity control
- bound any false positiveFamily-wise error rate
- bound false positives among rejectionsFalse-discovery rate
- simple conservative FWER divisorBonferroni correction
- stepwise FWER controlHolm’s procedure
- step-up FDR controlBenjamini-Hochberg
- test hypotheses in ordered familiesGatekeeping procedure
- Boosting
-
Fitting trees in sequence, each correcting the last’s residuals, to lower bias, as in gradient boosting and XGBoost. in the pathway →
On the pathway · 02 · Model · Learning algorithms and ensemblesWhich learner or ensemble fits?
- the overall ideaLearning algorithms and ensembles
- single interpretable splitsDecision tree (machine learning)
- classify by closest neighborsK-nearest neighbours
- maximum-margin separating boundarySupport vector machine
- average parallel bootstrapped modelsBagging
- many decorrelated bagged treesRandom forest
- sequentially correct prior errorsBoosting
- Bootstrap and resampling methods
-
Repeatedly resample the observed data with replacement and recompute the estimate, building an empirical sampling distribution for intervals when analytic standard errors are awkward. in the pathway →
On the pathway · 02 · Model · Sparse data and resamplingAre cells sparse or analytic standard errors doubtful?
- the overall familySparse data and resampling
- separation or small samplesFirth penalized regression
- very sparse, exact inferenceExact logistic regression
- no clean closed-form varianceBootstrap and resampling methods
- public-health impact measuresAttributable risk and population attributable fraction (PAF)
- Budget impact analysis
-
Projecting the total cost to a specific budget holder of adopting an intervention across the eligible population over a near-term horizon under realistic uptake. in the pathway →
On the pathway · 05 · Decision rule · Budget impact analysisWhat affordability question are you in?
- estimating the cost of adoptionBudget impact analysis
C
- Calibration
-
Whether a model’s predicted risks match observed event rates, read off a calibration plot or tested with goodness-of-fit. in the pathway →
On the pathway · 03 · Estimate · Calibration versus discriminationWhich aspect of predictive performance?
- the overall ideaCalibration versus discrimination
- ranking cases above non-casesDiscrimination
- summarizing ranking across thresholdsAUC
- predicted risks match observedCalibration
- testing calibration formallyHosmer-Lemeshow statistic
- Calibration (modeling)
-
Tuning an unobservable parameter until a model’s outputs match observed targets, with the resulting uncertainty carried forward. in the pathway →
On the pathway · 05 · Decision rule · Model validation and calibrationWhat situation?
- the overall ideaModel validation and calibration
- tuning model outputs to realityCalibration (modeling)
- confirming the model runs correctlyVerification
- Calibration versus discrimination
-
Discrimination asks whether a model ranks higher-risk patients above lower-risk ones, while calibration asks whether predicted risks match observed rates. in the pathway →
On the pathway · 03 · Estimate · Calibration versus discriminationWhich aspect of predictive performance?
- the overall ideaCalibration versus discrimination
- ranking cases above non-casesDiscrimination
- summarizing ranking across thresholdsAUC
- predicted risks match observedCalibration
- testing calibration formallyHosmer-Lemeshow statistic
- Case-cohort design
-
Samples a random subcohort plus all cases, letting one comparison group serve several outcomes from the same source population. in the pathway →
On the pathway · 00 · Framing · Observational study designsWhich observational design fits the question and dominant bias?
- the overall familyObservational study designs
- exposure known, follow forwardCohort study
- snapshot at one timeCross-sectional study
- rare outcome, look backCase-control study
- controls sampled within cohortNested case-control
- random subcohort, multiple outcomesCase-cohort design
- transient trigger, acute eventCase-crossover design
- within-person rate comparisonSelf-controlled case series (SCCS)
- corrects exposure time trendsCase-time-control design
- initiators, active comparatorActive-comparator new-user design
- standing source populationDisease registry
- Case-control study
-
Starts from outcome status, comparing prior exposure in cases versus controls; efficient for rare outcomes and long latencies. in the pathway →
On the pathway · 00 · Framing · Observational study designsWhich observational design fits the question and dominant bias?
- the overall familyObservational study designs
- exposure known, follow forwardCohort study
- snapshot at one timeCross-sectional study
- rare outcome, look backCase-control study
- controls sampled within cohortNested case-control
- random subcohort, multiple outcomesCase-cohort design
- transient trigger, acute eventCase-crossover design
- within-person rate comparisonSelf-controlled case series (SCCS)
- corrects exposure time trendsCase-time-control design
- initiators, active comparatorActive-comparator new-user design
- standing source populationDisease registry
- Case-crossover design
-
Self-controlled design comparing a person’s exposure shortly before an event to their own earlier reference periods, suited to transient triggers. in the pathway →
On the pathway · 00 · Framing · Observational study designsWhich observational design fits the question and dominant bias?
- the overall familyObservational study designs
- exposure known, follow forwardCohort study
- snapshot at one timeCross-sectional study
- rare outcome, look backCase-control study
- controls sampled within cohortNested case-control
- random subcohort, multiple outcomesCase-cohort design
- transient trigger, acute eventCase-crossover design
- within-person rate comparisonSelf-controlled case series (SCCS)
- corrects exposure time trendsCase-time-control design
- initiators, active comparatorActive-comparator new-user design
- standing source populationDisease registry
- Case-time-control design
-
Adds a control group to the case-crossover design to adjust for exposure trends over calendar time. in the pathway →
On the pathway · 00 · Framing · Observational study designsWhich observational design fits the question and dominant bias?
- the overall familyObservational study designs
- exposure known, follow forwardCohort study
- snapshot at one timeCross-sectional study
- rare outcome, look backCase-control study
- controls sampled within cohortNested case-control
- random subcohort, multiple outcomesCase-cohort design
- transient trigger, acute eventCase-crossover design
- within-person rate comparisonSelf-controlled case series (SCCS)
- corrects exposure time trendsCase-time-control design
- initiators, active comparatorActive-comparator new-user design
- standing source populationDisease registry
- Causal designs without randomization
-
A set of designs, each neutralizing a specific dominant threat to causal inference, matched to the threat endangering the question. in the pathway →
On the pathway · 02 · Model · Causal designs without randomizationWhich quasi-experimental design fits?
- the overall ideaCausal designs without randomization
- before-after across exposed and controlDifference-in-differences
- a haphazard nudge to exposureInstrumental variables
- assignment by a cutoff thresholdRegression discontinuity
- weighted donors build a counterfactualSynthetic control
- Causal diagrams
-
A directed acyclic graph of assumed causal effects that sorts each covariate into a confounder, mediator, or collider. in the pathway →
On the pathway · 02 · Model · Causal diagrams (DAGs) and conceptual frameworksWhich causal-diagram concept?
- the overall frameworkCausal diagrams
- the graph notation itselfDAG
- common cause of exposure and outcomeConfounder
- variable on the causal pathMediator
- common effect, conditioning opens biasCollider
- rule for sufficient adjustment setsBack-door criterion
- Causal estimators
-
Methods that turn a fixed design and adjustment set into a number, including propensity scores and g-methods. in the pathway →
On the pathway · 02 · Model · Causal estimators (propensity scores, g-methods)How do you estimate the causal effect?
- the overall ideaCausal estimators
- model treatment assignment probabilityPropensity score
- reweight by inverse treatment probabilityIPTW
- model and average outcomesG-formula
- combine outcome and treatment modelsDoubly-robust estimators
- targeted machine-learning estimationTMLE
- Cause-specific hazard
-
Instantaneous event rate among patients still at risk, used to study etiology and biological mechanism. in the pathway →
On the pathway · 03 · Estimate · Competing risks and parametric survivalDoes a competing event block the outcome?
- modeling time to eventCompeting risks and survival models
- a competing event existsCompeting risks
- you want etiologyCause-specific hazard
- you want absolute riskCumulative incidence function (CIF)
- modeling absolute risk directlyFine-Gray subdistribution hazard
- proportional hazards failsAccelerated failure time (AFT) models
- some patients are curedCure models
- CDISC SDTM
-
A CDISC standard for structuring clinical trial tabulation data. in the pathway →
On the pathway · 01 · Measurement · Data standards and provenanceWhich data standard or provenance layer?
- the overall idea of standards and provenanceData standards and provenance
- clinical coding terminology for findingsSNOMED
- regulatory model for collected trial dataCDISC SDTM
- analysis-ready dataset standardADaM
- Central limit theorem
-
The result that the mean of a large enough sample is approximately normal whatever the underlying shape, enabling z- and t-based inference. in the pathway →
On the pathway · 01 · Measurement · Probability distributions and the CLTWhich distribution or sampling result?
- the overall ideaProbability distributions
- continuous bell-shaped variableNormal distribution
- fixed-trial success countsBinomial distribution
- counts of rare eventsPoisson distribution
- overdispersed count dataNegative binomial distribution
- why sample means turn normalCentral limit theorem
- spread of a sample estimateStandard error
- Certainty of evidence (GRADE)
-
Rating how much confidence a body of evidence warrants, separately from effect size, downgrading for risk of bias, inconsistency, indirectness, imprecision, and publication bias. in the pathway →
On the pathway · 04 · Synthesis · Certainty of evidence (GRADE)Which certainty-of-evidence concept is in play?
- rating confidence in estimatesCertainty of evidence (GRADE)
- the named frameworkGRADE
- Characterizing the distribution
-
Examining what you measured, its shape, spread, and relationships, before assuming a model or summary is honest. in the pathway →
On the pathway · 01 · Measurement · Characterizing the distributionWhat shape feature?
- the overall ideaCharacterizing the distribution
- asymmetry of the distributionSkewness
- heaviness of the tailsKurtosis
- smoothing a nonlinear trendLOESS smoother
- Charlson comorbidity index (CCI)
-
A weighted count of selected serious conditions, originally calibrated to predict one-year mortality, used as a single comorbidity summary. in the pathway →
On the pathway · 01 · Measurement · Comorbidity and frailty adjustmentHow do I adjust for how sick patients already were?
- the overall ideaComorbidity and frailty adjustment
- you need a mortality-weighted scoreCharlson comorbidity index (CCI)
- you want broad comorbidity coverageElixhauser comorbidity measures
- patients are older or frailClaims-based frailty index
- Checking model assumptions
-
The diagnostics for the checkable statistical assumptions of a regression, distinct from a causal identifying assumption. in the pathway →
On the pathway · 02 · Model · Checking model assumptionsWhich model assumption to check?
- the overall ideaChecking model assumptions
- non-constant residual varianceHeteroscedasticity
- predictors too collinearVariance inflation factor
- single points driving the fitCook’s distance
- hazard ratio constant over timeProportional hazards
- test that proportionality formallySchoenfeld residuals
- fix variance without refittingRobust standard errors
- CHEERS
-
The reporting checklist for economic evaluations, the economic-evaluation member of the reporting-standards family. in the pathway →
On the pathway · 05 · Decision rule · Perspective and the reference caseWhose costs and benefits count?
- the overall ideaPerspective and the reference case
- standardized analysis conventionsReference case
- reporting checklist for economicsCHEERS
- count all costs to societySocietal perspective
- value of foregone alternativesOpportunity cost
- Chi-square test
-
A test of independence between two categorical variables, with Fisher’s exact test used when cell counts are small. in the pathway →
On the pathway · 02 · Model · Bivariate tests (t-test, chi-square)Which two-variable association are you testing?
- the overall family of testsBivariate tests
- mean across two groupsT-test
- means across three or more groupsANOVA
- ranks across two groupsMann-Whitney test
- ranks across three or more groupsKruskal-Wallis test
- two categorical variables, large countsChi-square test
- two categorical variables, small countsFisher’s exact test
- linear correlation of two continuousPearson correlation
- monotonic correlation, rankedSpearman correlation
- concordance-based rank correlationKendall’s tau
- effect size for two meansCohen’s d
- effect size for ANOVAEta-squared
- effect size for categorical associationCramer’s V
- Choosing a prior
-
Selecting the distribution encoding belief before the data, the most attacked part of a Bayesian analysis. in the pathway →
On the pathway · 02 · Model · Choosing a priorWhat kind of prior do you need?
- the overall choiceChoosing a prior
- prior matched to the likelihoodConjugate prior
- strong external informationInformative prior
- light regularizing informationWeakly-informative prior
- Choosing the estimand
-
Naming the exact quantity to be estimated, which effect and in whom, before choosing the method. in the pathway →
On the pathway · 02 · Model · Choosing the estimandWhose causal effect do you target?
- the overall ideaChoosing the estimand
- the formal target quantityEstimand
- effect across the whole populationATE
- effect among the treatedATT
- effect among compliers onlyLATE
- Claims and coding standards
-
The coded vocabularies behind each claim field, where analysis depends on knowing what each captures and how they map to one another. in the pathway →
On the pathway · 01 · Measurement · Claims and coding standardsWhich vocabulary encodes each claim field, and what does it capture?
- the overall ideaClaims and coding standards
- diagnosesICD-10-CM diagnosis codes
- inpatient proceduresICD-10-PCS procedure codes
- professional servicesCPT/HCPCS codes
- dispensed drugsNDC (National Drug Code)
- labs and observationsLOINC lab codes
- providerNPI (provider identifier)
- cross-database mappingOMOP standardized vocabularies (OHDSI)
- translating codesCode crosswalks and mappings
- drug utilizationATC and defined daily dose (DDD)
- Claims data
-
Billing-driven encounter and prescription data covering a payer’s population broadly, where a code is a bill not a diagnosis and clinical detail is thin. in the pathway →
On the pathway · 01 · Measurement · Data sources and their tradeoffsWhich data source or pitfall?
- the overall ideaData sources and their tradeoffs
- clinical detail from care recordsElectronic health record data
- billing records across encountersClaims data
- enrolled cohorts for a conditionRegistries
- sampled population questionnairesSurvey data
- group-level inference pitfallEcological fallacy
- Claims-based frailty index
-
A frailty proxy built from diagnosis and service codes, approximating functional decline when direct frailty assessment is unavailable in data. in the pathway →
On the pathway · 01 · Measurement · Comorbidity and frailty adjustmentHow do I adjust for how sick patients already were?
- the overall ideaComorbidity and frailty adjustment
- you need a mortality-weighted scoreCharlson comorbidity index (CCI)
- you want broad comorbidity coverageElixhauser comorbidity measures
- patients are older or frailClaims-based frailty index
- Claims/EHR phenotype algorithm
-
A rule mapping recorded codes and encounters to a presumed clinical event or condition. in the pathway →
On the pathway · 01 · Measurement · Outcome phenotyping and algorithm validationIs your outcome a validated algorithm or an unchecked code rule?
- the overall ideaOutcome phenotyping and validation
- the rule itselfClaims/EHR phenotype algorithm
- common coding rule1-inpatient / 2-outpatient rule
- tuning the ruleAlgorithm validation (PPV and sensitivity tradeoff)
- reference standardEndpoint adjudication and chart review
- bundled outcomesComposite endpoint construction
- Classification performance metrics
-
Measures read off the confusion matrix of predicted versus actual, including precision, recall, and F1. in the pathway →
On the pathway · 02 · Model · Classification performance metricsWhich classification metric?
- the overall ideaClassification performance metrics
- share of predicted positives correctPrecision
- share of true positives caughtRecall
- balance precision and recallF1 score
- tradeoff across all thresholdsPrecision-recall curve
- Clinical equipoise
-
Genuine uncertainty in the expert community about which trial arm is better, the ethical license to randomize patients. in the pathway →
On the pathway · § · Conduct it · Research ethics and the IRBWhich ethics concept or body?
- the overall ideaResearch ethics and the IRB
- foundational ethical principlesBelmont principles
- genuine uncertainty justifying a trialClinical equipoise
- participant’s voluntary agreementInformed consent
- body that reviews and approves studiesInstitutional review board
- Clone-censor-weight
-
A per-protocol target-trial method that clones patients into each strategy, censors deviators, and reweights to avoid immortal time bias. in the pathway →
On the pathway · 02 · Model · Real-world causal-inference extensionsHow do causal methods scale to claims and time?
- the overall ideaReal-world causal-inference extensions
- you have many candidate covariatesHigh-dimensional propensity score (hdPS)
- outcome modeling suits the problemDisease risk score
- treatment strategy unfolds over timeClone-censor-weight
- censoring depends on covariatesInverse-probability-of-censoring weighting (IPCW)
- you need a simple guardLandmark analysis
- Cluster sampling
-
Drawing whole groups such as schools or blocks to cut field cost when no list of individuals exists. in the pathway →
On the pathway · 01 · Measurement · Survey sampling designHow do you draw the sample?
- the overall ideaSurvey sampling design
- every unit known nonzero chanceProbability sample
- equal-chance draw from frameSimple random sampling
- sample within population strataStratified sampling
- sample whole groups togetherCluster sampling
- sample in successive nested stagesMultistage sampling
- variance inflation from clusteringDesign effect
- Clustering
-
Grouping similar observations, used for phenotyping disease subtypes from a panel of measurements. in the pathway →
On the pathway · 02 · Model · Unsupervised learningWhat unlabeled-data structure are you finding?
- the overall familyUnsupervised learning
- grouping similar observationsClustering
- nested grouping by linkageHierarchical clustering
- partitioning into k groupsK-means
- reducing the number of featuresDimensionality reduction
- orthogonal variance componentsPrincipal component analysis
- Cochran’s Q
-
A statistical test for heterogeneity across studies in a meta-analysis. in the pathway →
On the pathway · 04 · Synthesis · Meta-analysis and poolingWhich pooling or heterogeneity tool?
- the overall ideaMeta-analysis and pooling
- one true effect assumedFixed-effect meta-analysis
- effects vary across studiesRandom-effects meta-analysis
- how much effects varyHeterogeneity
- testing for heterogeneityCochran’s Q
- proportion of variance from heterogeneityI-squared
- between-study variance estimateTau-squared
- range for a new studyPrediction interval
- explaining heterogeneity by covariatesMeta-regression
- visualizing small-study effectsFunnel plot
- testing funnel asymmetryEgger’s test
- Code crosswalks and mappings
-
Lookup tables translating one vocabulary into another, each translation lossy in known ways. in the pathway →
On the pathway · 01 · Measurement · Claims and coding standardsWhich vocabulary encodes each claim field, and what does it capture?
- the overall ideaClaims and coding standards
- diagnosesICD-10-CM diagnosis codes
- inpatient proceduresICD-10-PCS procedure codes
- professional servicesCPT/HCPCS codes
- dispensed drugsNDC (National Drug Code)
- labs and observationsLOINC lab codes
- providerNPI (provider identifier)
- cross-database mappingOMOP standardized vocabularies (OHDSI)
- translating codesCode crosswalks and mappings
- drug utilizationATC and defined daily dose (DDD)
- Cohen’s d
-
The effect-size measure accompanying a t-test. in the pathway →
On the pathway · 02 · Model · Bivariate tests (t-test, chi-square)Which two-variable association are you testing?
- the overall family of testsBivariate tests
- mean across two groupsT-test
- means across three or more groupsANOVA
- ranks across two groupsMann-Whitney test
- ranks across three or more groupsKruskal-Wallis test
- two categorical variables, large countsChi-square test
- two categorical variables, small countsFisher’s exact test
- linear correlation of two continuousPearson correlation
- monotonic correlation, rankedSpearman correlation
- concordance-based rank correlationKendall’s tau
- effect size for two meansCohen’s d
- effect size for ANOVAEta-squared
- effect size for categorical associationCramer’s V
- Cohen’s kappa
-
A measure of two raters’ categorical agreement corrected for what chance alone would produce. in the pathway →
\[\kappa = \frac{p_o - p_e}{1 - p_e}\]
where \(\kappa\) is Cohen’s kappa, the chance-corrected agreement; \(p_o\) is the observed agreement between the two raters; \(p_e\) is the agreement expected if the raters labelled independently.
On the pathway · 01 · Measurement · Reliability and validityWhich measurement property are you assessing?
- the overall ideaReliability and validity
- consistency of measurementReliability
- measuring the intended constructValidity
- internal consistency of scale itemsCronbach’s alpha
- agreement on continuous measuresIntraclass correlation
- plot method agreement and biasBland-Altman plot
- categorical agreement, two ratersCohen’s kappa
- ordered-category agreement, two ratersWeighted kappa
- categorical agreement, many ratersFleiss’ kappa
- Cohort study
-
Follows defined people forward from exposure to outcome; prospective when assembled before outcomes occur, retrospective when reconstructed from existing records. in the pathway →
On the pathway · 00 · Framing · Observational study designsWhich observational design fits the question and dominant bias?
- the overall familyObservational study designs
- exposure known, follow forwardCohort study
- snapshot at one timeCross-sectional study
- rare outcome, look backCase-control study
- controls sampled within cohortNested case-control
- random subcohort, multiple outcomesCase-cohort design
- transient trigger, acute eventCase-crossover design
- within-person rate comparisonSelf-controlled case series (SCCS)
- corrects exposure time trendsCase-time-control design
- initiators, active comparatorActive-comparator new-user design
- standing source populationDisease registry
- Collider
-
A common effect of two variables, where adjusting actively opens bias rather than removing it. in the pathway →
On the pathway · 02 · Model · Causal diagrams (DAGs) and conceptual frameworksWhich causal-diagram concept?
- the overall frameworkCausal diagrams
- the graph notation itselfDAG
- common cause of exposure and outcomeConfounder
- variable on the causal pathMediator
- common effect, conditioning opens biasCollider
- rule for sufficient adjustment setsBack-door criterion
- Comorbidity and frailty adjustment
-
Summarizing a patient’s baseline illness burden from claims into a validated score used to adjust for confounding by underlying health. in the pathway →
On the pathway · 01 · Measurement · Comorbidity and frailty adjustmentHow do I adjust for how sick patients already were?
- the overall ideaComorbidity and frailty adjustment
- you need a mortality-weighted scoreCharlson comorbidity index (CCI)
- you want broad comorbidity coverageElixhauser comorbidity measures
- patients are older or frailClaims-based frailty index
- Competing risks
-
A setting where one event, such as death, prevents the event of interest from ever occurring. in the pathway →
On the pathway · 03 · Estimate · Competing risks and parametric survivalDoes a competing event block the outcome?
- modeling time to eventCompeting risks and survival models
- a competing event existsCompeting risks
- you want etiologyCause-specific hazard
- you want absolute riskCumulative incidence function (CIF)
- modeling absolute risk directlyFine-Gray subdistribution hazard
- proportional hazards failsAccelerated failure time (AFT) models
- some patients are curedCure models
- Competing risks and survival models
-
Methods for time-to-event data where competing events block the outcome or where parametric forms replace the proportional hazards assumption. in the pathway →
On the pathway · 03 · Estimate · Competing risks and parametric survivalDoes a competing event block the outcome?
- modeling time to eventCompeting risks and survival models
- a competing event existsCompeting risks
- you want etiologyCause-specific hazard
- you want absolute riskCumulative incidence function (CIF)
- modeling absolute risk directlyFine-Gray subdistribution hazard
- proportional hazards failsAccelerated failure time (AFT) models
- some patients are curedCure models
- Complex-sample design and survey weighting
-
Design-aware analysis using survey weights, strata, and primary sampling units so an oversampled, clustered sample speaks for its population. in the pathway →
On the pathway · 01 · Measurement · Complex-sample design and survey weightingWhich weighting or design adjustment?
- the overall ideaComplex-sample design and survey weighting
- scale respondents to the populationSurvey weight
- precision lost to the designEffective sample size
- Composite endpoint construction
-
Combining several outcome phenotypes into one variable, where the weakest component dominates overall measurement error. in the pathway →
On the pathway · 01 · Measurement · Outcome phenotyping and algorithm validationIs your outcome a validated algorithm or an unchecked code rule?
- the overall ideaOutcome phenotyping and validation
- the rule itselfClaims/EHR phenotype algorithm
- common coding rule1-inpatient / 2-outpatient rule
- tuning the ruleAlgorithm validation (PPV and sensitivity tradeoff)
- reference standardEndpoint adjudication and chart review
- bundled outcomesComposite endpoint construction
- Composite strategy
-
An intercurrent-event strategy that folds the event into the endpoint. in the pathway →
On the pathway · 02 · Model · Trial estimands and intercurrent eventsHow do you handle intercurrent events?
- the overall ideaTrial estimands and intercurrent events
- events disrupting outcome interpretationIntercurrent events
- ignore them, use assigned treatmentTreatment-policy strategy
- imagine they did not occurHypothetical strategy
- fold event into the outcomeComposite strategy
- restrict to a defined subpopulationPrincipal-stratum strategy
- Conditional independence
-
The unverifiable assumption underlying propensity-score methods. in the pathway →
On the pathway · 02 · Model · Identifying assumptionsWhich identifying assumption do you need?
- the overall ideaIdentifying assumptions
- treatment independent of confoundersConditional independence
- instrument affects outcome only via exposureExclusion restriction
- groups would have tracked togetherParallel trends
- Conducting a systematic review
-
A protocol-driven, pre-registered search with reproducible strings, dual independent screening, structured extraction, and a PRISMA flow diagram accounting for every record. in the pathway →
On the pathway · 04 · Synthesis · Conducting a systematic reviewWhat situation?
- the overall ideaConducting a systematic review
- registering the review protocolPROSPERO
- Confidence interval
-
The range of values compatible with the data around a point estimate, frequently misread as a direct probability statement about the true value. in the pathway →
On the pathway · 03 · Estimate · Uncertainty and inferenceHow to express estimate uncertainty?
- the overall ideaUncertainty and inference
- a plausible range for the estimateConfidence interval
- Confounder
-
A common cause of exposure and outcome, which you adjust for. in the pathway →
On the pathway · 02 · Model · Causal diagrams (DAGs) and conceptual frameworksWhich causal-diagram concept?
- the overall frameworkCausal diagrams
- the graph notation itselfDAG
- common cause of exposure and outcomeConfounder
- variable on the causal pathMediator
- common effect, conditioning opens biasCollider
- rule for sufficient adjustment setsBack-door criterion
- Confounding
-
A common cause of exposure and outcome that distorts the estimate, with confounding by indication the clinical archetype. in the pathway →
On the pathway · ∗ · Defend it · Study biases, by rungWhich bias is threatening the study?
- the overall map of biasesStudy biases, by rung
- who got sampled or enrolledSampling bias
- distorted entry into the studySelection bias
- selection among hospitalized patientsBerkson’s bias
- conditioning on a common effectOver-adjustment
- employed groups appear healthierHealthy-worker effect
- only survivors are observedSurvivorship bias
- differential loss to follow-upAttrition bias
- nonresponders differ systematicallyNonresponse bias
- a common cause of exposure and outcomeConfounding
- treatment chosen by prognosisConfounding by indication
- unequal outcome ascertainmentDetection bias
- inaccurate recall of exposureRecall bias
- interviewer shapes responsesInterviewer bias
- earlier detection inflates survivalLead-time bias
- slow cases preferentially detectedLength-time bias
- detecting harmless diseaseOverdiagnosis
- selective reporting of resultsPublication bias
- Confounding by indication
-
The clinical archetype of confounding, or channeling, where the reason for treatment also predicts the outcome. in the pathway →
On the pathway · ∗ · Defend it · Study biases, by rungWhich bias is threatening the study?
- the overall map of biasesStudy biases, by rung
- who got sampled or enrolledSampling bias
- distorted entry into the studySelection bias
- selection among hospitalized patientsBerkson’s bias
- conditioning on a common effectOver-adjustment
- employed groups appear healthierHealthy-worker effect
- only survivors are observedSurvivorship bias
- differential loss to follow-upAttrition bias
- nonresponders differ systematicallyNonresponse bias
- a common cause of exposure and outcomeConfounding
- treatment chosen by prognosisConfounding by indication
- unequal outcome ascertainmentDetection bias
- inaccurate recall of exposureRecall bias
- interviewer shapes responsesInterviewer bias
- earlier detection inflates survivalLead-time bias
- slow cases preferentially detectedLength-time bias
- detecting harmless diseaseOverdiagnosis
- selective reporting of resultsPublication bias
- Conjugate prior
-
A prior chosen so the posterior shares its form and the update is closed-form, such as a beta prior with a binomial likelihood. in the pathway →
On the pathway · 02 · Model · Choosing a priorWhat kind of prior do you need?
- the overall choiceChoosing a prior
- prior matched to the likelihoodConjugate prior
- strong external informationInformative prior
- light regularizing informationWeakly-informative prior
- Consensus methods (Delphi, nominal group)
-
Formal methods for a panel to converge on a recommendation when evidence underdetermines it, including the Delphi method, nominal group technique, and RAND/UCLA method. in the pathway →
On the pathway · 06 · Recommendation · Consensus methods (Delphi, nominal group)How do experts reach consensus?
- the overall ideaConsensus methods (Delphi, nominal group)
- anonymous iterative roundsDelphi method
- structured in-person rankingNominal group technique
- Consistency
-
The identifiability condition that the treatment is a well-defined intervention so a potential outcome means something specific. in the pathway →
On the pathway · 02 · Model · Potential outcomes and identifiabilityWhich identifiability condition is at stake?
- the overall frameworkPotential outcomes and identifiability
- the counterfactual setupPotential-outcomes framework
- only one outcome is observedFundamental problem of causal inference
- treated and untreated comparableExchangeability
- every covariate stratum has bothPositivity
- observed equals counterfactual under treatmentConsistency
- no interference, single versionSUTVA
- CONSORT
-
The reporting checklist for randomized trials. in the pathway →
On the pathway · 06 · Recommendation · Reporting standardsWhich study type are you reporting?
- the overall ideaReporting standards
- randomized controlled trialCONSORT
- observational studySTROBE
- systematic reviewPRISMA
- prediction model studyTRIPOD
- Continual reassessment method
-
A model-based phase I design estimating the maximum tolerated dose more efficiently with fewer patients overdosed. in the pathway →
On the pathway · 00 · Framing · Dose-finding and early-phase designsWhich early-phase design question are you facing?
- the overall design familyDose-finding and early-phase designs
- dose escalation by fixed cohort rule3+3 design
- model-based dose escalationContinual reassessment method
- the highest acceptably safe doseMaximum tolerated dose
- phase II screening for efficacySimon’s two-stage design
- Continuous enrollment and observable time
-
Requiring uninterrupted coverage so that a patient’s care is captured, letting absence of a code mean absence of care. in the pathway →
On the pathway · 01 · Measurement · Data feasibility, enrollment, and linkageCan this data actually answer my question?
- the overall ideaData feasibility, enrollment, and linkage
- you need observable follow-upContinuous enrollment and observable time
- you must size the populationDatabase feasibility and the attrition funnel
- you join multiple datasetsPrivacy-preserving record linkage (tokenization)
- Contrast
-
A weighted sum of coefficients estimating a quantity such as a subgroup effect when the model carries an interaction. in the pathway →
On the pathway · 02 · Model · Linear combinations and contrastsWhich comparison of model terms?
- the overall ideaLinear combinations and contrasts
- a specific weighted group comparisonContrast
- Cook’s distance
-
A diagnostic for influential points in a regression. in the pathway →
On the pathway · 02 · Model · Checking model assumptionsWhich model assumption to check?
- the overall ideaChecking model assumptions
- non-constant residual varianceHeteroscedasticity
- predictors too collinearVariance inflation factor
- single points driving the fitCook’s distance
- hazard ratio constant over timeProportional hazards
- test that proportionality formallySchoenfeld residuals
- fix variance without refittingRobust standard errors
- Cost-benefit analysis
-
Economic evaluation that monetizes the health benefit so it can be compared directly with cost. in the pathway →
On the pathway · 05 · Decision rule · Cost-effectiveness and the ICERWhich economic-evaluation framing fits?
- the overall ideaCost-effectiveness and the ICER
- extra cost per extra effectICER
- value costs and benefits in moneyCost-benefit analysis
- effects identical, compare costs onlyCost-minimization analysis
- effects in quality-adjusted life yearsCost-utility analysis
- value at a willingness thresholdNet monetary benefit
- maximum payable per unit benefitWillingness-to-pay threshold
- Cost-effectiveness acceptability curve
-
A curve reading off the probability that each option is the best buy at each willingness-to-pay threshold. in the pathway →
On the pathway · 05 · Decision rule · Uncertainty in cost-effectiveness (PSA)How are you handling cost-effectiveness uncertainty?
- the overall ideaUncertainty in cost-effectiveness (PSA)
- propagating parameter uncertaintyProbabilistic sensitivity analysis
- plotting cost and effect differencesCost-effectiveness plane
- probability of being cost-effectiveCost-effectiveness acceptability curve
- Cost-effectiveness alongside a trial
-
Estimating cost-effectiveness directly from a trial’s patient-level cost and outcome data, often via net-benefit regression, with high internal validity but a short horizon. in the pathway →
On the pathway · 05 · Decision rule · Cost-effectiveness alongside a trialWhat situation?
- the overall ideaCost-effectiveness alongside a trial
- regressing net benefit on covariatesNet-benefit regression
- Cost-effectiveness and the ICER
-
Economic evaluation putting cost and benefit on the same page, with the incremental cost-effectiveness ratio judged against a willingness-to-pay threshold. in the pathway →
On the pathway · 05 · Decision rule · Cost-effectiveness and the ICERWhich economic-evaluation framing fits?
- the overall ideaCost-effectiveness and the ICER
- extra cost per extra effectICER
- value costs and benefits in moneyCost-benefit analysis
- effects identical, compare costs onlyCost-minimization analysis
- effects in quality-adjusted life yearsCost-utility analysis
- value at a willingness thresholdNet monetary benefit
- maximum payable per unit benefitWillingness-to-pay threshold
- Cost-effectiveness plane
-
The plane on which a probabilistic analysis plots its cloud of incremental cost-and-effect pairs. in the pathway →
On the pathway · 05 · Decision rule · Uncertainty in cost-effectiveness (PSA)How are you handling cost-effectiveness uncertainty?
- the overall ideaUncertainty in cost-effectiveness (PSA)
- propagating parameter uncertaintyProbabilistic sensitivity analysis
- plotting cost and effect differencesCost-effectiveness plane
- probability of being cost-effectiveCost-effectiveness acceptability curve
- Cost-minimization analysis
-
Economic evaluation that compares only costs, applicable only when the outcomes of the options are genuinely equal. in the pathway →
On the pathway · 05 · Decision rule · Cost-effectiveness and the ICERWhich economic-evaluation framing fits?
- the overall ideaCost-effectiveness and the ICER
- extra cost per extra effectICER
- value costs and benefits in moneyCost-benefit analysis
- effects identical, compare costs onlyCost-minimization analysis
- effects in quality-adjusted life yearsCost-utility analysis
- value at a willingness thresholdNet monetary benefit
- maximum payable per unit benefitWillingness-to-pay threshold
- Cost-utility analysis
-
Economic evaluation measuring benefit in quality-adjusted life years so different conditions become comparable. in the pathway →
On the pathway · 05 · Decision rule · Cost-effectiveness and the ICERWhich economic-evaluation framing fits?
- the overall ideaCost-effectiveness and the ICER
- extra cost per extra effectICER
- value costs and benefits in moneyCost-benefit analysis
- effects identical, compare costs onlyCost-minimization analysis
- effects in quality-adjusted life yearsCost-utility analysis
- value at a willingness thresholdNet monetary benefit
- maximum payable per unit benefitWillingness-to-pay threshold
- Costing methods
-
How the cost in cost-effectiveness is estimated, from micro-costing each resource to gross costing a whole episode, sorted into direct medical, direct non-medical, and indirect costs. in the pathway →
On the pathway · 05 · Decision rule · Costing methodsHow to value the resources used?
- the overall ideaCosting methods
- aggregate top-down unit costsGross costing
- itemized bottom-up resource countsMicro-costing
- value lost productivity over a lifetimeHuman-capital approach
- value productivity loss until replacedFriction-cost approach
- CPT/HCPCS codes
-
Codes for professional services, procedures, and supplies in outpatient and physician billing. in the pathway →
On the pathway · 01 · Measurement · Claims and coding standardsWhich vocabulary encodes each claim field, and what does it capture?
- the overall ideaClaims and coding standards
- diagnosesICD-10-CM diagnosis codes
- inpatient proceduresICD-10-PCS procedure codes
- professional servicesCPT/HCPCS codes
- dispensed drugsNDC (National Drug Code)
- labs and observationsLOINC lab codes
- providerNPI (provider identifier)
- cross-database mappingOMOP standardized vocabularies (OHDSI)
- translating codesCode crosswalks and mappings
- drug utilizationATC and defined daily dose (DDD)
- Cramer’s V
-
An effect-size measure for a chi-square table. in the pathway →
On the pathway · 02 · Model · Bivariate tests (t-test, chi-square)Which two-variable association are you testing?
- the overall family of testsBivariate tests
- mean across two groupsT-test
- means across three or more groupsANOVA
- ranks across two groupsMann-Whitney test
- ranks across three or more groupsKruskal-Wallis test
- two categorical variables, large countsChi-square test
- two categorical variables, small countsFisher’s exact test
- linear correlation of two continuousPearson correlation
- monotonic correlation, rankedSpearman correlation
- concordance-based rank correlationKendall’s tau
- effect size for two meansCohen’s d
- effect size for ANOVAEta-squared
- effect size for categorical associationCramer’s V
- Credible interval
-
A range the parameter lies in with stated probability, a direct probability statement the frequentist interval cannot make. in the pathway →
On the pathway · 02 · Model · Bayesian inferenceWhich Bayesian concept is in play?
- the overall frameworkBayesian inference
- the updating rule itselfBayes’ theorem
- beliefs after seeing dataPosterior distribution
- interval summary of the posteriorCredible interval
- Cronbach’s alpha
-
A gauge of the internal consistency of a multi-item scale. in the pathway →
On the pathway · 01 · Measurement · Reliability and validityWhich measurement property are you assessing?
- the overall ideaReliability and validity
- consistency of measurementReliability
- measuring the intended constructValidity
- internal consistency of scale itemsCronbach’s alpha
- agreement on continuous measuresIntraclass correlation
- plot method agreement and biasBland-Altman plot
- categorical agreement, two ratersCohen’s kappa
- ordered-category agreement, two ratersWeighted kappa
- categorical agreement, many ratersFleiss’ kappa
- Cross-sectional study
-
Measures exposure and outcome at a single point in time, giving prevalence cheaply but rarely establishing temporal order. in the pathway →
On the pathway · 00 · Framing · Observational study designsWhich observational design fits the question and dominant bias?
- the overall familyObservational study designs
- exposure known, follow forwardCohort study
- snapshot at one timeCross-sectional study
- rare outcome, look backCase-control study
- controls sampled within cohortNested case-control
- random subcohort, multiple outcomesCase-cohort design
- transient trigger, acute eventCase-crossover design
- within-person rate comparisonSelf-controlled case series (SCCS)
- corrects exposure time trendsCase-time-control design
- initiators, active comparatorActive-comparator new-user design
- standing source populationDisease registry
- Cross-validation
-
Estimating out-of-sample error on held-out folds to choose the right model flexibility. in the pathway →
On the pathway · 02 · Model · Bias-variance and regularizationWhich concept or penalty?
- the overall ideaBias-variance and regularization
- the underlying error tradeoffBias-variance tradeoff
- fitting noise, poor generalizationOverfitting
- penalizing complexity broadlyRegularization
- estimating out-of-sample errorCross-validation
- shrink coefficients, keep allRidge regression
- shrink and select variablesLasso
- blend selection and shrinkageElastic net
- Crude rate
-
The unadjusted whole-population frequency, which confounds comparisons across populations with different age structures. in the pathway →
On the pathway · 01 · Measurement · Measures of disease frequencyWhat frequency are you trying to measure?
- the overall ideaMeasures of disease frequency
- existing cases at a time pointPrevalence
- new cases over follow-upIncidence
- new cases as a proportion at riskCumulative incidence
- new cases per unit follow-up timeIncidence rate
- denominator of summed follow-upPerson-time
- unadjusted rate in a populationCrude rate
- comparing rates across populationsAge-standardization
- observed versus expected deathsStandardized mortality ratio
- Cumulative incidence
-
The risk of disease: new cases over a fixed period divided by the population at risk. in the pathway →
On the pathway · 01 · Measurement · Measures of disease frequencyWhat frequency are you trying to measure?
- the overall ideaMeasures of disease frequency
- existing cases at a time pointPrevalence
- new cases over follow-upIncidence
- new cases as a proportion at riskCumulative incidence
- new cases per unit follow-up timeIncidence rate
- denominator of summed follow-upPerson-time
- unadjusted rate in a populationCrude rate
- comparing rates across populationsAge-standardization
- observed versus expected deathsStandardized mortality ratio
- Cumulative incidence function (CIF)
-
Probability of experiencing the event by a given time, accounting for competing events that remove patients. in the pathway →
On the pathway · 03 · Estimate · Competing risks and parametric survivalDoes a competing event block the outcome?
- modeling time to eventCompeting risks and survival models
- a competing event existsCompeting risks
- you want etiologyCause-specific hazard
- you want absolute riskCumulative incidence function (CIF)
- modeling absolute risk directlyFine-Gray subdistribution hazard
- proportional hazards failsAccelerated failure time (AFT) models
- some patients are curedCure models
- Cure models
-
Survival models that split the population into a cured fraction and a susceptible fraction with its own distribution. in the pathway →
On the pathway · 03 · Estimate · Competing risks and parametric survivalDoes a competing event block the outcome?
- modeling time to eventCompeting risks and survival models
- a competing event existsCompeting risks
- you want etiologyCause-specific hazard
- you want absolute riskCumulative incidence function (CIF)
- modeling absolute risk directlyFine-Gray subdistribution hazard
- proportional hazards failsAccelerated failure time (AFT) models
- some patients are curedCure models
- Cycle length and the half-cycle correction
-
Two timing choices in a state-transition model: a cycle short enough to miss no important event, and a correction for transitions occurring partway through a cycle. in the pathway →
On the pathway · 05 · Decision rule · Cycle length and the half-cycle correctionWhich cycle-timing issue applies?
- the overall ideaCycle length and the half-cycle correction
- adjust for mid-cycle transitionsHalf-cycle correction
D
- DAG
-
A directed acyclic graph: variables as nodes and assumed causal effects as arrows, with no cycles. in the pathway →
On the pathway · 02 · Model · Causal diagrams (DAGs) and conceptual frameworksWhich causal-diagram concept?
- the overall frameworkCausal diagrams
- the graph notation itselfDAG
- common cause of exposure and outcomeConfounder
- variable on the causal pathMediator
- common effect, conditioning opens biasCollider
- rule for sufficient adjustment setsBack-door criterion
- Data feasibility, enrollment, and linkage
-
Confirming a database can answer the question, that follow-up is observable, and that datasets are joined without exposing patient identities. in the pathway →
On the pathway · 01 · Measurement · Data feasibility, enrollment, and linkageCan this data actually answer my question?
- the overall ideaData feasibility, enrollment, and linkage
- you need observable follow-upContinuous enrollment and observable time
- you must size the populationDatabase feasibility and the attrition funnel
- you join multiple datasetsPrivacy-preserving record linkage (tokenization)
- Data management and reproducibility
-
The discipline between collection and analysis, from clean data capture and a database lock to a scripted, version-controlled pipeline that regenerates the numbers. in the pathway →
On the pathway · § · Conduct it · Data management and reproducibilityWhich data-management step are you at?
- the overall practiceData management and reproducibility
- freezing the dataset before analysisDatabase lock
- Data privacy and security
-
The duty owed to people in health data, governed by HIPAA in the US and GDPR in Europe, with de-identification or synthetic data enabling research sharing. in the pathway →
On the pathway · § · Conduct it · Data privacy and securityWhich rule or method?
- the overall ideaData privacy and security
- US health privacy lawHIPAA
- EU data protection lawGDPR
- de-identify by removing identifiersSafe Harbor
- de-identify by statistical opinionExpert determination
- generating artificial substitute recordsSynthetic data
- Data safety monitoring board
-
An independent board, not the sponsor, that decides whether to stop a trial early for efficacy, futility, or harm. in the pathway →
On the pathway · 00 · Framing · Interim analyses and group-sequential designWhich interim-monitoring element?
- the overall ideaInterim analyses and group-sequential design
- committee reviewing accruing dataData safety monitoring board
- planned looks with stopping rulesGroup-sequential design
- false-positive risk to spendType-I error
- stringent early-look boundaryO’Brien-Fleming boundary
- constant nominal-level boundaryPocock boundary
- Data sources and their tradeoffs
-
Each data source carries a characteristic strength and bias that bounds every question it can answer. in the pathway →
On the pathway · 01 · Measurement · Data sources and their tradeoffsWhich data source or pitfall?
- the overall ideaData sources and their tradeoffs
- clinical detail from care recordsElectronic health record data
- billing records across encountersClaims data
- enrolled cohorts for a conditionRegistries
- sampled population questionnairesSurvey data
- group-level inference pitfallEcological fallacy
- Data standards and provenance
-
The structure a datapoint inherits from how it was recorded, through CDISC standards or coding ontologies. in the pathway →
On the pathway · 01 · Measurement · Data standards and provenanceWhich data standard or provenance layer?
- the overall idea of standards and provenanceData standards and provenance
- clinical coding terminology for findingsSNOMED
- regulatory model for collected trial dataCDISC SDTM
- analysis-ready dataset standardADaM
- Database feasibility and the attrition funnel
-
Counting how many patients survive each eligibility criterion to judge whether a source supports the planned study. in the pathway →
On the pathway · 01 · Measurement · Data feasibility, enrollment, and linkageCan this data actually answer my question?
- the overall ideaData feasibility, enrollment, and linkage
- you need observable follow-upContinuous enrollment and observable time
- you must size the populationDatabase feasibility and the attrition funnel
- you join multiple datasetsPrivacy-preserving record linkage (tokenization)
- Database lock
-
A dated point after which no value in a study database changes silently, marking the clean source for analysis. in the pathway →
On the pathway · § · Conduct it · Data management and reproducibilityWhich data-management step are you at?
- the overall practiceData management and reproducibility
- freezing the dataset before analysisDatabase lock
- Decision tree (decision analysis)
-
A model mapping a one-off choice and its probabilistic consequences, clean for an acute decision but clumsy once events repeat. in the pathway →
On the pathway · 05 · Decision rule · Decision-analytic modelsWhich model structure fits the problem?
- the overall ideaDecision-analytic models
- branching one-time event sequenceDecision tree (decision analysis)
- recurring health states over cyclesMarkov model
- survival curves partition statesPartitioned survival model
- infection spread depends on prevalenceDynamic transmission model
- weight future values lowerDiscounting
- Decision tree (machine learning)
-
A predictor splitting predictors into regions, interpretable but unstable on its own. in the pathway →
On the pathway · 02 · Model · Learning algorithms and ensemblesWhich learner or ensemble fits?
- the overall ideaLearning algorithms and ensembles
- single interpretable splitsDecision tree (machine learning)
- classify by closest neighborsK-nearest neighbours
- maximum-margin separating boundarySupport vector machine
- average parallel bootstrapped modelsBagging
- many decorrelated bagged treesRandom forest
- sequentially correct prior errorsBoosting
- Decision-analytic models
-
Models estimating lifetime costs and QALYs that are rarely observed directly, from decision trees and Markov models to microsimulation and transmission models. in the pathway →
On the pathway · 05 · Decision rule · Decision-analytic modelsWhich model structure fits the problem?
- the overall ideaDecision-analytic models
- branching one-time event sequenceDecision tree (decision analysis)
- recurring health states over cyclesMarkov model
- survival curves partition statesPartitioned survival model
- infection spread depends on prevalenceDynamic transmission model
- weight future values lowerDiscounting
- Decision-curve analysis
-
Weighing the trade-offs of acting on a test or model directly in terms of net benefit across the range of thresholds a clinician might hold. in the pathway →
On the pathway · 05 · Decision rule · Decision-curve analysisWhich clinical-utility concept is in play?
- the overall methodDecision-curve analysis
- utility weighted by thresholdNet benefit
- Delphi method
-
A consensus method where an expert panel answers in iterative anonymous rounds, revising after seeing a statistical summary, so opinion converges without face-to-face pressure. in the pathway →
On the pathway · 06 · Recommendation · Consensus methods (Delphi, nominal group)How do experts reach consensus?
- the overall ideaConsensus methods (Delphi, nominal group)
- anonymous iterative roundsDelphi method
- structured in-person rankingNominal group technique
- Descriptive epidemiology
-
Describing a health event by person, place, and time to generate hypotheses and fix the frequency measure reported. in the pathway →
On the pathway · 00 · Framing · Descriptive epidemiology: person, place, timeWhat situation?
- the overall ideaDescriptive epidemiology
- Design effect
-
The factor by which clustering inflates variance, used to scale up the target sample size to hold the effective sample size. in the pathway →
\[\text{DEFF} = \frac{\text{Var}_{\text{complex}}}{\text{Var}_{\text{SRS}}}\]
where \(\text{DEFF}\) is the design effect, the variance penalty from the complex design; \(\text{Var}_{\text{complex}}\) is the variance under the actual complex sampling design; \(\text{Var}_{\text{SRS}}\) is the variance a simple random sample of the same size would give.
On the pathway · 01 · Measurement · Survey sampling designHow do you draw the sample?
- the overall ideaSurvey sampling design
- every unit known nonzero chanceProbability sample
- equal-chance draw from frameSimple random sampling
- sample within population strataStratified sampling
- sample whole groups togetherCluster sampling
- sample in successive nested stagesMultistage sampling
- variance inflation from clusteringDesign effect
- Detection bias
-
Differential ascertainment of the outcome by exposure group, also called observer bias. in the pathway →
On the pathway · ∗ · Defend it · Study biases, by rungWhich bias is threatening the study?
- the overall map of biasesStudy biases, by rung
- who got sampled or enrolledSampling bias
- distorted entry into the studySelection bias
- selection among hospitalized patientsBerkson’s bias
- conditioning on a common effectOver-adjustment
- employed groups appear healthierHealthy-worker effect
- only survivors are observedSurvivorship bias
- differential loss to follow-upAttrition bias
- nonresponders differ systematicallyNonresponse bias
- a common cause of exposure and outcomeConfounding
- treatment chosen by prognosisConfounding by indication
- unequal outcome ascertainmentDetection bias
- inaccurate recall of exposureRecall bias
- interviewer shapes responsesInterviewer bias
- earlier detection inflates survivalLead-time bias
- slow cases preferentially detectedLength-time bias
- detecting harmless diseaseOverdiagnosis
- selective reporting of resultsPublication bias
- Diagnostic-accuracy studies
-
Study design measuring how well an index test discriminates disease against a reference standard, prone to spectrum, verification, and incorporation bias. in the pathway →
On the pathway · 05 · Decision rule · Diagnostic-accuracy studiesWhich accuracy measure or pitfall?
- the overall ideaDiagnostic-accuracy studies
- how results shift disease oddsLikelihood ratios
- index test informs reference standardIncorporation bias
- only some get the reference standardVerification bias
- unrepresentative case mixSpectrum bias
- Difference-in-differences
-
A causal design that neutralizes a specific dominant threat to inference, resting on a parallel-trends assumption. in the pathway →
On the pathway · 02 · Model · Causal designs without randomizationWhich quasi-experimental design fits?
- the overall ideaCausal designs without randomization
- before-after across exposed and controlDifference-in-differences
- a haphazard nudge to exposureInstrumental variables
- assignment by a cutoff thresholdRegression discontinuity
- weighted donors build a counterfactualSynthetic control
- Differential misclassification
-
Measurement error related to the outcome, which can bias an effect in either direction and is harder to reason about. in the pathway →
On the pathway · 01 · Measurement · Measurement error and misclassificationWhat kind of measurement error?
- the overall ideaMeasurement error and misclassification
- error unrelated to other variablesNon-differential misclassification
- error differing by groupDifferential misclassification
- true variance over observed varianceReliability ratio
- Dimensionality reduction
-
Compressing many correlated variables into a few, through PCA or nonlinear methods. in the pathway →
On the pathway · 02 · Model · Unsupervised learningWhat unlabeled-data structure are you finding?
- the overall familyUnsupervised learning
- grouping similar observationsClustering
- nested grouping by linkageHierarchical clustering
- partitioning into k groupsK-means
- reducing the number of featuresDimensionality reduction
- orthogonal variance componentsPrincipal component analysis
- Discounting
-
Converting future costs and effects to present value over a model’s time horizon. in the pathway →
On the pathway · 05 · Decision rule · Decision-analytic modelsWhich model structure fits the problem?
- the overall ideaDecision-analytic models
- branching one-time event sequenceDecision tree (decision analysis)
- recurring health states over cyclesMarkov model
- survival curves partition statesPartitioned survival model
- infection spread depends on prevalenceDynamic transmission model
- weight future values lowerDiscounting
- Discrimination
-
Whether a model ranks higher-risk patients above lower-risk ones, measured by the AUC. in the pathway →
On the pathway · 03 · Estimate · Calibration versus discriminationWhich aspect of predictive performance?
- the overall ideaCalibration versus discrimination
- ranking cases above non-casesDiscrimination
- summarizing ranking across thresholdsAUC
- predicted risks match observedCalibration
- testing calibration formallyHosmer-Lemeshow statistic
- Disease registry
-
A systematically maintained roster of people with a condition or exposure that supplies a standing population for many designs. in the pathway →
On the pathway · 00 · Framing · Observational study designsWhich observational design fits the question and dominant bias?
- the overall familyObservational study designs
- exposure known, follow forwardCohort study
- snapshot at one timeCross-sectional study
- rare outcome, look backCase-control study
- controls sampled within cohortNested case-control
- random subcohort, multiple outcomesCase-cohort design
- transient trigger, acute eventCase-crossover design
- within-person rate comparisonSelf-controlled case series (SCCS)
- corrects exposure time trendsCase-time-control design
- initiators, active comparatorActive-comparator new-user design
- standing source populationDisease registry
- Disease risk score
-
A summary that models outcome risk from covariates instead of treatment probability, offering an alternative to the propensity score. in the pathway →
On the pathway · 02 · Model · Real-world causal-inference extensionsHow do causal methods scale to claims and time?
- the overall ideaReal-world causal-inference extensions
- you have many candidate covariatesHigh-dimensional propensity score (hdPS)
- outcome modeling suits the problemDisease risk score
- treatment strategy unfolds over timeClone-censor-weight
- censoring depends on covariatesInverse-probability-of-censoring weighting (IPCW)
- you need a simple guardLandmark analysis
- Dose-finding and early-phase designs
-
Early studies that find the tolerable dose and the efficacy signal before a confirmatory trial. in the pathway →
On the pathway · 00 · Framing · Dose-finding and early-phase designsWhich early-phase design question are you facing?
- the overall design familyDose-finding and early-phase designs
- dose escalation by fixed cohort rule3+3 design
- model-based dose escalationContinual reassessment method
- the highest acceptably safe doseMaximum tolerated dose
- phase II screening for efficacySimon’s two-stage design
- Double-barreled question
-
A survey item that asks two things at once. in the pathway →
On the pathway · 01 · Measurement · Questionnaire and instrument designWhich questionnaire flaw is in play?
- the overall craftQuestionnaire and instrument design
- asking two things at onceDouble-barreled question
- wording that steers the answerLeading question
- Double-programming
-
Independent re-derivation of a dataset or output by a second programmer without seeing the first, reconciled value by value as the sign-off. in the pathway →
What situation?
- the overall ideaStatistical programming and TFLs
- the reported tables and figuresTFLs
- independent reproduction for QCDouble-programming
- Doubly-robust estimators
-
Estimators such as augmented IPW and TMLE that combine a propensity and an outcome model and stay consistent if either is right. in the pathway →
On the pathway · 02 · Model · Causal estimators (propensity scores, g-methods)How do you estimate the causal effect?
- the overall ideaCausal estimators
- model treatment assignment probabilityPropensity score
- reweight by inverse treatment probabilityIPTW
- model and average outcomesG-formula
- combine outcome and treatment modelsDoubly-robust estimators
- targeted machine-learning estimationTMLE
- Drug era (OMOP)
-
A derived continuous exposure span in the OMOP model built from raw drug records using an explicit persistence gap. in the pathway →
On the pathway · 01 · Measurement · Defining exposure in real-world dataHow do raw fills become a defined exposure with a start and end?
- the overall ideaExposure definition in RWD
- building one courseExposure episode construction
- tolerating gapsGrace period and permissible gap
- shifting the clockInduction, latency, and lag windows
- adherence metricProportion of days covered (PDC)
- adherence metricMedication possession ratio (MPR)
- how long treatedPersistence (time to discontinuation)
- standardized spanDrug era (OMOP)
- Dynamic transmission model
-
An infectious-disease model capturing how treating one person changes others’ risk through herd immunity, which a fixed-risk cohort model cannot. in the pathway →
On the pathway · 05 · Decision rule · Decision-analytic modelsWhich model structure fits the problem?
- the overall ideaDecision-analytic models
- branching one-time event sequenceDecision tree (decision analysis)
- recurring health states over cyclesMarkov model
- survival curves partition statesPartitioned survival model
- infection spread depends on prevalenceDynamic transmission model
- weight future values lowerDiscounting
E
- E-value
-
A measure of how strong a hidden confounder would have to be, in association with both treatment and outcome, to explain away an observed result. in the pathway →
\[E = \text{RR} + \sqrt{\text{RR}(\text{RR} - 1)}\]
where \(E\) is the E-value, the smallest association a hidden confounder would need with both treatment and outcome to explain the estimate away; \(\text{RR}\) is the observed risk ratio, taken above 1 (for a protective effect, apply the formula to its reciprocal).
On the pathway · ∗ · Defend it · Bias quantificationHow do you quantify unmeasured bias?
- the overall ideaBias quantification
- strength needed to explain awayE-value
- hidden bias in matched designsRosenbaum bounds
- Ecological fallacy
-
Reading a group-level association as if it held for individuals, a trap of aggregated data. in the pathway →
On the pathway · 01 · Measurement · Data sources and their tradeoffsWhich data source or pitfall?
- the overall ideaData sources and their tradeoffs
- clinical detail from care recordsElectronic health record data
- billing records across encountersClaims data
- enrolled cohorts for a conditionRegistries
- sampled population questionnairesSurvey data
- group-level inference pitfallEcological fallacy
- Effect measures
-
The scales for reporting a result, including relative measures like risk and odds ratios and absolute measures like risk difference and number needed to treat. in the pathway →
\[\text{RR} = \frac{\text{risk}_{\text{exposed}}}{\text{risk}_{\text{unexposed}}}, \quad \text{RD} = \text{risk}_{\text{exposed}} - \text{risk}_{\text{unexposed}}\]
where \(\text{RR}\) is the risk ratio, a relative measure; \(\text{RD}\) is the risk difference, an absolute measure; \(\text{risk}_{\text{exposed}}\) is the outcome risk in the exposed group; \(\text{risk}_{\text{unexposed}}\) is the outcome risk in the unexposed group.
On the pathway · 03 · Estimate · Effect measuresWhich effect measure to report?
- the overall ideaEffect measures
- ratio of risks between groupsRisk ratio
- ratio of odds between groupsOdds ratio
- absolute difference in riskRisk difference
- patients treated per outcome preventedNumber needed to treat
- Effective sample size
-
The sample size discounted by the design effect, so a design effect of 2 leaves the precision of half the respondents. in the pathway →
\[n_{\text{eff}} = \frac{n}{\text{DEFF}}\]
where \(n_{\text{eff}}\) is the effective sample size, the precision the design actually delivers; \(n\) is the achieved sample size; \(\text{DEFF}\) is the design effect, the variance penalty from clustering and unequal weighting.
On the pathway · 01 · Measurement · Complex-sample design and survey weightingWhich weighting or design adjustment?
- the overall ideaComplex-sample design and survey weighting
- scale respondents to the populationSurvey weight
- precision lost to the designEffective sample size
- Egger’s test
-
A statistical test for funnel-plot asymmetry, used to check for publication bias. in the pathway →
On the pathway · 04 · Synthesis · Meta-analysis and poolingWhich pooling or heterogeneity tool?
- the overall ideaMeta-analysis and pooling
- one true effect assumedFixed-effect meta-analysis
- effects vary across studiesRandom-effects meta-analysis
- how much effects varyHeterogeneity
- testing for heterogeneityCochran’s Q
- proportion of variance from heterogeneityI-squared
- between-study variance estimateTau-squared
- range for a new studyPrediction interval
- explaining heterogeneity by covariatesMeta-regression
- visualizing small-study effectsFunnel plot
- testing funnel asymmetryEgger’s test
- Elastic net
-
A regularization that blends ridge and lasso penalties. in the pathway →
On the pathway · 02 · Model · Bias-variance and regularizationWhich concept or penalty?
- the overall ideaBias-variance and regularization
- the underlying error tradeoffBias-variance tradeoff
- fitting noise, poor generalizationOverfitting
- penalizing complexity broadlyRegularization
- estimating out-of-sample errorCross-validation
- shrink coefficients, keep allRidge regression
- shrink and select variablesLasso
- blend selection and shrinkageElastic net
- Electronic health record data
-
Clinically rich data recorded for care, so messy, single-system, and informatively missing rather than research-ready. in the pathway →
On the pathway · 01 · Measurement · Data sources and their tradeoffsWhich data source or pitfall?
- the overall ideaData sources and their tradeoffs
- clinical detail from care recordsElectronic health record data
- billing records across encountersClaims data
- enrolled cohorts for a conditionRegistries
- sampled population questionnairesSurvey data
- group-level inference pitfallEcological fallacy
- Elixhauser comorbidity measures
-
A broader set of comorbidity categories, often kept as separate indicators rather than one number, to adjust for diverse baseline conditions. in the pathway →
On the pathway · 01 · Measurement · Comorbidity and frailty adjustmentHow do I adjust for how sick patients already were?
- the overall ideaComorbidity and frailty adjustment
- you need a mortality-weighted scoreCharlson comorbidity index (CCI)
- you want broad comorbidity coverageElixhauser comorbidity measures
- patients are older or frailClaims-based frailty index
- Empirical calibration
-
Fitting the spread of many null estimates to recalibrate p-values and intervals for observed systematic error. in the pathway →
On the pathway · ∗ · Defend it · Negative controls and empirical calibrationNeed to detect hidden residual confounding?
- probing residual confoundingNegative controls and calibration
- checking confounding on exposureNegative control exposure
- checking confounding on outcomeNegative control outcome
- many negative controls availableEmpirical calibration
- quantifying systematic errorQuantitative bias analysis
- Endpoint adjudication and chart review
-
Clinician review of source records, blinded to exposure, serving as the reference standard for validation. in the pathway →
On the pathway · 01 · Measurement · Outcome phenotyping and algorithm validationIs your outcome a validated algorithm or an unchecked code rule?
- the overall ideaOutcome phenotyping and validation
- the rule itselfClaims/EHR phenotype algorithm
- common coding rule1-inpatient / 2-outpatient rule
- tuning the ruleAlgorithm validation (PPV and sensitivity tradeoff)
- reference standardEndpoint adjudication and chart review
- bundled outcomesComposite endpoint construction
- Endpoint logic and pre-registration
-
Fixing the primary endpoint the sample size rests on and publicly committing to it before unblinding, keeping confirmatory analyses confirmatory. in the pathway →
On the pathway · 00 · Framing · Endpoint logic and pre-registrationWhich endpoint or pre-specification concern?
- the overall ideaEndpoint logic and pre-registration
- the main pre-specified outcomePrimary endpoint
- a stand-in for the outcomeSurrogate endpoint
- criteria validating a surrogatePrentice’s criteria
- lock analysis plan in advancePre-registration
- EQ-5D
-
A preference-based instrument used to derive the utility weights that anchor quality-adjusted life years. in the pathway →
On the pathway · 05 · Decision rule · QALYs and health-state utilitiesWhich utility concept do you need?
- the overall ideaQALYs and health-state utilities
- quality-adjusted life yearsQALY
- preference weight for a health stateHealth-state utility
- a standardized utility instrumentEQ-5D
- Equivalence trial
-
A trial that bounds the difference between treatments on both sides. in the pathway →
On the pathway · 00 · Framing · Non-inferiority and equivalenceWhat are you trying to show?
- the overall ideaNon-inferiority and equivalence
- new is not meaningfully worseNon-inferiority trial
- new is neither worse nor betterEquivalence trial
- how much worse is tolerableNon-inferiority margin
- trial can detect a real differenceAssay sensitivity
- Estimand
-
The exact quantity to be estimated: which effect, in whom. in the pathway →
On the pathway · 02 · Model · Choosing the estimandWhose causal effect do you target?
- the overall ideaChoosing the estimand
- the formal target quantityEstimand
- effect across the whole populationATE
- effect among the treatedATT
- effect among compliers onlyLATE
- Eta-squared
-
An ANOVA effect-size measure, the share of variance the groups explain. in the pathway →
On the pathway · 02 · Model · Bivariate tests (t-test, chi-square)Which two-variable association are you testing?
- the overall family of testsBivariate tests
- mean across two groupsT-test
- means across three or more groupsANOVA
- ranks across two groupsMann-Whitney test
- ranks across three or more groupsKruskal-Wallis test
- two categorical variables, large countsChi-square test
- two categorical variables, small countsFisher’s exact test
- linear correlation of two continuousPearson correlation
- monotonic correlation, rankedSpearman correlation
- concordance-based rank correlationKendall’s tau
- effect size for two meansCohen’s d
- effect size for ANOVAEta-squared
- effect size for categorical associationCramer’s V
- Evidence-to-decision
-
Frameworks making the move from evidence to a recommendation explicit, weighing benefits and harms alongside values, feasibility, equity, and cost. in the pathway →
On the pathway · 06 · Recommendation · Evidence-to-decisionWhat situation?
- the overall ideaEvidence-to-decision
- EVPI
-
Expected value of perfect information: an upper bound on what further research could be worth, equal to the expected loss from deciding under current uncertainty. in the pathway →
On the pathway · 05 · Decision rule · Value of information (EVPI)What is more evidence worth?
- the overall ideaValue of information (EVPI)
- value of removing all uncertaintyEVPI
- value of a specific future studyExpected value of sample information
- Exact logistic regression
-
Conditions on sufficient statistics and enumerates the permutation distribution, giving valid inference without asymptotic approximations when data are very sparse. in the pathway →
On the pathway · 02 · Model · Sparse data and resamplingAre cells sparse or analytic standard errors doubtful?
- the overall familySparse data and resampling
- separation or small samplesFirth penalized regression
- very sparse, exact inferenceExact logistic regression
- no clean closed-form varianceBootstrap and resampling methods
- public-health impact measuresAttributable risk and population attributable fraction (PAF)
- Exchangeability
-
The identifiability condition that treated and untreated are comparable once confounders are controlled, meaning no unmeasured confounding. in the pathway →
On the pathway · 02 · Model · Potential outcomes and identifiabilityWhich identifiability condition is at stake?
- the overall frameworkPotential outcomes and identifiability
- the counterfactual setupPotential-outcomes framework
- only one outcome is observedFundamental problem of causal inference
- treated and untreated comparableExchangeability
- every covariate stratum has bothPositivity
- observed equals counterfactual under treatmentConsistency
- no interference, single versionSUTVA
- Exclusion restriction
-
The unverifiable assumption underlying instrumental-variable designs. in the pathway →
On the pathway · 02 · Model · Identifying assumptionsWhich identifying assumption do you need?
- the overall ideaIdentifying assumptions
- treatment independent of confoundersConditional independence
- instrument affects outcome only via exposureExclusion restriction
- groups would have tracked togetherParallel trends
- Expected value of partial perfect information (EVPPI)
-
Prices resolving specific uncertain parameters, identifying which uncertainty is worth further research. in the pathway →
On the pathway · 05 · Decision rule · Real-world cost and HTA methodsModeling skewed real-world costs for HTA?
- valuing real-world costsReal-world cost and HTA methods
- costs are zero-inflatedTwo-part and other cost models
- extreme cost outliers existWinsorization and trimming of cost outliers
- summarizing population spendPer-member-per-month costing (PMPM/PPPM)
- trial ends before lifetimeSurvival extrapolation for HTA
- value has multiple dimensionsMulti-criteria decision analysis (MCDA)
- prioritizing which uncertainty mattersExpected value of partial perfect information (EVPPI)
- Expected value of sample information
-
A measure valuing a study of a given design and size, going beyond perfect information to price real research. in the pathway →
On the pathway · 05 · Decision rule · Value of information (EVPI)What is more evidence worth?
- the overall ideaValue of information (EVPI)
- value of removing all uncertaintyEVPI
- value of a specific future studyExpected value of sample information
- Expert determination
-
A HIPAA de-identification route where a statistician certifies the re-identification risk is very small. in the pathway →
On the pathway · § · Conduct it · Data privacy and securityWhich rule or method?
- the overall ideaData privacy and security
- US health privacy lawHIPAA
- EU data protection lawGDPR
- de-identify by removing identifiersSafe Harbor
- de-identify by statistical opinionExpert determination
- generating artificial substitute recordsSynthetic data
- Exposure definition in RWD
-
Turning prescription or claim records into an exposure variable with a defined start, window, and end so it is clear who is treated and when. in the pathway →
On the pathway · 01 · Measurement · Defining exposure in real-world dataHow do raw fills become a defined exposure with a start and end?
- the overall ideaExposure definition in RWD
- building one courseExposure episode construction
- tolerating gapsGrace period and permissible gap
- shifting the clockInduction, latency, and lag windows
- adherence metricProportion of days covered (PDC)
- adherence metricMedication possession ratio (MPR)
- how long treatedPersistence (time to discontinuation)
- standardized spanDrug era (OMOP)
- Exposure episode construction
-
Stitching consecutive fills into a continuous treatment span using rules for combining overlapping or sequential supplies. in the pathway →
On the pathway · 01 · Measurement · Defining exposure in real-world dataHow do raw fills become a defined exposure with a start and end?
- the overall ideaExposure definition in RWD
- building one courseExposure episode construction
- tolerating gapsGrace period and permissible gap
- shifting the clockInduction, latency, and lag windows
- adherence metricProportion of days covered (PDC)
- adherence metricMedication possession ratio (MPR)
- how long treatedPersistence (time to discontinuation)
- standardized spanDrug era (OMOP)
- Extract-transform-load
-
Pulling from source tables, deriving study variables from operational definitions, and assembling one analysis-ready table. in the pathway →
On the pathway · 01 · Measurement · Assembling the analytic cohortWhich cohort-construction step?
- the overall ideaAssembling the analytic cohort
- pull and reshape raw source dataExtract-transform-load
- set the time-zero anchorIndex date
- define pre-index covariate historyLookback window
- restrict to treatment initiatorsNew-user design
F
- F1 score
-
The harmonic mean of precision and recall, high only when both are. in the pathway →
\[F_1 = \frac{2 \cdot \text{precision} \cdot \text{recall}}{\text{precision} + \text{recall}}\]
where \(F_1\) is the F1 score, the harmonic mean of precision and recall; \(\text{precision}\) is the share of positive predictions that are correct; \(\text{recall}\) is the share of true positives caught.
On the pathway · 02 · Model · Classification performance metricsWhich classification metric?
- the overall ideaClassification performance metrics
- share of predicted positives correctPrecision
- share of true positives caughtRecall
- balance precision and recallF1 score
- tradeoff across all thresholdsPrecision-recall curve
- False-discovery rate
-
The expected share of false positives among rejections, controlled by Benjamini-Hochberg, better for screening. in the pathway →
On the pathway · 02 · Model · Multiplicity controlHow do you control multiple testing?
- the overall ideaMultiplicity control
- bound any false positiveFamily-wise error rate
- bound false positives among rejectionsFalse-discovery rate
- simple conservative FWER divisorBonferroni correction
- stepwise FWER controlHolm’s procedure
- step-up FDR controlBenjamini-Hochberg
- test hypotheses in ordered familiesGatekeeping procedure
- Family-wise error rate
-
The chance of even one false positive, held down by Bonferroni or Holm’s step-down procedure. in the pathway →
On the pathway · 02 · Model · Multiplicity controlHow do you control multiple testing?
- the overall ideaMultiplicity control
- bound any false positiveFamily-wise error rate
- bound false positives among rejectionsFalse-discovery rate
- simple conservative FWER divisorBonferroni correction
- stepwise FWER controlHolm’s procedure
- step-up FDR controlBenjamini-Hochberg
- test hypotheses in ordered familiesGatekeeping procedure
- Fine-Gray subdistribution hazard
-
A hazard that models the cumulative incidence function directly, giving covariate effects on absolute risk. in the pathway →
On the pathway · 03 · Estimate · Competing risks and parametric survivalDoes a competing event block the outcome?
- modeling time to eventCompeting risks and survival models
- a competing event existsCompeting risks
- you want etiologyCause-specific hazard
- you want absolute riskCumulative incidence function (CIF)
- modeling absolute risk directlyFine-Gray subdistribution hazard
- proportional hazards failsAccelerated failure time (AFT) models
- some patients are curedCure models
- Firth penalized regression
-
Adds a bias-reducing penalty to the likelihood, keeping coefficient estimates finite and less biased even under separation in small or sparse data. in the pathway →
On the pathway · 02 · Model · Sparse data and resamplingAre cells sparse or analytic standard errors doubtful?
- the overall familySparse data and resampling
- separation or small samplesFirth penalized regression
- very sparse, exact inferenceExact logistic regression
- no clean closed-form varianceBootstrap and resampling methods
- public-health impact measuresAttributable risk and population attributable fraction (PAF)
- Fisher’s exact test
-
A test of association between categorical variables used when cell counts are small. in the pathway →
On the pathway · 02 · Model · Bivariate tests (t-test, chi-square)Which two-variable association are you testing?
- the overall family of testsBivariate tests
- mean across two groupsT-test
- means across three or more groupsANOVA
- ranks across two groupsMann-Whitney test
- ranks across three or more groupsKruskal-Wallis test
- two categorical variables, large countsChi-square test
- two categorical variables, small countsFisher’s exact test
- linear correlation of two continuousPearson correlation
- monotonic correlation, rankedSpearman correlation
- concordance-based rank correlationKendall’s tau
- effect size for two meansCohen’s d
- effect size for ANOVAEta-squared
- effect size for categorical associationCramer’s V
- Fixed-effect meta-analysis
-
A pooling model assuming every study estimates one common effect, weighting each only by the inverse of its variance. in the pathway →
\[w = \frac{1}{\text{variance}}\]
where \(w\) is the weight a study receives in the pooled estimate; \(\text{variance}\) is the variance of that study’s effect estimate.
On the pathway · 04 · Synthesis · Meta-analysis and poolingWhich pooling or heterogeneity tool?
- the overall ideaMeta-analysis and pooling
- one true effect assumedFixed-effect meta-analysis
- effects vary across studiesRandom-effects meta-analysis
- how much effects varyHeterogeneity
- testing for heterogeneityCochran’s Q
- proportion of variance from heterogeneityI-squared
- between-study variance estimateTau-squared
- range for a new studyPrediction interval
- explaining heterogeneity by covariatesMeta-regression
- visualizing small-study effectsFunnel plot
- testing funnel asymmetryEgger’s test
- Fleiss’ kappa
-
A kappa extending chance-corrected agreement past two raters. in the pathway →
On the pathway · 01 · Measurement · Reliability and validityWhich measurement property are you assessing?
- the overall ideaReliability and validity
- consistency of measurementReliability
- measuring the intended constructValidity
- internal consistency of scale itemsCronbach’s alpha
- agreement on continuous measuresIntraclass correlation
- plot method agreement and biasBland-Altman plot
- categorical agreement, two ratersCohen’s kappa
- ordered-category agreement, two ratersWeighted kappa
- categorical agreement, many ratersFleiss’ kappa
- Friction-cost approach
-
Valuing lost productivity by counting only earnings lost until a worker is replaced. in the pathway →
On the pathway · 05 · Decision rule · Costing methodsHow to value the resources used?
- the overall ideaCosting methods
- aggregate top-down unit costsGross costing
- itemized bottom-up resource countsMicro-costing
- value lost productivity over a lifetimeHuman-capital approach
- value productivity loss until replacedFriction-cost approach
- Fundamental problem of causal inference
-
That only one of a unit’s potential outcomes is ever observed. in the pathway →
On the pathway · 02 · Model · Potential outcomes and identifiabilityWhich identifiability condition is at stake?
- the overall frameworkPotential outcomes and identifiability
- the counterfactual setupPotential-outcomes framework
- only one outcome is observedFundamental problem of causal inference
- treated and untreated comparableExchangeability
- every covariate stratum has bothPositivity
- observed equals counterfactual under treatmentConsistency
- no interference, single versionSUTVA
- Funnel plot
-
A plot used to check for publication bias in a meta-analysis, where asymmetry suggests missing null studies. in the pathway →
On the pathway · 04 · Synthesis · Meta-analysis and poolingWhich pooling or heterogeneity tool?
- the overall ideaMeta-analysis and pooling
- one true effect assumedFixed-effect meta-analysis
- effects vary across studiesRandom-effects meta-analysis
- how much effects varyHeterogeneity
- testing for heterogeneityCochran’s Q
- proportion of variance from heterogeneityI-squared
- between-study variance estimateTau-squared
- range for a new studyPrediction interval
- explaining heterogeneity by covariatesMeta-regression
- visualizing small-study effectsFunnel plot
- testing funnel asymmetryEgger’s test
G
- G-estimation
-
A g-method estimating a structural nested model for time-varying confounding. in the pathway →
On the pathway · 02 · Model · Time-varying confounding and g-methodsHow do you handle time-varying confounding?
- the overall ideaTime-varying confounding
- confounder both affects and respondsTreatment-confounder feedback
- weight to remove time-varying confoundingMarginal structural model
- model effect directly through timeG-estimation
- G-formula
-
G-computation: modeling the outcome under each treatment and averaging over the covariate distribution. in the pathway →
On the pathway · 02 · Model · Causal estimators (propensity scores, g-methods)How do you estimate the causal effect?
- the overall ideaCausal estimators
- model treatment assignment probabilityPropensity score
- reweight by inverse treatment probabilityIPTW
- model and average outcomesG-formula
- combine outcome and treatment modelsDoubly-robust estimators
- targeted machine-learning estimationTMLE
- Gate question
-
A question that routes a respondent past items that do not apply, creating by-design blanks. in the pathway →
On the pathway · 01 · Measurement · Survey instruments: skip patterns and branchingWhich skip-logic element is in play?
- the overall ideaSurvey skip patterns
- item routing later questionsGate question
- Gatekeeping procedure
-
A hierarchical procedure ordering trial hypotheses and spending alpha down the sequence, testing a secondary endpoint only if the primary won. in the pathway →
On the pathway · 02 · Model · Multiplicity controlHow do you control multiple testing?
- the overall ideaMultiplicity control
- bound any false positiveFamily-wise error rate
- bound false positives among rejectionsFalse-discovery rate
- simple conservative FWER divisorBonferroni correction
- stepwise FWER controlHolm’s procedure
- step-up FDR controlBenjamini-Hochberg
- test hypotheses in ordered familiesGatekeeping procedure
- GDPR
-
The European regulation imposing a stricter consent-and-purpose regime on personal data than US rules. in the pathway →
On the pathway · § · Conduct it · Data privacy and securityWhich rule or method?
- the overall ideaData privacy and security
- US health privacy lawHIPAA
- EU data protection lawGDPR
- de-identify by removing identifiersSafe Harbor
- de-identify by statistical opinionExpert determination
- generating artificial substitute recordsSynthetic data
- GEE
-
Generalized estimating equations, used for clustered or repeated measures. in the pathway →
On the pathway · 02 · Model · Regression familiesWhich regression for your outcome?
- the overall ideaRegression families
- unifying exponential-family frameworkGLM
- continuous outcomeLinear regression
- binary outcomeLogistic regression
- count outcomePoisson regression
- overdispersed countsNegative binomial regression
- excess zeros in countsZero-inflated model
- separate zero and positive partsHurdle model
- correlated or clustered outcomesGEE
- repeated measures over timeMMRM
- Generalizability and transportability
-
Generalizability asks whether the study sample represents the target population; transportability formalizes when an estimate can be carried to a different population. in the pathway →
On the pathway · 04 · Synthesis · Generalizability and transportabilityDo findings carry to other populations?
- the overall ideaGeneralizability and transportability
- Generalized additive models
-
Models that extend splines to fit smooth nonlinear predictor effects. in the pathway →
On the pathway · 02 · Model · Model modifications (splines, interactions)How do you flex the model?
- the overall ideaModel modifications
- effect depends on another variableInteraction term
- fixed exposure term in count modelsOffset
- smooth nonlinear flexible curvesSplines
- additive smooth function componentsGeneralized additive models
- Gibbs sampling
-
A classic MCMC algorithm for drawing posterior samples. in the pathway →
On the pathway · 02 · Model · Bayesian computation (MCMC)Which sampling or diagnostic tool?
- the overall ideaBayesian computation
- sampling the posterior generallyMCMC
- proposal-and-accept samplerMetropolis-Hastings
- sample each parameter conditionallyGibbs sampling
- gradient-guided efficient samplerHamiltonian Monte Carlo
- check chains have convergedR-hat
- check model reproduces the dataPosterior predictive check
- GLM
-
A generalized linear model: a choice of outcome distribution plus a link function. in the pathway →
On the pathway · 02 · Model · Regression familiesWhich regression for your outcome?
- the overall ideaRegression families
- unifying exponential-family frameworkGLM
- continuous outcomeLinear regression
- binary outcomeLogistic regression
- count outcomePoisson regression
- overdispersed countsNegative binomial regression
- excess zeros in countsZero-inflated model
- separate zero and positive partsHurdle model
- correlated or clustered outcomesGEE
- repeated measures over timeMMRM
- Good clinical practice
-
The operational standard (ICH E6) making a trial’s data trustworthy through defined responsibilities, a followed protocol, source-data verification, and an audit trail. in the pathway →
On the pathway · § · Conduct it · Good clinical practiceWhat conduct standard governs the trial?
- the ethical and quality standardGood clinical practice
- Grace period and permissible gap
-
Allowed days between supplies before exposure is broken, and extra coverage past the last day of supply before discontinuation. in the pathway →
On the pathway · 01 · Measurement · Defining exposure in real-world dataHow do raw fills become a defined exposure with a start and end?
- the overall ideaExposure definition in RWD
- building one courseExposure episode construction
- tolerating gapsGrace period and permissible gap
- shifting the clockInduction, latency, and lag windows
- adherence metricProportion of days covered (PDC)
- adherence metricMedication possession ratio (MPR)
- how long treatedPersistence (time to discontinuation)
- standardized spanDrug era (OMOP)
- GRADE
-
A system rating the certainty of a body of evidence, downgrading for risk of bias, inconsistency, indirectness, imprecision, and publication bias. in the pathway →
On the pathway · 04 · Synthesis · Certainty of evidence (GRADE)Which certainty-of-evidence concept is in play?
- rating confidence in estimatesCertainty of evidence (GRADE)
- the named frameworkGRADE
- Gross costing
-
Top-down costing that values a whole episode of care with one aggregate weight such as a DRG payment. in the pathway →
On the pathway · 05 · Decision rule · Costing methodsHow to value the resources used?
- the overall ideaCosting methods
- aggregate top-down unit costsGross costing
- itemized bottom-up resource countsMicro-costing
- value lost productivity over a lifetimeHuman-capital approach
- value productivity loss until replacedFriction-cost approach
- Group-sequential design
-
A design that pre-specifies interim analyses and spends the alpha across them with a stopping boundary. in the pathway →
On the pathway · 00 · Framing · Interim analyses and group-sequential designWhich interim-monitoring element?
- the overall ideaInterim analyses and group-sequential design
- committee reviewing accruing dataData safety monitoring board
- planned looks with stopping rulesGroup-sequential design
- false-positive risk to spendType-I error
- stringent early-look boundaryO’Brien-Fleming boundary
- constant nominal-level boundaryPocock boundary
H
- Half-cycle correction
-
A fix for the counting error from tallying state membership only at cycle boundaries, since on average subjects transition partway through a cycle. in the pathway →
On the pathway · 05 · Decision rule · Cycle length and the half-cycle correctionWhich cycle-timing issue applies?
- the overall ideaCycle length and the half-cycle correction
- adjust for mid-cycle transitionsHalf-cycle correction
- Hamiltonian Monte Carlo
-
The MCMC engine of Stan, mixing far more efficiently in high dimensions. in the pathway →
On the pathway · 02 · Model · Bayesian computation (MCMC)Which sampling or diagnostic tool?
- the overall ideaBayesian computation
- sampling the posterior generallyMCMC
- proposal-and-accept samplerMetropolis-Hastings
- sample each parameter conditionallyGibbs sampling
- gradient-guided efficient samplerHamiltonian Monte Carlo
- check chains have convergedR-hat
- check model reproduces the dataPosterior predictive check
- Hazard ratios and non-proportional hazards
-
A hazard ratio assumes a constant effect on instantaneous risk over time; when that fails, the single ratio becomes a censoring-dependent weighted average. in the pathway →
On the pathway · 03 · Estimate · Hazard ratios and non-proportional hazardsWhich survival concept?
- the overall ideaHazard ratios and non-proportional hazards
- censoring unrelated to outcomeNon-informative censoring
- summary when hazards are non-proportionalRestricted mean survival time
- Health technology assessment and value frameworks
-
A body weighing cost-effectiveness against clinical benefit, budget impact, and equity to reach a coverage or pricing verdict, run differently across health systems. in the pathway →
Which HTA framework or body?
- the overall ideaHealth technology assessment and value frameworks
- UK appraisal agencyNICE
- US value-assessment organizationInstitute for Clinical and Economic Review
- Health-state utility
-
A preference-based weight between zero and one for a health state, elicited from instruments like the EQ-5D or time-trade-off and standard-gamble methods. in the pathway →
On the pathway · 05 · Decision rule · QALYs and health-state utilitiesWhich utility concept do you need?
- the overall ideaQALYs and health-state utilities
- quality-adjusted life yearsQALY
- preference weight for a health stateHealth-state utility
- a standardized utility instrumentEQ-5D
- Healthy-worker effect
-
The tendency of an employed cohort to be healthier than the general population. in the pathway →
On the pathway · ∗ · Defend it · Study biases, by rungWhich bias is threatening the study?
- the overall map of biasesStudy biases, by rung
- who got sampled or enrolledSampling bias
- distorted entry into the studySelection bias
- selection among hospitalized patientsBerkson’s bias
- conditioning on a common effectOver-adjustment
- employed groups appear healthierHealthy-worker effect
- only survivors are observedSurvivorship bias
- differential loss to follow-upAttrition bias
- nonresponders differ systematicallyNonresponse bias
- a common cause of exposure and outcomeConfounding
- treatment chosen by prognosisConfounding by indication
- unequal outcome ascertainmentDetection bias
- inaccurate recall of exposureRecall bias
- interviewer shapes responsesInterviewer bias
- earlier detection inflates survivalLead-time bias
- slow cases preferentially detectedLength-time bias
- detecting harmless diseaseOverdiagnosis
- selective reporting of resultsPublication bias
- Heterogeneity
-
The degree to which studies’ results actually disagree beyond chance, which decides whether a pooled number is informative or a fiction. in the pathway →
On the pathway · 04 · Synthesis · Meta-analysis and poolingWhich pooling or heterogeneity tool?
- the overall ideaMeta-analysis and pooling
- one true effect assumedFixed-effect meta-analysis
- effects vary across studiesRandom-effects meta-analysis
- how much effects varyHeterogeneity
- testing for heterogeneityCochran’s Q
- proportion of variance from heterogeneityI-squared
- between-study variance estimateTau-squared
- range for a new studyPrediction interval
- explaining heterogeneity by covariatesMeta-regression
- visualizing small-study effectsFunnel plot
- testing funnel asymmetryEgger’s test
- Heteroscedasticity
-
Non-constant residual variance, read from a residual-versus-fitted plot and confirmed with Breusch-Pagan or White. in the pathway →
On the pathway · 02 · Model · Checking model assumptionsWhich model assumption to check?
- the overall ideaChecking model assumptions
- non-constant residual varianceHeteroscedasticity
- predictors too collinearVariance inflation factor
- single points driving the fitCook’s distance
- hazard ratio constant over timeProportional hazards
- test that proportionality formallySchoenfeld residuals
- fix variance without refittingRobust standard errors
- Hierarchical Bayesian models
-
Multilevel models that estimate each group’s parameter while sharing a common prior, pulling estimates toward the mean. in the pathway →
On the pathway · 02 · Model · Hierarchical (multilevel) Bayesian modelsWhich multilevel Bayesian idea?
- the overall ideaHierarchical Bayesian models
- borrow strength across groupsPartial pooling
- Hierarchical clustering
-
A clustering method building a nested tree of groupings without fixing the number of clusters in advance. in the pathway →
On the pathway · 02 · Model · Unsupervised learningWhat unlabeled-data structure are you finding?
- the overall familyUnsupervised learning
- grouping similar observationsClustering
- nested grouping by linkageHierarchical clustering
- partitioning into k groupsK-means
- reducing the number of featuresDimensionality reduction
- orthogonal variance componentsPrincipal component analysis
- High-dimensional propensity score (hdPS)
-
An algorithm that screens thousands of claims codes to select empirical proxy confounders for the propensity-score model automatically. in the pathway →
On the pathway · 02 · Model · Real-world causal-inference extensionsHow do causal methods scale to claims and time?
- the overall ideaReal-world causal-inference extensions
- you have many candidate covariatesHigh-dimensional propensity score (hdPS)
- outcome modeling suits the problemDisease risk score
- treatment strategy unfolds over timeClone-censor-weight
- censoring depends on covariatesInverse-probability-of-censoring weighting (IPCW)
- you need a simple guardLandmark analysis
- HIPAA
-
The US law governing identifiable health information, which a dataset must satisfy through de-identification before sharing for research. in the pathway →
On the pathway · § · Conduct it · Data privacy and securityWhich rule or method?
- the overall ideaData privacy and security
- US health privacy lawHIPAA
- EU data protection lawGDPR
- de-identify by removing identifiersSafe Harbor
- de-identify by statistical opinionExpert determination
- generating artificial substitute recordsSynthetic data
- Holm’s procedure
-
A step-down procedure controlling the family-wise error rate with more power than Bonferroni. in the pathway →
On the pathway · 02 · Model · Multiplicity controlHow do you control multiple testing?
- the overall ideaMultiplicity control
- bound any false positiveFamily-wise error rate
- bound false positives among rejectionsFalse-discovery rate
- simple conservative FWER divisorBonferroni correction
- stepwise FWER controlHolm’s procedure
- step-up FDR controlBenjamini-Hochberg
- test hypotheses in ordered familiesGatekeeping procedure
- Homogeneity check
-
The test before pooling for whether stratum-specific estimates differ by more than noise, which would indicate effect modification. in the pathway →
On the pathway · 02 · Model · Stratified analysis (Mantel-Haenszel)Which stratified-analysis step are you at?
- the overall approachStratified analysis
- pooling across strataMantel-Haenszel estimator
- combining stratum log effectsWoolf’s method
- testing for effect modificationHomogeneity check
- Hosmer-Lemeshow statistic
-
A goodness-of-fit test of whether a model’s predicted risks match observed event rates across groups. in the pathway →
On the pathway · 03 · Estimate · Calibration versus discriminationWhich aspect of predictive performance?
- the overall ideaCalibration versus discrimination
- ranking cases above non-casesDiscrimination
- summarizing ranking across thresholdsAUC
- predicted risks match observedCalibration
- testing calibration formallyHosmer-Lemeshow statistic
- Human-capital approach
-
Valuing lost productivity by counting all earnings foregone to illness. in the pathway →
On the pathway · 05 · Decision rule · Costing methodsHow to value the resources used?
- the overall ideaCosting methods
- aggregate top-down unit costsGross costing
- itemized bottom-up resource countsMicro-costing
- value lost productivity over a lifetimeHuman-capital approach
- value productivity loss until replacedFriction-cost approach
- Hurdle model
-
A count model with a zero-versus-positive gate followed by a truncated count. in the pathway →
On the pathway · 02 · Model · Regression familiesWhich regression for your outcome?
- the overall ideaRegression families
- unifying exponential-family frameworkGLM
- continuous outcomeLinear regression
- binary outcomeLogistic regression
- count outcomePoisson regression
- overdispersed countsNegative binomial regression
- excess zeros in countsZero-inflated model
- separate zero and positive partsHurdle model
- correlated or clustered outcomesGEE
- repeated measures over timeMMRM
- Hypothetical strategy
-
An intercurrent-event strategy targeting the outcome had the event not occurred. in the pathway →
On the pathway · 02 · Model · Trial estimands and intercurrent eventsHow do you handle intercurrent events?
- the overall ideaTrial estimands and intercurrent events
- events disrupting outcome interpretationIntercurrent events
- ignore them, use assigned treatmentTreatment-policy strategy
- imagine they did not occurHypothetical strategy
- fold event into the outcomeComposite strategy
- restrict to a defined subpopulationPrincipal-stratum strategy
I
- I-squared
-
A statistic reporting the fraction of total variation across studies that is beyond chance, summarizing heterogeneity. in the pathway →
On the pathway · 04 · Synthesis · Meta-analysis and poolingWhich pooling or heterogeneity tool?
- the overall ideaMeta-analysis and pooling
- one true effect assumedFixed-effect meta-analysis
- effects vary across studiesRandom-effects meta-analysis
- how much effects varyHeterogeneity
- testing for heterogeneityCochran’s Q
- proportion of variance from heterogeneityI-squared
- between-study variance estimateTau-squared
- range for a new studyPrediction interval
- explaining heterogeneity by covariatesMeta-regression
- visualizing small-study effectsFunnel plot
- testing funnel asymmetryEgger’s test
- ICD-10-CM diagnosis codes
-
Clinical modification of ICD-10 used to code diagnoses and conditions for morbidity reporting. in the pathway →
On the pathway · 01 · Measurement · Claims and coding standardsWhich vocabulary encodes each claim field, and what does it capture?
- the overall ideaClaims and coding standards
- diagnosesICD-10-CM diagnosis codes
- inpatient proceduresICD-10-PCS procedure codes
- professional servicesCPT/HCPCS codes
- dispensed drugsNDC (National Drug Code)
- labs and observationsLOINC lab codes
- providerNPI (provider identifier)
- cross-database mappingOMOP standardized vocabularies (OHDSI)
- translating codesCode crosswalks and mappings
- drug utilizationATC and defined daily dose (DDD)
- ICD-10-PCS procedure codes
-
Procedure coding system for inpatient hospital procedures. in the pathway →
On the pathway · 01 · Measurement · Claims and coding standardsWhich vocabulary encodes each claim field, and what does it capture?
- the overall ideaClaims and coding standards
- diagnosesICD-10-CM diagnosis codes
- inpatient proceduresICD-10-PCS procedure codes
- professional servicesCPT/HCPCS codes
- dispensed drugsNDC (National Drug Code)
- labs and observationsLOINC lab codes
- providerNPI (provider identifier)
- cross-database mappingOMOP standardized vocabularies (OHDSI)
- translating codesCode crosswalks and mappings
- drug utilizationATC and defined daily dose (DDD)
- ICER
-
Incremental cost-effectiveness ratio: the extra cost divided by the extra benefit of one option over the next. in the pathway →
\[\text{ICER} = \frac{\Delta\text{cost}}{\Delta\text{effect}}, \quad \text{NMB} = \text{effect} \times \text{WTP} - \text{cost}\]
where \(\text{ICER}\) is the incremental cost-effectiveness ratio of one option over the next; \(\Delta\text{cost}\) is the extra cost of the option; \(\Delta\text{effect}\) is the extra benefit of the option; \(\text{NMB}\) is the net monetary benefit, the same comparison made linear; \(\text{effect}\) is the health benefit gained; \(\text{WTP}\) is the willingness-to-pay threshold per unit of benefit; \(\text{cost}\) is the cost of the option.
On the pathway · 05 · Decision rule · Cost-effectiveness and the ICERWhich economic-evaluation framing fits?
- the overall ideaCost-effectiveness and the ICER
- extra cost per extra effectICER
- value costs and benefits in moneyCost-benefit analysis
- effects identical, compare costs onlyCost-minimization analysis
- effects in quality-adjusted life yearsCost-utility analysis
- value at a willingness thresholdNet monetary benefit
- maximum payable per unit benefitWillingness-to-pay threshold
- IDE
-
FDA investigational device exemption, usually needed before a device trial begins. in the pathway →
On the pathway · § · Conduct it · Regulatory pathways and registrationWhich regulatory application?
- the overall ideaRegulatory pathways and registration
- investigational drug applicationIND
- investigational device exemptionIDE
- Identifying assumptions
-
The claim each causal design rests on that the data cannot verify, such as parallel trends or an exclusion restriction. in the pathway →
On the pathway · 02 · Model · Identifying assumptionsWhich identifying assumption do you need?
- the overall ideaIdentifying assumptions
- treatment independent of confoundersConditional independence
- instrument affects outcome only via exposureExclusion restriction
- groups would have tracked togetherParallel trends
- Immortal time
-
A stretch of follow-up during which the outcome could not yet have occurred, a bias target-trial emulation surfaces. in the pathway →
On the pathway · 00 · Framing · Target-trial emulationWhich target-trial element?
- the overall ideaTarget-trial emulation
- misaligned follow-up start creating biasImmortal time
- Immortal time bias
-
Mistakenly assigning follow-up during which the outcome could not occur to the treated group, manufacturing a survival advantage from bookkeeping. in the pathway →
On the pathway · 01 · Measurement · Immortal time biasWhat situation creates this bias?
- the overall ideaImmortal time bias
- Incidence
-
The rate of new cases, measured as cumulative incidence over a fixed period or as an incidence rate per person-time. in the pathway →
On the pathway · 01 · Measurement · Measures of disease frequencyWhat frequency are you trying to measure?
- the overall ideaMeasures of disease frequency
- existing cases at a time pointPrevalence
- new cases over follow-upIncidence
- new cases as a proportion at riskCumulative incidence
- new cases per unit follow-up timeIncidence rate
- denominator of summed follow-upPerson-time
- unadjusted rate in a populationCrude rate
- comparing rates across populationsAge-standardization
- observed versus expected deathsStandardized mortality ratio
- Incidence rate
-
New cases divided by the person-time at risk, which handles varying follow-up. in the pathway →
On the pathway · 01 · Measurement · Measures of disease frequencyWhat frequency are you trying to measure?
- the overall ideaMeasures of disease frequency
- existing cases at a time pointPrevalence
- new cases over follow-upIncidence
- new cases as a proportion at riskCumulative incidence
- new cases per unit follow-up timeIncidence rate
- denominator of summed follow-upPerson-time
- unadjusted rate in a populationCrude rate
- comparing rates across populationsAge-standardization
- observed versus expected deathsStandardized mortality ratio
- Incorporation bias
-
Bias arising when the index test is itself part of the reference standard it is judged against. in the pathway →
On the pathway · 05 · Decision rule · Diagnostic-accuracy studiesWhich accuracy measure or pitfall?
- the overall ideaDiagnostic-accuracy studies
- how results shift disease oddsLikelihood ratios
- index test informs reference standardIncorporation bias
- only some get the reference standardVerification bias
- unrepresentative case mixSpectrum bias
- IND
-
FDA investigational new drug application, usually needed before a drug trial begins. in the pathway →
On the pathway · § · Conduct it · Regulatory pathways and registrationWhich regulatory application?
- the overall ideaRegulatory pathways and registration
- investigational drug applicationIND
- investigational device exemptionIDE
- Index date
-
A single time zero at which eligibility, exposure assignment, and follow-up start are all aligned for each patient. in the pathway →
On the pathway · 01 · Measurement · Assembling the analytic cohortWhich cohort-construction step?
- the overall ideaAssembling the analytic cohort
- pull and reshape raw source dataExtract-transform-load
- set the time-zero anchorIndex date
- define pre-index covariate historyLookback window
- restrict to treatment initiatorsNew-user design
- Induction, latency, and lag windows
-
Time shifts that delay when exposure can plausibly cause an outcome, excluding implausibly early events. in the pathway →
On the pathway · 01 · Measurement · Defining exposure in real-world dataHow do raw fills become a defined exposure with a start and end?
- the overall ideaExposure definition in RWD
- building one courseExposure episode construction
- tolerating gapsGrace period and permissible gap
- shifting the clockInduction, latency, and lag windows
- adherence metricProportion of days covered (PDC)
- adherence metricMedication possession ratio (MPR)
- how long treatedPersistence (time to discontinuation)
- standardized spanDrug era (OMOP)
- Informative prior
-
A prior encoding real external knowledge, powerful when data are sparse. in the pathway →
On the pathway · 02 · Model · Choosing a priorWhat kind of prior do you need?
- the overall choiceChoosing a prior
- prior matched to the likelihoodConjugate prior
- strong external informationInformative prior
- light regularizing informationWeakly-informative prior
- Informed consent
-
The requirement that a participant understand the study, its risks, and their freedom to refuse or withdraw, with extra protection for vulnerable groups. in the pathway →
On the pathway · § · Conduct it · Research ethics and the IRBWhich ethics concept or body?
- the overall ideaResearch ethics and the IRB
- foundational ethical principlesBelmont principles
- genuine uncertainty justifying a trialClinical equipoise
- participant’s voluntary agreementInformed consent
- body that reviews and approves studiesInstitutional review board
- Institute for Clinical and Economic Review
-
A US body publishing value assessments that anchor drug-price negotiations without a binding cost-per-QALY rule. in the pathway →
Which HTA framework or body?
- the overall ideaHealth technology assessment and value frameworks
- UK appraisal agencyNICE
- US value-assessment organizationInstitute for Clinical and Economic Review
- Institutional review board
-
A body that reviews a study before it starts, weighing risks against benefits and able to halt or modify a protocol. in the pathway →
On the pathway · § · Conduct it · Research ethics and the IRBWhich ethics concept or body?
- the overall ideaResearch ethics and the IRB
- foundational ethical principlesBelmont principles
- genuine uncertainty justifying a trialClinical equipoise
- participant’s voluntary agreementInformed consent
- body that reviews and approves studiesInstitutional review board
- Instrumental variables
-
A causal design using a variable affecting exposure only, resting on an exclusion restriction. in the pathway →
On the pathway · 02 · Model · Causal designs without randomizationWhich quasi-experimental design fits?
- the overall ideaCausal designs without randomization
- before-after across exposed and controlDifference-in-differences
- a haphazard nudge to exposureInstrumental variables
- assignment by a cutoff thresholdRegression discontinuity
- weighted donors build a counterfactualSynthetic control
- Intention-to-treat
-
Analyzing every randomized patient in the arm assigned regardless of what they took, preserving randomization. in the pathway →
On the pathway · 02 · Model · Analysis populations (ITT vs per-protocol)Which set of subjects do you analyze?
- the overall ideaAnalysis populations
- as randomized, regardless of adherenceIntention-to-treat
- only those who followed protocolPer-protocol
- grouped by treatment actually receivedAs-treated
- Interaction term
-
A term capturing effect modification, letting an effect differ across subgroups instead of being averaged. in the pathway →
On the pathway · 02 · Model · Model modifications (splines, interactions)How do you flex the model?
- the overall ideaModel modifications
- effect depends on another variableInteraction term
- fixed exposure term in count modelsOffset
- smooth nonlinear flexible curvesSplines
- additive smooth function componentsGeneralized additive models
- Intercurrent events
-
Things happening after randomization that complicate the outcome, such as stopping the drug, switching, rescue medication, or death. in the pathway →
On the pathway · 02 · Model · Trial estimands and intercurrent eventsHow do you handle intercurrent events?
- the overall ideaTrial estimands and intercurrent events
- events disrupting outcome interpretationIntercurrent events
- ignore them, use assigned treatmentTreatment-policy strategy
- imagine they did not occurHypothetical strategy
- fold event into the outcomeComposite strategy
- restrict to a defined subpopulationPrincipal-stratum strategy
- Interim analyses and group-sequential design
-
Pre-specified looks at accumulating trial data that spend alpha across them so peeking does not inflate the false-positive rate. in the pathway →
On the pathway · 00 · Framing · Interim analyses and group-sequential designWhich interim-monitoring element?
- the overall ideaInterim analyses and group-sequential design
- committee reviewing accruing dataData safety monitoring board
- planned looks with stopping rulesGroup-sequential design
- false-positive risk to spendType-I error
- stringent early-look boundaryO’Brien-Fleming boundary
- constant nominal-level boundaryPocock boundary
- Interviewer bias
-
Bias from a data collector’s knowledge of a subject’s status shaping what is recorded. in the pathway →
On the pathway · ∗ · Defend it · Study biases, by rungWhich bias is threatening the study?
- the overall map of biasesStudy biases, by rung
- who got sampled or enrolledSampling bias
- distorted entry into the studySelection bias
- selection among hospitalized patientsBerkson’s bias
- conditioning on a common effectOver-adjustment
- employed groups appear healthierHealthy-worker effect
- only survivors are observedSurvivorship bias
- differential loss to follow-upAttrition bias
- nonresponders differ systematicallyNonresponse bias
- a common cause of exposure and outcomeConfounding
- treatment chosen by prognosisConfounding by indication
- unequal outcome ascertainmentDetection bias
- inaccurate recall of exposureRecall bias
- interviewer shapes responsesInterviewer bias
- earlier detection inflates survivalLead-time bias
- slow cases preferentially detectedLength-time bias
- detecting harmless diseaseOverdiagnosis
- selective reporting of resultsPublication bias
- Intraclass correlation
-
A measure of reproducibility for a continuous measurement across raters or repeats. in the pathway →
On the pathway · 01 · Measurement · Reliability and validityWhich measurement property are you assessing?
- the overall ideaReliability and validity
- consistency of measurementReliability
- measuring the intended constructValidity
- internal consistency of scale itemsCronbach’s alpha
- agreement on continuous measuresIntraclass correlation
- plot method agreement and biasBland-Altman plot
- categorical agreement, two ratersCohen’s kappa
- ordered-category agreement, two ratersWeighted kappa
- categorical agreement, many ratersFleiss’ kappa
- Inverse-probability-of-censoring weighting (IPCW)
-
Reweighting uncensored patients to stand in for similar censored ones, correcting the informative censoring that artificial censoring or dropout creates. in the pathway →
On the pathway · 02 · Model · Real-world causal-inference extensionsHow do causal methods scale to claims and time?
- the overall ideaReal-world causal-inference extensions
- you have many candidate covariatesHigh-dimensional propensity score (hdPS)
- outcome modeling suits the problemDisease risk score
- treatment strategy unfolds over timeClone-censor-weight
- censoring depends on covariatesInverse-probability-of-censoring weighting (IPCW)
- you need a simple guardLandmark analysis
- IPTW
-
Inverse-probability-of-treatment weighting, which reweights subjects by the inverse of their propensity score to balance measured confounders. in the pathway →
On the pathway · 02 · Model · Causal estimators (propensity scores, g-methods)How do you estimate the causal effect?
- the overall ideaCausal estimators
- model treatment assignment probabilityPropensity score
- reweight by inverse treatment probabilityIPTW
- model and average outcomesG-formula
- combine outcome and treatment modelsDoubly-robust estimators
- targeted machine-learning estimationTMLE
K
- K-means
-
A clustering method partitioning data into k groups by minimizing within-cluster distance to the cluster mean. in the pathway →
On the pathway · 02 · Model · Unsupervised learningWhat unlabeled-data structure are you finding?
- the overall familyUnsupervised learning
- grouping similar observationsClustering
- nested grouping by linkageHierarchical clustering
- partitioning into k groupsK-means
- reducing the number of featuresDimensionality reduction
- orthogonal variance componentsPrincipal component analysis
- K-nearest neighbours
-
A predictor using the majority or average of the k closest cases, sensitive to scaling and dimensionality. in the pathway →
On the pathway · 02 · Model · Learning algorithms and ensemblesWhich learner or ensemble fits?
- the overall ideaLearning algorithms and ensembles
- single interpretable splitsDecision tree (machine learning)
- classify by closest neighborsK-nearest neighbours
- maximum-margin separating boundarySupport vector machine
- average parallel bootstrapped modelsBagging
- many decorrelated bagged treesRandom forest
- sequentially correct prior errorsBoosting
- Kendall’s tau
-
A measure of concordance between two ordinal rankings, with tau-c for rectangular tables. in the pathway →
On the pathway · 02 · Model · Bivariate tests (t-test, chi-square)Which two-variable association are you testing?
- the overall family of testsBivariate tests
- mean across two groupsT-test
- means across three or more groupsANOVA
- ranks across two groupsMann-Whitney test
- ranks across three or more groupsKruskal-Wallis test
- two categorical variables, large countsChi-square test
- two categorical variables, small countsFisher’s exact test
- linear correlation of two continuousPearson correlation
- monotonic correlation, rankedSpearman correlation
- concordance-based rank correlationKendall’s tau
- effect size for two meansCohen’s d
- effect size for ANOVAEta-squared
- effect size for categorical associationCramer’s V
- Kruskal-Wallis test
-
A rank-based alternative to one-way ANOVA when normality is doubtful. in the pathway →
On the pathway · 02 · Model · Bivariate tests (t-test, chi-square)Which two-variable association are you testing?
- the overall family of testsBivariate tests
- mean across two groupsT-test
- means across three or more groupsANOVA
- ranks across two groupsMann-Whitney test
- ranks across three or more groupsKruskal-Wallis test
- two categorical variables, large countsChi-square test
- two categorical variables, small countsFisher’s exact test
- linear correlation of two continuousPearson correlation
- monotonic correlation, rankedSpearman correlation
- concordance-based rank correlationKendall’s tau
- effect size for two meansCohen’s d
- effect size for ANOVAEta-squared
- effect size for categorical associationCramer’s V
- Kurtosis
-
A summary of a distribution’s tail-heaviness, part of reading its shape. in the pathway →
On the pathway · 01 · Measurement · Characterizing the distributionWhat shape feature?
- the overall ideaCharacterizing the distribution
- asymmetry of the distributionSkewness
- heaviness of the tailsKurtosis
- smoothing a nonlinear trendLOESS smoother
L
- Landmark analysis
-
Classifying exposure status as of a fixed later time and analyzing from there, so early events are not misattributed to exposure. in the pathway →
On the pathway · 02 · Model · Real-world causal-inference extensionsHow do causal methods scale to claims and time?
- the overall ideaReal-world causal-inference extensions
- you have many candidate covariatesHigh-dimensional propensity score (hdPS)
- outcome modeling suits the problemDisease risk score
- treatment strategy unfolds over timeClone-censor-weight
- censoring depends on covariatesInverse-probability-of-censoring weighting (IPCW)
- you need a simple guardLandmark analysis
- Lasso
-
L1 regularization that shrinks some coefficients exactly to zero and so also selects variables. in the pathway →
On the pathway · 02 · Model · Bias-variance and regularizationWhich concept or penalty?
- the overall ideaBias-variance and regularization
- the underlying error tradeoffBias-variance tradeoff
- fitting noise, poor generalizationOverfitting
- penalizing complexity broadlyRegularization
- estimating out-of-sample errorCross-validation
- shrink coefficients, keep allRidge regression
- shrink and select variablesLasso
- blend selection and shrinkageElastic net
- LATE
-
The local average treatment effect, the contrast of potential outcomes among compliers. in the pathway →
On the pathway · 02 · Model · Choosing the estimandWhose causal effect do you target?
- the overall ideaChoosing the estimand
- the formal target quantityEstimand
- effect across the whole populationATE
- effect among the treatedATT
- effect among compliers onlyLATE
- Lead-time bias
-
The apparent survival gain from diagnosing earlier without changing the disease course. in the pathway →
On the pathway · ∗ · Defend it · Study biases, by rungWhich bias is threatening the study?
- the overall map of biasesStudy biases, by rung
- who got sampled or enrolledSampling bias
- distorted entry into the studySelection bias
- selection among hospitalized patientsBerkson’s bias
- conditioning on a common effectOver-adjustment
- employed groups appear healthierHealthy-worker effect
- only survivors are observedSurvivorship bias
- differential loss to follow-upAttrition bias
- nonresponders differ systematicallyNonresponse bias
- a common cause of exposure and outcomeConfounding
- treatment chosen by prognosisConfounding by indication
- unequal outcome ascertainmentDetection bias
- inaccurate recall of exposureRecall bias
- interviewer shapes responsesInterviewer bias
- earlier detection inflates survivalLead-time bias
- slow cases preferentially detectedLength-time bias
- detecting harmless diseaseOverdiagnosis
- selective reporting of resultsPublication bias
- Leading question
-
A survey item whose wording presses the respondent toward a particular answer. in the pathway →
On the pathway · 01 · Measurement · Questionnaire and instrument designWhich questionnaire flaw is in play?
- the overall craftQuestionnaire and instrument design
- asking two things at onceDouble-barreled question
- wording that steers the answerLeading question
- Learning algorithms and ensembles
-
The supervised toolkit beyond regression, including k-nearest neighbours, support vector machines, decision trees, and ensembles. in the pathway →
On the pathway · 02 · Model · Learning algorithms and ensemblesWhich learner or ensemble fits?
- the overall ideaLearning algorithms and ensembles
- single interpretable splitsDecision tree (machine learning)
- classify by closest neighborsK-nearest neighbours
- maximum-margin separating boundarySupport vector machine
- average parallel bootstrapped modelsBagging
- many decorrelated bagged treesRandom forest
- sequentially correct prior errorsBoosting
- Leave-one-out and specification curves
-
Re-estimating after dropping a single unit, or across many defensible modeling choices, to expose whether a finding rests on one unit or holds broadly. in the pathway →
On the pathway · ∗ · Defend it · Leave-one-out and specification curvesHow are you probing specification robustness?
- the overall ideaLeave-one-out and specification curves
- results across many model choicesSpecification-curve analysis
- Length-time bias
-
The over-representation of slow, indolent cases that screening preferentially catches. in the pathway →
On the pathway · ∗ · Defend it · Study biases, by rungWhich bias is threatening the study?
- the overall map of biasesStudy biases, by rung
- who got sampled or enrolledSampling bias
- distorted entry into the studySelection bias
- selection among hospitalized patientsBerkson’s bias
- conditioning on a common effectOver-adjustment
- employed groups appear healthierHealthy-worker effect
- only survivors are observedSurvivorship bias
- differential loss to follow-upAttrition bias
- nonresponders differ systematicallyNonresponse bias
- a common cause of exposure and outcomeConfounding
- treatment chosen by prognosisConfounding by indication
- unequal outcome ascertainmentDetection bias
- inaccurate recall of exposureRecall bias
- interviewer shapes responsesInterviewer bias
- earlier detection inflates survivalLead-time bias
- slow cases preferentially detectedLength-time bias
- detecting harmless diseaseOverdiagnosis
- selective reporting of resultsPublication bias
- Likelihood ratios
-
Summaries of a diagnostic table independent of prevalence that update pre-test odds to post-test odds directly. in the pathway →
\[\text{LR}+ = \frac{\text{sens}}{1 - \text{spec}}, \quad \text{LR}- = \frac{1 - \text{sens}}{\text{spec}}, \quad \text{post-test odds} = \text{pre-test odds} \times \text{LR}\]
where \(\text{LR}+\) is the positive likelihood ratio, how much a positive result raises the odds; \(\text{LR}-\) is the negative likelihood ratio, how much a negative result lowers the odds; \(\text{sens}\) is the sensitivity of the test; \(\text{spec}\) is the specificity of the test; \(\text{pre-test odds}\) is the odds of disease before the test, from prevalence; \(\text{post-test odds}\) is the odds of disease after the test result.
On the pathway · 05 · Decision rule · Diagnostic-accuracy studiesWhich accuracy measure or pitfall?
- the overall ideaDiagnostic-accuracy studies
- how results shift disease oddsLikelihood ratios
- index test informs reference standardIncorporation bias
- only some get the reference standardVerification bias
- unrepresentative case mixSpectrum bias
- Linear combinations and contrasts
-
A weighted sum of regression coefficients reported as the quantity of interest, with a standard error drawn from the variance-covariance matrix. in the pathway →
On the pathway · 02 · Model · Linear combinations and contrastsWhich comparison of model terms?
- the overall ideaLinear combinations and contrasts
- a specific weighted group comparisonContrast
- Linear regression
-
A regression for continuous outcomes, returning a mean difference. in the pathway →
On the pathway · 02 · Model · Regression familiesWhich regression for your outcome?
- the overall ideaRegression families
- unifying exponential-family frameworkGLM
- continuous outcomeLinear regression
- binary outcomeLogistic regression
- count outcomePoisson regression
- overdispersed countsNegative binomial regression
- excess zeros in countsZero-inflated model
- separate zero and positive partsHurdle model
- correlated or clustered outcomesGEE
- repeated measures over timeMMRM
- LOESS smoother
-
A smoother drawn on a scatter to reveal the shape of a relationship before assuming it is linear. in the pathway →
On the pathway · 01 · Measurement · Characterizing the distributionWhat shape feature?
- the overall ideaCharacterizing the distribution
- asymmetry of the distributionSkewness
- heaviness of the tailsKurtosis
- smoothing a nonlinear trendLOESS smoother
- Logistic regression
-
A regression for binary outcomes, returning an odds ratio. in the pathway →
On the pathway · 02 · Model · Regression familiesWhich regression for your outcome?
- the overall ideaRegression families
- unifying exponential-family frameworkGLM
- continuous outcomeLinear regression
- binary outcomeLogistic regression
- count outcomePoisson regression
- overdispersed countsNegative binomial regression
- excess zeros in countsZero-inflated model
- separate zero and positive partsHurdle model
- correlated or clustered outcomesGEE
- repeated measures over timeMMRM
- LOINC lab codes
-
Standard vocabulary for identifying laboratory tests and clinical observations. in the pathway →
On the pathway · 01 · Measurement · Claims and coding standardsWhich vocabulary encodes each claim field, and what does it capture?
- the overall ideaClaims and coding standards
- diagnosesICD-10-CM diagnosis codes
- inpatient proceduresICD-10-PCS procedure codes
- professional servicesCPT/HCPCS codes
- dispensed drugsNDC (National Drug Code)
- labs and observationsLOINC lab codes
- providerNPI (provider identifier)
- cross-database mappingOMOP standardized vocabularies (OHDSI)
- translating codesCode crosswalks and mappings
- drug utilizationATC and defined daily dose (DDD)
- Lookback window
-
The pre-index period in which confounders are measured so adjustment targets baseline causes, not post-exposure variables. in the pathway →
On the pathway · 01 · Measurement · Assembling the analytic cohortWhich cohort-construction step?
- the overall ideaAssembling the analytic cohort
- pull and reshape raw source dataExtract-transform-load
- set the time-zero anchorIndex date
- define pre-index covariate historyLookback window
- restrict to treatment initiatorsNew-user design
M
- MAD
-
The median absolute deviation, rescaled by 1.4826 to equal the standard deviation under a normal. in the pathway →
On the pathway · 02 · Model · Robust statistics for heavy tailsWhich robust measure?
- the overall ideaRobust statistics for heavy tails
- robust spread of the dataMAD
- robust outlier-resistant standardizationRobust z-score
- Mann-Whitney test
-
A rank-based alternative to the two-group t-test when normality is doubtful, also called the Wilcoxon rank-sum. in the pathway →
On the pathway · 02 · Model · Bivariate tests (t-test, chi-square)Which two-variable association are you testing?
- the overall family of testsBivariate tests
- mean across two groupsT-test
- means across three or more groupsANOVA
- ranks across two groupsMann-Whitney test
- ranks across three or more groupsKruskal-Wallis test
- two categorical variables, large countsChi-square test
- two categorical variables, small countsFisher’s exact test
- linear correlation of two continuousPearson correlation
- monotonic correlation, rankedSpearman correlation
- concordance-based rank correlationKendall’s tau
- effect size for two meansCohen’s d
- effect size for ANOVAEta-squared
- effect size for categorical associationCramer’s V
- Mantel-Haenszel estimator
-
A method for pooling stratum-specific odds ratios, risk ratios, or rate ratios into one. in the pathway →
On the pathway · 02 · Model · Stratified analysis (Mantel-Haenszel)Which stratified-analysis step are you at?
- the overall approachStratified analysis
- pooling across strataMantel-Haenszel estimator
- combining stratum log effectsWoolf’s method
- testing for effect modificationHomogeneity check
- MAR
-
Missing at random: missingness depending only on observed data, handled by multiple imputation conditional on it. in the pathway →
On the pathway · 01 · Measurement · Missing data: MCAR, MAR, MNARWhy are values missing?
- the overall ideaMissing data
- missingness unrelated to anythingMCAR
- missingness explained by observed dataMAR
- missingness depends on unseen valuesMNAR
- fill gaps and pool estimatesMultiple imputation
- Marginal structural model
-
A g-method fitted by inverse-probability-of-treatment weighting to handle time-varying confounding. in the pathway →
On the pathway · 02 · Model · Time-varying confounding and g-methodsHow do you handle time-varying confounding?
- the overall ideaTime-varying confounding
- confounder both affects and respondsTreatment-confounder feedback
- weight to remove time-varying confoundingMarginal structural model
- model effect directly through timeG-estimation
- Markov model
-
A state-transition model moving a cohort between health states each cycle by a transition matrix, the standard tool for chronic disease. in the pathway →
\[p = 1 - \exp(-r \cdot t)\]
where \(p\) is the per-cycle transition probability; \(r\) is the rate reported in the published evidence; \(t\) is the cycle length over which the probability applies.
On the pathway · 05 · Decision rule · Decision-analytic modelsWhich model structure fits the problem?
- the overall ideaDecision-analytic models
- branching one-time event sequenceDecision tree (decision analysis)
- recurring health states over cyclesMarkov model
- survival curves partition statesPartitioned survival model
- infection spread depends on prevalenceDynamic transmission model
- weight future values lowerDiscounting
- Maximum tolerated dose
-
The highest tolerable dose, the target a phase I dose-finding study estimates. in the pathway →
On the pathway · 00 · Framing · Dose-finding and early-phase designsWhich early-phase design question are you facing?
- the overall design familyDose-finding and early-phase designs
- dose escalation by fixed cohort rule3+3 design
- model-based dose escalationContinual reassessment method
- the highest acceptably safe doseMaximum tolerated dose
- phase II screening for efficacySimon’s two-stage design
- MCAR
-
Missing completely at random: a benign mechanism where missingness is unrelated to any data. in the pathway →
On the pathway · 01 · Measurement · Missing data: MCAR, MAR, MNARWhy are values missing?
- the overall ideaMissing data
- missingness unrelated to anythingMCAR
- missingness explained by observed dataMAR
- missingness depends on unseen valuesMNAR
- fill gaps and pool estimatesMultiple imputation
- McFadden’s pseudo-R-squared
-
A rough stand-in for R-squared in generalized linear models, where a true R-squared does not apply. in the pathway →
On the pathway · 03 · Estimate · Model fit, comparison, and prediction errorWhich fit or error measure do you need?
- the overall ideaModel fit, comparison, and prediction error
- variance explained, linear modelR-squared
- pseudo-fit for logistic modelMcFadden’s pseudo-R-squared
- compare models, penalize parametersAIC
- compare models, penalize more heavilyBIC
- prediction error, average magnitudeMean absolute error
- prediction error, penalize large missesRMSE
- MCMC
-
Markov chain Monte Carlo: drawing a dependent sequence of samples whose long-run distribution is the posterior. in the pathway →
On the pathway · 02 · Model · Bayesian computation (MCMC)Which sampling or diagnostic tool?
- the overall ideaBayesian computation
- sampling the posterior generallyMCMC
- proposal-and-accept samplerMetropolis-Hastings
- sample each parameter conditionallyGibbs sampling
- gradient-guided efficient samplerHamiltonian Monte Carlo
- check chains have convergedR-hat
- check model reproduces the dataPosterior predictive check
- Mean absolute error
-
A prediction error measure in the outcome’s units, used when a few large errors should not dominate. in the pathway →
On the pathway · 03 · Estimate · Model fit, comparison, and prediction errorWhich fit or error measure do you need?
- the overall ideaModel fit, comparison, and prediction error
- variance explained, linear modelR-squared
- pseudo-fit for logistic modelMcFadden’s pseudo-R-squared
- compare models, penalize parametersAIC
- compare models, penalize more heavilyBIC
- prediction error, average magnitudeMean absolute error
- prediction error, penalize large missesRMSE
- Measurement error and misclassification
-
Imprecision in measuring a variable, whose effect on an estimate depends on whether the error relates to the outcome. in the pathway →
On the pathway · 01 · Measurement · Measurement error and misclassificationWhat kind of measurement error?
- the overall ideaMeasurement error and misclassification
- error unrelated to other variablesNon-differential misclassification
- error differing by groupDifferential misclassification
- true variance over observed varianceReliability ratio
- Measurement-method effects
-
Two devices or protocols measuring the same quantity can disagree systematically, so a threshold validated under one does not transfer. in the pathway →
On the pathway · 01 · Measurement · Measurement-method effectsWhat situation is this?
- the overall ideaMeasurement-method effects
- Measures of disease frequency
-
The standard forms for counting how often disease occurs, including prevalence, incidence, and rates. in the pathway →
On the pathway · 01 · Measurement · Measures of disease frequencyWhat frequency are you trying to measure?
- the overall ideaMeasures of disease frequency
- existing cases at a time pointPrevalence
- new cases over follow-upIncidence
- new cases as a proportion at riskCumulative incidence
- new cases per unit follow-up timeIncidence rate
- denominator of summed follow-upPerson-time
- unadjusted rate in a populationCrude rate
- comparing rates across populationsAge-standardization
- observed versus expected deathsStandardized mortality ratio
- Mediation analysis
-
Splitting a total effect into a direct effect and an indirect effect running through a mediator. in the pathway →
\[\text{total} = \text{direct} + \text{indirect}, \quad \text{indirect} = a \cdot b\]
where \(\text{total}\) is the total effect of the exposure on the outcome; \(\text{direct}\) is the effect not running through the mediator; \(\text{indirect}\) is the effect running through the mediator; \(a\) is the exposure-to-mediator coefficient; \(b\) is the mediator-to-outcome coefficient.
On the pathway · 02 · Model · Mediation analysisWhich mediation concept is in play?
- the overall methodMediation analysis
- decomposing effect through a mediatorNatural direct and indirect effects
- Mediator
-
A variable on the causal path from exposure to outcome, left alone when the total effect is the target. in the pathway →
On the pathway · 02 · Model · Causal diagrams (DAGs) and conceptual frameworksWhich causal-diagram concept?
- the overall frameworkCausal diagrams
- the graph notation itselfDAG
- common cause of exposure and outcomeConfounder
- variable on the causal pathMediator
- common effect, conditioning opens biasCollider
- rule for sufficient adjustment setsBack-door criterion
- Medication possession ratio (MPR)
-
Total days supplied divided by days in the observation interval, an adherence measure that can exceed one with overlaps. in the pathway →
On the pathway · 01 · Measurement · Defining exposure in real-world dataHow do raw fills become a defined exposure with a start and end?
- the overall ideaExposure definition in RWD
- building one courseExposure episode construction
- tolerating gapsGrace period and permissible gap
- shifting the clockInduction, latency, and lag windows
- adherence metricProportion of days covered (PDC)
- adherence metricMedication possession ratio (MPR)
- how long treatedPersistence (time to discontinuation)
- standardized spanDrug era (OMOP)
- Meta-analysis and pooling
-
Combining studies into one estimate using inverse-variance weighting, which sharpens an estimate only when the studies are estimating the same thing. in the pathway →
On the pathway · 04 · Synthesis · Meta-analysis and poolingWhich pooling or heterogeneity tool?
- the overall ideaMeta-analysis and pooling
- one true effect assumedFixed-effect meta-analysis
- effects vary across studiesRandom-effects meta-analysis
- how much effects varyHeterogeneity
- testing for heterogeneityCochran’s Q
- proportion of variance from heterogeneityI-squared
- between-study variance estimateTau-squared
- range for a new studyPrediction interval
- explaining heterogeneity by covariatesMeta-regression
- visualizing small-study effectsFunnel plot
- testing funnel asymmetryEgger’s test
- Meta-regression
-
A technique that tries to explain heterogeneity across studies using study-level covariates. in the pathway →
On the pathway · 04 · Synthesis · Meta-analysis and poolingWhich pooling or heterogeneity tool?
- the overall ideaMeta-analysis and pooling
- one true effect assumedFixed-effect meta-analysis
- effects vary across studiesRandom-effects meta-analysis
- how much effects varyHeterogeneity
- testing for heterogeneityCochran’s Q
- proportion of variance from heterogeneityI-squared
- between-study variance estimateTau-squared
- range for a new studyPrediction interval
- explaining heterogeneity by covariatesMeta-regression
- visualizing small-study effectsFunnel plot
- testing funnel asymmetryEgger’s test
- Metropolis-Hastings
-
A classic MCMC algorithm for drawing posterior samples. in the pathway →
On the pathway · 02 · Model · Bayesian computation (MCMC)Which sampling or diagnostic tool?
- the overall ideaBayesian computation
- sampling the posterior generallyMCMC
- proposal-and-accept samplerMetropolis-Hastings
- sample each parameter conditionallyGibbs sampling
- gradient-guided efficient samplerHamiltonian Monte Carlo
- check chains have convergedR-hat
- check model reproduces the dataPosterior predictive check
- Micro-costing
-
Bottom-up costing that counts each resource used and multiplies it by its unit price. in the pathway →
On the pathway · 05 · Decision rule · Costing methodsHow to value the resources used?
- the overall ideaCosting methods
- aggregate top-down unit costsGross costing
- itemized bottom-up resource countsMicro-costing
- value lost productivity over a lifetimeHuman-capital approach
- value productivity loss until replacedFriction-cost approach
- Minimization
-
An adaptive assignment that places each patient to keep arms balanced across several factors at once. in the pathway →
On the pathway · 00 · Framing · Randomization and blindingWhat allocation or masking concern?
- the overall ideaRandomization and blinding
- hide the upcoming assignmentAllocation concealment
- balance arms in small chunksBlock randomization
- balance within prognostic strataStratified randomization
- dynamically balance many factorsMinimization
- mask treatment after allocationBlinding
- Missing data
-
Why a value is missing decides what can be done about it, across the MCAR, MAR, and MNAR mechanisms. in the pathway →
On the pathway · 01 · Measurement · Missing data: MCAR, MAR, MNARWhy are values missing?
- the overall ideaMissing data
- missingness unrelated to anythingMCAR
- missingness explained by observed dataMAR
- missingness depends on unseen valuesMNAR
- fill gaps and pool estimatesMultiple imputation
- MMRM
-
The mixed model for repeated measures, standard for a longitudinal trial endpoint, using all timepoints and handling dropout under missing-at-random. in the pathway →
On the pathway · 02 · Model · Regression familiesWhich regression for your outcome?
- the overall ideaRegression families
- unifying exponential-family frameworkGLM
- continuous outcomeLinear regression
- binary outcomeLogistic regression
- count outcomePoisson regression
- overdispersed countsNegative binomial regression
- excess zeros in countsZero-inflated model
- separate zero and positive partsHurdle model
- correlated or clustered outcomesGEE
- repeated measures over timeMMRM
- MNAR
-
Missing not at random: missingness depending on the unseen value itself, needing pattern-mixture or tipping-point sensitivity approaches. in the pathway →
On the pathway · 01 · Measurement · Missing data: MCAR, MAR, MNARWhy are values missing?
- the overall ideaMissing data
- missingness unrelated to anythingMCAR
- missingness explained by observed dataMAR
- missingness depends on unseen valuesMNAR
- fill gaps and pool estimatesMultiple imputation
- Model fit, comparison, and prediction error
-
The continuous-outcome counterpart to calibration and discrimination, covering variance explained, model comparison, and honest out-of-sample error. in the pathway →
On the pathway · 03 · Estimate · Model fit, comparison, and prediction errorWhich fit or error measure do you need?
- the overall ideaModel fit, comparison, and prediction error
- variance explained, linear modelR-squared
- pseudo-fit for logistic modelMcFadden’s pseudo-R-squared
- compare models, penalize parametersAIC
- compare models, penalize more heavilyBIC
- prediction error, average magnitudeMean absolute error
- prediction error, penalize large missesRMSE
- Model modifications
-
Standard adaptations to a base regression, including splines, interactions, transformations, and offsets, each answering a specific signal. in the pathway →
On the pathway · 02 · Model · Model modifications (splines, interactions)How do you flex the model?
- the overall ideaModel modifications
- effect depends on another variableInteraction term
- fixed exposure term in count modelsOffset
- smooth nonlinear flexible curvesSplines
- additive smooth function componentsGeneralized additive models
- Model validation and calibration
-
Checks that build trust in a model: verification that it is coded correctly and validation across face, internal, external, and predictive layers that it represents reality. in the pathway →
On the pathway · 05 · Decision rule · Model validation and calibrationWhat situation?
- the overall ideaModel validation and calibration
- tuning model outputs to realityCalibration (modeling)
- confirming the model runs correctlyVerification
- Monte Carlo simulation
-
Generating data under a known process and running the planned analysis over many replicates to study an estimator’s bias, coverage, and required sample size. in the pathway →
On the pathway · ∗ · Defend it · Monte Carlo simulationWhat situation?
- the overall ideaMonte Carlo simulation
- Multi-criteria decision analysis (MCDA)
-
Explicitly weighting criteria such as equity and severity when a single ratio cannot capture value. in the pathway →
On the pathway · 05 · Decision rule · Real-world cost and HTA methodsModeling skewed real-world costs for HTA?
- valuing real-world costsReal-world cost and HTA methods
- costs are zero-inflatedTwo-part and other cost models
- extreme cost outliers existWinsorization and trimming of cost outliers
- summarizing population spendPer-member-per-month costing (PMPM/PPPM)
- trial ends before lifetimeSurvival extrapolation for HTA
- value has multiple dimensionsMulti-criteria decision analysis (MCDA)
- prioritizing which uncertainty mattersExpected value of partial perfect information (EVPPI)
- Multiple imputation
-
Filling in missing values conditional on observed data, valid when data are missing at random. in the pathway →
On the pathway · 01 · Measurement · Missing data: MCAR, MAR, MNARWhy are values missing?
- the overall ideaMissing data
- missingness unrelated to anythingMCAR
- missingness explained by observed dataMAR
- missingness depends on unseen valuesMNAR
- fill gaps and pool estimatesMultiple imputation
- Multiplicity control
-
Methods to rein in false positives when many hypotheses are tested, via family-wise error or false-discovery control. in the pathway →
On the pathway · 02 · Model · Multiplicity controlHow do you control multiple testing?
- the overall ideaMultiplicity control
- bound any false positiveFamily-wise error rate
- bound false positives among rejectionsFalse-discovery rate
- simple conservative FWER divisorBonferroni correction
- stepwise FWER controlHolm’s procedure
- step-up FDR controlBenjamini-Hochberg
- test hypotheses in ordered familiesGatekeeping procedure
- Multistage sampling
-
Nesting sampling stages: sampling primary sampling units, then units within them, often with probability proportional to size. in the pathway →
On the pathway · 01 · Measurement · Survey sampling designHow do you draw the sample?
- the overall ideaSurvey sampling design
- every unit known nonzero chanceProbability sample
- equal-chance draw from frameSimple random sampling
- sample within population strataStratified sampling
- sample whole groups togetherCluster sampling
- sample in successive nested stagesMultistage sampling
- variance inflation from clusteringDesign effect
N
- Natural direct and indirect effects
-
The counterfactual framing of mediation, needing no unmeasured confounding of the mediator-outcome relationship. in the pathway →
On the pathway · 02 · Model · Mediation analysisWhich mediation concept is in play?
- the overall methodMediation analysis
- decomposing effect through a mediatorNatural direct and indirect effects
- NDC (National Drug Code)
-
Identifier encoding drug manufacturer, product, and package, requiring mapping to reach the ingredient level. in the pathway →
On the pathway · 01 · Measurement · Claims and coding standardsWhich vocabulary encodes each claim field, and what does it capture?
- the overall ideaClaims and coding standards
- diagnosesICD-10-CM diagnosis codes
- inpatient proceduresICD-10-PCS procedure codes
- professional servicesCPT/HCPCS codes
- dispensed drugsNDC (National Drug Code)
- labs and observationsLOINC lab codes
- providerNPI (provider identifier)
- cross-database mappingOMOP standardized vocabularies (OHDSI)
- translating codesCode crosswalks and mappings
- drug utilizationATC and defined daily dose (DDD)
- Negative binomial distribution
-
A distribution for overdispersed counts whose variance exceeds the mean. in the pathway →
On the pathway · 01 · Measurement · Probability distributions and the CLTWhich distribution or sampling result?
- the overall ideaProbability distributions
- continuous bell-shaped variableNormal distribution
- fixed-trial success countsBinomial distribution
- counts of rare eventsPoisson distribution
- overdispersed count dataNegative binomial distribution
- why sample means turn normalCentral limit theorem
- spread of a sample estimateStandard error
- Negative binomial regression
-
A count regression used when overdispersion makes the variance exceed the mean. in the pathway →
On the pathway · 02 · Model · Regression familiesWhich regression for your outcome?
- the overall ideaRegression families
- unifying exponential-family frameworkGLM
- continuous outcomeLinear regression
- binary outcomeLogistic regression
- count outcomePoisson regression
- overdispersed countsNegative binomial regression
- excess zeros in countsZero-inflated model
- separate zero and positive partsHurdle model
- correlated or clustered outcomesGEE
- repeated measures over timeMMRM
- Negative control exposure
-
An exposure sharing the real exposure’s confounding structure but with no plausible causal link to the outcome. in the pathway →
On the pathway · ∗ · Defend it · Negative controls and empirical calibrationNeed to detect hidden residual confounding?
- probing residual confoundingNegative controls and calibration
- checking confounding on exposureNegative control exposure
- checking confounding on outcomeNegative control outcome
- many negative controls availableEmpirical calibration
- quantifying systematic errorQuantitative bias analysis
- Negative control outcome
-
An outcome sharing the real outcome’s confounding structure but that exposure cannot plausibly cause. in the pathway →
On the pathway · ∗ · Defend it · Negative controls and empirical calibrationNeed to detect hidden residual confounding?
- probing residual confoundingNegative controls and calibration
- checking confounding on exposureNegative control exposure
- checking confounding on outcomeNegative control outcome
- many negative controls availableEmpirical calibration
- quantifying systematic errorQuantitative bias analysis
- Negative controls and calibration
-
Using outcomes or exposures with known null effects to detect and correct residual confounding in real analyses. in the pathway →
On the pathway · ∗ · Defend it · Negative controls and empirical calibrationNeed to detect hidden residual confounding?
- probing residual confoundingNegative controls and calibration
- checking confounding on exposureNegative control exposure
- checking confounding on outcomeNegative control outcome
- many negative controls availableEmpirical calibration
- quantifying systematic errorQuantitative bias analysis
- Nested case-control
-
Case-control study inside a defined cohort, sampling controls at the time each case occurs to preserve risk-set comparability. in the pathway →
On the pathway · 00 · Framing · Observational study designsWhich observational design fits the question and dominant bias?
- the overall familyObservational study designs
- exposure known, follow forwardCohort study
- snapshot at one timeCross-sectional study
- rare outcome, look backCase-control study
- controls sampled within cohortNested case-control
- random subcohort, multiple outcomesCase-cohort design
- transient trigger, acute eventCase-crossover design
- within-person rate comparisonSelf-controlled case series (SCCS)
- corrects exposure time trendsCase-time-control design
- initiators, active comparatorActive-comparator new-user design
- standing source populationDisease registry
- Net benefit
-
A metric weighing true positives against false positives at a threshold probability, going beyond accuracy by accounting for the consequences of acting. in the pathway →
On the pathway · 05 · Decision rule · Decision-curve analysisWhich clinical-utility concept is in play?
- the overall methodDecision-curve analysis
- utility weighted by thresholdNet benefit
- Net monetary benefit
-
A restatement of a cost-effectiveness comparison as effect times willingness-to-pay minus cost, avoiding the awkwardness of ratios and handling dominance. in the pathway →
On the pathway · 05 · Decision rule · Cost-effectiveness and the ICERWhich economic-evaluation framing fits?
- the overall ideaCost-effectiveness and the ICER
- extra cost per extra effectICER
- value costs and benefits in moneyCost-benefit analysis
- effects identical, compare costs onlyCost-minimization analysis
- effects in quality-adjusted life yearsCost-utility analysis
- value at a willingness thresholdNet monetary benefit
- maximum payable per unit benefitWillingness-to-pay threshold
- Net-benefit regression
-
Converting each patient’s cost and effect into one net-benefit outcome at a willingness-to-pay threshold and regressing it on treatment arm, giving covariate adjustment for free. in the pathway →
On the pathway · 05 · Decision rule · Cost-effectiveness alongside a trialWhat situation?
- the overall ideaCost-effectiveness alongside a trial
- regressing net benefit on covariatesNet-benefit regression
- Network meta-analysis
-
Combining a whole network of trials to estimate every pairwise treatment contrast and rank options, even when no trial compared them all directly. in the pathway →
On the pathway · 04 · Synthesis · Network meta-analysisWhich network meta-analysis concern?
- the overall ideaNetwork meta-analysis
- comparability across the networkTransitivity
- direct versus indirect agreementNode-splitting
- rank treatments overallSUCRA
- New-user design
-
A cohort design applying a washout window so prevalent users do not contaminate the comparison. in the pathway →
On the pathway · 01 · Measurement · Assembling the analytic cohortWhich cohort-construction step?
- the overall ideaAssembling the analytic cohort
- pull and reshape raw source dataExtract-transform-load
- set the time-zero anchorIndex date
- define pre-index covariate historyLookback window
- restrict to treatment initiatorsNew-user design
- NICE
-
A national agency that pairs cost-effectiveness analysis with an explicit cost-per-QALY threshold to reach coverage decisions. in the pathway →
Which HTA framework or body?
- the overall ideaHealth technology assessment and value frameworks
- UK appraisal agencyNICE
- US value-assessment organizationInstitute for Clinical and Economic Review
- Node-splitting
-
A formal check of consistency in a network meta-analysis, comparing the direct and indirect estimate for each contrast to flag disagreement. in the pathway →
On the pathway · 04 · Synthesis · Network meta-analysisWhich network meta-analysis concern?
- the overall ideaNetwork meta-analysis
- comparability across the networkTransitivity
- direct versus indirect agreementNode-splitting
- rank treatments overallSUCRA
- Nominal group technique
-
An in-person consensus method structuring convergence through silent ranking then discussion. in the pathway →
On the pathway · 06 · Recommendation · Consensus methods (Delphi, nominal group)How do experts reach consensus?
- the overall ideaConsensus methods (Delphi, nominal group)
- anonymous iterative roundsDelphi method
- structured in-person rankingNominal group technique
- Non-differential misclassification
-
Measurement error unrelated to the outcome, which usually biases an effect toward the null. in the pathway →
On the pathway · 01 · Measurement · Measurement error and misclassificationWhat kind of measurement error?
- the overall ideaMeasurement error and misclassification
- error unrelated to other variablesNon-differential misclassification
- error differing by groupDifferential misclassification
- true variance over observed varianceReliability ratio
- Non-inferiority and equivalence
-
Trials aiming to show a treatment is not meaningfully worse, or is bounded on both sides, rather than better. in the pathway →
On the pathway · 00 · Framing · Non-inferiority and equivalenceWhat are you trying to show?
- the overall ideaNon-inferiority and equivalence
- new is not meaningfully worseNon-inferiority trial
- new is neither worse nor betterEquivalence trial
- how much worse is tolerableNon-inferiority margin
- trial can detect a real differenceAssay sensitivity
- Non-inferiority margin
-
The pre-specified amount by which a new treatment may be worse and still pass, set from clinical tolerability and the control’s advantage. in the pathway →
On the pathway · 00 · Framing · Non-inferiority and equivalenceWhat are you trying to show?
- the overall ideaNon-inferiority and equivalence
- new is not meaningfully worseNon-inferiority trial
- new is neither worse nor betterEquivalence trial
- how much worse is tolerableNon-inferiority margin
- trial can detect a real differenceAssay sensitivity
- Non-inferiority trial
-
A trial testing against a shifted null, passing if the effect is no worse than standard by more than a pre-specified margin. in the pathway →
On the pathway · 00 · Framing · Non-inferiority and equivalenceWhat are you trying to show?
- the overall ideaNon-inferiority and equivalence
- new is not meaningfully worseNon-inferiority trial
- new is neither worse nor betterEquivalence trial
- how much worse is tolerableNon-inferiority margin
- trial can detect a real differenceAssay sensitivity
- Non-informative censoring
-
The assumption that censored subjects are representative of those still at risk, which informative dropout violates. in the pathway →
On the pathway · 03 · Estimate · Hazard ratios and non-proportional hazardsWhich survival concept?
- the overall ideaHazard ratios and non-proportional hazards
- censoring unrelated to outcomeNon-informative censoring
- summary when hazards are non-proportionalRestricted mean survival time
- Nonresponse bias
-
Bias from those who do not answer a survey differing systematically from those who do. in the pathway →
On the pathway · ∗ · Defend it · Study biases, by rungWhich bias is threatening the study?
- the overall map of biasesStudy biases, by rung
- who got sampled or enrolledSampling bias
- distorted entry into the studySelection bias
- selection among hospitalized patientsBerkson’s bias
- conditioning on a common effectOver-adjustment
- employed groups appear healthierHealthy-worker effect
- only survivors are observedSurvivorship bias
- differential loss to follow-upAttrition bias
- nonresponders differ systematicallyNonresponse bias
- a common cause of exposure and outcomeConfounding
- treatment chosen by prognosisConfounding by indication
- unequal outcome ascertainmentDetection bias
- inaccurate recall of exposureRecall bias
- interviewer shapes responsesInterviewer bias
- earlier detection inflates survivalLead-time bias
- slow cases preferentially detectedLength-time bias
- detecting harmless diseaseOverdiagnosis
- selective reporting of resultsPublication bias
- Normal distribution
-
The Gaussian distribution, often used for continuous measurements, whose standardized form is the z. in the pathway →
On the pathway · 01 · Measurement · Probability distributions and the CLTWhich distribution or sampling result?
- the overall ideaProbability distributions
- continuous bell-shaped variableNormal distribution
- fixed-trial success countsBinomial distribution
- counts of rare eventsPoisson distribution
- overdispersed count dataNegative binomial distribution
- why sample means turn normalCentral limit theorem
- spread of a sample estimateStandard error
- NPI (provider identifier)
-
National Provider Identifier for the rendering or billing clinician or organization. in the pathway →
On the pathway · 01 · Measurement · Claims and coding standardsWhich vocabulary encodes each claim field, and what does it capture?
- the overall ideaClaims and coding standards
- diagnosesICD-10-CM diagnosis codes
- inpatient proceduresICD-10-PCS procedure codes
- professional servicesCPT/HCPCS codes
- dispensed drugsNDC (National Drug Code)
- labs and observationsLOINC lab codes
- providerNPI (provider identifier)
- cross-database mappingOMOP standardized vocabularies (OHDSI)
- translating codesCode crosswalks and mappings
- drug utilizationATC and defined daily dose (DDD)
- Number needed to treat
-
Absolute measure of benefit, the number of patients treated to prevent one event, equal to the reciprocal of the absolute risk reduction. in the pathway →
\[\text{NNT} = \frac{1}{\text{ARR}}\]
where \(\text{NNT}\) is the number needed to treat, how many patients must be treated for one to benefit; \(\text{ARR}\) is the absolute risk reduction, the difference in risk between arms.
On the pathway · 03 · Estimate · Effect measuresWhich effect measure to report?
- the overall ideaEffect measures
- ratio of risks between groupsRisk ratio
- ratio of odds between groupsOdds ratio
- absolute difference in riskRisk difference
- patients treated per outcome preventedNumber needed to treat
O
- O’Brien-Fleming boundary
-
An alpha-spending boundary that is stringent early and near-nominal at the trial’s end. in the pathway →
On the pathway · 00 · Framing · Interim analyses and group-sequential designWhich interim-monitoring element?
- the overall ideaInterim analyses and group-sequential design
- committee reviewing accruing dataData safety monitoring board
- planned looks with stopping rulesGroup-sequential design
- false-positive risk to spendType-I error
- stringent early-look boundaryO’Brien-Fleming boundary
- constant nominal-level boundaryPocock boundary
- Observational study designs
-
The family of non-randomized designs that observe exposures and outcomes as they occur, each chosen to fit a question and limit a specific bias. in the pathway →
On the pathway · 00 · Framing · Observational study designsWhich observational design fits the question and dominant bias?
- the overall familyObservational study designs
- exposure known, follow forwardCohort study
- snapshot at one timeCross-sectional study
- rare outcome, look backCase-control study
- controls sampled within cohortNested case-control
- random subcohort, multiple outcomesCase-cohort design
- transient trigger, acute eventCase-crossover design
- within-person rate comparisonSelf-controlled case series (SCCS)
- corrects exposure time trendsCase-time-control design
- initiators, active comparatorActive-comparator new-user design
- standing source populationDisease registry
- Odds ratio
-
Ratio of the odds of an outcome between groups, often misread as a risk ratio when the outcome is common, which overstates the effect. in the pathway →
On the pathway · 03 · Estimate · Effect measuresWhich effect measure to report?
- the overall ideaEffect measures
- ratio of risks between groupsRisk ratio
- ratio of odds between groupsOdds ratio
- absolute difference in riskRisk difference
- patients treated per outcome preventedNumber needed to treat
- Offset
-
A term for exposure time or population at risk that turns a Poisson count model into a rate model. in the pathway →
On the pathway · 02 · Model · Model modifications (splines, interactions)How do you flex the model?
- the overall ideaModel modifications
- effect depends on another variableInteraction term
- fixed exposure term in count modelsOffset
- smooth nonlinear flexible curvesSplines
- additive smooth function componentsGeneralized additive models
- OMOP standardized vocabularies (OHDSI)
-
Common data model mapping heterogeneous source codes to standard concepts so studies run across databases, at some loss of detail. in the pathway →
On the pathway · 01 · Measurement · Claims and coding standardsWhich vocabulary encodes each claim field, and what does it capture?
- the overall ideaClaims and coding standards
- diagnosesICD-10-CM diagnosis codes
- inpatient proceduresICD-10-PCS procedure codes
- professional servicesCPT/HCPCS codes
- dispensed drugsNDC (National Drug Code)
- labs and observationsLOINC lab codes
- providerNPI (provider identifier)
- cross-database mappingOMOP standardized vocabularies (OHDSI)
- translating codesCode crosswalks and mappings
- drug utilizationATC and defined daily dose (DDD)
- Operating characteristics
-
Sensitivity and specificity describe a test in the abstract, while predictive values describe what a result means for a patient and shift with prevalence. in the pathway →
On the pathway · 05 · Decision rule · Operating characteristicsWhich diagnostic-performance measure?
- the overall ideaOperating characteristics
- true positives among diseasedSensitivity
- true negatives among healthySpecificity
- disease probability given a resultPredictive values
- Operationalizing the variable
-
Writing a variable definition precise enough, with codes, thresholds, and windows, that two analysts produce the same cases. in the pathway →
On the pathway · 01 · Measurement · Operationalizing the variableWhat measurement situation are you in?
- turning a concept into a variableOperationalizing the variable
- Opportunity cost
-
The principle that every dollar spent is health some other patient could have had. in the pathway →
On the pathway · 05 · Decision rule · Perspective and the reference caseWhose costs and benefits count?
- the overall ideaPerspective and the reference case
- standardized analysis conventionsReference case
- reporting checklist for economicsCHEERS
- count all costs to societySocietal perspective
- value of foregone alternativesOpportunity cost
- Outcome phenotyping and validation
-
Treating a claims or EHR outcome as an algorithm whose accuracy must be measured, because its predictive value and sensitivity bias the estimate. in the pathway →
On the pathway · 01 · Measurement · Outcome phenotyping and algorithm validationIs your outcome a validated algorithm or an unchecked code rule?
- the overall ideaOutcome phenotyping and validation
- the rule itselfClaims/EHR phenotype algorithm
- common coding rule1-inpatient / 2-outpatient rule
- tuning the ruleAlgorithm validation (PPV and sensitivity tradeoff)
- reference standardEndpoint adjudication and chart review
- bundled outcomesComposite endpoint construction
- Over-adjustment
-
Conditioning on a mediator or collider, adding bias while trying to remove it, the mirror image of confounding. in the pathway →
On the pathway · ∗ · Defend it · Study biases, by rungWhich bias is threatening the study?
- the overall map of biasesStudy biases, by rung
- who got sampled or enrolledSampling bias
- distorted entry into the studySelection bias
- selection among hospitalized patientsBerkson’s bias
- conditioning on a common effectOver-adjustment
- employed groups appear healthierHealthy-worker effect
- only survivors are observedSurvivorship bias
- differential loss to follow-upAttrition bias
- nonresponders differ systematicallyNonresponse bias
- a common cause of exposure and outcomeConfounding
- treatment chosen by prognosisConfounding by indication
- unequal outcome ascertainmentDetection bias
- inaccurate recall of exposureRecall bias
- interviewer shapes responsesInterviewer bias
- earlier detection inflates survivalLead-time bias
- slow cases preferentially detectedLength-time bias
- detecting harmless diseaseOverdiagnosis
- selective reporting of resultsPublication bias
- Overdiagnosis
-
Detecting disease that would never have caused harm, inflating apparent screening benefit. in the pathway →
On the pathway · ∗ · Defend it · Study biases, by rungWhich bias is threatening the study?
- the overall map of biasesStudy biases, by rung
- who got sampled or enrolledSampling bias
- distorted entry into the studySelection bias
- selection among hospitalized patientsBerkson’s bias
- conditioning on a common effectOver-adjustment
- employed groups appear healthierHealthy-worker effect
- only survivors are observedSurvivorship bias
- differential loss to follow-upAttrition bias
- nonresponders differ systematicallyNonresponse bias
- a common cause of exposure and outcomeConfounding
- treatment chosen by prognosisConfounding by indication
- unequal outcome ascertainmentDetection bias
- inaccurate recall of exposureRecall bias
- interviewer shapes responsesInterviewer bias
- earlier detection inflates survivalLead-time bias
- slow cases preferentially detectedLength-time bias
- detecting harmless diseaseOverdiagnosis
- selective reporting of resultsPublication bias
- Overfitting
-
When a model flexible enough to chase noise fits the training data but fails on new data. in the pathway →
On the pathway · 02 · Model · Bias-variance and regularizationWhich concept or penalty?
- the overall ideaBias-variance and regularization
- the underlying error tradeoffBias-variance tradeoff
- fitting noise, poor generalizationOverfitting
- penalizing complexity broadlyRegularization
- estimating out-of-sample errorCross-validation
- shrink coefficients, keep allRidge regression
- shrink and select variablesLasso
- blend selection and shrinkageElastic net
P
- Parallel trends
-
The unverifiable assumption underlying difference-in-differences. in the pathway →
On the pathway · 02 · Model · Identifying assumptionsWhich identifying assumption do you need?
- the overall ideaIdentifying assumptions
- treatment independent of confoundersConditional independence
- instrument affects outcome only via exposureExclusion restriction
- groups would have tracked togetherParallel trends
- Parameter uncertainty
-
Second-order uncertainty in an input’s true value because it was estimated from finite data, propagated by probabilistic sensitivity analysis. in the pathway →
On the pathway · 05 · Decision rule · Types of uncertaintyWhich source of uncertainty?
- the overall ideaTypes of uncertainty
- uncertainty in input estimatesParameter uncertainty
- random variation between individualsStochastic uncertainty
- uncertainty in model structureStructural uncertainty
- finding where conclusions flipThreshold analysis
- Partial pooling
-
Shrinkage that stabilizes small or sparse groups by borrowing strength from the rest, between pooled and fully separate estimates. in the pathway →
On the pathway · 02 · Model · Hierarchical (multilevel) Bayesian modelsWhich multilevel Bayesian idea?
- the overall ideaHierarchical Bayesian models
- borrow strength across groupsPartial pooling
- Partitioned survival model
-
An oncology model reading state membership straight off the progression-free and overall survival curves rather than a transition matrix. in the pathway →
On the pathway · 05 · Decision rule · Decision-analytic modelsWhich model structure fits the problem?
- the overall ideaDecision-analytic models
- branching one-time event sequenceDecision tree (decision analysis)
- recurring health states over cyclesMarkov model
- survival curves partition statesPartitioned survival model
- infection spread depends on prevalenceDynamic transmission model
- weight future values lowerDiscounting
- Pearson correlation
-
A measure of linear association between two continuous variables. in the pathway →
On the pathway · 02 · Model · Bivariate tests (t-test, chi-square)Which two-variable association are you testing?
- the overall family of testsBivariate tests
- mean across two groupsT-test
- means across three or more groupsANOVA
- ranks across two groupsMann-Whitney test
- ranks across three or more groupsKruskal-Wallis test
- two categorical variables, large countsChi-square test
- two categorical variables, small countsFisher’s exact test
- linear correlation of two continuousPearson correlation
- monotonic correlation, rankedSpearman correlation
- concordance-based rank correlationKendall’s tau
- effect size for two meansCohen’s d
- effect size for ANOVAEta-squared
- effect size for categorical associationCramer’s V
- PECO
-
The observational cousin of PICO, naming population, exposure, comparator, and outcome. in the pathway →
On the pathway · 00 · Framing · Research question (PICO / PECO)Which question framework fits?
- the overall ideaResearch question
- intervention question for a trialPICO
- add an explicit time horizonPICOT
- add study-design eligibilityPICOS
- exposure question for observational workPECO
- Per-member-per-month costing (PMPM/PPPM)
-
Spend normalized by enrollment time, comparing populations with different follow-up at the budget level. in the pathway →
On the pathway · 05 · Decision rule · Real-world cost and HTA methodsModeling skewed real-world costs for HTA?
- valuing real-world costsReal-world cost and HTA methods
- costs are zero-inflatedTwo-part and other cost models
- extreme cost outliers existWinsorization and trimming of cost outliers
- summarizing population spendPer-member-per-month costing (PMPM/PPPM)
- trial ends before lifetimeSurvival extrapolation for HTA
- value has multiple dimensionsMulti-criteria decision analysis (MCDA)
- prioritizing which uncertainty mattersExpected value of partial perfect information (EVPPI)
- Per-protocol
-
Restricting analysis to those who followed the protocol, which answers the biological question but breaks randomization. in the pathway →
On the pathway · 02 · Model · Analysis populations (ITT vs per-protocol)Which set of subjects do you analyze?
- the overall ideaAnalysis populations
- as randomized, regardless of adherenceIntention-to-treat
- only those who followed protocolPer-protocol
- grouped by treatment actually receivedAs-treated
- Persistence (time to discontinuation)
-
Duration from initiation to the first permissible-gap-exceeding break in supply. in the pathway →
On the pathway · 01 · Measurement · Defining exposure in real-world dataHow do raw fills become a defined exposure with a start and end?
- the overall ideaExposure definition in RWD
- building one courseExposure episode construction
- tolerating gapsGrace period and permissible gap
- shifting the clockInduction, latency, and lag windows
- adherence metricProportion of days covered (PDC)
- adherence metricMedication possession ratio (MPR)
- how long treatedPersistence (time to discontinuation)
- standardized spanDrug era (OMOP)
- Person-time
-
Each subject’s time under observation summed across the cohort, the denominator of an incidence rate. in the pathway →
On the pathway · 01 · Measurement · Measures of disease frequencyWhat frequency are you trying to measure?
- the overall ideaMeasures of disease frequency
- existing cases at a time pointPrevalence
- new cases over follow-upIncidence
- new cases as a proportion at riskCumulative incidence
- new cases per unit follow-up timeIncidence rate
- denominator of summed follow-upPerson-time
- unadjusted rate in a populationCrude rate
- comparing rates across populationsAge-standardization
- observed versus expected deathsStandardized mortality ratio
- Perspective and the reference case
-
Whose costs count changes the answer, so a standardized reference case and impact inventory make analyses comparable, distinguishing healthcare-sector from societal perspectives. in the pathway →
On the pathway · 05 · Decision rule · Perspective and the reference caseWhose costs and benefits count?
- the overall ideaPerspective and the reference case
- standardized analysis conventionsReference case
- reporting checklist for economicsCHEERS
- count all costs to societySocietal perspective
- value of foregone alternativesOpportunity cost
- PICO
-
Population, intervention, comparator, outcome: a framework forcing a clinical question to be specific enough to design around. in the pathway →
On the pathway · 00 · Framing · Research question (PICO / PECO)Which question framework fits?
- the overall ideaResearch question
- intervention question for a trialPICO
- add an explicit time horizonPICOT
- add study-design eligibilityPICOS
- exposure question for observational workPECO
- PICOS
-
PICO with an appended study design, the convention in systematic reviews. in the pathway →
On the pathway · 00 · Framing · Research question (PICO / PECO)Which question framework fits?
- the overall ideaResearch question
- intervention question for a trialPICO
- add an explicit time horizonPICOT
- add study-design eligibilityPICOS
- exposure question for observational workPECO
- PICOT
-
PICO with an appended timeframe, the convention in clinical-question teaching. in the pathway →
On the pathway · 00 · Framing · Research question (PICO / PECO)Which question framework fits?
- the overall ideaResearch question
- intervention question for a trialPICO
- add an explicit time horizonPICOT
- add study-design eligibilityPICOS
- exposure question for observational workPECO
- Placebo and falsification tests
-
Looking for an effect where none should exist, such as a pre-treatment period or unaffected outcome, to test whether a design is sound. in the pathway →
On the pathway · ∗ · Defend it · Placebo and falsification testsWhat situation is this?
- the overall ideaPlacebo and falsification tests
- Pocock boundary
-
An alpha-spending boundary that holds a constant threshold across interim looks. in the pathway →
On the pathway · 00 · Framing · Interim analyses and group-sequential designWhich interim-monitoring element?
- the overall ideaInterim analyses and group-sequential design
- committee reviewing accruing dataData safety monitoring board
- planned looks with stopping rulesGroup-sequential design
- false-positive risk to spendType-I error
- stringent early-look boundaryO’Brien-Fleming boundary
- constant nominal-level boundaryPocock boundary
- Poisson distribution
-
The distribution of counts of rare events. in the pathway →
On the pathway · 01 · Measurement · Probability distributions and the CLTWhich distribution or sampling result?
- the overall ideaProbability distributions
- continuous bell-shaped variableNormal distribution
- fixed-trial success countsBinomial distribution
- counts of rare eventsPoisson distribution
- overdispersed count dataNegative binomial distribution
- why sample means turn normalCentral limit theorem
- spread of a sample estimateStandard error
- Poisson regression
-
A regression for counts returning a rate ratio, assuming the variance equals the mean. in the pathway →
On the pathway · 02 · Model · Regression familiesWhich regression for your outcome?
- the overall ideaRegression families
- unifying exponential-family frameworkGLM
- continuous outcomeLinear regression
- binary outcomeLogistic regression
- count outcomePoisson regression
- overdispersed countsNegative binomial regression
- excess zeros in countsZero-inflated model
- separate zero and positive partsHurdle model
- correlated or clustered outcomesGEE
- repeated measures over timeMMRM
- Positivity
-
The identifiability condition that every kind of unit could have received either treatment, also called overlap. in the pathway →
On the pathway · 02 · Model · Potential outcomes and identifiabilityWhich identifiability condition is at stake?
- the overall frameworkPotential outcomes and identifiability
- the counterfactual setupPotential-outcomes framework
- only one outcome is observedFundamental problem of causal inference
- treated and untreated comparableExchangeability
- every covariate stratum has bothPositivity
- observed equals counterfactual under treatmentConsistency
- no interference, single versionSUTVA
- Posterior distribution
-
The updated distribution of a parameter after combining prior belief with the data. in the pathway →
On the pathway · 02 · Model · Bayesian inferenceWhich Bayesian concept is in play?
- the overall frameworkBayesian inference
- the updating rule itselfBayes’ theorem
- beliefs after seeing dataPosterior distribution
- interval summary of the posteriorCredible interval
- Posterior predictive check
-
Asking whether data simulated from the fitted model resemble the real data. in the pathway →
On the pathway · 02 · Model · Bayesian computation (MCMC)Which sampling or diagnostic tool?
- the overall ideaBayesian computation
- sampling the posterior generallyMCMC
- proposal-and-accept samplerMetropolis-Hastings
- sample each parameter conditionallyGibbs sampling
- gradient-guided efficient samplerHamiltonian Monte Carlo
- check chains have convergedR-hat
- check model reproduces the dataPosterior predictive check
- Potential outcomes and identifiability
-
A framework defining a causal effect as the contrast of outcomes under treatment and no treatment, with conditions for estimating it from data. in the pathway →
On the pathway · 02 · Model · Potential outcomes and identifiabilityWhich identifiability condition is at stake?
- the overall frameworkPotential outcomes and identifiability
- the counterfactual setupPotential-outcomes framework
- only one outcome is observedFundamental problem of causal inference
- treated and untreated comparableExchangeability
- every covariate stratum has bothPositivity
- observed equals counterfactual under treatmentConsistency
- no interference, single versionSUTVA
- Potential-outcomes framework
-
Imagining for each unit the outcome under treatment and under no treatment, whose contrast is the causal effect. in the pathway →
On the pathway · 02 · Model · Potential outcomes and identifiabilityWhich identifiability condition is at stake?
- the overall frameworkPotential outcomes and identifiability
- the counterfactual setupPotential-outcomes framework
- only one outcome is observedFundamental problem of causal inference
- treated and untreated comparableExchangeability
- every covariate stratum has bothPositivity
- observed equals counterfactual under treatmentConsistency
- no interference, single versionSUTVA
- Pre-registration
-
A public commitment on ClinicalTrials.gov or the Open Science Framework that locks the endpoint before data are unblinded. in the pathway →
On the pathway · 00 · Framing · Endpoint logic and pre-registrationWhich endpoint or pre-specification concern?
- the overall ideaEndpoint logic and pre-registration
- the main pre-specified outcomePrimary endpoint
- a stand-in for the outcomeSurrogate endpoint
- criteria validating a surrogatePrentice’s criteria
- lock analysis plan in advancePre-registration
- Precision
-
The share of positive predictions that are correct, the same as positive predictive value. in the pathway →
On the pathway · 02 · Model · Classification performance metricsWhich classification metric?
- the overall ideaClassification performance metrics
- share of predicted positives correctPrecision
- share of true positives caughtRecall
- balance precision and recallF1 score
- tradeoff across all thresholdsPrecision-recall curve
- Precision-recall curve
-
A more honest summary than ROC-AUC of classifier performance under class imbalance. in the pathway →
On the pathway · 02 · Model · Classification performance metricsWhich classification metric?
- the overall ideaClassification performance metrics
- share of predicted positives correctPrecision
- share of true positives caughtRecall
- balance precision and recallF1 score
- tradeoff across all thresholdsPrecision-recall curve
- Prediction and machine learning
-
Flexible models for predicting rather than explaining, judged on out-of-sample error and calibration, not coefficient plausibility. in the pathway →
On the pathway · 02 · Model · Prediction and machine learningWhich prediction concept is in play?
- the overall areaPrediction and machine learning
- explaining individual predictionsSHAP
- Prediction interval
-
The range a new study’s true effect might fall in, wider than the confidence interval and more honest under substantial heterogeneity. in the pathway →
On the pathway · 04 · Synthesis · Meta-analysis and poolingWhich pooling or heterogeneity tool?
- the overall ideaMeta-analysis and pooling
- one true effect assumedFixed-effect meta-analysis
- effects vary across studiesRandom-effects meta-analysis
- how much effects varyHeterogeneity
- testing for heterogeneityCochran’s Q
- proportion of variance from heterogeneityI-squared
- between-study variance estimateTau-squared
- range for a new studyPrediction interval
- explaining heterogeneity by covariatesMeta-regression
- visualizing small-study effectsFunnel plot
- testing funnel asymmetryEgger’s test
- Predictive values
-
What a positive or negative test result means for the patient in front of you, shifting with the prevalence of disease. in the pathway →
\[\text{PPV} = \frac{\text{sens} \cdot \text{prev}}{\text{sens} \cdot \text{prev} + (1 - \text{spec}) \cdot (1 - \text{prev})}\]
where \(\text{PPV}\) is the positive predictive value, the chance a positive result is a true case; \(\text{sens}\) is the sensitivity, the chance a true case tests positive; \(\text{spec}\) is the specificity, the chance a non-case tests negative; \(\text{prev}\) is the prevalence, the share of the tested population with the disease.
On the pathway · 05 · Decision rule · Operating characteristicsWhich diagnostic-performance measure?
- the overall ideaOperating characteristics
- true positives among diseasedSensitivity
- true negatives among healthySpecificity
- disease probability given a resultPredictive values
- Prentice’s criteria
-
The formal test for whether a surrogate endpoint validly captures a treatment’s effect on the true clinical outcome. in the pathway →
On the pathway · 00 · Framing · Endpoint logic and pre-registrationWhich endpoint or pre-specification concern?
- the overall ideaEndpoint logic and pre-registration
- the main pre-specified outcomePrimary endpoint
- a stand-in for the outcomeSurrogate endpoint
- criteria validating a surrogatePrentice’s criteria
- lock analysis plan in advancePre-registration
- Prevalence
-
The share of a population that has a condition at a point in time or over a window, reflecting both occurrence and duration. in the pathway →
On the pathway · 01 · Measurement · Measures of disease frequencyWhat frequency are you trying to measure?
- the overall ideaMeasures of disease frequency
- existing cases at a time pointPrevalence
- new cases over follow-upIncidence
- new cases as a proportion at riskCumulative incidence
- new cases per unit follow-up timeIncidence rate
- denominator of summed follow-upPerson-time
- unadjusted rate in a populationCrude rate
- comparing rates across populationsAge-standardization
- observed versus expected deathsStandardized mortality ratio
- Primary endpoint
-
The outcome the sample size is built on and the headline claim is read against, with everything else secondary. in the pathway →
On the pathway · 00 · Framing · Endpoint logic and pre-registrationWhich endpoint or pre-specification concern?
- the overall ideaEndpoint logic and pre-registration
- the main pre-specified outcomePrimary endpoint
- a stand-in for the outcomeSurrogate endpoint
- criteria validating a surrogatePrentice’s criteria
- lock analysis plan in advancePre-registration
- Principal component analysis
-
A dimensionality-reduction method finding the orthogonal directions of greatest variance. in the pathway →
On the pathway · 02 · Model · Unsupervised learningWhat unlabeled-data structure are you finding?
- the overall familyUnsupervised learning
- grouping similar observationsClustering
- nested grouping by linkageHierarchical clustering
- partitioning into k groupsK-means
- reducing the number of featuresDimensionality reduction
- orthogonal variance componentsPrincipal component analysis
- Principal-stratum strategy
-
An intercurrent-event strategy restricting to those who would never have the event. in the pathway →
On the pathway · 02 · Model · Trial estimands and intercurrent eventsHow do you handle intercurrent events?
- the overall ideaTrial estimands and intercurrent events
- events disrupting outcome interpretationIntercurrent events
- ignore them, use assigned treatmentTreatment-policy strategy
- imagine they did not occurHypothetical strategy
- fold event into the outcomeComposite strategy
- restrict to a defined subpopulationPrincipal-stratum strategy
- PRISMA
-
The reporting checklist and flow diagram for systematic reviews. in the pathway →
On the pathway · 06 · Recommendation · Reporting standardsWhich study type are you reporting?
- the overall ideaReporting standards
- randomized controlled trialCONSORT
- observational studySTROBE
- systematic reviewPRISMA
- prediction model studyTRIPOD
- Privacy-preserving record linkage (tokenization)
-
Matching records across datasets using encrypted tokens instead of raw identifiers, so patients can be linked without revealing who they are. in the pathway →
On the pathway · 01 · Measurement · Data feasibility, enrollment, and linkageCan this data actually answer my question?
- the overall ideaData feasibility, enrollment, and linkage
- you need observable follow-upContinuous enrollment and observable time
- you must size the populationDatabase feasibility and the attrition funnel
- you join multiple datasetsPrivacy-preserving record linkage (tokenization)
- Probabilistic sensitivity analysis
-
Propagating parameter uncertainty through a Monte Carlo simulation that draws each parameter from a distribution and reruns the model thousands of times. in the pathway →
On the pathway · 05 · Decision rule · Uncertainty in cost-effectiveness (PSA)How are you handling cost-effectiveness uncertainty?
- the overall ideaUncertainty in cost-effectiveness (PSA)
- propagating parameter uncertaintyProbabilistic sensitivity analysis
- plotting cost and effect differencesCost-effectiveness plane
- probability of being cost-effectiveCost-effectiveness acceptability curve
- Probability distributions
-
The theoretical distributions that model data and supply the reference for test statistics. in the pathway →
On the pathway · 01 · Measurement · Probability distributions and the CLTWhich distribution or sampling result?
- the overall ideaProbability distributions
- continuous bell-shaped variableNormal distribution
- fixed-trial success countsBinomial distribution
- counts of rare eventsPoisson distribution
- overdispersed count dataNegative binomial distribution
- why sample means turn normalCentral limit theorem
- spread of a sample estimateStandard error
- Probability sample
-
A sample giving every unit a known, nonzero chance of selection, the basis for generalizing to the population. in the pathway →
On the pathway · 01 · Measurement · Survey sampling designHow do you draw the sample?
- the overall ideaSurvey sampling design
- every unit known nonzero chanceProbability sample
- equal-chance draw from frameSimple random sampling
- sample within population strataStratified sampling
- sample whole groups togetherCluster sampling
- sample in successive nested stagesMultistage sampling
- variance inflation from clusteringDesign effect
- Propensity score
-
The probability of treatment given covariates, used to match or weight treated and untreated on measured confounders. in the pathway →
On the pathway · 02 · Model · Causal estimators (propensity scores, g-methods)How do you estimate the causal effect?
- the overall ideaCausal estimators
- model treatment assignment probabilityPropensity score
- reweight by inverse treatment probabilityIPTW
- model and average outcomesG-formula
- combine outcome and treatment modelsDoubly-robust estimators
- targeted machine-learning estimationTMLE
- Proportion of days covered (PDC)
-
Fraction of a period during which a patient had drug supply on hand, capping overlapping fills. in the pathway →
On the pathway · 01 · Measurement · Defining exposure in real-world dataHow do raw fills become a defined exposure with a start and end?
- the overall ideaExposure definition in RWD
- building one courseExposure episode construction
- tolerating gapsGrace period and permissible gap
- shifting the clockInduction, latency, and lag windows
- adherence metricProportion of days covered (PDC)
- adherence metricMedication possession ratio (MPR)
- how long treatedPersistence (time to discontinuation)
- standardized spanDrug era (OMOP)
- Proportional hazards
-
The Cox-model assumption checked with scaled Schoenfeld residuals or a log-log survival plot. in the pathway →
On the pathway · 02 · Model · Checking model assumptionsWhich model assumption to check?
- the overall ideaChecking model assumptions
- non-constant residual varianceHeteroscedasticity
- predictors too collinearVariance inflation factor
- single points driving the fitCook’s distance
- hazard ratio constant over timeProportional hazards
- test that proportionality formallySchoenfeld residuals
- fix variance without refittingRobust standard errors
- PROSPERO
-
The register where a systematic review protocol is recorded before screening, keeping the review from becoming a search for the wanted result. in the pathway →
On the pathway · 04 · Synthesis · Conducting a systematic reviewWhat situation?
- the overall ideaConducting a systematic review
- registering the review protocolPROSPERO
- Publication bias
-
Positive results being published while null ones vanish, inflating a pooled estimate and often visible as funnel-plot asymmetry. in the pathway →
On the pathway · ∗ · Defend it · Study biases, by rungWhich bias is threatening the study?
- the overall map of biasesStudy biases, by rung
- who got sampled or enrolledSampling bias
- distorted entry into the studySelection bias
- selection among hospitalized patientsBerkson’s bias
- conditioning on a common effectOver-adjustment
- employed groups appear healthierHealthy-worker effect
- only survivors are observedSurvivorship bias
- differential loss to follow-upAttrition bias
- nonresponders differ systematicallyNonresponse bias
- a common cause of exposure and outcomeConfounding
- treatment chosen by prognosisConfounding by indication
- unequal outcome ascertainmentDetection bias
- inaccurate recall of exposureRecall bias
- interviewer shapes responsesInterviewer bias
- earlier detection inflates survivalLead-time bias
- slow cases preferentially detectedLength-time bias
- detecting harmless diseaseOverdiagnosis
- selective reporting of resultsPublication bias
Q
- QALY
-
Quality-adjusted life year: time spent in a health state multiplied by a utility weight between zero, equivalent to death, and one, full health. in the pathway →
On the pathway · 05 · Decision rule · QALYs and health-state utilitiesWhich utility concept do you need?
- the overall ideaQALYs and health-state utilities
- quality-adjusted life yearsQALY
- preference weight for a health stateHealth-state utility
- a standardized utility instrumentEQ-5D
- QALYs and health-state utilities
-
The quality-adjusted life year multiplies time in a health state by a utility weight anchored between zero (death) and one (full health). in the pathway →
On the pathway · 05 · Decision rule · QALYs and health-state utilitiesWhich utility concept do you need?
- the overall ideaQALYs and health-state utilities
- quality-adjusted life yearsQALY
- preference weight for a health stateHealth-state utility
- a standardized utility instrumentEQ-5D
- Quantitative bias analysis
-
Methods that assign explicit numerical assumptions to bias and propagate them into adjusted estimates and intervals. in the pathway →
On the pathway · ∗ · Defend it · Negative controls and empirical calibrationNeed to detect hidden residual confounding?
- probing residual confoundingNegative controls and calibration
- checking confounding on exposureNegative control exposure
- checking confounding on outcomeNegative control outcome
- many negative controls availableEmpirical calibration
- quantifying systematic errorQuantitative bias analysis
- Questionnaire and instrument design
-
Fixing before fieldwork what a survey can measure, through item wording, response format, administration mode, and branching. in the pathway →
On the pathway · 01 · Measurement · Questionnaire and instrument designWhich questionnaire flaw is in play?
- the overall craftQuestionnaire and instrument design
- asking two things at onceDouble-barreled question
- wording that steers the answerLeading question
R
- R-hat
-
A convergence statistic that should sit near 1 when MCMC chains started far apart have mixed. in the pathway →
On the pathway · 02 · Model · Bayesian computation (MCMC)Which sampling or diagnostic tool?
- the overall ideaBayesian computation
- sampling the posterior generallyMCMC
- proposal-and-accept samplerMetropolis-Hastings
- sample each parameter conditionallyGibbs sampling
- gradient-guided efficient samplerHamiltonian Monte Carlo
- check chains have convergedR-hat
- check model reproduces the dataPosterior predictive check
- R-squared
-
The share of outcome variance a model explains, which climbs mechanically as predictors are added, so use the adjusted or out-of-sample version. in the pathway →
On the pathway · 03 · Estimate · Model fit, comparison, and prediction errorWhich fit or error measure do you need?
- the overall ideaModel fit, comparison, and prediction error
- variance explained, linear modelR-squared
- pseudo-fit for logistic modelMcFadden’s pseudo-R-squared
- compare models, penalize parametersAIC
- compare models, penalize more heavilyBIC
- prediction error, average magnitudeMean absolute error
- prediction error, penalize large missesRMSE
- Random forest
-
The standard bagging ensemble, averaging many trees trained on bootstrap resamples. in the pathway →
On the pathway · 02 · Model · Learning algorithms and ensemblesWhich learner or ensemble fits?
- the overall ideaLearning algorithms and ensembles
- single interpretable splitsDecision tree (machine learning)
- classify by closest neighborsK-nearest neighbours
- maximum-margin separating boundarySupport vector machine
- average parallel bootstrapped modelsBagging
- many decorrelated bagged treesRandom forest
- sequentially correct prior errorsBoosting
- Random-effects meta-analysis
-
A pooling model assuming the true effect varies across studies, adding between-study variance to each weight and widening the interval. in the pathway →
On the pathway · 04 · Synthesis · Meta-analysis and poolingWhich pooling or heterogeneity tool?
- the overall ideaMeta-analysis and pooling
- one true effect assumedFixed-effect meta-analysis
- effects vary across studiesRandom-effects meta-analysis
- how much effects varyHeterogeneity
- testing for heterogeneityCochran’s Q
- proportion of variance from heterogeneityI-squared
- between-study variance estimateTau-squared
- range for a new studyPrediction interval
- explaining heterogeneity by covariatesMeta-regression
- visualizing small-study effectsFunnel plot
- testing funnel asymmetryEgger’s test
- Randomization and blinding
-
The schemes that assign trial arms and the safeguards, allocation concealment and blinding, that keep that assignment from being gamed or biased. in the pathway →
On the pathway · 00 · Framing · Randomization and blindingWhat allocation or masking concern?
- the overall ideaRandomization and blinding
- hide the upcoming assignmentAllocation concealment
- balance arms in small chunksBlock randomization
- balance within prognostic strataStratified randomization
- dynamically balance many factorsMinimization
- mask treatment after allocationBlinding
- Real-world causal-inference extensions
-
Methods extending propensity-score and g-methods to high-dimensional claims data and to treatment and censoring that vary over follow-up time. in the pathway →
On the pathway · 02 · Model · Real-world causal-inference extensionsHow do causal methods scale to claims and time?
- the overall ideaReal-world causal-inference extensions
- you have many candidate covariatesHigh-dimensional propensity score (hdPS)
- outcome modeling suits the problemDisease risk score
- treatment strategy unfolds over timeClone-censor-weight
- censoring depends on covariatesInverse-probability-of-censoring weighting (IPCW)
- you need a simple guardLandmark analysis
- Real-world cost and HTA methods
-
Techniques for modeling skewed real-world costs and extrapolating trial data into health technology assessment decisions. in the pathway →
On the pathway · 05 · Decision rule · Real-world cost and HTA methodsModeling skewed real-world costs for HTA?
- valuing real-world costsReal-world cost and HTA methods
- costs are zero-inflatedTwo-part and other cost models
- extreme cost outliers existWinsorization and trimming of cost outliers
- summarizing population spendPer-member-per-month costing (PMPM/PPPM)
- trial ends before lifetimeSurvival extrapolation for HTA
- value has multiple dimensionsMulti-criteria decision analysis (MCDA)
- prioritizing which uncertainty mattersExpected value of partial perfect information (EVPPI)
- Recall
-
The share of true positives caught, the same as sensitivity. in the pathway →
On the pathway · 02 · Model · Classification performance metricsWhich classification metric?
- the overall ideaClassification performance metrics
- share of predicted positives correctPrecision
- share of true positives caughtRecall
- balance precision and recallF1 score
- tradeoff across all thresholdsPrecision-recall curve
- Recall bias
-
Differential memory of past exposure between cases and controls. in the pathway →
On the pathway · ∗ · Defend it · Study biases, by rungWhich bias is threatening the study?
- the overall map of biasesStudy biases, by rung
- who got sampled or enrolledSampling bias
- distorted entry into the studySelection bias
- selection among hospitalized patientsBerkson’s bias
- conditioning on a common effectOver-adjustment
- employed groups appear healthierHealthy-worker effect
- only survivors are observedSurvivorship bias
- differential loss to follow-upAttrition bias
- nonresponders differ systematicallyNonresponse bias
- a common cause of exposure and outcomeConfounding
- treatment chosen by prognosisConfounding by indication
- unequal outcome ascertainmentDetection bias
- inaccurate recall of exposureRecall bias
- interviewer shapes responsesInterviewer bias
- earlier detection inflates survivalLead-time bias
- slow cases preferentially detectedLength-time bias
- detecting harmless diseaseOverdiagnosis
- selective reporting of resultsPublication bias
- Reference case
-
A standardized set of methods recommended by the Second Panel, reported alongside any analysis so results are comparable. in the pathway →
On the pathway · 05 · Decision rule · Perspective and the reference caseWhose costs and benefits count?
- the overall ideaPerspective and the reference case
- standardized analysis conventionsReference case
- reporting checklist for economicsCHEERS
- count all costs to societySocietal perspective
- value of foregone alternativesOpportunity cost
- Registries
-
Purpose-built data for one disease, deep but narrow. in the pathway →
On the pathway · 01 · Measurement · Data sources and their tradeoffsWhich data source or pitfall?
- the overall ideaData sources and their tradeoffs
- clinical detail from care recordsElectronic health record data
- billing records across encountersClaims data
- enrolled cohorts for a conditionRegistries
- sampled population questionnairesSurvey data
- group-level inference pitfallEcological fallacy
- Regression discontinuity
-
A causal design exploiting a cutoff, resting on continuity at the cutoff. in the pathway →
On the pathway · 02 · Model · Causal designs without randomizationWhich quasi-experimental design fits?
- the overall ideaCausal designs without randomization
- before-after across exposed and controlDifference-in-differences
- a haphazard nudge to exposureInstrumental variables
- assignment by a cutoff thresholdRegression discontinuity
- weighted donors build a counterfactualSynthetic control
- Regression families
-
The principle that the outcome dictates the model, most being a generalized linear model of an outcome distribution plus a link function. in the pathway →
On the pathway · 02 · Model · Regression familiesWhich regression for your outcome?
- the overall ideaRegression families
- unifying exponential-family frameworkGLM
- continuous outcomeLinear regression
- binary outcomeLogistic regression
- count outcomePoisson regression
- overdispersed countsNegative binomial regression
- excess zeros in countsZero-inflated model
- separate zero and positive partsHurdle model
- correlated or clustered outcomesGEE
- repeated measures over timeMMRM
- Regularization
-
Penalizing model complexity to buy the right flexibility, through ridge, lasso, or elastic net. in the pathway →
On the pathway · 02 · Model · Bias-variance and regularizationWhich concept or penalty?
- the overall ideaBias-variance and regularization
- the underlying error tradeoffBias-variance tradeoff
- fitting noise, poor generalizationOverfitting
- penalizing complexity broadlyRegularization
- estimating out-of-sample errorCross-validation
- shrink coefficients, keep allRidge regression
- shrink and select variablesLasso
- blend selection and shrinkageElastic net
- Regulatory pathways and registration
-
The regulatory frame around a study informing a regulated decision, including FDA IND or IDE applications and mandatory ClinicalTrials.gov registration and results posting. in the pathway →
On the pathway · § · Conduct it · Regulatory pathways and registrationWhich regulatory application?
- the overall ideaRegulatory pathways and registration
- investigational drug applicationIND
- investigational device exemptionIDE
- Relative versus absolute
-
The communication choice of whether to lead with a relative effect, which can sound large, or an absolute effect, where benefit becomes concrete. in the pathway →
On the pathway · 03 · Estimate · Relative versus absoluteWhich scale frames the effect?
- the overall ideaRelative versus absolute
- effect on the absolute scaleAbsolute risk reduction
- Reliability
-
Reproducibility: measuring the same quantity again and getting the same answer. in the pathway →
On the pathway · 01 · Measurement · Reliability and validityWhich measurement property are you assessing?
- the overall ideaReliability and validity
- consistency of measurementReliability
- measuring the intended constructValidity
- internal consistency of scale itemsCronbach’s alpha
- agreement on continuous measuresIntraclass correlation
- plot method agreement and biasBland-Altman plot
- categorical agreement, two ratersCohen’s kappa
- ordered-category agreement, two ratersWeighted kappa
- categorical agreement, many ratersFleiss’ kappa
- Reliability and validity
-
Two independent properties of a measurement: reproducibility on repeat, and whether it measures what it claims. in the pathway →
On the pathway · 01 · Measurement · Reliability and validityWhich measurement property are you assessing?
- the overall ideaReliability and validity
- consistency of measurementReliability
- measuring the intended constructValidity
- internal consistency of scale itemsCronbach’s alpha
- agreement on continuous measuresIntraclass correlation
- plot method agreement and biasBland-Altman plot
- categorical agreement, two ratersCohen’s kappa
- ordered-category agreement, two ratersWeighted kappa
- categorical agreement, many ratersFleiss’ kappa
- Reliability ratio
-
The signal’s share of total variance, by which non-differential error attenuates a true slope. in the pathway →
\[\lambda = \frac{\sigma^2_{\text{true}}}{\sigma^2_{\text{true}} + \sigma^2_{\text{error}}}\]
where \(\lambda\) is the reliability ratio, the signal’s share of total variance; \(\sigma^2_{\text{true}}\) is the variance of the true values; \(\sigma^2_{\text{error}}\) is the variance of the measurement error.
On the pathway · 01 · Measurement · Measurement error and misclassificationWhat kind of measurement error?
- the overall ideaMeasurement error and misclassification
- error unrelated to other variablesNon-differential misclassification
- error differing by groupDifferential misclassification
- true variance over observed varianceReliability ratio
- Reporting standards
-
Checklists like CONSORT, STROBE, PRISMA, and TRIPOD that make a study’s methods auditable by requiring the details that let a reader judge it. in the pathway →
On the pathway · 06 · Recommendation · Reporting standardsWhich study type are you reporting?
- the overall ideaReporting standards
- randomized controlled trialCONSORT
- observational studySTROBE
- systematic reviewPRISMA
- prediction model studyTRIPOD
- Research ethics and the IRB
-
Modern research ethics rests on the three Belmont principles and is enforced before a study starts by an institutional review board weighing risks against benefits. in the pathway →
On the pathway · § · Conduct it · Research ethics and the IRBWhich ethics concept or body?
- the overall ideaResearch ethics and the IRB
- foundational ethical principlesBelmont principles
- genuine uncertainty justifying a trialClinical equipoise
- participant’s voluntary agreementInformed consent
- body that reviews and approves studiesInstitutional review board
- Research question
-
A study’s question written specifically enough to act on, using PICO or PECO to fix population, intervention or exposure, comparator, and outcome. in the pathway →
On the pathway · 00 · Framing · Research question (PICO / PECO)Which question framework fits?
- the overall ideaResearch question
- intervention question for a trialPICO
- add an explicit time horizonPICOT
- add study-design eligibilityPICOS
- exposure question for observational workPECO
- Restricted mean survival time
-
A survival summary that remains meaningful under non-proportional hazards and gives a number a patient can actually use. in the pathway →
On the pathway · 03 · Estimate · Hazard ratios and non-proportional hazardsWhich survival concept?
- the overall ideaHazard ratios and non-proportional hazards
- censoring unrelated to outcomeNon-informative censoring
- summary when hazards are non-proportionalRestricted mean survival time
- Ridge regression
-
L2 regularization that shrinks coefficients toward zero. in the pathway →
On the pathway · 02 · Model · Bias-variance and regularizationWhich concept or penalty?
- the overall ideaBias-variance and regularization
- the underlying error tradeoffBias-variance tradeoff
- fitting noise, poor generalizationOverfitting
- penalizing complexity broadlyRegularization
- estimating out-of-sample errorCross-validation
- shrink coefficients, keep allRidge regression
- shrink and select variablesLasso
- blend selection and shrinkageElastic net
- Risk calculators and prediction tools
-
A model packaged for bedside use that carries its development population with it, so external validation and recalibration matter before its output drives action. in the pathway →
On the pathway · 05 · Decision rule · Risk calculators and prediction toolsWhat situation?
- the overall ideaRisk calculators and prediction tools
- Risk difference
-
Absolute effect measure: the risk in the exposed group minus the risk in the unexposed group. in the pathway →
On the pathway · 03 · Estimate · Effect measuresWhich effect measure to report?
- the overall ideaEffect measures
- ratio of risks between groupsRisk ratio
- ratio of odds between groupsOdds ratio
- absolute difference in riskRisk difference
- patients treated per outcome preventedNumber needed to treat
- Risk ratio
-
Relative effect measure: the risk in the exposed group divided by the risk in the unexposed group. in the pathway →
On the pathway · 03 · Estimate · Effect measuresWhich effect measure to report?
- the overall ideaEffect measures
- ratio of risks between groupsRisk ratio
- ratio of odds between groupsOdds ratio
- absolute difference in riskRisk difference
- patients treated per outcome preventedNumber needed to treat
- Risk-of-bias appraisal
-
Scoring how a study’s design and conduct threaten its result domain by domain, using structured tools like RoB 2 for trials and ROBINS-I for observational studies. in the pathway →
On the pathway · 04 · Synthesis · Risk-of-bias appraisalWhich risk-of-bias tool fits?
- the overall ideaRisk-of-bias appraisal
- randomized trialsRoB 2
- non-randomized intervention studiesROBINS-I
- RMSE
-
Root mean squared error, the prediction error in the outcome’s own units that punishes large misses hardest. in the pathway →
On the pathway · 03 · Estimate · Model fit, comparison, and prediction errorWhich fit or error measure do you need?
- the overall ideaModel fit, comparison, and prediction error
- variance explained, linear modelR-squared
- pseudo-fit for logistic modelMcFadden’s pseudo-R-squared
- compare models, penalize parametersAIC
- compare models, penalize more heavilyBIC
- prediction error, average magnitudeMean absolute error
- prediction error, penalize large missesRMSE
- RoB 2
-
A structured tool for scoring risk of bias in randomized trials, domain by domain. in the pathway →
On the pathway · 04 · Synthesis · Risk-of-bias appraisalWhich risk-of-bias tool fits?
- the overall ideaRisk-of-bias appraisal
- randomized trialsRoB 2
- non-randomized intervention studiesROBINS-I
- ROBINS-I
-
A structured tool for scoring risk of bias in observational studies, domain by domain. in the pathway →
On the pathway · 04 · Synthesis · Risk-of-bias appraisalWhich risk-of-bias tool fits?
- the overall ideaRisk-of-bias appraisal
- randomized trialsRoB 2
- non-randomized intervention studiesROBINS-I
- Robust standard errors
-
Heteroscedasticity-robust (sandwich) standard errors, the modern default for non-constant variance. in the pathway →
On the pathway · 02 · Model · Checking model assumptionsWhich model assumption to check?
- the overall ideaChecking model assumptions
- non-constant residual varianceHeteroscedasticity
- predictors too collinearVariance inflation factor
- single points driving the fitCook’s distance
- hazard ratio constant over timeProportional hazards
- test that proportionality formallySchoenfeld residuals
- fix variance without refittingRobust standard errors
- Robust statistics for heavy tails
-
Median-based summaries and MAD-scaled z-scores that resist the outliers which dominate means and standard deviations in heavy-tailed data. in the pathway →
On the pathway · 02 · Model · Robust statistics for heavy tailsWhich robust measure?
- the overall ideaRobust statistics for heavy tails
- robust spread of the dataMAD
- robust outlier-resistant standardizationRobust z-score
- Robust z-score
-
A z-score built from the median and MAD so extreme points no longer set the scale. in the pathway →
\[z = \frac{x - \text{median}}{1.4826 \times \text{MAD}}\]
where \(z\) is the robust z-score for a value; \(x\) is the value being scored; \(\text{median}\) is the median of the data, the robust center; \(\text{MAD}\) is the median absolute deviation, the robust spread; \(1.4826\) rescales the MAD to equal the standard deviation under a normal.
On the pathway · 02 · Model · Robust statistics for heavy tailsWhich robust measure?
- the overall ideaRobust statistics for heavy tails
- robust spread of the dataMAD
- robust outlier-resistant standardizationRobust z-score
- Rosenbaum bounds
-
A method quantifying how much unmeasured confounding would overturn a result in matched designs, analogous to the E-value. in the pathway →
On the pathway · ∗ · Defend it · Bias quantificationHow do you quantify unmeasured bias?
- the overall ideaBias quantification
- strength needed to explain awayE-value
- hidden bias in matched designsRosenbaum bounds
S
- Safe Harbor
-
A HIPAA de-identification method that strips eighteen specified identifiers from a dataset. in the pathway →
On the pathway · § · Conduct it · Data privacy and securityWhich rule or method?
- the overall ideaData privacy and security
- US health privacy lawHIPAA
- EU data protection lawGDPR
- de-identify by removing identifiersSafe Harbor
- de-identify by statistical opinionExpert determination
- generating artificial substitute recordsSynthetic data
- Safety and adverse-event analysis
-
Tabulating adverse events by type and severity on the safety population, compared as risk differences or exposure-adjusted rates, deliberately not corrected for multiplicity. in the pathway →
On the pathway · 03 · Estimate · Safety and adverse-event analysisWhich safety analysis element?
- the overall ideaSafety and adverse-event analysis
- subjects who received any treatmentSafety population
- Safety population
-
Everyone who received any treatment, the set on which adverse events are counted, rather than the randomized set. in the pathway →
On the pathway · 03 · Estimate · Safety and adverse-event analysisWhich safety analysis element?
- the overall ideaSafety and adverse-event analysis
- subjects who received any treatmentSafety population
- Sampling bias
-
A sample that does not represent the target population. in the pathway →
On the pathway · ∗ · Defend it · Study biases, by rungWhich bias is threatening the study?
- the overall map of biasesStudy biases, by rung
- who got sampled or enrolledSampling bias
- distorted entry into the studySelection bias
- selection among hospitalized patientsBerkson’s bias
- conditioning on a common effectOver-adjustment
- employed groups appear healthierHealthy-worker effect
- only survivors are observedSurvivorship bias
- differential loss to follow-upAttrition bias
- nonresponders differ systematicallyNonresponse bias
- a common cause of exposure and outcomeConfounding
- treatment chosen by prognosisConfounding by indication
- unequal outcome ascertainmentDetection bias
- inaccurate recall of exposureRecall bias
- interviewer shapes responsesInterviewer bias
- earlier detection inflates survivalLead-time bias
- slow cases preferentially detectedLength-time bias
- detecting harmless diseaseOverdiagnosis
- selective reporting of resultsPublication bias
- Schoenfeld residuals
-
Scaled residuals used to check the proportional-hazards assumption of a Cox model. in the pathway →
On the pathway · 02 · Model · Checking model assumptionsWhich model assumption to check?
- the overall ideaChecking model assumptions
- non-constant residual varianceHeteroscedasticity
- predictors too collinearVariance inflation factor
- single points driving the fitCook’s distance
- hazard ratio constant over timeProportional hazards
- test that proportionality formallySchoenfeld residuals
- fix variance without refittingRobust standard errors
- Selection bias
-
Bias from who ends up in the analysis, including sampling, volunteer, nonresponse, attrition, Berkson’s, healthy-worker, and survivorship variants. in the pathway →
On the pathway · ∗ · Defend it · Study biases, by rungWhich bias is threatening the study?
- the overall map of biasesStudy biases, by rung
- who got sampled or enrolledSampling bias
- distorted entry into the studySelection bias
- selection among hospitalized patientsBerkson’s bias
- conditioning on a common effectOver-adjustment
- employed groups appear healthierHealthy-worker effect
- only survivors are observedSurvivorship bias
- differential loss to follow-upAttrition bias
- nonresponders differ systematicallyNonresponse bias
- a common cause of exposure and outcomeConfounding
- treatment chosen by prognosisConfounding by indication
- unequal outcome ascertainmentDetection bias
- inaccurate recall of exposureRecall bias
- interviewer shapes responsesInterviewer bias
- earlier detection inflates survivalLead-time bias
- slow cases preferentially detectedLength-time bias
- detecting harmless diseaseOverdiagnosis
- selective reporting of resultsPublication bias
- Self-controlled case series (SCCS)
-
Models event rates across exposed and unexposed time within each affected person, removing all time-fixed within-person confounding. in the pathway →
On the pathway · 00 · Framing · Observational study designsWhich observational design fits the question and dominant bias?
- the overall familyObservational study designs
- exposure known, follow forwardCohort study
- snapshot at one timeCross-sectional study
- rare outcome, look backCase-control study
- controls sampled within cohortNested case-control
- random subcohort, multiple outcomesCase-cohort design
- transient trigger, acute eventCase-crossover design
- within-person rate comparisonSelf-controlled case series (SCCS)
- corrects exposure time trendsCase-time-control design
- initiators, active comparatorActive-comparator new-user design
- standing source populationDisease registry
- Sensitivity
-
The proportion of truly diseased patients a test correctly identifies as positive. in the pathway →
On the pathway · 05 · Decision rule · Operating characteristicsWhich diagnostic-performance measure?
- the overall ideaOperating characteristics
- true positives among diseasedSensitivity
- true negatives among healthySpecificity
- disease probability given a resultPredictive values
- Sensitivity analysis
-
Pre-specified analyses that deliberately vary the assumptions most likely to be challenged and report what happens, more credible than analyses run only after review. in the pathway →
On the pathway · ∗ · Defend it · Sensitivity analysisWhat robustness situation are you in?
- testing how conclusions hold upSensitivity analysis
- SHAP
-
An interpretability tool that partly restores insight into flexible predictive models. in the pathway →
On the pathway · 02 · Model · Prediction and machine learningWhich prediction concept is in play?
- the overall areaPrediction and machine learning
- explaining individual predictionsSHAP
- Simon’s two-stage design
-
A small single-arm phase II design that stops early when first-stage responses are too few to continue. in the pathway →
On the pathway · 00 · Framing · Dose-finding and early-phase designsWhich early-phase design question are you facing?
- the overall design familyDose-finding and early-phase designs
- dose escalation by fixed cohort rule3+3 design
- model-based dose escalationContinual reassessment method
- the highest acceptably safe doseMaximum tolerated dose
- phase II screening for efficacySimon’s two-stage design
- Simple random sampling
-
Drawing from one frame with equal selection probability. in the pathway →
On the pathway · 01 · Measurement · Survey sampling designHow do you draw the sample?
- the overall ideaSurvey sampling design
- every unit known nonzero chanceProbability sample
- equal-chance draw from frameSimple random sampling
- sample within population strataStratified sampling
- sample whole groups togetherCluster sampling
- sample in successive nested stagesMultistage sampling
- variance inflation from clusteringDesign effect
- Skewness
-
A summary of a distribution’s asymmetry, part of reading its shape. in the pathway →
On the pathway · 01 · Measurement · Characterizing the distributionWhat shape feature?
- the overall ideaCharacterizing the distribution
- asymmetry of the distributionSkewness
- heaviness of the tailsKurtosis
- smoothing a nonlinear trendLOESS smoother
- SNOMED
-
A clinical coding ontology used for claims and records. in the pathway →
On the pathway · 01 · Measurement · Data standards and provenanceWhich data standard or provenance layer?
- the overall idea of standards and provenanceData standards and provenance
- clinical coding terminology for findingsSNOMED
- regulatory model for collected trial dataCDISC SDTM
- analysis-ready dataset standardADaM
- Societal perspective
-
A costing viewpoint that adds patient time, caregiving, and lost productivity to medical costs, which can flip the verdict for some conditions. in the pathway →
On the pathway · 05 · Decision rule · Perspective and the reference caseWhose costs and benefits count?
- the overall ideaPerspective and the reference case
- standardized analysis conventionsReference case
- reporting checklist for economicsCHEERS
- count all costs to societySocietal perspective
- value of foregone alternativesOpportunity cost
- Sparse data and resampling
-
Methods for small cell counts or rare events, where standard likelihood is unstable and resampling or exact procedures give trustworthy estimates and intervals. in the pathway →
On the pathway · 02 · Model · Sparse data and resamplingAre cells sparse or analytic standard errors doubtful?
- the overall familySparse data and resampling
- separation or small samplesFirth penalized regression
- very sparse, exact inferenceExact logistic regression
- no clean closed-form varianceBootstrap and resampling methods
- public-health impact measuresAttributable risk and population attributable fraction (PAF)
- Spearman correlation
-
A measure of monotone association between two continuous variables. in the pathway →
On the pathway · 02 · Model · Bivariate tests (t-test, chi-square)Which two-variable association are you testing?
- the overall family of testsBivariate tests
- mean across two groupsT-test
- means across three or more groupsANOVA
- ranks across two groupsMann-Whitney test
- ranks across three or more groupsKruskal-Wallis test
- two categorical variables, large countsChi-square test
- two categorical variables, small countsFisher’s exact test
- linear correlation of two continuousPearson correlation
- monotonic correlation, rankedSpearman correlation
- concordance-based rank correlationKendall’s tau
- effect size for two meansCohen’s d
- effect size for ANOVAEta-squared
- effect size for categorical associationCramer’s V
- Specification-curve analysis
-
Re-estimating a result across the many defensible modeling choices to show whether a conclusion holds broadly or only along one path. in the pathway →
On the pathway · ∗ · Defend it · Leave-one-out and specification curvesHow are you probing specification robustness?
- the overall ideaLeave-one-out and specification curves
- results across many model choicesSpecification-curve analysis
- Specificity
-
The proportion of truly disease-free patients a test correctly identifies as negative. in the pathway →
On the pathway · 05 · Decision rule · Operating characteristicsWhich diagnostic-performance measure?
- the overall ideaOperating characteristics
- true positives among diseasedSensitivity
- true negatives among healthySpecificity
- disease probability given a resultPredictive values
- Spectrum bias
-
Inflated test accuracy when cases are floridly sick and controls plainly well, so accuracy at a referral center overstates that in primary care. in the pathway →
On the pathway · 05 · Decision rule · Diagnostic-accuracy studiesWhich accuracy measure or pitfall?
- the overall ideaDiagnostic-accuracy studies
- how results shift disease oddsLikelihood ratios
- index test informs reference standardIncorporation bias
- only some get the reference standardVerification bias
- unrepresentative case mixSpectrum bias
- SPIRIT
-
The reporting standard for a trial protocol, the protocol counterpart to the CONSORT checklist for the finished trial. in the pathway →
On the pathway · § · Conduct it · The study protocol (SPIRIT)What situation?
- the overall ideaThe study protocol (SPIRIT)
- reporting standard for protocolsSPIRIT
- Splines
-
Restricted cubic or natural splines that fit a smooth piecewise curve at a few knots, modeling nonlinearity more stably than high-order polynomials. in the pathway →
On the pathway · 02 · Model · Model modifications (splines, interactions)How do you flex the model?
- the overall ideaModel modifications
- effect depends on another variableInteraction term
- fixed exposure term in count modelsOffset
- smooth nonlinear flexible curvesSplines
- additive smooth function componentsGeneralized additive models
- Standard error
-
The spread of a sample mean, shrinking with the square root of sample size, so quadrupling n halves it. in the pathway →
\[\text{SE} = \frac{\sigma}{\sqrt{n}}\]
where \(\text{SE}\) is the standard error of the sample mean; \(\sigma\) is the standard deviation of a single observation; \(n\) is the number of observations averaged.
On the pathway · 01 · Measurement · Probability distributions and the CLTWhich distribution or sampling result?
- the overall ideaProbability distributions
- continuous bell-shaped variableNormal distribution
- fixed-trial success countsBinomial distribution
- counts of rare eventsPoisson distribution
- overdispersed count dataNegative binomial distribution
- why sample means turn normalCentral limit theorem
- spread of a sample estimateStandard error
- Standardized mortality ratio
-
The ratio of observed to expected events used in indirect age-standardization. in the pathway →
On the pathway · 01 · Measurement · Measures of disease frequencyWhat frequency are you trying to measure?
- the overall ideaMeasures of disease frequency
- existing cases at a time pointPrevalence
- new cases over follow-upIncidence
- new cases as a proportion at riskCumulative incidence
- new cases per unit follow-up timeIncidence rate
- denominator of summed follow-upPerson-time
- unadjusted rate in a populationCrude rate
- comparing rates across populationsAge-standardization
- observed versus expected deathsStandardized mortality ratio
- Statistical programming and TFLs
-
Delivering analysis as pre-specified tables, figures, and listings, with credibility enforced by independent double-programming reconciled value by value. in the pathway →
What situation?
- the overall ideaStatistical programming and TFLs
- the reported tables and figuresTFLs
- independent reproduction for QCDouble-programming
- Stochastic uncertainty
-
First-order random variation between otherwise identical individuals, the noise a microsimulation has to average out. in the pathway →
On the pathway · 05 · Decision rule · Types of uncertaintyWhich source of uncertainty?
- the overall ideaTypes of uncertainty
- uncertainty in input estimatesParameter uncertainty
- random variation between individualsStochastic uncertainty
- uncertainty in model structureStructural uncertainty
- finding where conclusions flipThreshold analysis
- Stratified analysis
-
Controlling confounding by splitting data on the confounder, estimating within each stratum, and pooling the estimates. in the pathway →
On the pathway · 02 · Model · Stratified analysis (Mantel-Haenszel)Which stratified-analysis step are you at?
- the overall approachStratified analysis
- pooling across strataMantel-Haenszel estimator
- combining stratum log effectsWoolf’s method
- testing for effect modificationHomogeneity check
- Stratified randomization
-
Randomization that balances a few strong prognostic factors within strata. in the pathway →
On the pathway · 00 · Framing · Randomization and blindingWhat allocation or masking concern?
- the overall ideaRandomization and blinding
- hide the upcoming assignmentAllocation concealment
- balance arms in small chunksBlock randomization
- balance within prognostic strataStratified randomization
- dynamically balance many factorsMinimization
- mask treatment after allocationBlinding
- Stratified sampling
-
Splitting the frame into strata and sampling within each, allowing precise oversampling of a small subgroup at the cost of unequal selection probabilities. in the pathway →
On the pathway · 01 · Measurement · Survey sampling designHow do you draw the sample?
- the overall ideaSurvey sampling design
- every unit known nonzero chanceProbability sample
- equal-chance draw from frameSimple random sampling
- sample within population strataStratified sampling
- sample whole groups togetherCluster sampling
- sample in successive nested stagesMultistage sampling
- variance inflation from clusteringDesign effect
- Strength of recommendation
-
How firmly a guideline body is willing to speak, signaled by ACC/AHA class and level of evidence or GRADE’s strong-versus-conditional split, and it should track the certainty of evidence. in the pathway →
On the pathway · 06 · Recommendation · Strength of recommendationHow strong is the recommendation?
- the overall ideaStrength of recommendation
- STROBE
-
The reporting checklist for observational studies. in the pathway →
On the pathway · 06 · Recommendation · Reporting standardsWhich study type are you reporting?
- the overall ideaReporting standards
- randomized controlled trialCONSORT
- observational studySTROBE
- systematic reviewPRISMA
- prediction model studyTRIPOD
- Structural uncertainty
-
Uncertainty in a model’s own form, which states exist and which functional form, often larger than parameter uncertainty yet routinely ignored. in the pathway →
On the pathway · 05 · Decision rule · Types of uncertaintyWhich source of uncertainty?
- the overall ideaTypes of uncertainty
- uncertainty in input estimatesParameter uncertainty
- random variation between individualsStochastic uncertainty
- uncertainty in model structureStructural uncertainty
- finding where conclusions flipThreshold analysis
- Study biases, by rung
-
A family of biases mapped to the rung where each enters, spanning selection, information, confounding, synthesis, and screening biases. in the pathway →
On the pathway · ∗ · Defend it · Study biases, by rungWhich bias is threatening the study?
- the overall map of biasesStudy biases, by rung
- who got sampled or enrolledSampling bias
- distorted entry into the studySelection bias
- selection among hospitalized patientsBerkson’s bias
- conditioning on a common effectOver-adjustment
- employed groups appear healthierHealthy-worker effect
- only survivors are observedSurvivorship bias
- differential loss to follow-upAttrition bias
- nonresponders differ systematicallyNonresponse bias
- a common cause of exposure and outcomeConfounding
- treatment chosen by prognosisConfounding by indication
- unequal outcome ascertainmentDetection bias
- inaccurate recall of exposureRecall bias
- interviewer shapes responsesInterviewer bias
- earlier detection inflates survivalLead-time bias
- slow cases preferentially detectedLength-time bias
- detecting harmless diseaseOverdiagnosis
- selective reporting of resultsPublication bias
- SUCRA
-
Surface under the cumulative ranking curve, summarizing a treatment’s rank where 100 percent is certainly best and 0 percent certainly worst. in the pathway →
On the pathway · 04 · Synthesis · Network meta-analysisWhich network meta-analysis concern?
- the overall ideaNetwork meta-analysis
- comparability across the networkTransitivity
- direct versus indirect agreementNode-splitting
- rank treatments overallSUCRA
- Supervised and unsupervised learning
-
The split in machine learning by whether the data carry an outcome label. in the pathway →
On the pathway · 02 · Model · Supervised and unsupervised learningAre outcome labels available?
- the overall ideaSupervised and unsupervised learning
- learn from labeled outcomesSupervised learning
- Supervised learning
-
Learning to predict a known target label such as a diagnosis, cost, or survival time. in the pathway →
On the pathway · 02 · Model · Supervised and unsupervised learningAre outcome labels available?
- the overall ideaSupervised and unsupervised learning
- learn from labeled outcomesSupervised learning
- Support vector machine
-
A classifier finding the widest-margin boundary between classes, using a kernel to bend it nonlinearly. in the pathway →
On the pathway · 02 · Model · Learning algorithms and ensemblesWhich learner or ensemble fits?
- the overall ideaLearning algorithms and ensembles
- single interpretable splitsDecision tree (machine learning)
- classify by closest neighborsK-nearest neighbours
- maximum-margin separating boundarySupport vector machine
- average parallel bootstrapped modelsBagging
- many decorrelated bagged treesRandom forest
- sequentially correct prior errorsBoosting
- Surrogate endpoint
-
A lab marker or scan standing in for a clinical outcome, trustworthy only once validated to capture the treatment’s effect on what patients feel. in the pathway →
On the pathway · 00 · Framing · Endpoint logic and pre-registrationWhich endpoint or pre-specification concern?
- the overall ideaEndpoint logic and pre-registration
- the main pre-specified outcomePrimary endpoint
- a stand-in for the outcomeSurrogate endpoint
- criteria validating a surrogatePrentice’s criteria
- lock analysis plan in advancePre-registration
- Survey data
-
A probability sample built for population estimates that generalizes well once its weights and design are respected. in the pathway →
On the pathway · 01 · Measurement · Data sources and their tradeoffsWhich data source or pitfall?
- the overall ideaData sources and their tradeoffs
- clinical detail from care recordsElectronic health record data
- billing records across encountersClaims data
- enrolled cohorts for a conditionRegistries
- sampled population questionnairesSurvey data
- group-level inference pitfallEcological fallacy
- Survey sampling design
-
The probability-sampling scheme by which a sample is drawn so it can generalize to the population. in the pathway →
On the pathway · 01 · Measurement · Survey sampling designHow do you draw the sample?
- the overall ideaSurvey sampling design
- every unit known nonzero chanceProbability sample
- equal-chance draw from frameSimple random sampling
- sample within population strataStratified sampling
- sample whole groups togetherCluster sampling
- sample in successive nested stagesMultistage sampling
- variance inflation from clusteringDesign effect
- Survey skip patterns
-
Branching where a gate question routes a respondent past inapplicable items, so a skipped item is blank by design rather than missing. in the pathway →
On the pathway · 01 · Measurement · Survey instruments: skip patterns and branchingWhich skip-logic element is in play?
- the overall ideaSurvey skip patterns
- item routing later questionsGate question
- Survey weight
-
The factor correcting for unequal selection so a sample represents the population it was drawn from. in the pathway →
On the pathway · 01 · Measurement · Complex-sample design and survey weightingWhich weighting or design adjustment?
- the overall ideaComplex-sample design and survey weighting
- scale respondents to the populationSurvey weight
- precision lost to the designEffective sample size
- Survival extrapolation for HTA
-
Fitting parametric or flexible models to observed survival and projecting beyond the trial horizon. in the pathway →
On the pathway · 05 · Decision rule · Real-world cost and HTA methodsModeling skewed real-world costs for HTA?
- valuing real-world costsReal-world cost and HTA methods
- costs are zero-inflatedTwo-part and other cost models
- extreme cost outliers existWinsorization and trimming of cost outliers
- summarizing population spendPer-member-per-month costing (PMPM/PPPM)
- trial ends before lifetimeSurvival extrapolation for HTA
- value has multiple dimensionsMulti-criteria decision analysis (MCDA)
- prioritizing which uncertainty mattersExpected value of partial perfect information (EVPPI)
- Survivorship bias
-
Bias from studying only the units that lasted long enough to be observed. in the pathway →
On the pathway · ∗ · Defend it · Study biases, by rungWhich bias is threatening the study?
- the overall map of biasesStudy biases, by rung
- who got sampled or enrolledSampling bias
- distorted entry into the studySelection bias
- selection among hospitalized patientsBerkson’s bias
- conditioning on a common effectOver-adjustment
- employed groups appear healthierHealthy-worker effect
- only survivors are observedSurvivorship bias
- differential loss to follow-upAttrition bias
- nonresponders differ systematicallyNonresponse bias
- a common cause of exposure and outcomeConfounding
- treatment chosen by prognosisConfounding by indication
- unequal outcome ascertainmentDetection bias
- inaccurate recall of exposureRecall bias
- interviewer shapes responsesInterviewer bias
- earlier detection inflates survivalLead-time bias
- slow cases preferentially detectedLength-time bias
- detecting harmless diseaseOverdiagnosis
- selective reporting of resultsPublication bias
- SUTVA
-
The stable-unit-treatment-value assumption that one unit’s treatment does not affect another’s outcome, ruling out interference or spillover. in the pathway →
On the pathway · 02 · Model · Potential outcomes and identifiabilityWhich identifiability condition is at stake?
- the overall frameworkPotential outcomes and identifiability
- the counterfactual setupPotential-outcomes framework
- only one outcome is observedFundamental problem of causal inference
- treated and untreated comparableExchangeability
- every covariate stratum has bothPositivity
- observed equals counterfactual under treatmentConsistency
- no interference, single versionSUTVA
- Synthetic control
-
A causal design that constructs a comparison unit to neutralize a dominant threat to inference. in the pathway →
On the pathway · 02 · Model · Causal designs without randomizationWhich quasi-experimental design fits?
- the overall ideaCausal designs without randomization
- before-after across exposed and controlDifference-in-differences
- a haphazard nudge to exposureInstrumental variables
- assignment by a cutoff thresholdRegression discontinuity
- weighted donors build a counterfactualSynthetic control
- Synthetic data
-
New records drawn from a generative model fit to real data, reproducing the joint distribution without copying individuals, needing privacy and fidelity audits. in the pathway →
\[\frac{dx}{dt} = v(x,t)\]
where \(x\) is the point being transported from the noise distribution toward the data distribution; \(t\) is time along the continuous path, running from 0 to 1; \(v(x,t)\) is the learned velocity field that moves \(x\) at each point and time.
On the pathway · § · Conduct it · Data privacy and securityWhich rule or method?
- the overall ideaData privacy and security
- US health privacy lawHIPAA
- EU data protection lawGDPR
- de-identify by removing identifiersSafe Harbor
- de-identify by statistical opinionExpert determination
- generating artificial substitute recordsSynthetic data
T
- T-test
-
A test comparing a continuous outcome between two groups, equivalent to a linear regression on a binary indicator. in the pathway →
On the pathway · 02 · Model · Bivariate tests (t-test, chi-square)Which two-variable association are you testing?
- the overall family of testsBivariate tests
- mean across two groupsT-test
- means across three or more groupsANOVA
- ranks across two groupsMann-Whitney test
- ranks across three or more groupsKruskal-Wallis test
- two categorical variables, large countsChi-square test
- two categorical variables, small countsFisher’s exact test
- linear correlation of two continuousPearson correlation
- monotonic correlation, rankedSpearman correlation
- concordance-based rank correlationKendall’s tau
- effect size for two meansCohen’s d
- effect size for ANOVAEta-squared
- effect size for categorical associationCramer’s V
- Target-trial emulation
-
Imagining the randomized trial you would have run, writing its protocol, then building the observational analysis to match it. in the pathway →
On the pathway · 00 · Framing · Target-trial emulationWhich target-trial element?
- the overall ideaTarget-trial emulation
- misaligned follow-up start creating biasImmortal time
- Tau-squared
-
The between-study variance added to each study’s weight in a random-effects meta-analysis, often estimated by DerSimonian-Laird. in the pathway →
On the pathway · 04 · Synthesis · Meta-analysis and poolingWhich pooling or heterogeneity tool?
- the overall ideaMeta-analysis and pooling
- one true effect assumedFixed-effect meta-analysis
- effects vary across studiesRandom-effects meta-analysis
- how much effects varyHeterogeneity
- testing for heterogeneityCochran’s Q
- proportion of variance from heterogeneityI-squared
- between-study variance estimateTau-squared
- range for a new studyPrediction interval
- explaining heterogeneity by covariatesMeta-regression
- visualizing small-study effectsFunnel plot
- testing funnel asymmetryEgger’s test
- TFLs
-
Tables, figures, and listings: the programmed outputs of an analysis, whose shells are pre-specified in the statistical analysis plan. in the pathway →
What situation?
- the overall ideaStatistical programming and TFLs
- the reported tables and figuresTFLs
- independent reproduction for QCDouble-programming
- The evidence-recommendation gap
-
The distance between how firmly a guideline is worded and the actual support beneath it, whether an extrapolated threshold, a single trial, or mere expert consensus. in the pathway →
On the pathway · 06 · Recommendation · The evidence–recommendation gapWhat recommendation situation are you in?
- moving from evidence to adviceThe evidence-recommendation gap
- The statistical analysis plan
-
The document pre-committing, before unblinding, exactly how the primary question will be answered, turning a confirmatory analysis confirmatory. in the pathway →
On the pathway · § · Conduct it · The statistical analysis planWhat situation?
- the overall ideaThe statistical analysis plan
- The study protocol (SPIRIT)
-
The master plan every other document hangs from, covering objectives, eligibility, intervention, outcomes, sample size, analysis, ethics, and dissemination. in the pathway →
On the pathway · § · Conduct it · The study protocol (SPIRIT)What situation?
- the overall ideaThe study protocol (SPIRIT)
- reporting standard for protocolsSPIRIT
- Threshold analysis
-
An analysis finding the input value at which a decision flips. in the pathway →
On the pathway · 05 · Decision rule · Types of uncertaintyWhich source of uncertainty?
- the overall ideaTypes of uncertainty
- uncertainty in input estimatesParameter uncertainty
- random variation between individualsStochastic uncertainty
- uncertainty in model structureStructural uncertainty
- finding where conclusions flipThreshold analysis
- Thresholds and cut points
-
Turning a continuous risk or measurement into a yes/no action, a convenient but lossy choice that trades sensitivity against specificity and encodes a value judgment. in the pathway →
On the pathway · 05 · Decision rule · Thresholds and cut pointsWhere to set the decision cutoff?
- the overall ideaThresholds and cut points
- Time-varying confounding
-
When a confounder is itself affected by past treatment, breaking ordinary adjustment and requiring g-methods. in the pathway →
On the pathway · 02 · Model · Time-varying confounding and g-methodsHow do you handle time-varying confounding?
- the overall ideaTime-varying confounding
- confounder both affects and respondsTreatment-confounder feedback
- weight to remove time-varying confoundingMarginal structural model
- model effect directly through timeG-estimation
- TMLE
-
Targeted maximum likelihood estimation, a doubly-robust estimator combining a propensity and an outcome model. in the pathway →
On the pathway · 02 · Model · Causal estimators (propensity scores, g-methods)How do you estimate the causal effect?
- the overall ideaCausal estimators
- model treatment assignment probabilityPropensity score
- reweight by inverse treatment probabilityIPTW
- model and average outcomesG-formula
- combine outcome and treatment modelsDoubly-robust estimators
- targeted machine-learning estimationTMLE
- Traceability
-
The rule that every analysis value trace back through ADaM to its SDTM source and original case-report form. in the pathway →
On the pathway · 01 · Measurement · Assembling a clinical trial datasetWhat are you building or tracing?
- the overall ideaAssembling a clinical trial dataset
- one row per subjectADSL
- link result back to sourceTraceability
- Transitivity
-
The assumption that trials are similar enough in populations and methods that an indirect comparison through a common comparator is valid. in the pathway →
On the pathway · 04 · Synthesis · Network meta-analysisWhich network meta-analysis concern?
- the overall ideaNetwork meta-analysis
- comparability across the networkTransitivity
- direct versus indirect agreementNode-splitting
- rank treatments overallSUCRA
- Treatment-confounder feedback
-
When a confounder both responds to past treatment and guides the next, common in chronic-disease cohorts. in the pathway →
On the pathway · 02 · Model · Time-varying confounding and g-methodsHow do you handle time-varying confounding?
- the overall ideaTime-varying confounding
- confounder both affects and respondsTreatment-confounder feedback
- weight to remove time-varying confoundingMarginal structural model
- model effect directly through timeG-estimation
- Treatment-policy strategy
-
An intercurrent-event strategy that counts the outcome regardless of the event, in the intention-to-treat spirit. in the pathway →
On the pathway · 02 · Model · Trial estimands and intercurrent eventsHow do you handle intercurrent events?
- the overall ideaTrial estimands and intercurrent events
- events disrupting outcome interpretationIntercurrent events
- ignore them, use assigned treatmentTreatment-policy strategy
- imagine they did not occurHypothetical strategy
- fold event into the outcomeComposite strategy
- restrict to a defined subpopulationPrincipal-stratum strategy
- Trial estimands and intercurrent events
-
A trial’s precise question, stated under the ICH E9 R1 framework, with a named strategy for events that occur after randomization. in the pathway →
On the pathway · 02 · Model · Trial estimands and intercurrent eventsHow do you handle intercurrent events?
- the overall ideaTrial estimands and intercurrent events
- events disrupting outcome interpretationIntercurrent events
- ignore them, use assigned treatmentTreatment-policy strategy
- imagine they did not occurHypothetical strategy
- fold event into the outcomeComposite strategy
- restrict to a defined subpopulationPrincipal-stratum strategy
- TRIPOD
-
The reporting checklist for prediction models. in the pathway →
On the pathway · 06 · Recommendation · Reporting standardsWhich study type are you reporting?
- the overall ideaReporting standards
- randomized controlled trialCONSORT
- observational studySTROBE
- systematic reviewPRISMA
- prediction model studyTRIPOD
- Two-part and other cost models
-
Models separating whether cost occurred from how much, plus robust GLMs for skewed cost data. in the pathway →
On the pathway · 05 · Decision rule · Real-world cost and HTA methodsModeling skewed real-world costs for HTA?
- valuing real-world costsReal-world cost and HTA methods
- costs are zero-inflatedTwo-part and other cost models
- extreme cost outliers existWinsorization and trimming of cost outliers
- summarizing population spendPer-member-per-month costing (PMPM/PPPM)
- trial ends before lifetimeSurvival extrapolation for HTA
- value has multiple dimensionsMulti-criteria decision analysis (MCDA)
- prioritizing which uncertainty mattersExpected value of partial perfect information (EVPPI)
- Type-I error
-
The false-positive rate, which unplanned peeking at accumulating data inflates. in the pathway →
On the pathway · 00 · Framing · Interim analyses and group-sequential designWhich interim-monitoring element?
- the overall ideaInterim analyses and group-sequential design
- committee reviewing accruing dataData safety monitoring board
- planned looks with stopping rulesGroup-sequential design
- false-positive risk to spendType-I error
- stringent early-look boundaryO’Brien-Fleming boundary
- constant nominal-level boundaryPocock boundary
- Types of uncertainty
-
Naming the kinds of uncertainty (parameter, stochastic, heterogeneity, structural) because each needs different tools to handle honestly. in the pathway →
On the pathway · 05 · Decision rule · Types of uncertaintyWhich source of uncertainty?
- the overall ideaTypes of uncertainty
- uncertainty in input estimatesParameter uncertainty
- random variation between individualsStochastic uncertainty
- uncertainty in model structureStructural uncertainty
- finding where conclusions flipThreshold analysis
U
- Uncertainty and inference
-
Reporting the range compatible with the data via confidence intervals, accounting for clustering, since statistical significance is not clinical importance. in the pathway →
On the pathway · 03 · Estimate · Uncertainty and inferenceHow to express estimate uncertainty?
- the overall ideaUncertainty and inference
- a plausible range for the estimateConfidence interval
- Uncertainty in cost-effectiveness (PSA)
-
Methods showing how fragile an ICER is, from one-way and tornado analyses to probabilistic sensitivity analysis propagating parameter uncertainty through Monte Carlo simulation. in the pathway →
On the pathway · 05 · Decision rule · Uncertainty in cost-effectiveness (PSA)How are you handling cost-effectiveness uncertainty?
- the overall ideaUncertainty in cost-effectiveness (PSA)
- propagating parameter uncertaintyProbabilistic sensitivity analysis
- plotting cost and effect differencesCost-effectiveness plane
- probability of being cost-effectiveCost-effectiveness acceptability curve
- Unsupervised learning
-
Finding structure in data with no outcome label, through clustering or dimensionality reduction. in the pathway →
On the pathway · 02 · Model · Unsupervised learningWhat unlabeled-data structure are you finding?
- the overall familyUnsupervised learning
- grouping similar observationsClustering
- nested grouping by linkageHierarchical clustering
- partitioning into k groupsK-means
- reducing the number of featuresDimensionality reduction
- orthogonal variance componentsPrincipal component analysis
V
- Validity
-
Whether an instrument measures what it claims, through content, construct, and criterion validity. in the pathway →
On the pathway · 01 · Measurement · Reliability and validityWhich measurement property are you assessing?
- the overall ideaReliability and validity
- consistency of measurementReliability
- measuring the intended constructValidity
- internal consistency of scale itemsCronbach’s alpha
- agreement on continuous measuresIntraclass correlation
- plot method agreement and biasBland-Altman plot
- categorical agreement, two ratersCohen’s kappa
- ordered-category agreement, two ratersWeighted kappa
- categorical agreement, many ratersFleiss’ kappa
- Value of information (EVPI)
-
Pricing the decision uncertainty that remains, where the expected value of perfect information is the expected loss from deciding under current uncertainty. in the pathway →
On the pathway · 05 · Decision rule · Value of information (EVPI)What is more evidence worth?
- the overall ideaValue of information (EVPI)
- value of removing all uncertaintyEVPI
- value of a specific future studyExpected value of sample information
- Variance inflation factor
-
A diagnostic for multicollinearity among predictors. in the pathway →
On the pathway · 02 · Model · Checking model assumptionsWhich model assumption to check?
- the overall ideaChecking model assumptions
- non-constant residual varianceHeteroscedasticity
- predictors too collinearVariance inflation factor
- single points driving the fitCook’s distance
- hazard ratio constant over timeProportional hazards
- test that proportionality formallySchoenfeld residuals
- fix variance without refittingRobust standard errors
- Verification
-
Checking whether a model is coded correctly, that the implementation does the math intended. in the pathway →
On the pathway · 05 · Decision rule · Model validation and calibrationWhat situation?
- the overall ideaModel validation and calibration
- tuning model outputs to realityCalibration (modeling)
- confirming the model runs correctlyVerification
- Verification bias
-
Bias arising when only test-positive patients go on to receive the reference standard. in the pathway →
On the pathway · 05 · Decision rule · Diagnostic-accuracy studiesWhich accuracy measure or pitfall?
- the overall ideaDiagnostic-accuracy studies
- how results shift disease oddsLikelihood ratios
- index test informs reference standardIncorporation bias
- only some get the reference standardVerification bias
- unrepresentative case mixSpectrum bias
W
- Weakly-informative prior
-
A prior that gently regularizes without committing to much. in the pathway →
On the pathway · 02 · Model · Choosing a priorWhat kind of prior do you need?
- the overall choiceChoosing a prior
- prior matched to the likelihoodConjugate prior
- strong external informationInformative prior
- light regularizing informationWeakly-informative prior
- Weighted kappa
-
A kappa that credits near-misses on an ordinal scale. in the pathway →
On the pathway · 01 · Measurement · Reliability and validityWhich measurement property are you assessing?
- the overall ideaReliability and validity
- consistency of measurementReliability
- measuring the intended constructValidity
- internal consistency of scale itemsCronbach’s alpha
- agreement on continuous measuresIntraclass correlation
- plot method agreement and biasBland-Altman plot
- categorical agreement, two ratersCohen’s kappa
- ordered-category agreement, two ratersWeighted kappa
- categorical agreement, many ratersFleiss’ kappa
- Willingness-to-pay threshold
-
The benchmark amount a payer will pay per unit of benefit, against which an incremental cost-effectiveness ratio is judged. in the pathway →
On the pathway · 05 · Decision rule · Cost-effectiveness and the ICERWhich economic-evaluation framing fits?
- the overall ideaCost-effectiveness and the ICER
- extra cost per extra effectICER
- value costs and benefits in moneyCost-benefit analysis
- effects identical, compare costs onlyCost-minimization analysis
- effects in quality-adjusted life yearsCost-utility analysis
- value at a willingness thresholdNet monetary benefit
- maximum payable per unit benefitWillingness-to-pay threshold
- Winsorization and trimming of cost outliers
-
Capping or dropping extreme cost values so a few catastrophic claims do not dominate the mean. in the pathway →
On the pathway · 05 · Decision rule · Real-world cost and HTA methodsModeling skewed real-world costs for HTA?
- valuing real-world costsReal-world cost and HTA methods
- costs are zero-inflatedTwo-part and other cost models
- extreme cost outliers existWinsorization and trimming of cost outliers
- summarizing population spendPer-member-per-month costing (PMPM/PPPM)
- trial ends before lifetimeSurvival extrapolation for HTA
- value has multiple dimensionsMulti-criteria decision analysis (MCDA)
- prioritizing which uncertainty mattersExpected value of partial perfect information (EVPPI)
- Woolf’s method
-
A method for pooling stratum-specific association estimates across strata. in the pathway →
On the pathway · 02 · Model · Stratified analysis (Mantel-Haenszel)Which stratified-analysis step are you at?
- the overall approachStratified analysis
- pooling across strataMantel-Haenszel estimator
- combining stratum log effectsWoolf’s method
- testing for effect modificationHomogeneity check
Z
- Zero-inflated model
- A count model mixing a structural-zero process with a count process when zeros pile up. in the pathway →
Two ways to take this further:
Learn the methods. Create a free account → to follow new write-ups and traces as they go up, alongside the full From Data to Bedside pathway.
Put them to work on your study. Book a discovery call → for study design, causal inference, and analysis that survives review.
On the pathway · 02 · Model · Regression familiesWhich regression for your outcome?- the overall ideaRegression families
- unifying exponential-family frameworkGLM
- continuous outcomeLinear regression
- binary outcomeLogistic regression
- count outcomePoisson regression
- overdispersed countsNegative binomial regression
- excess zeros in countsZero-inflated model
- separate zero and positive partsHurdle model
- correlated or clustered outcomesGEE
- repeated measures over timeMMRM