Glossary

From Data to Bedside · every term in the pathway, defined and linked

This glossary indexes every concept in the pathway. Each entry gives a one-line definition and links to the pathway node where the term is taught in full; on the pathway, the term links back here. Terms are listed alphabetically.

1-inpatient / 2-outpatient rule
Counting a case from one inpatient diagnosis or two outpatient diagnoses on separate dates to filter out rule-out codes. in the pathway →
Is your outcome a validated algorithm or an unchecked code rule?
3+3 design
A classic phase I escalation in small cohorts until toxicity appears. in the pathway →
Which early-phase design question are you facing?

A

Absolute risk reduction
The absolute difference in risk between groups; its reciprocal is the number needed to treat. in the pathway →
Which scale frames the effect?
Accelerated failure time (AFT) models
Parametric models on log survival time, yielding a time ratio when proportional hazards fails. in the pathway →
Does a competing event block the outcome?
Active-comparator new-user design
Restricts to initiators of a treatment versus an active alternative, curbing confounding by indication and prevalent-user and immortal-time distortions. in the pathway →
Which observational design fits the question and dominant bias?
ADaM
A CDISC standard for analysis-ready clinical trial datasets derived from SDTM. in the pathway →
Which data standard or provenance layer?
ADSL
A subject-level ADaM dataset with one row per trial participant. in the pathway →
What are you building or tracing?
Age-standardization
Adjusting rates to a standard population so comparisons are not confounded by differing age structures, done directly or indirectly. in the pathway →
What frequency are you trying to measure?
AIC
An information criterion trading goodness of fit against the number of parameters to compare non-nested models, where lower is better. in the pathway →
Which fit or error measure do you need?
Algorithm validation (PPV and sensitivity tradeoff)
Tightening a rule raises positive predictive value but lowers sensitivity, and vice versa. in the pathway →
Is your outcome a validated algorithm or an unchecked code rule?
Allocation concealment
A safeguard ensuring the next treatment assignment cannot be foreseen and gamed. in the pathway →
What allocation or masking concern?
Analysis populations
Who counts in the analysis, contrasting intention-to-treat, per-protocol, and as-treated, itself a choice of estimand. in the pathway →
Which set of subjects do you analyze?
ANOVA
Analysis of variance, extending the t-test to compare a continuous outcome across more than two groups. in the pathway →
Which two-variable association are you testing?
As-treated
Analyzing patients by the treatment they actually received. in the pathway →
Which set of subjects do you analyze?
Assay sensitivity
The assumption that a trial could have detected a real difference had one existed, since a sloppy trial looks non-inferior. in the pathway →
What are you trying to show?
Assembling a clinical trial dataset
Standardizing trial data from case-report forms through CDISC SDTM into ADaM analysis datasets, governed by traceability. in the pathway →
What are you building or tracing?
Assembling the analytic cohort
Turning a research database into one analysis-ready table via extract-transform-load, fixing its grain and time structure. in the pathway →
Which cohort-construction step?
ATC and defined daily dose (DDD)
WHO classification grouping drugs by therapeutic class, paired with a standard daily dose unit for comparable utilization. in the pathway →
Which vocabulary encodes each claim field, and what does it capture?
ATE
The average treatment effect, the contrast of potential outcomes over everyone. in the pathway → \[\text{ATE} = E[Y(1) - Y(0)]\] where \(\text{ATE}\) is the average treatment effect over the whole population; \(Y(1)\) is the outcome a unit would have under treatment; \(Y(0)\) is the outcome the same unit would have under no treatment.
Whose causal effect do you target?
ATT
The average treatment effect on the treated, the contrast of potential outcomes among treated units. in the pathway →
Whose causal effect do you target?
Attributable risk and population attributable fraction (PAF)
Attributable risk is the excess risk among the exposed; PAF is the share of population cases removable by eliminating the exposure. in the pathway →
Are cells sparse or analytic standard errors doubtful?
Attrition bias
Bias from differential loss to follow-up over time between groups. in the pathway →
Which bias is threatening the study?
AUC
The probability that a randomly chosen case gets a higher predicted risk than a randomly chosen non-case, where 0.5 is chance and 1 is perfect ranking. in the pathway →
Which aspect of predictive performance?

B

Back-door criterion
The rule that reads the adjustment set straight off a causal diagram. in the pathway →
Which causal-diagram concept?
Bagging
Training many trees on bootstrap resamples and averaging them to lower variance, the basis of the random forest. in the pathway →
Which learner or ensemble fits?
Bayes’ theorem
The rule that the posterior is proportional to the likelihood times the prior. in the pathway → \[\text{posterior} \propto \text{likelihood} \times \text{prior}\] where \(\text{posterior}\) is the updated distribution of the parameter after seeing the data; \(\text{likelihood}\) is what the data say about the parameter; \(\text{prior}\) is what you believed about the parameter before seeing the data.
Which Bayesian concept is in play?
Bayesian computation
Exploring posteriors with no closed form by simulation, principally Markov chain Monte Carlo. in the pathway →
Which sampling or diagnostic tool?
Bayesian inference
Treating the parameter as a random quantity with a distribution the data update, yielding a posterior summarized by a credible interval. in the pathway →
Which Bayesian concept is in play?
Belmont principles
The three principles underpinning research ethics: respect for persons, beneficence, and justice. in the pathway →
Which ethics concept or body?
Benjamini-Hochberg
A procedure controlling the false-discovery rate among rejected hypotheses. in the pathway →
How do you control multiple testing?
Berkson’s bias
The spurious association produced by conditioning on hospital admission. in the pathway →
Which bias is threatening the study?
Bias quantification
Putting a number on how much unmeasured confounding it would take to overturn a result, as one pre-specified sensitivity analysis. in the pathway →
How do you quantify unmeasured bias?
Bias-variance and regularization
The tradeoff between a model too simple to capture signal and one flexible enough to chase noise, managed by regularization. in the pathway →
Which concept or penalty?
Bias-variance tradeoff
The balance where a too-simple model underfits and a too-flexible model overfits, with prediction error their sum plus irreducible noise. in the pathway → \[\text{expected prediction error} = \text{bias}^2 + \text{variance} + \text{irreducible noise}\] where \(\text{expected prediction error}\) is the average error on new data; \(\text{bias}^2\) is the squared error from a model too simple to capture the signal; \(\text{variance}\) is the error from a model flexible enough to chase noise; \(\text{irreducible noise}\) is the variation no model can remove.
Which concept or penalty?
BIC
An information criterion like AIC but penalizing each extra parameter more heavily, so it favors smaller models, where lower is better. in the pathway →
Which fit or error measure do you need?
Binomial distribution
The distribution of counts of successes. in the pathway →
Which distribution or sampling result?
Bivariate tests
Classical tests of whether two variables are associated, each a special case of a regression model. in the pathway →
Which two-variable association are you testing?
Bland-Altman plot
A plot of differences against means that reveals systematic disagreement two methods can have despite high correlation. in the pathway →
Which measurement property are you assessing?
Blinding
Keeping patients, clinicians, and outcome assessors unaware of the assigned arm to prevent the bias that knowing it introduces. in the pathway →
What allocation or masking concern?
Block randomization
Permuted-block randomization that keeps trial arms close to equal in size as enrollment proceeds. in the pathway →
What allocation or masking concern?
Bonferroni correction
Dividing alpha by the number of tests to control the family-wise error rate. in the pathway →
How do you control multiple testing?
Boosting
Fitting trees in sequence, each correcting the last’s residuals, to lower bias, as in gradient boosting and XGBoost. in the pathway →
Which learner or ensemble fits?
Bootstrap and resampling methods
Repeatedly resample the observed data with replacement and recompute the estimate, building an empirical sampling distribution for intervals when analytic standard errors are awkward. in the pathway →
Are cells sparse or analytic standard errors doubtful?
Budget impact analysis
Projecting the total cost to a specific budget holder of adopting an intervention across the eligible population over a near-term horizon under realistic uptake. in the pathway →
What affordability question are you in?

C

Calibration
Whether a model’s predicted risks match observed event rates, read off a calibration plot or tested with goodness-of-fit. in the pathway →
Which aspect of predictive performance?
Calibration (modeling)
Tuning an unobservable parameter until a model’s outputs match observed targets, with the resulting uncertainty carried forward. in the pathway →
What situation?
Calibration versus discrimination
Discrimination asks whether a model ranks higher-risk patients above lower-risk ones, while calibration asks whether predicted risks match observed rates. in the pathway →
Which aspect of predictive performance?
Case-cohort design
Samples a random subcohort plus all cases, letting one comparison group serve several outcomes from the same source population. in the pathway →
Which observational design fits the question and dominant bias?
Case-control study
Starts from outcome status, comparing prior exposure in cases versus controls; efficient for rare outcomes and long latencies. in the pathway →
Which observational design fits the question and dominant bias?
Case-crossover design
Self-controlled design comparing a person’s exposure shortly before an event to their own earlier reference periods, suited to transient triggers. in the pathway →
Which observational design fits the question and dominant bias?
Case-time-control design
Adds a control group to the case-crossover design to adjust for exposure trends over calendar time. in the pathway →
Which observational design fits the question and dominant bias?
Causal designs without randomization
A set of designs, each neutralizing a specific dominant threat to causal inference, matched to the threat endangering the question. in the pathway →
Which quasi-experimental design fits?
Causal diagrams
A directed acyclic graph of assumed causal effects that sorts each covariate into a confounder, mediator, or collider. in the pathway →
Which causal-diagram concept?
Causal estimators
Methods that turn a fixed design and adjustment set into a number, including propensity scores and g-methods. in the pathway →
How do you estimate the causal effect?
Cause-specific hazard
Instantaneous event rate among patients still at risk, used to study etiology and biological mechanism. in the pathway →
Does a competing event block the outcome?
CDISC SDTM
A CDISC standard for structuring clinical trial tabulation data. in the pathway →
Which data standard or provenance layer?
Central limit theorem
The result that the mean of a large enough sample is approximately normal whatever the underlying shape, enabling z- and t-based inference. in the pathway →
Which distribution or sampling result?
Certainty of evidence (GRADE)
Rating how much confidence a body of evidence warrants, separately from effect size, downgrading for risk of bias, inconsistency, indirectness, imprecision, and publication bias. in the pathway →
Which certainty-of-evidence concept is in play?
Characterizing the distribution
Examining what you measured, its shape, spread, and relationships, before assuming a model or summary is honest. in the pathway →
What shape feature?
Charlson comorbidity index (CCI)
A weighted count of selected serious conditions, originally calibrated to predict one-year mortality, used as a single comorbidity summary. in the pathway →
How do I adjust for how sick patients already were?
Checking model assumptions
The diagnostics for the checkable statistical assumptions of a regression, distinct from a causal identifying assumption. in the pathway →
Which model assumption to check?
CHEERS
The reporting checklist for economic evaluations, the economic-evaluation member of the reporting-standards family. in the pathway →
Whose costs and benefits count?
Chi-square test
A test of independence between two categorical variables, with Fisher’s exact test used when cell counts are small. in the pathway →
Which two-variable association are you testing?
Choosing a prior
Selecting the distribution encoding belief before the data, the most attacked part of a Bayesian analysis. in the pathway →
What kind of prior do you need?
Choosing the estimand
Naming the exact quantity to be estimated, which effect and in whom, before choosing the method. in the pathway →
Whose causal effect do you target?
Claims and coding standards
The coded vocabularies behind each claim field, where analysis depends on knowing what each captures and how they map to one another. in the pathway →
Which vocabulary encodes each claim field, and what does it capture?
Claims data
Billing-driven encounter and prescription data covering a payer’s population broadly, where a code is a bill not a diagnosis and clinical detail is thin. in the pathway →
Which data source or pitfall?
Claims-based frailty index
A frailty proxy built from diagnosis and service codes, approximating functional decline when direct frailty assessment is unavailable in data. in the pathway →
How do I adjust for how sick patients already were?
Claims/EHR phenotype algorithm
A rule mapping recorded codes and encounters to a presumed clinical event or condition. in the pathway →
Is your outcome a validated algorithm or an unchecked code rule?
Classification performance metrics
Measures read off the confusion matrix of predicted versus actual, including precision, recall, and F1. in the pathway →
Which classification metric?
Clinical equipoise
Genuine uncertainty in the expert community about which trial arm is better, the ethical license to randomize patients. in the pathway →
Which ethics concept or body?
Clone-censor-weight
A per-protocol target-trial method that clones patients into each strategy, censors deviators, and reweights to avoid immortal time bias. in the pathway →
How do causal methods scale to claims and time?
Cluster sampling
Drawing whole groups such as schools or blocks to cut field cost when no list of individuals exists. in the pathway →
How do you draw the sample?
Clustering
Grouping similar observations, used for phenotyping disease subtypes from a panel of measurements. in the pathway →
What unlabeled-data structure are you finding?
Cochran’s Q
A statistical test for heterogeneity across studies in a meta-analysis. in the pathway →
Which pooling or heterogeneity tool?
Code crosswalks and mappings
Lookup tables translating one vocabulary into another, each translation lossy in known ways. in the pathway →
Which vocabulary encodes each claim field, and what does it capture?
Cohen’s d
The effect-size measure accompanying a t-test. in the pathway →
Which two-variable association are you testing?
Cohen’s kappa
A measure of two raters’ categorical agreement corrected for what chance alone would produce. in the pathway → \[\kappa = \frac{p_o - p_e}{1 - p_e}\] where \(\kappa\) is Cohen’s kappa, the chance-corrected agreement; \(p_o\) is the observed agreement between the two raters; \(p_e\) is the agreement expected if the raters labelled independently.
Which measurement property are you assessing?
Cohort study
Follows defined people forward from exposure to outcome; prospective when assembled before outcomes occur, retrospective when reconstructed from existing records. in the pathway →
Which observational design fits the question and dominant bias?
Collider
A common effect of two variables, where adjusting actively opens bias rather than removing it. in the pathway →
Which causal-diagram concept?
Comorbidity and frailty adjustment
Summarizing a patient’s baseline illness burden from claims into a validated score used to adjust for confounding by underlying health. in the pathway →
How do I adjust for how sick patients already were?
Competing risks
A setting where one event, such as death, prevents the event of interest from ever occurring. in the pathway →
Does a competing event block the outcome?
Competing risks and survival models
Methods for time-to-event data where competing events block the outcome or where parametric forms replace the proportional hazards assumption. in the pathway →
Does a competing event block the outcome?
Complex-sample design and survey weighting
Design-aware analysis using survey weights, strata, and primary sampling units so an oversampled, clustered sample speaks for its population. in the pathway →
Which weighting or design adjustment?
Composite endpoint construction
Combining several outcome phenotypes into one variable, where the weakest component dominates overall measurement error. in the pathway →
Is your outcome a validated algorithm or an unchecked code rule?
Composite strategy
An intercurrent-event strategy that folds the event into the endpoint. in the pathway →
How do you handle intercurrent events?
Conditional independence
The unverifiable assumption underlying propensity-score methods. in the pathway →
Which identifying assumption do you need?
Conducting a systematic review
A protocol-driven, pre-registered search with reproducible strings, dual independent screening, structured extraction, and a PRISMA flow diagram accounting for every record. in the pathway →
What situation?
Confidence interval
The range of values compatible with the data around a point estimate, frequently misread as a direct probability statement about the true value. in the pathway →
How to express estimate uncertainty?
Confounder
A common cause of exposure and outcome, which you adjust for. in the pathway →
Which causal-diagram concept?
Confounding
A common cause of exposure and outcome that distorts the estimate, with confounding by indication the clinical archetype. in the pathway →
Which bias is threatening the study?
Confounding by indication
The clinical archetype of confounding, or channeling, where the reason for treatment also predicts the outcome. in the pathway →
Which bias is threatening the study?
Conjugate prior
A prior chosen so the posterior shares its form and the update is closed-form, such as a beta prior with a binomial likelihood. in the pathway →
What kind of prior do you need?
Consensus methods (Delphi, nominal group)
Formal methods for a panel to converge on a recommendation when evidence underdetermines it, including the Delphi method, nominal group technique, and RAND/UCLA method. in the pathway →
How do experts reach consensus?
Consistency
The identifiability condition that the treatment is a well-defined intervention so a potential outcome means something specific. in the pathway →
Which identifiability condition is at stake?
CONSORT
The reporting checklist for randomized trials. in the pathway →
Which study type are you reporting?
Continual reassessment method
A model-based phase I design estimating the maximum tolerated dose more efficiently with fewer patients overdosed. in the pathway →
Which early-phase design question are you facing?
Continuous enrollment and observable time
Requiring uninterrupted coverage so that a patient’s care is captured, letting absence of a code mean absence of care. in the pathway →
Can this data actually answer my question?
Contrast
A weighted sum of coefficients estimating a quantity such as a subgroup effect when the model carries an interaction. in the pathway →
Which comparison of model terms?
Cook’s distance
A diagnostic for influential points in a regression. in the pathway →
Which model assumption to check?
Cost-benefit analysis
Economic evaluation that monetizes the health benefit so it can be compared directly with cost. in the pathway →
Which economic-evaluation framing fits?
Cost-effectiveness acceptability curve
A curve reading off the probability that each option is the best buy at each willingness-to-pay threshold. in the pathway →
How are you handling cost-effectiveness uncertainty?
Cost-effectiveness alongside a trial
Estimating cost-effectiveness directly from a trial’s patient-level cost and outcome data, often via net-benefit regression, with high internal validity but a short horizon. in the pathway →
What situation?
Cost-effectiveness and the ICER
Economic evaluation putting cost and benefit on the same page, with the incremental cost-effectiveness ratio judged against a willingness-to-pay threshold. in the pathway →
Which economic-evaluation framing fits?
Cost-effectiveness plane
The plane on which a probabilistic analysis plots its cloud of incremental cost-and-effect pairs. in the pathway →
How are you handling cost-effectiveness uncertainty?
Cost-minimization analysis
Economic evaluation that compares only costs, applicable only when the outcomes of the options are genuinely equal. in the pathway →
Which economic-evaluation framing fits?
Cost-utility analysis
Economic evaluation measuring benefit in quality-adjusted life years so different conditions become comparable. in the pathway →
Which economic-evaluation framing fits?
Costing methods
How the cost in cost-effectiveness is estimated, from micro-costing each resource to gross costing a whole episode, sorted into direct medical, direct non-medical, and indirect costs. in the pathway →
How to value the resources used?
CPT/HCPCS codes
Codes for professional services, procedures, and supplies in outpatient and physician billing. in the pathway →
Which vocabulary encodes each claim field, and what does it capture?
Cramer’s V
An effect-size measure for a chi-square table. in the pathway →
Which two-variable association are you testing?
Credible interval
A range the parameter lies in with stated probability, a direct probability statement the frequentist interval cannot make. in the pathway →
Which Bayesian concept is in play?
Cronbach’s alpha
A gauge of the internal consistency of a multi-item scale. in the pathway →
Which measurement property are you assessing?
Cross-sectional study
Measures exposure and outcome at a single point in time, giving prevalence cheaply but rarely establishing temporal order. in the pathway →
Which observational design fits the question and dominant bias?
Cross-validation
Estimating out-of-sample error on held-out folds to choose the right model flexibility. in the pathway →
Which concept or penalty?
Crude rate
The unadjusted whole-population frequency, which confounds comparisons across populations with different age structures. in the pathway →
What frequency are you trying to measure?
Cumulative incidence
The risk of disease: new cases over a fixed period divided by the population at risk. in the pathway →
What frequency are you trying to measure?
Cumulative incidence function (CIF)
Probability of experiencing the event by a given time, accounting for competing events that remove patients. in the pathway →
Does a competing event block the outcome?
Cure models
Survival models that split the population into a cured fraction and a susceptible fraction with its own distribution. in the pathway →
Does a competing event block the outcome?
Cycle length and the half-cycle correction
Two timing choices in a state-transition model: a cycle short enough to miss no important event, and a correction for transitions occurring partway through a cycle. in the pathway →
Which cycle-timing issue applies?

D

DAG
A directed acyclic graph: variables as nodes and assumed causal effects as arrows, with no cycles. in the pathway →
Which causal-diagram concept?
Data feasibility, enrollment, and linkage
Confirming a database can answer the question, that follow-up is observable, and that datasets are joined without exposing patient identities. in the pathway →
Can this data actually answer my question?
Data management and reproducibility
The discipline between collection and analysis, from clean data capture and a database lock to a scripted, version-controlled pipeline that regenerates the numbers. in the pathway →
Which data-management step are you at?
Data privacy and security
The duty owed to people in health data, governed by HIPAA in the US and GDPR in Europe, with de-identification or synthetic data enabling research sharing. in the pathway →
Which rule or method?
Data safety monitoring board
An independent board, not the sponsor, that decides whether to stop a trial early for efficacy, futility, or harm. in the pathway →
Which interim-monitoring element?
Data sources and their tradeoffs
Each data source carries a characteristic strength and bias that bounds every question it can answer. in the pathway →
Which data source or pitfall?
Data standards and provenance
The structure a datapoint inherits from how it was recorded, through CDISC standards or coding ontologies. in the pathway →
Which data standard or provenance layer?
Database feasibility and the attrition funnel
Counting how many patients survive each eligibility criterion to judge whether a source supports the planned study. in the pathway →
Can this data actually answer my question?
Database lock
A dated point after which no value in a study database changes silently, marking the clean source for analysis. in the pathway →
Which data-management step are you at?
Decision tree (decision analysis)
A model mapping a one-off choice and its probabilistic consequences, clean for an acute decision but clumsy once events repeat. in the pathway →
Which model structure fits the problem?
Decision tree (machine learning)
A predictor splitting predictors into regions, interpretable but unstable on its own. in the pathway →
Which learner or ensemble fits?
Decision-analytic models
Models estimating lifetime costs and QALYs that are rarely observed directly, from decision trees and Markov models to microsimulation and transmission models. in the pathway →
Which model structure fits the problem?
Decision-curve analysis
Weighing the trade-offs of acting on a test or model directly in terms of net benefit across the range of thresholds a clinician might hold. in the pathway →
Which clinical-utility concept is in play?
Delphi method
A consensus method where an expert panel answers in iterative anonymous rounds, revising after seeing a statistical summary, so opinion converges without face-to-face pressure. in the pathway →
How do experts reach consensus?
Descriptive epidemiology
Describing a health event by person, place, and time to generate hypotheses and fix the frequency measure reported. in the pathway →
Design effect
The factor by which clustering inflates variance, used to scale up the target sample size to hold the effective sample size. in the pathway → \[\text{DEFF} = \frac{\text{Var}_{\text{complex}}}{\text{Var}_{\text{SRS}}}\] where \(\text{DEFF}\) is the design effect, the variance penalty from the complex design; \(\text{Var}_{\text{complex}}\) is the variance under the actual complex sampling design; \(\text{Var}_{\text{SRS}}\) is the variance a simple random sample of the same size would give.
How do you draw the sample?
Detection bias
Differential ascertainment of the outcome by exposure group, also called observer bias. in the pathway →
Which bias is threatening the study?
Diagnostic-accuracy studies
Study design measuring how well an index test discriminates disease against a reference standard, prone to spectrum, verification, and incorporation bias. in the pathway →
Which accuracy measure or pitfall?
Difference-in-differences
A causal design that neutralizes a specific dominant threat to inference, resting on a parallel-trends assumption. in the pathway →
Which quasi-experimental design fits?
Differential misclassification
Measurement error related to the outcome, which can bias an effect in either direction and is harder to reason about. in the pathway →
What kind of measurement error?
Dimensionality reduction
Compressing many correlated variables into a few, through PCA or nonlinear methods. in the pathway →
What unlabeled-data structure are you finding?
Discounting
Converting future costs and effects to present value over a model’s time horizon. in the pathway →
Which model structure fits the problem?
Discrimination
Whether a model ranks higher-risk patients above lower-risk ones, measured by the AUC. in the pathway →
Which aspect of predictive performance?
Disease registry
A systematically maintained roster of people with a condition or exposure that supplies a standing population for many designs. in the pathway →
Which observational design fits the question and dominant bias?
Disease risk score
A summary that models outcome risk from covariates instead of treatment probability, offering an alternative to the propensity score. in the pathway →
How do causal methods scale to claims and time?
Dose-finding and early-phase designs
Early studies that find the tolerable dose and the efficacy signal before a confirmatory trial. in the pathway →
Which early-phase design question are you facing?
Double-barreled question
A survey item that asks two things at once. in the pathway →
Which questionnaire flaw is in play?
Double-programming
Independent re-derivation of a dataset or output by a second programmer without seeing the first, reconciled value by value as the sign-off. in the pathway →
What situation?
Doubly-robust estimators
Estimators such as augmented IPW and TMLE that combine a propensity and an outcome model and stay consistent if either is right. in the pathway →
How do you estimate the causal effect?
Drug era (OMOP)
A derived continuous exposure span in the OMOP model built from raw drug records using an explicit persistence gap. in the pathway →
How do raw fills become a defined exposure with a start and end?
Dynamic transmission model
An infectious-disease model capturing how treating one person changes others’ risk through herd immunity, which a fixed-risk cohort model cannot. in the pathway →
Which model structure fits the problem?

E

E-value
A measure of how strong a hidden confounder would have to be, in association with both treatment and outcome, to explain away an observed result. in the pathway → \[E = \text{RR} + \sqrt{\text{RR}(\text{RR} - 1)}\] where \(E\) is the E-value, the smallest association a hidden confounder would need with both treatment and outcome to explain the estimate away; \(\text{RR}\) is the observed risk ratio, taken above 1 (for a protective effect, apply the formula to its reciprocal).
How do you quantify unmeasured bias?
Ecological fallacy
Reading a group-level association as if it held for individuals, a trap of aggregated data. in the pathway →
Which data source or pitfall?
Effect measures
The scales for reporting a result, including relative measures like risk and odds ratios and absolute measures like risk difference and number needed to treat. in the pathway → \[\text{RR} = \frac{\text{risk}_{\text{exposed}}}{\text{risk}_{\text{unexposed}}}, \quad \text{RD} = \text{risk}_{\text{exposed}} - \text{risk}_{\text{unexposed}}\] where \(\text{RR}\) is the risk ratio, a relative measure; \(\text{RD}\) is the risk difference, an absolute measure; \(\text{risk}_{\text{exposed}}\) is the outcome risk in the exposed group; \(\text{risk}_{\text{unexposed}}\) is the outcome risk in the unexposed group.
Which effect measure to report?
Effective sample size
The sample size discounted by the design effect, so a design effect of 2 leaves the precision of half the respondents. in the pathway → \[n_{\text{eff}} = \frac{n}{\text{DEFF}}\] where \(n_{\text{eff}}\) is the effective sample size, the precision the design actually delivers; \(n\) is the achieved sample size; \(\text{DEFF}\) is the design effect, the variance penalty from clustering and unequal weighting.
Which weighting or design adjustment?
Egger’s test
A statistical test for funnel-plot asymmetry, used to check for publication bias. in the pathway →
Which pooling or heterogeneity tool?
Elastic net
A regularization that blends ridge and lasso penalties. in the pathway →
Which concept or penalty?
Electronic health record data
Clinically rich data recorded for care, so messy, single-system, and informatively missing rather than research-ready. in the pathway →
Which data source or pitfall?
Elixhauser comorbidity measures
A broader set of comorbidity categories, often kept as separate indicators rather than one number, to adjust for diverse baseline conditions. in the pathway →
How do I adjust for how sick patients already were?
Empirical calibration
Fitting the spread of many null estimates to recalibrate p-values and intervals for observed systematic error. in the pathway →
Need to detect hidden residual confounding?
Endpoint adjudication and chart review
Clinician review of source records, blinded to exposure, serving as the reference standard for validation. in the pathway →
Is your outcome a validated algorithm or an unchecked code rule?
Endpoint logic and pre-registration
Fixing the primary endpoint the sample size rests on and publicly committing to it before unblinding, keeping confirmatory analyses confirmatory. in the pathway →
Which endpoint or pre-specification concern?
EQ-5D
A preference-based instrument used to derive the utility weights that anchor quality-adjusted life years. in the pathway →
Which utility concept do you need?
Equivalence trial
A trial that bounds the difference between treatments on both sides. in the pathway →
What are you trying to show?
Estimand
The exact quantity to be estimated: which effect, in whom. in the pathway →
Whose causal effect do you target?
Eta-squared
An ANOVA effect-size measure, the share of variance the groups explain. in the pathway →
Which two-variable association are you testing?
Evidence-to-decision
Frameworks making the move from evidence to a recommendation explicit, weighing benefits and harms alongside values, feasibility, equity, and cost. in the pathway →
What situation?
EVPI
Expected value of perfect information: an upper bound on what further research could be worth, equal to the expected loss from deciding under current uncertainty. in the pathway →
What is more evidence worth?
Exact logistic regression
Conditions on sufficient statistics and enumerates the permutation distribution, giving valid inference without asymptotic approximations when data are very sparse. in the pathway →
Are cells sparse or analytic standard errors doubtful?
Exchangeability
The identifiability condition that treated and untreated are comparable once confounders are controlled, meaning no unmeasured confounding. in the pathway →
Which identifiability condition is at stake?
Exclusion restriction
The unverifiable assumption underlying instrumental-variable designs. in the pathway →
Which identifying assumption do you need?
Expected value of partial perfect information (EVPPI)
Prices resolving specific uncertain parameters, identifying which uncertainty is worth further research. in the pathway →
Modeling skewed real-world costs for HTA?
Expected value of sample information
A measure valuing a study of a given design and size, going beyond perfect information to price real research. in the pathway →
What is more evidence worth?
Expert determination
A HIPAA de-identification route where a statistician certifies the re-identification risk is very small. in the pathway →
Which rule or method?
Exposure definition in RWD
Turning prescription or claim records into an exposure variable with a defined start, window, and end so it is clear who is treated and when. in the pathway →
How do raw fills become a defined exposure with a start and end?
Exposure episode construction
Stitching consecutive fills into a continuous treatment span using rules for combining overlapping or sequential supplies. in the pathway →
How do raw fills become a defined exposure with a start and end?
Extract-transform-load
Pulling from source tables, deriving study variables from operational definitions, and assembling one analysis-ready table. in the pathway →
Which cohort-construction step?

F

F1 score
The harmonic mean of precision and recall, high only when both are. in the pathway → \[F_1 = \frac{2 \cdot \text{precision} \cdot \text{recall}}{\text{precision} + \text{recall}}\] where \(F_1\) is the F1 score, the harmonic mean of precision and recall; \(\text{precision}\) is the share of positive predictions that are correct; \(\text{recall}\) is the share of true positives caught.
Which classification metric?
False-discovery rate
The expected share of false positives among rejections, controlled by Benjamini-Hochberg, better for screening. in the pathway →
How do you control multiple testing?
Family-wise error rate
The chance of even one false positive, held down by Bonferroni or Holm’s step-down procedure. in the pathway →
How do you control multiple testing?
Fine-Gray subdistribution hazard
A hazard that models the cumulative incidence function directly, giving covariate effects on absolute risk. in the pathway →
Does a competing event block the outcome?
Firth penalized regression
Adds a bias-reducing penalty to the likelihood, keeping coefficient estimates finite and less biased even under separation in small or sparse data. in the pathway →
Are cells sparse or analytic standard errors doubtful?
Fisher’s exact test
A test of association between categorical variables used when cell counts are small. in the pathway →
Which two-variable association are you testing?
Fixed-effect meta-analysis
A pooling model assuming every study estimates one common effect, weighting each only by the inverse of its variance. in the pathway → \[w = \frac{1}{\text{variance}}\] where \(w\) is the weight a study receives in the pooled estimate; \(\text{variance}\) is the variance of that study’s effect estimate.
Which pooling or heterogeneity tool?
Fleiss’ kappa
A kappa extending chance-corrected agreement past two raters. in the pathway →
Which measurement property are you assessing?
Friction-cost approach
Valuing lost productivity by counting only earnings lost until a worker is replaced. in the pathway →
How to value the resources used?
Fundamental problem of causal inference
That only one of a unit’s potential outcomes is ever observed. in the pathway →
Which identifiability condition is at stake?
Funnel plot
A plot used to check for publication bias in a meta-analysis, where asymmetry suggests missing null studies. in the pathway →
Which pooling or heterogeneity tool?

G

G-estimation
A g-method estimating a structural nested model for time-varying confounding. in the pathway →
How do you handle time-varying confounding?
G-formula
G-computation: modeling the outcome under each treatment and averaging over the covariate distribution. in the pathway →
How do you estimate the causal effect?
Gate question
A question that routes a respondent past items that do not apply, creating by-design blanks. in the pathway →
Which skip-logic element is in play?
Gatekeeping procedure
A hierarchical procedure ordering trial hypotheses and spending alpha down the sequence, testing a secondary endpoint only if the primary won. in the pathway →
How do you control multiple testing?
GDPR
The European regulation imposing a stricter consent-and-purpose regime on personal data than US rules. in the pathway →
Which rule or method?
GEE
Generalized estimating equations, used for clustered or repeated measures. in the pathway →
Which regression for your outcome?
Generalizability and transportability
Generalizability asks whether the study sample represents the target population; transportability formalizes when an estimate can be carried to a different population. in the pathway →
Do findings carry to other populations?
Generalized additive models
Models that extend splines to fit smooth nonlinear predictor effects. in the pathway →
How do you flex the model?
Gibbs sampling
A classic MCMC algorithm for drawing posterior samples. in the pathway →
Which sampling or diagnostic tool?
GLM
A generalized linear model: a choice of outcome distribution plus a link function. in the pathway →
Which regression for your outcome?
Good clinical practice
The operational standard (ICH E6) making a trial’s data trustworthy through defined responsibilities, a followed protocol, source-data verification, and an audit trail. in the pathway →
What conduct standard governs the trial?
Grace period and permissible gap
Allowed days between supplies before exposure is broken, and extra coverage past the last day of supply before discontinuation. in the pathway →
How do raw fills become a defined exposure with a start and end?
GRADE
A system rating the certainty of a body of evidence, downgrading for risk of bias, inconsistency, indirectness, imprecision, and publication bias. in the pathway →
Which certainty-of-evidence concept is in play?
Gross costing
Top-down costing that values a whole episode of care with one aggregate weight such as a DRG payment. in the pathway →
How to value the resources used?
Group-sequential design
A design that pre-specifies interim analyses and spends the alpha across them with a stopping boundary. in the pathway →
Which interim-monitoring element?

H

Half-cycle correction
A fix for the counting error from tallying state membership only at cycle boundaries, since on average subjects transition partway through a cycle. in the pathway →
Which cycle-timing issue applies?
Hamiltonian Monte Carlo
The MCMC engine of Stan, mixing far more efficiently in high dimensions. in the pathway →
Which sampling or diagnostic tool?
Hazard ratios and non-proportional hazards
A hazard ratio assumes a constant effect on instantaneous risk over time; when that fails, the single ratio becomes a censoring-dependent weighted average. in the pathway →
Which survival concept?
Health technology assessment and value frameworks
A body weighing cost-effectiveness against clinical benefit, budget impact, and equity to reach a coverage or pricing verdict, run differently across health systems. in the pathway →
Which HTA framework or body?
Health-state utility
A preference-based weight between zero and one for a health state, elicited from instruments like the EQ-5D or time-trade-off and standard-gamble methods. in the pathway →
Which utility concept do you need?
Healthy-worker effect
The tendency of an employed cohort to be healthier than the general population. in the pathway →
Which bias is threatening the study?
Heterogeneity
The degree to which studies’ results actually disagree beyond chance, which decides whether a pooled number is informative or a fiction. in the pathway →
Which pooling or heterogeneity tool?
Heteroscedasticity
Non-constant residual variance, read from a residual-versus-fitted plot and confirmed with Breusch-Pagan or White. in the pathway →
Which model assumption to check?
Hierarchical Bayesian models
Multilevel models that estimate each group’s parameter while sharing a common prior, pulling estimates toward the mean. in the pathway →
Which multilevel Bayesian idea?
Hierarchical clustering
A clustering method building a nested tree of groupings without fixing the number of clusters in advance. in the pathway →
What unlabeled-data structure are you finding?
High-dimensional propensity score (hdPS)
An algorithm that screens thousands of claims codes to select empirical proxy confounders for the propensity-score model automatically. in the pathway →
How do causal methods scale to claims and time?
HIPAA
The US law governing identifiable health information, which a dataset must satisfy through de-identification before sharing for research. in the pathway →
Which rule or method?
Holm’s procedure
A step-down procedure controlling the family-wise error rate with more power than Bonferroni. in the pathway →
How do you control multiple testing?
Homogeneity check
The test before pooling for whether stratum-specific estimates differ by more than noise, which would indicate effect modification. in the pathway →
Which stratified-analysis step are you at?
Hosmer-Lemeshow statistic
A goodness-of-fit test of whether a model’s predicted risks match observed event rates across groups. in the pathway →
Which aspect of predictive performance?
Human-capital approach
Valuing lost productivity by counting all earnings foregone to illness. in the pathway →
How to value the resources used?
Hurdle model
A count model with a zero-versus-positive gate followed by a truncated count. in the pathway →
Which regression for your outcome?
Hypothetical strategy
An intercurrent-event strategy targeting the outcome had the event not occurred. in the pathway →
How do you handle intercurrent events?

I

I-squared
A statistic reporting the fraction of total variation across studies that is beyond chance, summarizing heterogeneity. in the pathway →
Which pooling or heterogeneity tool?
ICD-10-CM diagnosis codes
Clinical modification of ICD-10 used to code diagnoses and conditions for morbidity reporting. in the pathway →
Which vocabulary encodes each claim field, and what does it capture?
ICD-10-PCS procedure codes
Procedure coding system for inpatient hospital procedures. in the pathway →
Which vocabulary encodes each claim field, and what does it capture?
ICER
Incremental cost-effectiveness ratio: the extra cost divided by the extra benefit of one option over the next. in the pathway → \[\text{ICER} = \frac{\Delta\text{cost}}{\Delta\text{effect}}, \quad \text{NMB} = \text{effect} \times \text{WTP} - \text{cost}\] where \(\text{ICER}\) is the incremental cost-effectiveness ratio of one option over the next; \(\Delta\text{cost}\) is the extra cost of the option; \(\Delta\text{effect}\) is the extra benefit of the option; \(\text{NMB}\) is the net monetary benefit, the same comparison made linear; \(\text{effect}\) is the health benefit gained; \(\text{WTP}\) is the willingness-to-pay threshold per unit of benefit; \(\text{cost}\) is the cost of the option.
Which economic-evaluation framing fits?
IDE
FDA investigational device exemption, usually needed before a device trial begins. in the pathway →
Which regulatory application?
Identifying assumptions
The claim each causal design rests on that the data cannot verify, such as parallel trends or an exclusion restriction. in the pathway →
Which identifying assumption do you need?
Immortal time
A stretch of follow-up during which the outcome could not yet have occurred, a bias target-trial emulation surfaces. in the pathway →
Which target-trial element?
Immortal time bias
Mistakenly assigning follow-up during which the outcome could not occur to the treated group, manufacturing a survival advantage from bookkeeping. in the pathway →
What situation creates this bias?
Incidence
The rate of new cases, measured as cumulative incidence over a fixed period or as an incidence rate per person-time. in the pathway →
What frequency are you trying to measure?
Incidence rate
New cases divided by the person-time at risk, which handles varying follow-up. in the pathway →
What frequency are you trying to measure?
Incorporation bias
Bias arising when the index test is itself part of the reference standard it is judged against. in the pathway →
Which accuracy measure or pitfall?
IND
FDA investigational new drug application, usually needed before a drug trial begins. in the pathway →
Which regulatory application?
Index date
A single time zero at which eligibility, exposure assignment, and follow-up start are all aligned for each patient. in the pathway →
Which cohort-construction step?
Induction, latency, and lag windows
Time shifts that delay when exposure can plausibly cause an outcome, excluding implausibly early events. in the pathway →
How do raw fills become a defined exposure with a start and end?
Informative prior
A prior encoding real external knowledge, powerful when data are sparse. in the pathway →
What kind of prior do you need?
Informed consent
The requirement that a participant understand the study, its risks, and their freedom to refuse or withdraw, with extra protection for vulnerable groups. in the pathway →
Which ethics concept or body?
Institute for Clinical and Economic Review
A US body publishing value assessments that anchor drug-price negotiations without a binding cost-per-QALY rule. in the pathway →
Which HTA framework or body?
Institutional review board
A body that reviews a study before it starts, weighing risks against benefits and able to halt or modify a protocol. in the pathway →
Which ethics concept or body?
Instrumental variables
A causal design using a variable affecting exposure only, resting on an exclusion restriction. in the pathway →
Which quasi-experimental design fits?
Intention-to-treat
Analyzing every randomized patient in the arm assigned regardless of what they took, preserving randomization. in the pathway →
Which set of subjects do you analyze?
Interaction term
A term capturing effect modification, letting an effect differ across subgroups instead of being averaged. in the pathway →
How do you flex the model?
Intercurrent events
Things happening after randomization that complicate the outcome, such as stopping the drug, switching, rescue medication, or death. in the pathway →
How do you handle intercurrent events?
Interim analyses and group-sequential design
Pre-specified looks at accumulating trial data that spend alpha across them so peeking does not inflate the false-positive rate. in the pathway →
Which interim-monitoring element?
Interviewer bias
Bias from a data collector’s knowledge of a subject’s status shaping what is recorded. in the pathway →
Which bias is threatening the study?
Intraclass correlation
A measure of reproducibility for a continuous measurement across raters or repeats. in the pathway →
Which measurement property are you assessing?
Inverse-probability-of-censoring weighting (IPCW)
Reweighting uncensored patients to stand in for similar censored ones, correcting the informative censoring that artificial censoring or dropout creates. in the pathway →
How do causal methods scale to claims and time?
IPTW
Inverse-probability-of-treatment weighting, which reweights subjects by the inverse of their propensity score to balance measured confounders. in the pathway →
How do you estimate the causal effect?

K

K-means
A clustering method partitioning data into k groups by minimizing within-cluster distance to the cluster mean. in the pathway →
What unlabeled-data structure are you finding?
K-nearest neighbours
A predictor using the majority or average of the k closest cases, sensitive to scaling and dimensionality. in the pathway →
Which learner or ensemble fits?
Kendall’s tau
A measure of concordance between two ordinal rankings, with tau-c for rectangular tables. in the pathway →
Which two-variable association are you testing?
Kruskal-Wallis test
A rank-based alternative to one-way ANOVA when normality is doubtful. in the pathway →
Which two-variable association are you testing?
Kurtosis
A summary of a distribution’s tail-heaviness, part of reading its shape. in the pathway →
What shape feature?

L

Landmark analysis
Classifying exposure status as of a fixed later time and analyzing from there, so early events are not misattributed to exposure. in the pathway →
How do causal methods scale to claims and time?
Lasso
L1 regularization that shrinks some coefficients exactly to zero and so also selects variables. in the pathway →
Which concept or penalty?
LATE
The local average treatment effect, the contrast of potential outcomes among compliers. in the pathway →
Whose causal effect do you target?
Lead-time bias
The apparent survival gain from diagnosing earlier without changing the disease course. in the pathway →
Which bias is threatening the study?
Leading question
A survey item whose wording presses the respondent toward a particular answer. in the pathway →
Which questionnaire flaw is in play?
Learning algorithms and ensembles
The supervised toolkit beyond regression, including k-nearest neighbours, support vector machines, decision trees, and ensembles. in the pathway →
Which learner or ensemble fits?
Leave-one-out and specification curves
Re-estimating after dropping a single unit, or across many defensible modeling choices, to expose whether a finding rests on one unit or holds broadly. in the pathway →
How are you probing specification robustness?
Length-time bias
The over-representation of slow, indolent cases that screening preferentially catches. in the pathway →
Which bias is threatening the study?
Likelihood ratios
Summaries of a diagnostic table independent of prevalence that update pre-test odds to post-test odds directly. in the pathway → \[\text{LR}+ = \frac{\text{sens}}{1 - \text{spec}}, \quad \text{LR}- = \frac{1 - \text{sens}}{\text{spec}}, \quad \text{post-test odds} = \text{pre-test odds} \times \text{LR}\] where \(\text{LR}+\) is the positive likelihood ratio, how much a positive result raises the odds; \(\text{LR}-\) is the negative likelihood ratio, how much a negative result lowers the odds; \(\text{sens}\) is the sensitivity of the test; \(\text{spec}\) is the specificity of the test; \(\text{pre-test odds}\) is the odds of disease before the test, from prevalence; \(\text{post-test odds}\) is the odds of disease after the test result.
Which accuracy measure or pitfall?
Linear combinations and contrasts
A weighted sum of regression coefficients reported as the quantity of interest, with a standard error drawn from the variance-covariance matrix. in the pathway →
Which comparison of model terms?
Linear regression
A regression for continuous outcomes, returning a mean difference. in the pathway →
Which regression for your outcome?
LOESS smoother
A smoother drawn on a scatter to reveal the shape of a relationship before assuming it is linear. in the pathway →
What shape feature?
Logistic regression
A regression for binary outcomes, returning an odds ratio. in the pathway →
Which regression for your outcome?
LOINC lab codes
Standard vocabulary for identifying laboratory tests and clinical observations. in the pathway →
Which vocabulary encodes each claim field, and what does it capture?
Lookback window
The pre-index period in which confounders are measured so adjustment targets baseline causes, not post-exposure variables. in the pathway →
Which cohort-construction step?

M

MAD
The median absolute deviation, rescaled by 1.4826 to equal the standard deviation under a normal. in the pathway →
Which robust measure?
Mann-Whitney test
A rank-based alternative to the two-group t-test when normality is doubtful, also called the Wilcoxon rank-sum. in the pathway →
Which two-variable association are you testing?
Mantel-Haenszel estimator
A method for pooling stratum-specific odds ratios, risk ratios, or rate ratios into one. in the pathway →
Which stratified-analysis step are you at?
MAR
Missing at random: missingness depending only on observed data, handled by multiple imputation conditional on it. in the pathway →
Why are values missing?
Marginal structural model
A g-method fitted by inverse-probability-of-treatment weighting to handle time-varying confounding. in the pathway →
How do you handle time-varying confounding?
Markov model
A state-transition model moving a cohort between health states each cycle by a transition matrix, the standard tool for chronic disease. in the pathway → \[p = 1 - \exp(-r \cdot t)\] where \(p\) is the per-cycle transition probability; \(r\) is the rate reported in the published evidence; \(t\) is the cycle length over which the probability applies.
Which model structure fits the problem?
Maximum tolerated dose
The highest tolerable dose, the target a phase I dose-finding study estimates. in the pathway →
Which early-phase design question are you facing?
MCAR
Missing completely at random: a benign mechanism where missingness is unrelated to any data. in the pathway →
Why are values missing?
McFadden’s pseudo-R-squared
A rough stand-in for R-squared in generalized linear models, where a true R-squared does not apply. in the pathway →
Which fit or error measure do you need?
MCMC
Markov chain Monte Carlo: drawing a dependent sequence of samples whose long-run distribution is the posterior. in the pathway →
Which sampling or diagnostic tool?
Mean absolute error
A prediction error measure in the outcome’s units, used when a few large errors should not dominate. in the pathway →
Which fit or error measure do you need?
Measurement error and misclassification
Imprecision in measuring a variable, whose effect on an estimate depends on whether the error relates to the outcome. in the pathway →
What kind of measurement error?
Measurement-method effects
Two devices or protocols measuring the same quantity can disagree systematically, so a threshold validated under one does not transfer. in the pathway →
What situation is this?
Measures of disease frequency
The standard forms for counting how often disease occurs, including prevalence, incidence, and rates. in the pathway →
What frequency are you trying to measure?
Mediation analysis
Splitting a total effect into a direct effect and an indirect effect running through a mediator. in the pathway → \[\text{total} = \text{direct} + \text{indirect}, \quad \text{indirect} = a \cdot b\] where \(\text{total}\) is the total effect of the exposure on the outcome; \(\text{direct}\) is the effect not running through the mediator; \(\text{indirect}\) is the effect running through the mediator; \(a\) is the exposure-to-mediator coefficient; \(b\) is the mediator-to-outcome coefficient.
Which mediation concept is in play?
Mediator
A variable on the causal path from exposure to outcome, left alone when the total effect is the target. in the pathway →
Which causal-diagram concept?
Medication possession ratio (MPR)
Total days supplied divided by days in the observation interval, an adherence measure that can exceed one with overlaps. in the pathway →
How do raw fills become a defined exposure with a start and end?
Meta-analysis and pooling
Combining studies into one estimate using inverse-variance weighting, which sharpens an estimate only when the studies are estimating the same thing. in the pathway →
Which pooling or heterogeneity tool?
Meta-regression
A technique that tries to explain heterogeneity across studies using study-level covariates. in the pathway →
Which pooling or heterogeneity tool?
Metropolis-Hastings
A classic MCMC algorithm for drawing posterior samples. in the pathway →
Which sampling or diagnostic tool?
Micro-costing
Bottom-up costing that counts each resource used and multiplies it by its unit price. in the pathway →
How to value the resources used?
Minimization
An adaptive assignment that places each patient to keep arms balanced across several factors at once. in the pathway →
What allocation or masking concern?
Missing data
Why a value is missing decides what can be done about it, across the MCAR, MAR, and MNAR mechanisms. in the pathway →
Why are values missing?
MMRM
The mixed model for repeated measures, standard for a longitudinal trial endpoint, using all timepoints and handling dropout under missing-at-random. in the pathway →
Which regression for your outcome?
MNAR
Missing not at random: missingness depending on the unseen value itself, needing pattern-mixture or tipping-point sensitivity approaches. in the pathway →
Why are values missing?
Model fit, comparison, and prediction error
The continuous-outcome counterpart to calibration and discrimination, covering variance explained, model comparison, and honest out-of-sample error. in the pathway →
Which fit or error measure do you need?
Model modifications
Standard adaptations to a base regression, including splines, interactions, transformations, and offsets, each answering a specific signal. in the pathway →
How do you flex the model?
Model validation and calibration
Checks that build trust in a model: verification that it is coded correctly and validation across face, internal, external, and predictive layers that it represents reality. in the pathway →
What situation?
Monte Carlo simulation
Generating data under a known process and running the planned analysis over many replicates to study an estimator’s bias, coverage, and required sample size. in the pathway →
What situation?
Multi-criteria decision analysis (MCDA)
Explicitly weighting criteria such as equity and severity when a single ratio cannot capture value. in the pathway →
Modeling skewed real-world costs for HTA?
Multiple imputation
Filling in missing values conditional on observed data, valid when data are missing at random. in the pathway →
Why are values missing?
Multiplicity control
Methods to rein in false positives when many hypotheses are tested, via family-wise error or false-discovery control. in the pathway →
How do you control multiple testing?
Multistage sampling
Nesting sampling stages: sampling primary sampling units, then units within them, often with probability proportional to size. in the pathway →
How do you draw the sample?

N

Natural direct and indirect effects
The counterfactual framing of mediation, needing no unmeasured confounding of the mediator-outcome relationship. in the pathway →
Which mediation concept is in play?
NDC (National Drug Code)
Identifier encoding drug manufacturer, product, and package, requiring mapping to reach the ingredient level. in the pathway →
Which vocabulary encodes each claim field, and what does it capture?
Negative binomial distribution
A distribution for overdispersed counts whose variance exceeds the mean. in the pathway →
Which distribution or sampling result?
Negative binomial regression
A count regression used when overdispersion makes the variance exceed the mean. in the pathway →
Which regression for your outcome?
Negative control exposure
An exposure sharing the real exposure’s confounding structure but with no plausible causal link to the outcome. in the pathway →
Need to detect hidden residual confounding?
Negative control outcome
An outcome sharing the real outcome’s confounding structure but that exposure cannot plausibly cause. in the pathway →
Need to detect hidden residual confounding?
Negative controls and calibration
Using outcomes or exposures with known null effects to detect and correct residual confounding in real analyses. in the pathway →
Need to detect hidden residual confounding?
Nested case-control
Case-control study inside a defined cohort, sampling controls at the time each case occurs to preserve risk-set comparability. in the pathway →
Which observational design fits the question and dominant bias?
Net benefit
A metric weighing true positives against false positives at a threshold probability, going beyond accuracy by accounting for the consequences of acting. in the pathway →
Which clinical-utility concept is in play?
Net monetary benefit
A restatement of a cost-effectiveness comparison as effect times willingness-to-pay minus cost, avoiding the awkwardness of ratios and handling dominance. in the pathway →
Which economic-evaluation framing fits?
Net-benefit regression
Converting each patient’s cost and effect into one net-benefit outcome at a willingness-to-pay threshold and regressing it on treatment arm, giving covariate adjustment for free. in the pathway →
What situation?
Network meta-analysis
Combining a whole network of trials to estimate every pairwise treatment contrast and rank options, even when no trial compared them all directly. in the pathway →
Which network meta-analysis concern?
New-user design
A cohort design applying a washout window so prevalent users do not contaminate the comparison. in the pathway →
Which cohort-construction step?
NICE
A national agency that pairs cost-effectiveness analysis with an explicit cost-per-QALY threshold to reach coverage decisions. in the pathway →
Which HTA framework or body?
Node-splitting
A formal check of consistency in a network meta-analysis, comparing the direct and indirect estimate for each contrast to flag disagreement. in the pathway →
Which network meta-analysis concern?
Nominal group technique
An in-person consensus method structuring convergence through silent ranking then discussion. in the pathway →
How do experts reach consensus?
Non-differential misclassification
Measurement error unrelated to the outcome, which usually biases an effect toward the null. in the pathway →
What kind of measurement error?
Non-inferiority and equivalence
Trials aiming to show a treatment is not meaningfully worse, or is bounded on both sides, rather than better. in the pathway →
What are you trying to show?
Non-inferiority margin
The pre-specified amount by which a new treatment may be worse and still pass, set from clinical tolerability and the control’s advantage. in the pathway →
What are you trying to show?
Non-inferiority trial
A trial testing against a shifted null, passing if the effect is no worse than standard by more than a pre-specified margin. in the pathway →
What are you trying to show?
Non-informative censoring
The assumption that censored subjects are representative of those still at risk, which informative dropout violates. in the pathway →
Which survival concept?
Nonresponse bias
Bias from those who do not answer a survey differing systematically from those who do. in the pathway →
Which bias is threatening the study?
Normal distribution
The Gaussian distribution, often used for continuous measurements, whose standardized form is the z. in the pathway →
Which distribution or sampling result?
NPI (provider identifier)
National Provider Identifier for the rendering or billing clinician or organization. in the pathway →
Which vocabulary encodes each claim field, and what does it capture?
Number needed to treat
Absolute measure of benefit, the number of patients treated to prevent one event, equal to the reciprocal of the absolute risk reduction. in the pathway → \[\text{NNT} = \frac{1}{\text{ARR}}\] where \(\text{NNT}\) is the number needed to treat, how many patients must be treated for one to benefit; \(\text{ARR}\) is the absolute risk reduction, the difference in risk between arms.
Which effect measure to report?

O

O’Brien-Fleming boundary
An alpha-spending boundary that is stringent early and near-nominal at the trial’s end. in the pathway →
Which interim-monitoring element?
Observational study designs
The family of non-randomized designs that observe exposures and outcomes as they occur, each chosen to fit a question and limit a specific bias. in the pathway →
Which observational design fits the question and dominant bias?
Odds ratio
Ratio of the odds of an outcome between groups, often misread as a risk ratio when the outcome is common, which overstates the effect. in the pathway →
Which effect measure to report?
Offset
A term for exposure time or population at risk that turns a Poisson count model into a rate model. in the pathway →
How do you flex the model?
OMOP standardized vocabularies (OHDSI)
Common data model mapping heterogeneous source codes to standard concepts so studies run across databases, at some loss of detail. in the pathway →
Which vocabulary encodes each claim field, and what does it capture?
Operating characteristics
Sensitivity and specificity describe a test in the abstract, while predictive values describe what a result means for a patient and shift with prevalence. in the pathway →
Which diagnostic-performance measure?
Operationalizing the variable
Writing a variable definition precise enough, with codes, thresholds, and windows, that two analysts produce the same cases. in the pathway →
What measurement situation are you in?
Opportunity cost
The principle that every dollar spent is health some other patient could have had. in the pathway →
Whose costs and benefits count?
Outcome phenotyping and validation
Treating a claims or EHR outcome as an algorithm whose accuracy must be measured, because its predictive value and sensitivity bias the estimate. in the pathway →
Is your outcome a validated algorithm or an unchecked code rule?
Over-adjustment
Conditioning on a mediator or collider, adding bias while trying to remove it, the mirror image of confounding. in the pathway →
Which bias is threatening the study?
Overdiagnosis
Detecting disease that would never have caused harm, inflating apparent screening benefit. in the pathway →
Which bias is threatening the study?
Overfitting
When a model flexible enough to chase noise fits the training data but fails on new data. in the pathway →
Which concept or penalty?

P

Parallel trends
The unverifiable assumption underlying difference-in-differences. in the pathway →
Which identifying assumption do you need?
Parameter uncertainty
Second-order uncertainty in an input’s true value because it was estimated from finite data, propagated by probabilistic sensitivity analysis. in the pathway →
Which source of uncertainty?
Partial pooling
Shrinkage that stabilizes small or sparse groups by borrowing strength from the rest, between pooled and fully separate estimates. in the pathway →
Which multilevel Bayesian idea?
Partitioned survival model
An oncology model reading state membership straight off the progression-free and overall survival curves rather than a transition matrix. in the pathway →
Which model structure fits the problem?
Pearson correlation
A measure of linear association between two continuous variables. in the pathway →
Which two-variable association are you testing?
PECO
The observational cousin of PICO, naming population, exposure, comparator, and outcome. in the pathway →
Which question framework fits?
  • the overall ideaResearch question
  • intervention question for a trialPICO
  • add an explicit time horizonPICOT
  • add study-design eligibilityPICOS
  • exposure question for observational workPECO
Per-member-per-month costing (PMPM/PPPM)
Spend normalized by enrollment time, comparing populations with different follow-up at the budget level. in the pathway →
Modeling skewed real-world costs for HTA?
Per-protocol
Restricting analysis to those who followed the protocol, which answers the biological question but breaks randomization. in the pathway →
Which set of subjects do you analyze?
Persistence (time to discontinuation)
Duration from initiation to the first permissible-gap-exceeding break in supply. in the pathway →
How do raw fills become a defined exposure with a start and end?
Person-time
Each subject’s time under observation summed across the cohort, the denominator of an incidence rate. in the pathway →
What frequency are you trying to measure?
Perspective and the reference case
Whose costs count changes the answer, so a standardized reference case and impact inventory make analyses comparable, distinguishing healthcare-sector from societal perspectives. in the pathway →
Whose costs and benefits count?
PICO
Population, intervention, comparator, outcome: a framework forcing a clinical question to be specific enough to design around. in the pathway →
Which question framework fits?
  • the overall ideaResearch question
  • intervention question for a trialPICO
  • add an explicit time horizonPICOT
  • add study-design eligibilityPICOS
  • exposure question for observational workPECO
PICOS
PICO with an appended study design, the convention in systematic reviews. in the pathway →
Which question framework fits?
  • the overall ideaResearch question
  • intervention question for a trialPICO
  • add an explicit time horizonPICOT
  • add study-design eligibilityPICOS
  • exposure question for observational workPECO
PICOT
PICO with an appended timeframe, the convention in clinical-question teaching. in the pathway →
Which question framework fits?
  • the overall ideaResearch question
  • intervention question for a trialPICO
  • add an explicit time horizonPICOT
  • add study-design eligibilityPICOS
  • exposure question for observational workPECO
Placebo and falsification tests
Looking for an effect where none should exist, such as a pre-treatment period or unaffected outcome, to test whether a design is sound. in the pathway →
What situation is this?
Pocock boundary
An alpha-spending boundary that holds a constant threshold across interim looks. in the pathway →
Which interim-monitoring element?
Poisson distribution
The distribution of counts of rare events. in the pathway →
Which distribution or sampling result?
Poisson regression
A regression for counts returning a rate ratio, assuming the variance equals the mean. in the pathway →
Which regression for your outcome?
Positivity
The identifiability condition that every kind of unit could have received either treatment, also called overlap. in the pathway →
Which identifiability condition is at stake?
Posterior distribution
The updated distribution of a parameter after combining prior belief with the data. in the pathway →
Which Bayesian concept is in play?
Posterior predictive check
Asking whether data simulated from the fitted model resemble the real data. in the pathway →
Which sampling or diagnostic tool?
Potential outcomes and identifiability
A framework defining a causal effect as the contrast of outcomes under treatment and no treatment, with conditions for estimating it from data. in the pathway →
Which identifiability condition is at stake?
Potential-outcomes framework
Imagining for each unit the outcome under treatment and under no treatment, whose contrast is the causal effect. in the pathway →
Which identifiability condition is at stake?
Pre-registration
A public commitment on ClinicalTrials.gov or the Open Science Framework that locks the endpoint before data are unblinded. in the pathway →
Which endpoint or pre-specification concern?
Precision
The share of positive predictions that are correct, the same as positive predictive value. in the pathway →
Which classification metric?
Precision-recall curve
A more honest summary than ROC-AUC of classifier performance under class imbalance. in the pathway →
Which classification metric?
Prediction and machine learning
Flexible models for predicting rather than explaining, judged on out-of-sample error and calibration, not coefficient plausibility. in the pathway →
Which prediction concept is in play?
Prediction interval
The range a new study’s true effect might fall in, wider than the confidence interval and more honest under substantial heterogeneity. in the pathway →
Which pooling or heterogeneity tool?
Predictive values
What a positive or negative test result means for the patient in front of you, shifting with the prevalence of disease. in the pathway → \[\text{PPV} = \frac{\text{sens} \cdot \text{prev}}{\text{sens} \cdot \text{prev} + (1 - \text{spec}) \cdot (1 - \text{prev})}\] where \(\text{PPV}\) is the positive predictive value, the chance a positive result is a true case; \(\text{sens}\) is the sensitivity, the chance a true case tests positive; \(\text{spec}\) is the specificity, the chance a non-case tests negative; \(\text{prev}\) is the prevalence, the share of the tested population with the disease.
Which diagnostic-performance measure?
Prentice’s criteria
The formal test for whether a surrogate endpoint validly captures a treatment’s effect on the true clinical outcome. in the pathway →
Which endpoint or pre-specification concern?
Prevalence
The share of a population that has a condition at a point in time or over a window, reflecting both occurrence and duration. in the pathway →
What frequency are you trying to measure?
Primary endpoint
The outcome the sample size is built on and the headline claim is read against, with everything else secondary. in the pathway →
Which endpoint or pre-specification concern?
Principal component analysis
A dimensionality-reduction method finding the orthogonal directions of greatest variance. in the pathway →
What unlabeled-data structure are you finding?
Principal-stratum strategy
An intercurrent-event strategy restricting to those who would never have the event. in the pathway →
How do you handle intercurrent events?
PRISMA
The reporting checklist and flow diagram for systematic reviews. in the pathway →
Which study type are you reporting?
Privacy-preserving record linkage (tokenization)
Matching records across datasets using encrypted tokens instead of raw identifiers, so patients can be linked without revealing who they are. in the pathway →
Can this data actually answer my question?
Probabilistic sensitivity analysis
Propagating parameter uncertainty through a Monte Carlo simulation that draws each parameter from a distribution and reruns the model thousands of times. in the pathway →
How are you handling cost-effectiveness uncertainty?
Probability distributions
The theoretical distributions that model data and supply the reference for test statistics. in the pathway →
Which distribution or sampling result?
Probability sample
A sample giving every unit a known, nonzero chance of selection, the basis for generalizing to the population. in the pathway →
How do you draw the sample?
Propensity score
The probability of treatment given covariates, used to match or weight treated and untreated on measured confounders. in the pathway →
How do you estimate the causal effect?
Proportion of days covered (PDC)
Fraction of a period during which a patient had drug supply on hand, capping overlapping fills. in the pathway →
How do raw fills become a defined exposure with a start and end?
Proportional hazards
The Cox-model assumption checked with scaled Schoenfeld residuals or a log-log survival plot. in the pathway →
Which model assumption to check?
PROSPERO
The register where a systematic review protocol is recorded before screening, keeping the review from becoming a search for the wanted result. in the pathway →
What situation?
Publication bias
Positive results being published while null ones vanish, inflating a pooled estimate and often visible as funnel-plot asymmetry. in the pathway →
Which bias is threatening the study?

Q

QALY
Quality-adjusted life year: time spent in a health state multiplied by a utility weight between zero, equivalent to death, and one, full health. in the pathway →
Which utility concept do you need?
QALYs and health-state utilities
The quality-adjusted life year multiplies time in a health state by a utility weight anchored between zero (death) and one (full health). in the pathway →
Which utility concept do you need?
Quantitative bias analysis
Methods that assign explicit numerical assumptions to bias and propagate them into adjusted estimates and intervals. in the pathway →
Need to detect hidden residual confounding?
Questionnaire and instrument design
Fixing before fieldwork what a survey can measure, through item wording, response format, administration mode, and branching. in the pathway →
Which questionnaire flaw is in play?

R

R-hat
A convergence statistic that should sit near 1 when MCMC chains started far apart have mixed. in the pathway →
Which sampling or diagnostic tool?
R-squared
The share of outcome variance a model explains, which climbs mechanically as predictors are added, so use the adjusted or out-of-sample version. in the pathway →
Which fit or error measure do you need?
Random forest
The standard bagging ensemble, averaging many trees trained on bootstrap resamples. in the pathway →
Which learner or ensemble fits?
Random-effects meta-analysis
A pooling model assuming the true effect varies across studies, adding between-study variance to each weight and widening the interval. in the pathway →
Which pooling or heterogeneity tool?
Randomization and blinding
The schemes that assign trial arms and the safeguards, allocation concealment and blinding, that keep that assignment from being gamed or biased. in the pathway →
What allocation or masking concern?
Real-world causal-inference extensions
Methods extending propensity-score and g-methods to high-dimensional claims data and to treatment and censoring that vary over follow-up time. in the pathway →
How do causal methods scale to claims and time?
Real-world cost and HTA methods
Techniques for modeling skewed real-world costs and extrapolating trial data into health technology assessment decisions. in the pathway →
Modeling skewed real-world costs for HTA?
Recall
The share of true positives caught, the same as sensitivity. in the pathway →
Which classification metric?
Recall bias
Differential memory of past exposure between cases and controls. in the pathway →
Which bias is threatening the study?
Reference case
A standardized set of methods recommended by the Second Panel, reported alongside any analysis so results are comparable. in the pathway →
Whose costs and benefits count?
Registries
Purpose-built data for one disease, deep but narrow. in the pathway →
Which data source or pitfall?
Regression discontinuity
A causal design exploiting a cutoff, resting on continuity at the cutoff. in the pathway →
Which quasi-experimental design fits?
Regression families
The principle that the outcome dictates the model, most being a generalized linear model of an outcome distribution plus a link function. in the pathway →
Which regression for your outcome?
Regularization
Penalizing model complexity to buy the right flexibility, through ridge, lasso, or elastic net. in the pathway →
Which concept or penalty?
Regulatory pathways and registration
The regulatory frame around a study informing a regulated decision, including FDA IND or IDE applications and mandatory ClinicalTrials.gov registration and results posting. in the pathway →
Which regulatory application?
Relative versus absolute
The communication choice of whether to lead with a relative effect, which can sound large, or an absolute effect, where benefit becomes concrete. in the pathway →
Which scale frames the effect?
Reliability
Reproducibility: measuring the same quantity again and getting the same answer. in the pathway →
Which measurement property are you assessing?
Reliability and validity
Two independent properties of a measurement: reproducibility on repeat, and whether it measures what it claims. in the pathway →
Which measurement property are you assessing?
Reliability ratio
The signal’s share of total variance, by which non-differential error attenuates a true slope. in the pathway → \[\lambda = \frac{\sigma^2_{\text{true}}}{\sigma^2_{\text{true}} + \sigma^2_{\text{error}}}\] where \(\lambda\) is the reliability ratio, the signal’s share of total variance; \(\sigma^2_{\text{true}}\) is the variance of the true values; \(\sigma^2_{\text{error}}\) is the variance of the measurement error.
What kind of measurement error?
Reporting standards
Checklists like CONSORT, STROBE, PRISMA, and TRIPOD that make a study’s methods auditable by requiring the details that let a reader judge it. in the pathway →
Which study type are you reporting?
Research ethics and the IRB
Modern research ethics rests on the three Belmont principles and is enforced before a study starts by an institutional review board weighing risks against benefits. in the pathway →
Which ethics concept or body?
Research question
A study’s question written specifically enough to act on, using PICO or PECO to fix population, intervention or exposure, comparator, and outcome. in the pathway →
Which question framework fits?
  • the overall ideaResearch question
  • intervention question for a trialPICO
  • add an explicit time horizonPICOT
  • add study-design eligibilityPICOS
  • exposure question for observational workPECO
Restricted mean survival time
A survival summary that remains meaningful under non-proportional hazards and gives a number a patient can actually use. in the pathway →
Which survival concept?
Ridge regression
L2 regularization that shrinks coefficients toward zero. in the pathway →
Which concept or penalty?
Risk calculators and prediction tools
A model packaged for bedside use that carries its development population with it, so external validation and recalibration matter before its output drives action. in the pathway →
Risk difference
Absolute effect measure: the risk in the exposed group minus the risk in the unexposed group. in the pathway →
Which effect measure to report?
Risk ratio
Relative effect measure: the risk in the exposed group divided by the risk in the unexposed group. in the pathway →
Which effect measure to report?
Risk-of-bias appraisal
Scoring how a study’s design and conduct threaten its result domain by domain, using structured tools like RoB 2 for trials and ROBINS-I for observational studies. in the pathway →
Which risk-of-bias tool fits?
RMSE
Root mean squared error, the prediction error in the outcome’s own units that punishes large misses hardest. in the pathway →
Which fit or error measure do you need?
RoB 2
A structured tool for scoring risk of bias in randomized trials, domain by domain. in the pathway →
Which risk-of-bias tool fits?
ROBINS-I
A structured tool for scoring risk of bias in observational studies, domain by domain. in the pathway →
Which risk-of-bias tool fits?
Robust standard errors
Heteroscedasticity-robust (sandwich) standard errors, the modern default for non-constant variance. in the pathway →
Which model assumption to check?
Robust statistics for heavy tails
Median-based summaries and MAD-scaled z-scores that resist the outliers which dominate means and standard deviations in heavy-tailed data. in the pathway →
Which robust measure?
Robust z-score
A z-score built from the median and MAD so extreme points no longer set the scale. in the pathway → \[z = \frac{x - \text{median}}{1.4826 \times \text{MAD}}\] where \(z\) is the robust z-score for a value; \(x\) is the value being scored; \(\text{median}\) is the median of the data, the robust center; \(\text{MAD}\) is the median absolute deviation, the robust spread; \(1.4826\) rescales the MAD to equal the standard deviation under a normal.
Which robust measure?
Rosenbaum bounds
A method quantifying how much unmeasured confounding would overturn a result in matched designs, analogous to the E-value. in the pathway →
How do you quantify unmeasured bias?

S

Safe Harbor
A HIPAA de-identification method that strips eighteen specified identifiers from a dataset. in the pathway →
Which rule or method?
Safety and adverse-event analysis
Tabulating adverse events by type and severity on the safety population, compared as risk differences or exposure-adjusted rates, deliberately not corrected for multiplicity. in the pathway →
Which safety analysis element?
Safety population
Everyone who received any treatment, the set on which adverse events are counted, rather than the randomized set. in the pathway →
Which safety analysis element?
Sampling bias
A sample that does not represent the target population. in the pathway →
Which bias is threatening the study?
Schoenfeld residuals
Scaled residuals used to check the proportional-hazards assumption of a Cox model. in the pathway →
Which model assumption to check?
Selection bias
Bias from who ends up in the analysis, including sampling, volunteer, nonresponse, attrition, Berkson’s, healthy-worker, and survivorship variants. in the pathway →
Which bias is threatening the study?
Self-controlled case series (SCCS)
Models event rates across exposed and unexposed time within each affected person, removing all time-fixed within-person confounding. in the pathway →
Which observational design fits the question and dominant bias?
Sensitivity
The proportion of truly diseased patients a test correctly identifies as positive. in the pathway →
Which diagnostic-performance measure?
Sensitivity analysis
Pre-specified analyses that deliberately vary the assumptions most likely to be challenged and report what happens, more credible than analyses run only after review. in the pathway →
What robustness situation are you in?
SHAP
An interpretability tool that partly restores insight into flexible predictive models. in the pathway →
Which prediction concept is in play?
Simon’s two-stage design
A small single-arm phase II design that stops early when first-stage responses are too few to continue. in the pathway →
Which early-phase design question are you facing?
Simple random sampling
Drawing from one frame with equal selection probability. in the pathway →
How do you draw the sample?
Skewness
A summary of a distribution’s asymmetry, part of reading its shape. in the pathway →
What shape feature?
SNOMED
A clinical coding ontology used for claims and records. in the pathway →
Which data standard or provenance layer?
Societal perspective
A costing viewpoint that adds patient time, caregiving, and lost productivity to medical costs, which can flip the verdict for some conditions. in the pathway →
Whose costs and benefits count?
Sparse data and resampling
Methods for small cell counts or rare events, where standard likelihood is unstable and resampling or exact procedures give trustworthy estimates and intervals. in the pathway →
Are cells sparse or analytic standard errors doubtful?
Spearman correlation
A measure of monotone association between two continuous variables. in the pathway →
Which two-variable association are you testing?
Specification-curve analysis
Re-estimating a result across the many defensible modeling choices to show whether a conclusion holds broadly or only along one path. in the pathway →
How are you probing specification robustness?
Specificity
The proportion of truly disease-free patients a test correctly identifies as negative. in the pathway →
Which diagnostic-performance measure?
Spectrum bias
Inflated test accuracy when cases are floridly sick and controls plainly well, so accuracy at a referral center overstates that in primary care. in the pathway →
Which accuracy measure or pitfall?
SPIRIT
The reporting standard for a trial protocol, the protocol counterpart to the CONSORT checklist for the finished trial. in the pathway →
What situation?
Splines
Restricted cubic or natural splines that fit a smooth piecewise curve at a few knots, modeling nonlinearity more stably than high-order polynomials. in the pathway →
How do you flex the model?
Standard error
The spread of a sample mean, shrinking with the square root of sample size, so quadrupling n halves it. in the pathway → \[\text{SE} = \frac{\sigma}{\sqrt{n}}\] where \(\text{SE}\) is the standard error of the sample mean; \(\sigma\) is the standard deviation of a single observation; \(n\) is the number of observations averaged.
Which distribution or sampling result?
Standardized mortality ratio
The ratio of observed to expected events used in indirect age-standardization. in the pathway →
What frequency are you trying to measure?
Statistical programming and TFLs
Delivering analysis as pre-specified tables, figures, and listings, with credibility enforced by independent double-programming reconciled value by value. in the pathway →
What situation?
Stochastic uncertainty
First-order random variation between otherwise identical individuals, the noise a microsimulation has to average out. in the pathway →
Which source of uncertainty?
Stratified analysis
Controlling confounding by splitting data on the confounder, estimating within each stratum, and pooling the estimates. in the pathway →
Which stratified-analysis step are you at?
Stratified randomization
Randomization that balances a few strong prognostic factors within strata. in the pathway →
What allocation or masking concern?
Stratified sampling
Splitting the frame into strata and sampling within each, allowing precise oversampling of a small subgroup at the cost of unequal selection probabilities. in the pathway →
How do you draw the sample?
Strength of recommendation
How firmly a guideline body is willing to speak, signaled by ACC/AHA class and level of evidence or GRADE’s strong-versus-conditional split, and it should track the certainty of evidence. in the pathway →
How strong is the recommendation?
STROBE
The reporting checklist for observational studies. in the pathway →
Which study type are you reporting?
Structural uncertainty
Uncertainty in a model’s own form, which states exist and which functional form, often larger than parameter uncertainty yet routinely ignored. in the pathway →
Which source of uncertainty?
Study biases, by rung
A family of biases mapped to the rung where each enters, spanning selection, information, confounding, synthesis, and screening biases. in the pathway →
Which bias is threatening the study?
SUCRA
Surface under the cumulative ranking curve, summarizing a treatment’s rank where 100 percent is certainly best and 0 percent certainly worst. in the pathway →
Which network meta-analysis concern?
Supervised and unsupervised learning
The split in machine learning by whether the data carry an outcome label. in the pathway →
Are outcome labels available?
Supervised learning
Learning to predict a known target label such as a diagnosis, cost, or survival time. in the pathway →
Are outcome labels available?
Support vector machine
A classifier finding the widest-margin boundary between classes, using a kernel to bend it nonlinearly. in the pathway →
Which learner or ensemble fits?
Surrogate endpoint
A lab marker or scan standing in for a clinical outcome, trustworthy only once validated to capture the treatment’s effect on what patients feel. in the pathway →
Which endpoint or pre-specification concern?
Survey data
A probability sample built for population estimates that generalizes well once its weights and design are respected. in the pathway →
Which data source or pitfall?
Survey sampling design
The probability-sampling scheme by which a sample is drawn so it can generalize to the population. in the pathway →
How do you draw the sample?
Survey skip patterns
Branching where a gate question routes a respondent past inapplicable items, so a skipped item is blank by design rather than missing. in the pathway →
Which skip-logic element is in play?
Survey weight
The factor correcting for unequal selection so a sample represents the population it was drawn from. in the pathway →
Which weighting or design adjustment?
Survival extrapolation for HTA
Fitting parametric or flexible models to observed survival and projecting beyond the trial horizon. in the pathway →
Modeling skewed real-world costs for HTA?
Survivorship bias
Bias from studying only the units that lasted long enough to be observed. in the pathway →
Which bias is threatening the study?
SUTVA
The stable-unit-treatment-value assumption that one unit’s treatment does not affect another’s outcome, ruling out interference or spillover. in the pathway →
Which identifiability condition is at stake?
Synthetic control
A causal design that constructs a comparison unit to neutralize a dominant threat to inference. in the pathway →
Which quasi-experimental design fits?
Synthetic data
New records drawn from a generative model fit to real data, reproducing the joint distribution without copying individuals, needing privacy and fidelity audits. in the pathway → \[\frac{dx}{dt} = v(x,t)\] where \(x\) is the point being transported from the noise distribution toward the data distribution; \(t\) is time along the continuous path, running from 0 to 1; \(v(x,t)\) is the learned velocity field that moves \(x\) at each point and time.
Which rule or method?

T

T-test
A test comparing a continuous outcome between two groups, equivalent to a linear regression on a binary indicator. in the pathway →
Which two-variable association are you testing?
Target-trial emulation
Imagining the randomized trial you would have run, writing its protocol, then building the observational analysis to match it. in the pathway →
Which target-trial element?
Tau-squared
The between-study variance added to each study’s weight in a random-effects meta-analysis, often estimated by DerSimonian-Laird. in the pathway →
Which pooling or heterogeneity tool?
TFLs
Tables, figures, and listings: the programmed outputs of an analysis, whose shells are pre-specified in the statistical analysis plan. in the pathway →
What situation?
The evidence-recommendation gap
The distance between how firmly a guideline is worded and the actual support beneath it, whether an extrapolated threshold, a single trial, or mere expert consensus. in the pathway →
What recommendation situation are you in?
The statistical analysis plan
The document pre-committing, before unblinding, exactly how the primary question will be answered, turning a confirmatory analysis confirmatory. in the pathway →
What situation?
The study protocol (SPIRIT)
The master plan every other document hangs from, covering objectives, eligibility, intervention, outcomes, sample size, analysis, ethics, and dissemination. in the pathway →
What situation?
Threshold analysis
An analysis finding the input value at which a decision flips. in the pathway →
Which source of uncertainty?
Thresholds and cut points
Turning a continuous risk or measurement into a yes/no action, a convenient but lossy choice that trades sensitivity against specificity and encodes a value judgment. in the pathway →
Where to set the decision cutoff?
Time-varying confounding
When a confounder is itself affected by past treatment, breaking ordinary adjustment and requiring g-methods. in the pathway →
How do you handle time-varying confounding?
TMLE
Targeted maximum likelihood estimation, a doubly-robust estimator combining a propensity and an outcome model. in the pathway →
How do you estimate the causal effect?
Traceability
The rule that every analysis value trace back through ADaM to its SDTM source and original case-report form. in the pathway →
What are you building or tracing?
Transitivity
The assumption that trials are similar enough in populations and methods that an indirect comparison through a common comparator is valid. in the pathway →
Which network meta-analysis concern?
Treatment-confounder feedback
When a confounder both responds to past treatment and guides the next, common in chronic-disease cohorts. in the pathway →
How do you handle time-varying confounding?
Treatment-policy strategy
An intercurrent-event strategy that counts the outcome regardless of the event, in the intention-to-treat spirit. in the pathway →
How do you handle intercurrent events?
Trial estimands and intercurrent events
A trial’s precise question, stated under the ICH E9 R1 framework, with a named strategy for events that occur after randomization. in the pathway →
How do you handle intercurrent events?
TRIPOD
The reporting checklist for prediction models. in the pathway →
Which study type are you reporting?
Two-part and other cost models
Models separating whether cost occurred from how much, plus robust GLMs for skewed cost data. in the pathway →
Modeling skewed real-world costs for HTA?
Type-I error
The false-positive rate, which unplanned peeking at accumulating data inflates. in the pathway →
Which interim-monitoring element?
Types of uncertainty
Naming the kinds of uncertainty (parameter, stochastic, heterogeneity, structural) because each needs different tools to handle honestly. in the pathway →
Which source of uncertainty?

U

Uncertainty and inference
Reporting the range compatible with the data via confidence intervals, accounting for clustering, since statistical significance is not clinical importance. in the pathway →
How to express estimate uncertainty?
Uncertainty in cost-effectiveness (PSA)
Methods showing how fragile an ICER is, from one-way and tornado analyses to probabilistic sensitivity analysis propagating parameter uncertainty through Monte Carlo simulation. in the pathway →
How are you handling cost-effectiveness uncertainty?
Unsupervised learning
Finding structure in data with no outcome label, through clustering or dimensionality reduction. in the pathway →
What unlabeled-data structure are you finding?

V

Validity
Whether an instrument measures what it claims, through content, construct, and criterion validity. in the pathway →
Which measurement property are you assessing?
Value of information (EVPI)
Pricing the decision uncertainty that remains, where the expected value of perfect information is the expected loss from deciding under current uncertainty. in the pathway →
What is more evidence worth?
Variance inflation factor
A diagnostic for multicollinearity among predictors. in the pathway →
Which model assumption to check?
Verification
Checking whether a model is coded correctly, that the implementation does the math intended. in the pathway →
What situation?
Verification bias
Bias arising when only test-positive patients go on to receive the reference standard. in the pathway →
Which accuracy measure or pitfall?

W

Weakly-informative prior
A prior that gently regularizes without committing to much. in the pathway →
What kind of prior do you need?
Weighted kappa
A kappa that credits near-misses on an ordinal scale. in the pathway →
Which measurement property are you assessing?
Willingness-to-pay threshold
The benchmark amount a payer will pay per unit of benefit, against which an incremental cost-effectiveness ratio is judged. in the pathway →
Which economic-evaluation framing fits?
Winsorization and trimming of cost outliers
Capping or dropping extreme cost values so a few catastrophic claims do not dominate the mean. in the pathway →
Modeling skewed real-world costs for HTA?
Woolf’s method
A method for pooling stratum-specific association estimates across strata. in the pathway →
Which stratified-analysis step are you at?

Z

Zero-inflated model
A count model mixing a structural-zero process with a count process when zeros pile up. in the pathway →

← Back to the pathway

Two ways to take this further: