Sampling describes the process of selecting a group of individuals (or observations) from a particular population.
Research based on a representative sample of the larger population will increase the generalizability of study results. Findings from studies that are based on sample results may not be able to be extrapolated to the population at large unless the sample adequately represents the broader universe.
Last, J. A Dictionary of Epidemiology. 4th ed. Oxford: Oxford University Press (2001).
Typically represents direct medical cost savings resulting from reduced health care resource utilization and/or indirect savings resulting from improvements in productivity or reduction in disability or workers’ compensation. Savings is one measure used to demonstrate the financial impact of population health management programs.
Savings are usually calculated using an actuarially-adjusted trended comparison methodology or multivariate regression methodology. A robust and rigorous study design is crucial to derivation of the observations that are used in the savings calculation. Savings may be expressed in terms of total dollars or at the person-level (e.g., dollars per member per month, dollars per chronic member per month, dollars per participant, dollars per eligible). Estimated savings are often used to calculate return on investment, a ratio of savings to costs.
Refer to the Care Continuum Alliance Outcomes Guidelines, Volume 5 for further detail on estimating savings.
Johns Hopkins/American Healthways. “Standard Outcome Metrics and Evaluation Methodology for Disease Management Programs.” Disease Management 6(3)(Fall 2003) 121-138.
Care Continuum Alliance. Outcomes Guidelines Report. Vol. 5. Washington, DC: Care Continuum Alliance (2010).
Secondary payer refers to situations where an individual may have insurance coverage provided by more than one health insurance plan. When a patient is covered by two payers (e.g., health plans, public programs, or insurance policies), one insurer is always defined as having the primary responsibility for paying claims (the “primary payer”), while the other pays the share of the remaining charges eligible under their benefit plans. The Medicare program, for example, does not have primary responsibility for paying a beneficiary’s medical expenses.
Hence, for an individual with both private and Medicare insurance, the private payer should cover (i.e., pay) claims for its member before Medicare, which is considered the secondary payer. The private payer would cover all expenses up to its normal benefits limits, and then Medicare would pay that portion of the remaining expenses that are eligible under Medicare benefits.
Medicare secondary payer (MSP) refers to situations where the Medicare program does not have primary responsibility for paying a beneficiary’s medical expenses. The Medicare beneficiary may be entitled to other coverage, which should pay before Medicare.
Through a series of regulations, Medicare has shifted costs to private sources of payment such as group health plans, liability insurance (including self insurance), no fault insurance or workers’ compensation where they are in place— for example, for active employees aged over 65. This program of shifting costs is referred to as the Medicare Secondary Payer (MSP) program. Under MSP, employer-sponsored health plans could be the primary payers of benefits when individuals are covered by both employers and Medicare.
See also coordination of benefits and primary payer.
“Coordination of Benefits (COB).” Dec. 2004. 16 Feb. 2006 http://wedi.org/cmsUploads/pdfUpload/WhitePaper/pub/COBWhitePaper200412.pdf
Glossary: Centers for Medicare and Medicaid Services. US Department of Health and Human Services. 9 Feb. 2006 http://www.cms.hhs.gov/apps/glossary/default.asp?Letter=S&Language=English
“Medicare Secondary Payer (MSP) Manual Rev 65.” March 2009. http://www.cms.gov/manuals/downloads/msp105c01.pdf
This term refers to care undertaken by the patient him- or herself or provided by the patient’s family members, friends, or others in the community who are not medical professionals. In population health management, it relates to patient education and the empowerment that allows the individual (or family) to manage his or her own health/illness based on evidence-based medicine without continuous medical or other professional consultation. It may also be called non-professional care.
Population health management programs provide tools such as nurse call lines, pamphlets, newsletters, videotapes, CDs, and online services to educate patients and their families. Self-care can be powerful and effective when patients are provided with sufficient tools via care management. Once the patient/family has the ability to take greater control of their own health, it has been observed that outcomes improve and high cost utilization decreases. Feedback to physicians by the patients themselves or via a population health management program is also included in the self-care context.
See also health status (self-reported).
Informal Care. 1998. 23 Feb. 2006 http://www.ecosa.org/csi/glossary.nsf/termlinks/Informal_care?opendocument
Todd, W. and Nash, D. eds. Kipp, R., Towner, C. and Levin, H. Disease Management-A Systems Approach to Improving Patient Outcomes Jossey-Bass (2001).
The probability of correctly identifying an individual with a specific condition or characteristic based on results of a test, diagnostic process, or algorithm.
Sensitivity is computed based on the number of individuals who have the condition/characteristic of interest and are identified as such (true positives) and the number of individuals with the condition/characteristic but are not identified (false negatives). The larger the sensitivity the more likely an individual is to be correctly identified as having the condition/characteristic of interest (i.e., higher true positives, lower false negatives). There is a trade-off between sensitivity and specificity.
Sensitivity = True positives / (True positives + False negatives)
See also predictive value and specificity.
Hennekens C, Buring J. Epidemiology in Medicine. Boston: Little, Brown & Company (1987).
The probability of correctly identifying an individual who does not have a specific condition or characteristic based on results of a test, diagnostic process, or algorithm.
Specificity is computed based on the number of individuals who do not have the characteristic of interest and are identified as such (true negatives) and the number of individuals who do not have the characteristic but are identified with the condition (false positives). The larger the specificity the more likely an individual without the characteristic/condition of interest is to be correctly identified (higher true positives, lower false negatives). There is a trade-off between sensitivity and specificity.
Specificity = True negatives / (False positives + True negatives).
See also predictive value and sensitivity.
Hennekens C, Buring J. Epidemiology in Medicine. Boston: Little, Brown & Company (1987).
Provision of financial or another measurable type of support for a program or research project.
Standard Error of Measurement (SEM)
Standard error is the expected variability in the difference between a true score (actual value in the person being tested) and a test score (the value obtained through a measurement device).
In the context of population health management, providers and others in the position of evaluating changes in an individual’s or population’s clinical outcomes should consider whether differences are due to true change (for example, the member actually lowered his cholesterol) or to SEM (for example, the individual’s cholesterol level has not changed, but a slightly lower value is reported due to the increase or decrease of error during a particular testing situation).
The standard error of measurement is expressed as: σ meas
The formula for calculating the standard error of measurement is:
Sm = S √
Where Sm = the standard error of measurement (σ meas)
S = the standard deviation of the scores
R = the reliability coefficient of the test instrument
Baumgartner, T., et al. Measurement for Evaluation in Physical Education & Exercise Science, 7th ed. New York: McGraw-Hill (2002).
A statistic is a number or value that is computed from a sample (in contrast to a parameter, which is a measure relating to the entire population). Statistics computed from a random sample are subject to random variation or sampling error.
Definition: Statistic. 23 Feb. 2006 http://www.cmh.edu/stats/definitions/statistic.htm
Statistical significance is a concept used to describe how likely it was to have obtained the observed results by chance alone. In making statistical calculations, we set a significance level, also called Type I error, α, or confidence level. The significance level is the probability that we will find there is a difference between two groups when the groups are not actually different. A significance level of .05 or .1 is generally desired. To determine whether a finding is statistically significant, we calculate a P value for the test statistic and compare that with the value of α.
The P value numerically depicts the probability that a test statistic is within the range that would have been observed if the null hypothesis (i.e., no difference between the true value and the test value) were true. In other words, the P value states the probability that the difference observed in a comparison of two groups could have occurred if the groups were actually alike. The usual notation is the letter P expressed by a decimal notation such as 0.01, 0.05. or the abbreviation n.s. (not significant). The symbol < (less than) a specific decimal is also used. In most population health management evaluations, a study result with a probability value less than 5% (P < 0.05) or 1% (P < 0.01) is considered statistically significant, meaning that the result is sufficiently unlikely to have occurred by chance.
See also confidence intervals and statistical tests.
Last, J. A Dictionary of Epidemiology, 4th ed. Oxford: Oxford University Press (2001).
Spiegel, M. et al. Schaum's Outline of Probability and Statistics, 2nd ed. New York: McGraw-Hill (2000).
A statistical test provides a procedure for accepting or rejecting an hypothesis or assertion about a population that is being studied.
CUMMING’S PREDICTION MEASURE (CPM)
CPM = 1 – (mean absolute prediction error) / (mean absolute deviation from average)
R-square is a measure of the fit of a least-squares model such as a linear regression. Essentially it measures the proportion (or percent) of variation that is explained by the model. The denominator in this measure is essentially the variance of the dependent variable (actually multiplied by the number of observations minus 1 (e.g., degrees of freedom):
R-square = Σ (Model) / Σ (Total) where Σ (Model) represents the variation of the dependent variable that is explained by the model and Σ (Total) is the total variation of the model.
See sensitivity and specificity.
Cumming, R., Knutson, D., Cameron, B., and Derrick, B. “A Comparative Analysis of Claims-based Methods of Health Risk Assessment for Commercial Populations.” Schaumburg, IL: Society of Actuaries (2002).
SOA Home Page. Society of Actuaries. 16 Feb. 2006 http://www.soa.org/ccm/content/
Spiegel, M. et al. Schaum’s Outline of Probability and Statistics, 2nd ed. New York: McGraw-Hill (2000).
In population health management, stratification has two meanings:
Stratification is a method of randomization that is frequently used to prevent chance imbalances in important factors between groups in a randomized controlled trial. If another variable is known or suspected to influence the outcome of interest, then equal numbers of members from each stratum of that variable are allocated into the different groups to prevent that variable from confounding the association between the outcome variable and the exposure(s) of interest.
Stratification is a process for sorting a population of eligible members into groups relating to their relative need for total population management interventions. Stratification may integrate a variety of data, if available, including claims, pharmacy, laboratory, health risks, or consumer-reported data.
See also risk adjustment and predictive modeling.
Cumming, R., Knutson, D., Cameron, B. and Derrick, B. A Comparative Analysis of Claims-Based Methods of Health Risk Assessment for Commercial Populations. Schaumburg, IL: Society of Actuaries (2002).
Iezzoni, L. Risk Adjustment for Measuring Health Care Outcome, 3rd ed. Chicago: Health Administration Press (2003).
Olenckno, W. Essential Epidemiology. Prospect Heights: Waveland Press, Inc. (2002).
The study design describes the manner in which the evaluation is structured and conducted.
The objective of measurement is to render a quantitative or qualitative description. To describe a program outcome, one can describe a state or a change that, to be valuable, must be valid. Validity has many flavors. For this purpose, one can imagine a hierarchy of validity types:
- Accuracy: measurements correctly capture data and compute metrics. This definition is independent of the meaning of the results.
- Internal validity: metrics and study design are constructed and executed in a way consistent with the intent of the measurement under specific circumstances. The measurement can be reasonably expected to represent what it is intended to measure – for example, a difference in the utilization of services from what would otherwise have been expected for this population.
- External validity: metrics and study design are constructed and executed in a way consistent with the intent of the measurement generally. The measurement can be reasonably expected to represent the impact of the intervention across instances and settings. As such, measures and study outcomes are comparable broadly across programs and/or organizations – a difference in the utilization of services from what would otherwise be expected for any similar population, for example.
The following framework outlines the prevailing approaches that are being or could be used to assess program effectiveness. The more rigorous a method’s ability to produce conclusive or valid results, the more impractical it is likely to be for routine, periodic reporting. These are noted in both the complexity and key issues rows.
|DESIGN||Randomized Controlled Trial||Non- Experimental||Concurrent Industry Trend||Historic Control||Pre/Post|
|Source of Comparison Group||Population for whom program was implemented, randomly selected group withhelp from program.||Population for whom program was not implemented (not contracted), or , population for whom program was implemented, but individuals who were not contacted or chose not to participate.||Composite of populations for whom program was implemented.||Expected Trend (typically observed trend of individuals without measured conditions in population for whom program was implemented) applied to population for whom program was implemented.||Population for whom program was implemented, in prior year|
|Threats to Validity||very low, internal and external.||Low (controlling for appropiate selection bias), internal and external.||Moderate, internal and external.||high, external.||High, internal and external.|
|Complexity||Very high||High||Currently High||Low||Very Low|
|Key Issues||Not suited for routine reporting due to cost, timing, and limited generalizability.||Cost, availablility of comparison group and complete comparison group data.||High upfront investment cost.||Questinable validity without risk adjustment,||Regression to mean, selection bias.|
Care Continuum Alliance Outcomes Guidelines Report, Vol. 5.
Randomized Controlled Trial (RCT): A randomized controlled trial is a scientific experiment to evaluate the effectiveness (or a product). Subjects are randomly allocated to the treatment group that receives the intervention and the control group that does not. The random assignment is meant to maximize the probability that both groups are fully comparable at baseline so that later differences can be attributed to the intervention. Thus, the RCT is considered the gold standard of study design. Its disadvantages are the complexity of organizing such a trial, concerns about the limited generalizability of the findings because of the tight control over the intervention under trial conditions, and the reluctance of assigning subjects to a control group that may not receive treatment.
Study designs with non-experimental comparison groups: A strong non-experimental design requires the comparison group to be drawn from comparable populations or to use individuals matched to the intervention group members. Comparability can also be achieved with statistical adjustment. Below are some examples of control group opportunities:
- Non-contracted group: This can be referred to as natural experiment. In this design, researchers try to find a naturally occurring control group that is equivalent to the treatment group in all other aspects but exposure to the intervention. Thus, the design is similar to an RCT but without the random assignment of individuals. For example, if a health plan offers a care management intervention to the fully insured but not to the self-insured business lines in the same region, the members in the self-insured products could be investigated as an equivalent, non-randomly assigned comparison group.
- Non-qualified individuals: The comparison group consists of individuals who were not identified for the chronic care management intervention. This group can include individuals with no chronic conditions, with chronic conditions that are not part of the chronic care management contract, or with chronic conditions that are part of the chronic care management contract but fail to meet program inclusion criteria.
- Non-contacted individuals: The comparison group consists of individuals who were eligible for the intervention but not contacted by the chronic care management organization. Examples are individuals with incomplete addresses or missing phone numbers.
- Non-participants: The comparison group consists of individuals who were identified for a chronic care management intervention but chose not to participate.
Concurrent Industry Trend: This trend would be estimated by combining longitudinal data from a large number of care management organizations that reflect the overall industry reasonably well. The trend would then become the equivalent of the S&P 500 for the chronic care management industry, i.e., a national benchmark against which individual programs would be compared. For a fair comparison, the trend would have to be adjusted to account for differences in region, case mix, etc., between the national peer group data and the individual program. Of note, a program’s effect would not reflect the impact compared to no treatment but compared to the rest of the industry.
Historic control: This design compares the trend in the sub-population equivalent to the intervention population prior to start of the intervention and afterward to an expected trend based on analysis of the rest of the population. Any difference in observed findings to the expected results is credited to the intervention.
Pre-post comparison: This design compares baseline to post-intervention data. The difference is counted as program effect.
Most industry activity has been focused in the historic control method. Comparison group methods, which are derived from within the same population in which the program was implemented, such as non-participants, are currently emerging both in terms of population and individual (matched control) comparisons. Limited opportunities arise to conduct studies, in which the comparison group is identified from outside the program population, such as a different employer group and region of the health plan. Even rarer is the situation to conduct a prospective randomized control design which is best suited for special studies and research.
Care Continuum Alliance. (2010). Outcomes Guidelines Report, vol. 5. Washington, DC: Care Continuum Alliance.
A survey is a gathering of a sample of data or opinions considered to be representative of a whole and is one of several methods for collecting data from a group or population. A survey can be used to estimate the views of a group, the distribution of characteristics in a sample population, or changes in opinions or results over time. Surveys usually employ instruments such as questionnaires or interviews and can be delivered in a variety of methods, including postal mail, telephone, text, internet, e-mail, or face-to-face.
Survey parameters and content development are key to understanding potential variables that affect survey results. Survey parameters to consider are survey method, time commitment for completion of survey, quality of survey questions, literacy level of population being surveyed, and any judgment of responses required by surveyor agencies adept at the development of surveys.
A survey questionnaire should include a brief introduction indicating who is conducting the survey, how the information will be used, and reassuring participants that their responses will be kept confidential. Credibility can be further established by suggesting the sponsoring organization has a neutral viewpoint and no inherent bias.
Biases should also be considered in the analysis of survey results. These biases should be considered in advance since biases that are too large my impact the legitimacy of the survey findings. The three most common types of bias are:
- The Hawthorne Effect: Respondents tend to respond differently simply because they have been selected for a survey because the special recognition led respondents to answer in the way which was viewed as most pleasing to the researcher.
- The “Habit” bias: Respondents who are given similar questions will fall into a habit of answering them similarly without considering each on its merit
- Non-respondent bias: The assumption that individuals who haven't responded in a survey tend to feel the same as those who have responded.
These biases can be minimized by placing personal questions at the end of the questionnaire and by changing the format of questions throughout the questionnaire. Due to the nature of the survey, confidentiality should be a consideration in determining the method of delivery and use of analytical results.
Survey completion is usually voluntary and may or may not be incentivized, either monetarily or in some other form of remuneration, for completion. Surveys can be completed as frequently as necessary to gather the necessary data points to be used for analytical evaluation of results.
Don A. Dillman, Smyth, J., Christian, L, 2008. Internet, Mail, and Mixed-Mode Surveys: The Tailored Design Method. New York City: John Wiley and Sons.
Don A. Dillman. 1978. Mail and Telephone Surveys: The Total Design Method. New York City: John Wiley and Sons.