PHM Glossary: I

Identified (Population)

For specific programs, the identified population is composed of program eligible members who are identified as meeting wellness or care management program identification criteria. For a more detailed description, see population.

Identification, Predictive Modeling, Risk Ranking, Risk Adjustment, and Stratification

These terms are often used interchangeably in health care applications. The lack of consistent usage can lead to confusion. We propose a set of definitions that we have found to be useful in practical applications and that overcome the duplication and redundancy often found in this area.


Identification refers to the process whereby claims data are used to identify conditions or therapies (or gaps in therapies) in a targeted population. It is often the first step in a predictive modeling, risk ranking, risk adjustment, or stratification exercise. The targeted population (or subpopulation) is first identified through demographic and other criteria (not claims-related) and then the condition and/or treatment criteria are applied. Condition and treatment identification is derived principally from claims data (may be augmented by self-reported data). For a typical set of definitions see chronic conditions/disease definitions.

Claims data contain valuable information about the patient who incurred the claim and the provider responsible for the treatment. For example, a claim in the ICD-9-CM series 250.xx indicates the possibility of a diabetes diagnosis; a claim with a GPI group 44 indicates a treatment with an asthma medication. For the purposes of analysis, it is sometimes helpful to classify patients into categories, such as:

  • Wellness—those whose services include check-ups, flu shots, etc.;
  • Chronic—those with a chronic condition—for example diabetes, asthma, etc.;
  • Acute—those with an acute condition—for example maternity, hip surgery, etc.; and
  • Catastrophic—those patients with multiple co-morbidities or high cost.

The electronic patient record will flag all transactions recorded for the patient record as indicating a particular diagnosis (or diagnoses). For example, a patient will be identified as diabetic, using condition rule sets such as those defined in chronic condition/disease definitions, if the patient record indicates within a 12-month period the presence of:

  • One hospital visit with primary diagnosis of diabetes; or
  • Two or more office visits with primary diagnosis of diabetes; or
  • Three or more 30-day prescriptions filled for GPI class 27.

The result of an identification process is a list of patients—for example, a list of diabetics. The list is not prioritized or risk-ranked in any way (although clearly the identification process itself represents a simple risk ranking or stratification process. But within the identified list there is no prioritization or risk ranking).

Predictive modeling is the process of forecasting future health care expenditures, resource utilization, or adverse clinical events (such as inpatient admits) based on differences in individuals’ health statuses. An important objective of predictive modeling is the allocation of population health management resources to patients that are predicted to incur high health care expenses (in the absence of some kind of care intervention). With the predictive modeling technique, individual patients are assigned a risk score based on their likelihood of using health services in the near future. In addition, predictive modeling techniques can be used to identify patients for control and intervention groups for population health management program evaluation.

One result of the predictive modeling process is a risk ranking of the target population. Risk ranking is the process of ordering or prioritizing a population, from highest severity to lowest severity. Risk ranking is valuable for population health management applications. When combined with a predicted frequency of the specific event, the data may be combined with data on cost of interventions to determine the economic opportunity of an intervention program. A number of commercial vendors provide predictive modeling and risk ranking services. Risk ranking is a process that can be run periodically and applies rules in a standard and repeatable way (usually using a computer).

Cumulative Total Population


Risk has no single definition in care management. For example, there are patients who exhibit many risk factors who never have a claim, while previously nonrisky patients undergo lengthy treatment and have high claims. There is also a related concept, severity, which is clinical in nature and may or may not cause a patient to exhibit increased financial risk. It is useful for the purpose of care management (which is often program-oriented) to take a financial approach to risk classification. To that end, we propose the following definition: “A high-risk patient is one who has a high probability of experiencing an event.” Thus patient risk is directly related to the probability that the patient will experience a particular event. (The event in question may be defined by the user; it could be a hospitalization, claims in excess of $X, or condition-specific claims or utilization.)

For chronic care and case management programs, the predicted high-cost event may often be a hospitalization. Thus, a patient with a high likelihood of having a hospitalization in the next year is defined as a more risky patient than one who has a low probability of experiencing the event. Once a definition of risk can be agreed within the organization, issues of stratification and prediction may be more readily resolved.

Risk also has a time dimension. Thus, a patient who is identified as high risk (e.g., in the top decile of all patients) in January of 2004, based on the prior 12 months claims, for example, may exhibit a different risk level in July of 2004, based on claims incurred through June of 2004, and in January 2005, based on claims incurred in 2004. Indeed, if the program has worked to reduce the patient’s utilization, the patient’s 2004 claims, and therefore risk ranking, should be lower.

Stratification is the process of assigning risk-ranked members to intervention categories. Stratification may be based on claims-based risk ranking only, but most usually include self-reported data or nurse evaluation. Stratification is a workflow process; the operational importance of stratification is the correspondence between risk strata and interventions or groups of interventions (programs).

A patient may be risk-ranked high, but stratified low (if, for example the patient is well-controlled, or offers no opportunity for behavior change).


There is an important consequence of these definitions of risk ranking and stratification. Risk ranking is always based on claims and/or demographic data and, because such data are interrogated monthly or quarterly, is periodic. Stratification, on the other hand, is dynamic and based on member needs and the member’s situation at a point in time. Because the definition of stratification includes claims-based risk ranking, the process of stratification may begin with claims-based risk ranks and be complemented by self-reported and nurse-evaluation data.

Impactibility Models (Predictive Models)

An impactibility (predictive) model predicts who will acquire a disease, an adverse event related to a disease, or change from one health (functioning) state to another, where these outcomes are impactible with some specific intervention such as taking or stopping a medication, doing a test, reducing avoidable medical costs, making a behavioral change, or changing the person’s environment. Impactibility models tend to focus on specific sets of circumstances where there is an identifiable gap in care, such as someone with heart failure who is not taking a beta-blocker.


Many predictive models predict high-cost patients in a future period. However, many predicted high-cost patients may not be good candidates for chronic care management (examples are trauma, accident, some cancers, etc.). Like all predictive models, an impactibility model attempts to predict who is at risk of high costs from chronic disease, a disease-related adverse event, or a change from one health state to another (e.g., functional status). Unlike most models that predict an outcome (such as cost, service utilization, or health state), impactibility models identify people with specific circumstances that can be changed to beneficially influence a health outcome and to predict the change of risk for those who successfully alter the impactible factor.

Below is a listing of what is considered impactible and illuminates the spectrum of what can be incorporated into impactibility models.

  1. Non-adherence to a drug of proven benefit for a condition
  2. Taking a drug that is deleterious for a condition
  3. Not being monitored for the untoward effects of a drug
  4. Not being monitored for preventable sequelae of a disease
  5. Engaging in behaviors that increase the likelihood of acquiring or worsening a disease
  6. Living in an environment that increases risk for a disease or for its complications
  7. Having health beliefs that reduce one’s ability to understand and manage their condition
  8. Being ready to move from less to greater readiness to change


There are several advantages to using impactibility modeling to identify appropriate people for an intervention. First, just knowing someone has a disease or the severity of their disease (or even its likelihood of progression from low to high cost) does not relate to how impactible that person’s disease (or its outcomes) is. A second advantage to using impactibility modeling is that often the change in the probability of adverse events for a person (or population) can be modeled from published adverse event rates from studies on populations who had and who did not have the impactible factor. Impactibility models should not be used in isolation. First, it is difficult to uncover all impactible factors. The claims or patient-supplied data may not be there (or may be inaccurate) and knowledge of the patient’s readiness to change, health beliefs, health practices, or living environment might not be available. Without an estimate of someone’s disease severity or risk for utilization or high cost, impactibility alone is insufficient to prioritize the work of chronic care management.


Lewis GH. “Impactability Models: Identifying the Subgroup of High Risk Patients Most Amenable to Hospital Avoidance Programs.” Milbank Quarterly; 88(2).

“Expect Greater Use of eHealth in Disease Management in 2004.” Disease Management News. 10 Jan. 2004. 16 Feb. 2006


The incidence of a disease is the rate at which new cases occur (or are identified) in a population during a specified period. Incidence reflects newly identified cases measured over a period of time. Incidence differs from prevalence, which quantifies the proportion of a population with a disease at a given point in time.


The incidence rate (also called incidence density or force of morbidity) is defined as:

Number of new cases of a condition during a given period of time
Total person-time of observation

In presenting the incidence rate, it is important to specify the time unit: person-days, person-years, etc.

In the example below, four individuals are followed for the course of 7 years.


Patient Time Period
  1/90 1/91 1/92 1/93 1/94 1/95 1/96 1/97  
Patient W
2 yr
Patient X
5.5 yr
Patient Y
4 yr
Patient Z
5 yr
Total years at risk   16.5 yr

Onset of disease

A simple (but incorrect) observation of the number of person-years would be 35.0 (7 years * 4 individuals). This is incorrect because information is not available on all four individuals for the entire 7-year period. The number of person-years during which the information is available is 16.5 (this number of person-years is called by actuaries exposure to risk).

The incidence rate in this example is two cases per 16.5 person-years (patients X and Z), or 12.1 cases per 100 person-years.

There is a related measure of incidence called cumulative incidence (CI). This rate is the proportion of a population that has the condition over the period of time. Applying the above data to different years:

CI (1990): 1 case/2 individuals = 50 cases/100 individuals

CI (1991): 0 cases/3 individuals = 0 cases/100 individuals

CI (1990-1): 1 case/3 individuals = 33 cases/100 individuals (2-year rate)

These examples show considerable variation because of the small samples involved. Comparison with the incidence rate calculation shows that the latter is a more accurate estimate of the underlying force of morbidity.


Hennekens, C. and Buring, J. Epidemiology in Medicine. Boston: Little, Brown & Company (1987).

Mausner, J., and Kramer, S. Epidemiology, An Introductory Text. 2nd ed. Philadelphia: Elsevier (1985).

Morton, R., Hebel, J., and McCarter, R. A Study Guide to Epidemiology and Biostatistics. 3rd ed. Gaithersburg, MD: Aspen Publications (1990).

Rothman, K., and Greelandad, S. Modern Epidemiology. 2nd ed. Baltimore: Lippincott, Williams & Wilkins (1998).

Independent Variable

A manipulated variable in an experiment or study whose presence or degree determines the change in the dependent variable is referred to as an independent variable. It is a variable that is not affected by any other variables with which it is compared. The group requesting a study may have some control over the independent variable, such as the type of disease management intervention offered. But the independent variable is not always under the researcher’s control—for example, when a population health management participant is also exposed to some other program. Demographic factors (e.g., age or gender) may also be independent variables.


Independent variable = temperature; dependent variable = ice cream sales

Ice cream sales go up on hot days.


Definition: Independent Variable. 16 Feb. 2006

independent variable. (n.d.). The American Heritage® Dictionary of the English Language, Fourth Edition. Retrieved April 29, 2010, from website: variable


Indicator is a variable, like a sign, symptom, or other means of measurement that is used to reveal the degree to which another variable is valued or changed.


A depression questionnaire is an indicator that measures the presence of distinct depressive symptoms; if a person possesses a certain number of symptoms, the questionnaire indicates that the person is indeed depressed.


indicator. (n.d.). Wall Street Words. Retrieved April 29, 2010, from website:

Indicator Variable (Dummy Variable)

A binary explanatory variable (variable that can be categorized into only two groups) that is analyzed to determine whether these two different categories are associated with significantly different outcomes. The two different categories of the indicator variable do not have quantitative meaning; however, they are coded as numerical values—0 and 1—so that they can be analyzed using linear or logistic regression analysis, which are forms of statistical analysis that require quantitative values. Very often, an indicator variable is used to denote the absence of a condition 0, or the presence of a condition 1. An indicator variable is also called a dummy variable.

Use of dummy variables usually increases model fit (coefficient of determination), but at a cost of fewer degrees of freedom and loss of generality of the model. Too many dummy variables result in a model that does not provide any general conclusions.


Cancer patients who receive treatment A are coded as 1, and cancer patients who do not receive treatment A are coded as 0. These two groups are then compared using linear regression analysis to determine whether they have significantly different lengths of life.


The practice of subcontracting work to another company that is under the same general ownership.


insourcing. (n.d.). Collins English Dictionary - Complete & Unabridged 10th Edition. Retrieved December 08, 2010, from website:


The ability of health information systems to work together within and across organizational boundaries in order to advance the effective delivery of healthcare for individuals and communities. The ability of systems or components to exchange health information and to use the information that has been exchanged accurately, securely, and verifiably, when and where needed.


HIMSS – add cite

The University of Kansas Center for Health Informatics – add cite

Intervention (Intervention Program)

Intervention (intervention program) is a planned and defined activity administered by an external source with the aim of inducing change in the individual’s behavior, for ideal health.


Examples of interventions are telephonic contacts, mail, e-mail, SMS/text, educational videos, literature, and face-to-face visits.