Skip to main content

Leveraging big data in population health management



Population health management takes into account many determinants of health, including medical care, social and physical environments and related services, genetics, and individual behavior. Many different types of data may be used to guide population health management programs and to estimate program value. In addition to the variety of data required for these programs, big population health program data are characterized by large volume, high velocity, and inconsistent data flows. This manuscript describes how big data analytics have been used to craft a population health program to help improve the lives of about four million older adults who have an AARP® Medicare Supplement Insurance plan (i.e., a Medigap plan). Plan enrollees have access to a wellness program, holistic care coordination programs, two telephone-based advice lines, concierge support for insurance and medical care needs, and a program designed to help reduce unnecessary emergency room (ER) visits.


During 2009–2011, these program components led to several improvements in health care. For example, increased duration in care coordination was associated with fewer hospital readmissions, and participants were significantly more likely to have recurring physician office visits and recommended laboratory tests. Participants in ER decision support reduced their ER visits by 1299 visits per 1000 insureds, compared with a reduction of 1121 visits per 1000 insureds for individuals who did not participate in the program. Better depression management helped reduce depression symptoms in 59 % of participants engaged in that program. Big data analytics of member data suggested the need for a wellness program feature, which began in 2014. Analytics of disease management services offered in 2009–2011 helped to combine and refocus these program features to enhance effectiveness in later years.


Using big data to help manage and evaluate a population health program has led to several improvements in health care. Program management, reporting, and evaluation processes generated additional data which, when analyzed, continues to refine program implementation and quality. Future improvements to this program may include enhanced integration of social service programs that will generate their own data streams for analyses designed to further improve health and wellbeing.


During the last decade, the use of health management programs to help people live healthier lives has increased, especially for those with multiple chronic conditions [1]. A good population health management approach takes into account the multiple determinants of health, including the adequacy of medical care, public health interventions, social and physical environments and related services, genetics, and individual behavior [2].

At a minimum, population health management programs try to enhance health by finding those with unmet health needs and taking measures to close these gaps [3]. This promotes quality in health care, with a focus on getting the right care to the right patient at the right time [4]. There is acknowledgement of [5, 6], but so far only limited movement toward, integrating social service provision (e.g., to improve living environments, reduce violence, etc.) along with health management programs to improve health and wellbeing. Given the broad potential scope of population health management, its implementation may require leveraging big data sources that exist across the healthcare and other social systems [7]. Below we describe how this can work for those who have traditional fee-for-service Medicare coverage along with a Medicare Supplement Insurance (i.e., Medigap) plan.

Medicare and Medigap

Medicare is the U.S. federal health insurance program for about 55 million people age 65 or older and for those with disabilities and end-stage renal disease. Much of what we know about health and the use of healthcare services for these people comes from analyses of Medicare program data [8]. The Centers for Medicare & Medicaid Services, a branch of the Department of Health and Human Services, runs the Medicare program. In 2010, about 39 million people (accounting for about 99 % of those 65 or older) had at least some Medicare coverage [9].

Medicare has four different parts, designated A through D. Parts A and B, which pertain to institutional care (e.g., hospital and nursing home care), the use of durable medical equipment, and outpatient (non-institutional) care, are often called ‘original’ or ‘traditional fee-for-service’ programs. These programs require no formal network of medical care providers to be used. Medicare beneficiaries can choose any providers they wish, as long as those providers have chosen to participate in the federal Medicare program and abide by its conditions of participation.

Many people who purchase Parts A and B also choose to purchase a standalone Part D prescription drug plan (PDP), to pay for pharmaceutical care. Pharmaceutical coverage is also included within most Part C (Medicare Advantage) plans. Part C plans include health maintenance organizations, preferred provider organizations, or other managed care arrangements that do require use of specific networks of healthcare providers. About 30 % of Medicare beneficiaries choose Part C plans, while the rest use Parts A and B [10]. About 45 % of Medicare beneficiaries have stand-alone Part D coverage (i.e., outside of Part C plans) [11]. Because Medicare Parts A and B do not pay for all of the costs of healthcare services, those with high healthcare utilization may face high out-of-pocket fees for their care. About 21 % of beneficiaries with Part A and Part B coverage purchase additional Medicare supplement coverage, also known as a Medigap plan, to offset many of these out-of-pocket costs [12].

Medigap coverage is popular; about 48 % of non-institutionalized Medicare beneficiaries had Medigap coverage in 2012 [12]. However, little is known about Medigap insureds because their data have not been readily available for analysis. Research is needed to understand how to help those who purchase Medigap coverage live healthier lives. Thus, as described below, population health management programs are now being tested for many Medicare beneficiaries who purchase Medigap coverage via AARP-branded Medicare Supplement (Medigap) plans. Big data analytics are the key to the development and success of these population health management programs.

Big data

The data used by population health programs are already big, but have the potential to be even bigger. Many types of data are typically used to manage these programs or to estimate program value. Medical and health data may include information on health management program engagement, health risks that are collected via the use of survey-based health assessments, as well as health insurance claims and membership files. Information on the social or physical environment come from external data sources that describe characteristics of home life, neighborhood, and the local supply and quality of health care. Still other data collected from surveys may be used to measure health-related quality of life and satisfaction with insurance arrangements, perceptions of access to care, and the quality of care received.

This paper demonstrates how big data collection and analytic processes allow researchers and program managers to focus on a broad set of population health management program outcomes with such data. Specifically, we describe the use of big data to support a population health program for individuals who purchased an AARP Medigap plan, which supplements their fee-for-service Medicare coverage. Currently, about four million adults have enrolled in an AARP Medigap plan insured either by UnitedHealthcare Insurance Company or UnitedHealthcare Insurance Company of New York. These plans are offered in all 50 states, Washington D.C., and various U.S. territories.


Population health program components

Individuals with an AARP Medigap plan have access to several population health program offerings (Fig. 1). Some of these, such as the Nurse HealthLine, benefit everyone regardless of where they lie in their healthcare journey. Conversely, other program components, such as Treatment Decision Support and Advanced Illness, benefit those transitioning through defined health events. Regardless of health status, accessing any of these services generates data that are aggregated for reporting and research, which in turn informs program management and refinement. The interactions of about four million individuals with these population health programs results in a data volume and frequency sufficient to create big data. Below we briefly describe each program component and the data produced from them in more detail, to show how the use of these data contributes to population health management for AARP Medigap insureds.

Fig. 1
figure 1

This diagram reflects the breadth of offerings in the population health management program and the types of data required to develop, manage, and evaluate that program

First, there are two health risk appraisal (HRA) survey questionnaires. An abbreviated HRA is offered to all new Medigap plan enrollees. This ‘mini-HRA’ contains 17 questions about current health conditions, prescription drug use, limitations in activities of daily living, and frequency of hospitalization in the last year. In addition, all insureds (not just the new members) have online access to another HRA with 93 questions. The data from both HRAs are analyzed to find individuals with significant health challenges; these insureds are then contacted to ask if they would like to participate in one or more of the programs shown in Fig. 1, to help them remain well or better coordinate care for their chronic health conditions.

All four million AARP Medigap insureds also have access to the Nurse HealthLine, a telephone-based assistance service staffed by nurses. Callers to the Nurse HealthLine get suggestions about how to find a healthcare provider or learn where best to receive care for pressing acute or chronic health issues. Information on the costs and benefits of the AARP-branded Nurse HealthLine has been published elsewhere [13].

Next, a telephone-based pilot called the Trusted Health Partner is available to AARP Medigap insureds in parts of Texas. This program feature provides a single point of contact for those who want help with insurance services or with their healthcare needs. Once engaged with a Trusted Health Partner, insureds receive several outbound phone calls and a survey to determine their health-related needs, and program staff members work to address those needs. Common issues include finding high-quality medical providers, addressing health insurance questions, and finding community support services to address health or social service needs. Help provided may take the form of education, advice, assistance, or referrals to health topic subject matter experts who can address more detailed questions if necessary. Other referrals can be made to nurses who can provide advice about how best to meet clinical care needs. Meanwhile, other ancillary medical specialists are available to provide advice on nutrition and caregiving to other family members or loved ones.

Finally, a new fitness and wellness pilot called “At Your Best” was launched in 2014 for AARP Medigap insureds in test markets in New Jersey and Missouri [14]. Depending on program performance, this population health management program feature may expand to additional states. At Your Best offers personalized coaching to help participants improve their health and wellness, as well as support through online health resources, one-on-one telephonic wellness coaching, and ties to various community-based activities. With the online activity, individuals can take a wellness assessment, customize their profiles, and start a two-week personalized fitness, nutrition, or health risk reduction plan. Telephone-based wellness coaching is available to help participants set health and wellness goals (e.g., lose weight, become more active, or reduce stress), create an action plan, and stay connected for motivation and support. Participants can also attend local events, such as a weekly walking club or nutrition class, and they get discounted memberships at a local fitness center.

Other population health program components reach out to those who may benefit from targeted interventions. These individuals are usually found through analyses of Medicare Part A, Part B, or Part D data but can also be identified through self-reported data, such as data from the HRA. These data support the operations of a Treatment Decision Support service, an Emergency Room Decision Support service, and a care coordination process called “MyCarePath.” Each of these program components is introduced below.

Treatment Decision Support nurses reach out by telephone to discuss available treatment options with individuals whose Medicare claims data provide evidence of chronic knee, hip, or back pain. In another targeted initiative, the Emergency Room Decision Support (ERDS) service identifies those who appear to be over-utilizing emergency rooms. ERDS staff contact high ER users by telephone to discuss their healthcare needs and provide suggestions about how to find a medical provider in a non-ER setting, if warranted [15].

Finally, the MyCarePath pilot was offered in New York, Ohio, North Carolina, and parts of Texas in 2014; New Jersey was added in 2015. This pilot uses computerized algorithms to find and engage people with multiple chronic health conditions who may benefit from additional care coordination and ancillary support. Direct referrals to MyCarePath may also be accepted. MyCarePath provides an individualized, patient-centered, and holistic approach to managing care, focusing on overall physical and mental health rather than on specific chronic illnesses. It is staffed by licensed registered nurses who work with plan enrollees to address their personal goals and both medical and non-medical needs. More information about MyCarePath can be found in Hawkins et al [16].

Population health programs require big data to work well

Several types of data are generated on behalf of those with AARP Medigap plans to support the population health program components mentioned above. These data vary in format and frequency, as noted below.

First, each insured has an administrative (health plan membership) record that includes basic demographic and contact information, as well as information about his or her plan and dates of coverage. There are 10 different standardized Medigap plan types available in most states, and these differ with regard to coverage of deductibles, copayments, and other out-of-pocket expenses. Administrative data are updated monthly.

Second, those who use traditional fee-for-service health care generate Medicare Part A and/or Part B insurance claims, which are submitted by doctors, hospitals, and other providers to Medicare intermediaries. These intermediaries verify Medicare enrollment and process the claims so providers can be paid for the hospital inpatient, outpatient, emergency room, pharmaceutical, or other services they provide. This process yields claim streams that are sent to the Medigap insurer (e.g., UnitedHealthcare) to reimburse members for Medigap-covered services. Part A and Part B claims data include fields with standardized codes for medical diagnoses and procedures, place of service, the amounts billed by the healthcare provider, as well as the amounts actually paid by Medicare, the AARP Medigap plan, and the individual. These claim records also include place of service codes, which indicate whether care was provided in the emergency room, inpatient service, ambulatory services, laboratory, long-term care unit, or ancillary service areas.

For those with the AARP Medicare Part D plan, claims for prescription pharmaceuticals are submitted by pharmacies or pharmacy benefit management companies to UnitedHealthcare’s Part D insurance program. The Part D program pays for prescription pharmaceuticals and generates another stream of data about these program services, which are maintained in a data base separate from, but linkable to, the medical claims and administrative data. The pharmacy claims files include the name and class of drug, the National Drug Code Identifier, the prescribed drug dosage, prescription fill dates, and number of days of pharmaceutical care covered by that particular claim. Part D data tend to arrive more quickly than do medical claims (i.e., usually within just a few days of receiving the pharmaceutical service, whereas medical claims tend to arrive a few weeks or more after the medical service is received).

The previously described HRA survey data are also used to support the population health program, providing information gathered about demographics, medical problems, and limitations in activities of daily living. In addition, random samples of Medigap insureds are asked to complete the Medicare Consumer Assessment of Healthcare Providers and Systems (CAHPS) fee-for-service questionnaire. This questionnaire is the national standard for measuring consumer experiences with health plans [17, 18]. The CAHPS survey is fielded annually and collects member demographics and information about health status and satisfaction and experiences with healthcare services. Between 2009 and 2011, the CAHPS questionnaire was sent to a random sample of individuals with an AARP Medigap plan residing in 10 states. During 2012 and 2013, the survey was sent to two groups, first to individuals eligible to participate in the MyCarePath program and who resided within the MyCarePath pilot market. The second surveyed group was a random sample of AARP Medigap insureds residing in five states outside of the MyCarePath pilot market. In 2014, the first group remained unchanged, but the second group consisted of a random sample of individuals who were eligible for MyCarePath, but chose not to participate (i.e., non-participants). Starting in 2015, a longitudinal CAHPS survey process was initiated to include MyCarePath participants, non-participants, and participants who dis-enrolled from MyCarePath (i.e., dis-enrollees).

Still other data used to manage the population health program are obtained from outside sources, including U.S. Census data (updated every 10 years), the Dartmouth Atlas (updated every two or three years, on no specific schedule), and the KBM Group (updated every 2 months). U.S. Census data are available at the zip-code level and are used to identify demographic differences by these geographic regions. Similarly, Dartmouth Atlas data are used to describe differences in the supply of healthcare services in the geographic areas where Medigap insureds live. Additionally, the AmeriLINK Data Sourcing system generated by the KBM Group was used to find information about socioeconomic status for each qualified member. The KBM Group generates this information by collecting data from public records, purchase transactions, U.S. Census data, and consumer surveys [19].

All of these data are known to influence healthcare utilization and expenditures so are valuable for analyses of population health program experiences [20]. They can also be used to help assess access to care, quality of care, and health-related quality of life.

Finally, program qualification and participation data are stored in separate files that are linkable to all of the above data sources. These data reflect information about individuals prior to participation, such as how they became qualified and outreach attempts to engage them. These data also contain information about the extent of program participation, including number of contacts, type of contacts, healthcare gaps that were identified and closed during the program, duration of participation, and detailed notes. These data are updated monthly.

The Optum eSync platform is used to aggregate the data noted above, link those data to each other, and quantify the need for healthcare services to close gaps in care and better coordinate care for AARP Medigap insureds. This platform helps program staff and participants to work together to determine which gaps in health care to address first, second, and so on. By doing so, staff help make sure program participants receive appropriate care from suitable providers, learn to take better care of themselves, and live healthier lifestyles.

Data storage requirements

Assembling the administrative, claims, and program data for the four million insureds with an AARP Medigap plan requires significant resources, including time and infrastructure. On average, about 13 million claims are submitted each month for Medicare Part A, Part B, and Part D services. These data are refreshed monthly and are managed in 131 tables, comprising nearly five Terabytes of storage.

To put this in perspective, one Terabyte can hold about 17,000 h of music, 1000 h of videos, or 310,000 photos [21]. In addition, these data must be stored for over three years for auditing purposes. Another seven terabytes of space store the historical versions of these data files. To manage this volume, significant resources are spent on data acquisition, aggregation, linkage, cleaning, and analyses, as well as data infrastructure and security. Finally, these data need to be integrated to allow program staff to better coordinate care for AARP Medigap insureds, and to facilitate program management, reporting, evaluation, and continuous quality improvement.

Analytics used to support population health and research

To get the right person to the right care at the right time, a robust analytics platform is required, including computer programs to manage clinical resources, assess health risks, and measure service quality. For example, Optum’s Evidence-based Management (Symmetry® EBM Connect®) software is used with medical claims, pharmacy claims, laboratory claims, and enrollment data to estimate compliance with current evidence-based best practices for the treatment of clinical conditions and the use of preventive services. Individuals who fail to meet best practice recommendations can be found and approached with an invitation to enroll in one of the population health program components to reduce gaps in care. Other Optum software, known as Optum™ ImpactPro™, uses medical claims, pharmacy claims, and demographic variables to proactively predict those who will have higher than average future healthcare usage. These individuals also may benefit from participation in the population health program.

In addition, medical claims, pharmacy claims, demographic variables, and survey data are used for program management and evaluation. Quarterly reports based on these data provide a comprehensive assessment of program performance. These reports show trends over time in socioeconomic factors; qualification for the program; participation rates; operational metrics describing program services; the quality of care received during program participation; satisfaction with program services; and several inpatient, outpatient, and other healthcare utilization and expenditure metrics. Many of the values of these metrics are compared to performance targets, so program staff and others can monitor performance during any particular quarter and view trends over several quarters, document positive and negative findings, and learn where improvements need to be made.

Quarterly reports are complemented by annual program evaluations that use advanced analytic methods to control for case mix differences between program participants and non-participants and thus generate program impact estimates. Finally, predictive modeling based upon personal characteristics is used to find individuals who are most likely to engage when invited to participate and succeed once engaged. Individuals with the higher predicted probabilities of program engagement and success receive higher priority in program outreach efforts. Our team refers to this predictive modeling-based process as “propensity to succeed analysis,” which has been described in more detail elsewhere [22].


In the last several years, UnitedHealth Group has amassed a large data repository on individuals with AARP Medigap coverage. These data have been used to support a multi-component health program in several ways, as described next.

MyCarePath care coordination

Data were used to enhance engagement, monitor performance, and determine if participants had improved quality of care and reduced costs when compared with individuals who were eligible to participate but chose not to do so.

First, a survey and several focus groups were conducted, as described elsewhere [23]. The top reasons given for MyCarePath participation included perceptions that participation would be beneficial, convenient, and provided at no additional cost. The main reasons for not engaging included no perceived benefit from participation. Reasons given for disengaging included a lack of time, not believing MyCarePath to be helpful, or confusion regarding the services it would provide. Among the key findings from the focus groups were that individuals who felt they were not getting sufficient support from their medical providers, or those needing a sounding board, were more likely to participate.

Next, anecdotal data suggested that individuals who completed the HRAs were more likely to participate in MyCarePath. Recent multivariate modeling of these data showed that this was true even after adjusting for case mix differences between participants and non-participants. Those who completed the HRA were over twice as likely to participate when compared with individuals who had not completed an HRA. This modeling was also used to identify several other conditions and programmatic variables that significantly influenced participation.

The quarterly metrics report provides ongoing indicators of MyCarePath processes and performance. Through the assembly and integration of administrative, claims, pharmacy, and survey data, MyCarePath staff can determine whether: participation rates meet targets, performance metrics are within desired bounds, participants are satisfied with the care they receive, and whether they report improved quality of life as a result of participation. Claims data are also used to gauge compliance with evidence-based medicine metrics. Results are shared and discussed with AARP Services, Incorporated (ASI) management and with external consultants. (ASI, a subsidiary of AARP, manages the provider relationships for and performs quality control oversight of the products and services that carry the AARP brand [24].) Ozminkowski and Serxner described these program reporting processes in a previous publication [25].

Figure 2 shows an example of how these metrics are summarized and interpreted. In this example, diabetes patients are being tracked on three quality metrics, including having an annual office visit with their physician, having required blood tests, and adhering to prescribed medications. “Target” represents the program’s goal, while the observed value is the metric value achieved during that quarter. As long as the observed value is within the range of the lower and upper tracking bounds, the metric’s status is considered “On track,” and no action is required. However, if the observed value falls outside of the tracking bounds, the metric may warrant investigation. A quarterly metric trend line provides a quick assessment regarding the trend’s history over the last several quarters.

Fig. 2
figure 2

An example of a quarterly report that tracks quality metrics for patients with diabetes

While these quarterly metric reports are timely and valuable, the data are aggregate statistics that are not adjusted for case mix differences between program participants and non-participants. Scientifically rigorous program evaluations were made possible by aggregating the data from sources mentioned above. These evaluations use propensity score weighting and other multivariate modeling techniques and are periodically conducted to account for case mix differences. The program evaluations showed that these population health programs have improved the quality of health care over the three-year period from 2009 to 2011. The evaluations of four of the AARP population health management program components are described next.

In a recent evaluation of the AARP high-risk care coordination program component, [16] increased duration in care coordination was associated with fewer hospital readmissions, and participants were significantly more likely to have recurring office visits and recommended laboratory tests. A similar evaluation showed improved quality of care and savings for those who participated in the emergency room decision support component [15]. Results indicated that participants reduced their emergency room visits by 1299 visits per 1000 insureds, compared with a reduction of 1121 visits per 1000 insureds for the non-engaged individuals. Participants also had an incremental decrease of 53 admissions per 1000 insureds, compared with non-participants. An evaluation of the Nurse HealthLine was also conducted, [13] focusing on its nurse-provided telephone-based triage service. The evaluation found that 55 % of callers to the Nurse HealthLine were adherent, implying that they were more likely to get the right care at the right time.

Finally, an internal evaluation of the depression management component of the population health program, similar in design to the high-risk care coordination evaluation previously mentioned, found reduced depression symptoms in 59 % of those who were engaged in the depression management feature. Although not statistically significant, engaged individuals were also about 40 % less likely to be readmitted within 30 days of a previous hospitalization.

Evaluations have also identified a few pilot program features that were not as successful as the above-mentioned components and have therefore been modified. These include disease management services for members with coronary artery disease, congestive heart failure, and diabetes, as well as another telephone-based case management feature. The disease management program components were designed to help coordinate care for specific diseases, while the telephonic case management component was designed for individuals where analyses of medical claims data predicted they may have higher than average healthcare expenditures in the future, but who were healthier than those targeted for high-risk care coordination. To provide a more meaningful population health experience, the disease management, depression management, telephonic case management, and high-risk care coordination components were merged into the MyCarePath program in 2014. Individuals already participating in these components were transferred to MyCarePath, while newly identified individuals who were previously qualified for these were offered MyCarePath participation.


This paper describes how big data have successfully been used to support a population health management program for individuals who are 65 years of age or older and have fee-for-service Medicare coverage with additional Medicare Supplement (Medigap) insurance. As described above, this program would be much less effective without the development and integration of numerous and disparate data sources and the computational ability to manage and analyze these data.

Establishing, monitoring, and evaluating population health programs rely on big data. Big data are emerging as an important resource in other parts of the healthcare industry, as well. For example, 59 % of respondents to the Big Data Cure survey of 150 federal healthcare and healthcare research executives said they believed that fulfilling their agency’s objectives within the next 5 years depend on leveraging big data [26]. In this report, 57 % of survey respondents also stated that the use of mobile and wireless devices will improve patient care and 53 % planned to utilize machine-to-machine technologies (i.e., technologies used to collect, monitor, or store healthcare information without human intervention) within the next 2 years.

This paper describes a number of data sources used to deliver a population health program for older adults with Medigap coverage, yet there are still other big data sources to be explored, as illustrated in other applications outside our realm. Recent technology, such as genomics and proteomics, has led to exponential growth in medical knowledge, but also produces vast amounts of data [27]. For example, the University of Pittsburgh Medical Center conducts genomic research that is used to facilitate early detection, diagnosis, and intervention for a multitude of disorders in newborns, as well as to diagnose and treat cancer.

Other types of data can be more fully utilized as well. For example, the medical staff at Duke University has now linked electronic health records with information from a geographic information system. This allows Duke researchers and clinicians to select, visualize, and predictively study groups of patients with a healthcare issue of interest in a geographic map in real time [28]. In another example, the American Society of Clinical Oncology is developing software to collect disparate electronic health record data from cancer patients in order to develop a big clinical database intended to speed learning about treatment options, support efforts to improve quality of care, and hasten the development of new medications [29]. Finally, although most data are currently generated by humans, there is considerable interest in the proliferation of data generated by sensors or intelligent devices, such as smartphones, as this is projected to be a major data source in the next 10 years [30]. These advances pose exciting opportunities and challenges for analyses that can improve population health programs.

One might characterize the development of population health programs so far in terms of a phrase such as ‘let’s do the best we can with the data we are able to collect from within the confines of the health or healthcare domain, then expand from there.’ Limiting data development, aggregation, analytic, reporting, and evaluation work to these data sources has in and of itself been a huge effort with big data. With the AARP population health management effort, about 20 full-time staff are involved, but over 100 more have contributed time to the endeavor since 2007, when the population health management program efforts began. This does not count the hundreds of staff who process and pay Medigap claims or perform other insurance functions.

The next pioneering landscape will involve efforts to better integrate social services with healthcare services and population health program efforts. Doing this successfully may improve access to the medical care and social service providers they need and help reduce healthcare and social service costs. Currently Medicare will not pay for social services from non-medical care providers, and a literal act of Congress would be required for it to do so. Progress has been slow, but mentions of the need to integrate health and social services much more closely can be found in the popular press, and in discussions in many peer-reviewed articles, such as Dunn et al [6].

The big data implications of integrating healthcare data with other forms of data may be staggering. Perhaps the most notable example of this is the integration of social service, tax, and other data needed to run the state and federal exchanges for Obamacare, requiring hundreds of staff from many federal and state agencies.

As new types of data are integrated into population health programs, data security and privacy will become an even bigger concern. Big data from smartphones, wearable items, and other devices can greatly enhance the capabilities of population health programs, but the risks of data theft increase with each new type of device that generates private information. For example, the U.S. Government Accountability Office reported that during 2011 the number of different malicious software programs targeting mobile devices had increased by 185 % [31].

It is important to address security issues at each contributing data source in order to protect private information. The larger the perceived risk of contributing data from multiple sources, the less likely that patients and others will be willing to do so, offsetting some of the potential health gains associated with big data aggregation and analysis. It is worth mentioning that smartphone or other device data are not being collected to aid in the management or evaluation of the AARP programs at this time. The utility, safety, and legal ramifications of doing this require further evaluation. In the future, if collecting these data can be shown to substantially improve care coordination and quality of life, then the use of these data sources may be pursued.


As illustrated here, a population health program can be used to help older adults live healthier lives. The management of population health programs has benefitted from recent advances in big data; here we described how big data were successfully leveraged to execute a comprehensive population health program for AARP Medigap insureds age 65 or older. In addition, data generated from these programs have been used to routinely monitor program performance and to conduct in-depth program evaluations. New technologies will further enhance the opportunity to improve population health and social service programs through the use of big data.


ASI, AARP Services, Incorporated; CAHPS, Consumer Assessment of Healthcare Providers and Systems; ER, emergency room; ERDS, Emergency Room Decision Support; HRA, health risk appraisal


  1. James P, Kuzel A, Thompson B, Davis A, Grumbach K. Evolving perspectives on population health management. Ann Fam Med. 2014;12:481–2.

    Article  Google Scholar 

  2. Kindig D, Stoddart G. What is population health? Am J Public Health. 2003;93:383.

    Article  Google Scholar 

  3. Chen EH, Bodenheimer T. Improving population health through team-based panel management: comment on “Electronic medical record reminders and panel management to improve primary care of elderly patients”. Arch Intern Med. 2011;171:1558–9.

    Article  Google Scholar 

  4. Clancy CM. What is Health Care Quality and Who Decides. In Committee on Finance Subcommittee on Health Care United States Senate, Agency for Health Care Research and Quality, U.S. Department of Health and Human Services. 2009. Accessed 27 Sep 2014.

  5. Centers for Disease Control and Prevention. Healthy People 2020 Accessed 11 Nov 2015.

  6. Dunn JR, Hayes MV. Social inequality, population health, and housing: a study of two Vancouver neighborhoods. Soc Sci Med. 2000;51:563–87.

    Article  Google Scholar 

  7. Hartman C. Healthcare’s growing data opportunity. Leveraging clinical intelligence to elevate population health management strategies. Health Manag Technol. 2014;35:24.

    Google Scholar 

  8. Centers For Medicare & Medicaid Services. Medicare & You. 2014. Accessed 16 Jun 2014.

  9. Centers For Medicare & Medicaid Services. Medicare Enrollment Reports. 2010. Accessed 16 Jun 2014.

  10. The Boards of Trustees Federal Hospital Insurance and Federal Supplementary Medical Insurance Trust Funds: The 2015 Annual Report of the Boards of Trustees of the Federal Hospital Insurance and Federal Supplementary Medical Insurance Trust Funds. 2015. Accessed 10 Nov 2015.

  11. The Henry J. Kaiser Family Foundation: Medicare Prescription Drug Plans: Stand Alone PDP Enrollees as a Percent of Total Medicare Population. Accessed 11 Nov 2015.

  12. AHIP. Center for Policy and Research: Beneficiaries with Medigap Coverage. 2015. Accessed 29 Apr 2015.

    Google Scholar 

  13. Navratil-Strawn JL, Hawkins K, Wells TS, Ozminkowski RJ, Hawkins-Koch J, Chan H, Hartley SK, Migliori RJ, Yeh CS. Listening to the nurse pays off: an integrated Nurse HealthLine programme was associated with significant cost savings. J Nurs Manag. 2013;22:837–47.

    Article  Google Scholar 

  14. MacConnel E. UnitedHealthcare is introducing a new wellness initiative. Total Benefits Solutions Inc. 2014. Accessed 17 Mar 2015.

  15. Navratil-Strawn JL, Hawkins K, Wells TS, Ozminkowski RJ, Hartley SK, Migliori RJ, Yeh CS. An emergency room decision-support program that increased physician office visits, decreased emergency room visits, and saved money. Popul Health Manag. 2014;17:257–64.

    Article  Google Scholar 

  16. Hawkins K, Parker PM, Hommer CE, Bhattarai GR, Huang J, Wells TS, Ozminkowski RJ, Yeh CS. Evaluation of a High-Risk Case Management Pilot Program for Medicare Beneficiaries with Medigap Coverage. Popul Health Manag. 2014;18:93–103.

    Article  Google Scholar 

  17. Hays RD, Shaul JA, Williams VS, Lubalin JS, Harris-Kojetin LD, Sweeny SF, Cleary PD. Psychometric properties of the CAHPS 1.0 survey measures. Consumer Assessment of Health Plans Study. Med Care. 1999;37:22–31.

    Article  Google Scholar 

  18. Hargraves JL, Hays RD, Cleary PD. Psychometric properties of the Consumer Assessment of Health Plans Study (CAHPS) 2.0 adult core survey. Health Serv Res. 2003;38:1509–27.

    Article  Google Scholar 

  19. KBM Group. AmeriLINK® Data Sourcing. Accessed 2 Aug 2013.

  20. Wennberg JE, Fisher ES, Skinner JS. Geography and the debate over Medicare reform. Health Aff (Millwood). 2002; Suppl Web Exclusives:W96-114.

  21. Foo M. How much can a 1 TB external hard drive hold? 2012. Accessed 2 Oct 2014.

    Google Scholar 

  22. Hawkins K, Ozminkowski RJ, Mujahid A, Wells TS, Bhattarai GR, Wang S, Hommer CE, Huang J, Migliori RJ, Yeh CS. Propensity to succeed: Prioritizing individuals most likely to benefit from care coordination. Popul Health Manag. 2015 [Epub ahead of print].

  23. Hawkins K, Wells TS, Hommer CE, Ozminkowski RJ, Richards DM, Yeh CS. Factors driving engagement decisions in care coordination programs. Prof Case Manag. 2014;19:216–23.

    Article  Google Scholar 

  24. Bagley M. Lawrence Flanagan Named President and CEO Of AARP Services, Inc. AARP Press Center. 2014. Accessed 24 Mar 2015.

  25. Ozminkowski RJ, Serxner S. Tell the right story with your program reporting processes. In Corporate Wellness Magazine. 2012. Accessed 16 Dec 2014.

  26. MeriTalk and EMC. The Big Data Cure. 2014. Accessed 6 Oct 2014.

  27. Howe D, Costanzo M, Fey P, Gojobori T, Hannick L, Hide W, Hill DP, Kania R, Schaeffer M, St Pierre S, et al. Big data: The future of biocuration. Nature. 2008;455:47–50.

    Article  Google Scholar 

  28. Ericson J. Duke Medicine’s Big Data Plan to Improve Population Health. 2014. Accessed 6 Oct 2014.

    Google Scholar 

  29. Winslow R. ‘Big Data’ for Cancer Care. Vast Storehouse of Patient Records Will Let Doctors Search for Effective Treatment. In Wall Street Journal. 2013. Accessed 6 Oct 2014.

  30. Bean R. If you think big data’s challenges are tough now… In Big Idea: Data & Analytics Blog. 2015. Accessed 29 Jan 2015.

    Google Scholar 

  31. U.S. Government Accountability Office. Better Implementation of Controls for Mobile Devices Should Be Encouraged. GAO-12-757. 2012. Accessed 14 Apr 2015.

Download references


The authors thank Stephanie J. MacLeod MS for her editorial assistance and critical review of this manuscript.

Availability of supporting data

Internal data was used, which cannot be released to the public.

Authors’ contributions

TSW, KH, and RJO contributed to the concept and design of the study. TSW, RJO, GRB, KH, and DGA wrote and revised the manuscript. All authors read and approved the final manuscript.

Competing interests

This work was funded by the Medicare Supplement Health Insurance Program. Ronald Ozminkowski, Kevin Hawkins, Timothy Wells, Gandhi Bhattarai, and Stephanie MacLeod are all employed by UnitedHealth Group and own stock in UnitedHealth Group. Douglas Armstrong is employed by AARP Services, Inc. Their compensation was not dependent upon the results obtained in this research, and they retained full independence in the conduct of this research. The authors declare that they have no competing interests.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Timothy S. Wells.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wells, T.S., Ozminkowski, R.J., Hawkins, K. et al. Leveraging big data in population health management. Big Data Anal 1, 1 (2016).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: