Dropdown items
My Academies

Personal Library

Account settings

Developing an Agenda for Population Aging and Social Research in Low- and Middle-Income Countries (LMICs): Proceedings of a Workshop (2024)

Chapter: 6 Use of Existing Data

Visit NAP.edu/10766 to get more information about this book, to buy it in print, or to download it as a free PDF.

Previous chapter Next chapter
Page of 126
Search this publication

Page 71 Cite Bookmark

Suggested Citation: "6 Use of Existing Data." National Academies of Sciences, Engineering, and Medicine. 2024. Developing an Agenda for Population Aging and Social Research in Low- and Middle-Income Countries (LMICs): Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/27415.

6

Use of Existing Data

There is a great deal of existing data that could be leveraged to improve our knowledge of aging in low- and middle-income countries (LMICs), said David Weir (University of Michigan, planning committee member), who moderated the fifth session of the workshop. This session focused on identifying data in LMICs that may be of interest for examining life-course trajectories of development and aging, including early-life prospective data or retrospective data from current older cohorts; data linkages; or leveraging existing cohort studies established for other nonaging purposes. Presenters had been given a series of questions to guide their comments:

Page 72 Cite Bookmark

What administrative linkages should we consider to enhance the utility of existing data?
How can we continue to foster data sharing and deal with data sharing issues?
What existing cohort studies might be used as samples for longitudinal studies of aging?
What sampling frames (including administrative data, as well as censuses, etc.) are available to lower the cost of finding potential participants?

LONGITUDINAL HOUSEHOLD SURVEYS

The Living Standard Measurement Study (LSMS) is the World Bank’s flagship household survey program, said Amparo Palacios-López (World Bank), virtual participant. It was created in 1980 in response to a perceived need for policy-relevant data for measuring poverty; it has since incorporated measures on employment, health, and other indicators. The objective, she said, is to allow policy makers to understand the determinants of these outcomes. The LSMS program has two components: one is focused on measuring living standards and the other is focused on studying the measurements themselves. LSMS supports countries in understanding the living standards of their populations by offering technical assistance and advisory services on all stages of the survey life cycle and by creating capacity within research institutes in client countries. LSMS engages in the study of measurement by conducting research on survey methods and by producing guidelines on best practices.¹

Palacios-López explained that there are three ways that LSMS works with client countries: LSMS-led, LSMS-advised, and LSMS-style. An LSMS-led process is one in which LSMS is equal partners with the country and works to design and implement national-level methods under specific LSMS initiatives. In an LSMS-advised process, the LSMS team provides various levels of technical assistance to countries, such as help with revising a questionnaire. An LSMS-style process is one in which a country uses free LSMS materials with no team involvement. Countries use the guidelines that LSMS has produced and implement their own living standards, said Palacios-López.

The main focus of LSMS has been LMICs, said Palacios-López, with a particular focus on Africa. One specific effort in African countries is the Living Standards Measurement Study—Integrated Surveys on Agriculture (LSMS-ISA).² This is a unique system of longitudinal surveys designed to

___________________

¹ https://www.worldbank.org/en/programs/lsms/overview

² https://www.worldbank.org/en/programs/lsms/initiatives/lsms-ISA

Page 73 Cite Bookmark

improve the understanding of household and individual welfare, livelihoods, and smallholder agriculture in Africa. The program has been in operation for more than 15 years, and it has focused on three different work streams. The first stream is data production. LSMS supports the design, implementation, and dissemination of country-owned, multitopic, national panel household surveys. The main difference between these and other similar surveys, said Palacios-López, is that they are panel household surveys; there are datasets of households or individuals that are followed over time.

The second work stream of LSMS-ISA is focused on methods and tools. In this effort, LSMS and partner countries work to improve methods and tools for survey data collection and analysis through field experiments and rigorous research and development. Palacios-López shared an example of a methodological experiment conducted in Malawi aimed at finding ways to measure time use. Researchers gave participants smart phones to record activities over the course of the day; many participants could not read or write but were able to use images to create a time-use recording. The third work stream of LSMS-ISA is conducting and promoting research to inform evidence-based development policies.

LSMS has been working with national statistical offices in eight partner countries: Burkina Faso, Ethiopia, Malawi, Mali, Niger, Nigeria, Tanzania, and Uganda. Together, these countries make up 45% of the population of sub-Saharan Africa, Palacios-López noted. The surveys are integrated into the national statistical systems and implemented by the national statistical offices. They track households and individuals and are representative at the national and regional levels. The surveys use a specific agricultural model designed to collect as much information as possible to inform agricultural policy. Surveys are georeferenced at the household level and the plot level. Computer-assisted personal and telephone interviewing are used to conduct the survey, and all of the data are publicly available.

Palacios-López shared some numbers on the impact of LSMS-ISA:

33 surveys,
160,000+ household interviews,
81,000+ dataset downloads,
6,400+ total publications,
3,000+ total citations,
20+ guidebooks, and
1400+ guidebook downloads.

More specifically, Palacios-López discussed some selected data from the LSMS surveys. Using surveys from Nigeria and Tanzania, she said she used harmonized longitudinal data and selected the 4 oldest cohorts. She noted that while the data are representative of the country’s population in

Page 74 Cite Bookmark

general, they may not be representative of the older population specifically. In addition, the surveys do not contain indicators specific to aging or older populations, but there are many indicators that may be relevant. She looked at data on employment status, type of work, health care access, and functional limitations. As expected, employment wanes over time as individuals age, but they still are quite active. In Nigeria, the majority of people 40 and older are employed; of those who are employed, the oldest age group (70+) had the largest share of individuals working in agriculture. Palacios-López noted that while agriculture is a physically demanding job, it is also one that provides subsistence, so it may be used as “insurance” when things do not go well in other sectors. Survey questions about health care found that needs for and access to health care providers rose along with respondents’ ages. In addition, the oldest cohort showed a much higher share of individuals reporting functional limitations.

Longitudinal studies are critical for studying different indicators over time, said Palacios-López. They can provide information on the evolution of different cohorts and support the design of policies that target the aging population. LSMS panel data are a good example of this type of survey, with rich information on sociodemographics. If the LSMS were to be used for aging-specific research, it would be useful to ensure that the sample is representative of older populations and to add modules that are relevant to aging populations in LMICs.

DATA LANDSCAPE IN KENYA

The African population is aging very rapidly, said Anthony Ngugi (The Aga Khan University). In 2020, less than 6% of the population was older than 60. By 2050, this number is projected to increase to around 15%, with the number even higher in some nations. At the same time as this demographic transition is occurring, there is an equally rapid epidemiologic transition with an increase in chronic disease and disability. These transitions are happening, he said, in an area in which there are scarce population-level data on critical domains of aging, such as health, mental health, climate vulnerability, and economic well-being. Data are needed to inform responses to the unique health and socioeconomic challenges that will emerge as the population ages.

There are several likely sources of data that could be explored for evidence related to aging. These include Health Management Information Systems (HMISs); national or regional surveys (e.g., health surveys); longitudinal population studies; and health and demographic surveillance systems (HDSSs), including the use of bureaus of statistics’ sampling frames for additional population data collection. Ngugi discussed each of these in turn, using Kenya as an example.

Page 75 Cite Bookmark

The HMIS collects data for a specific service or demographic in a facility-based register, and at the end of the month data are aggregated for submission to a higher HMIS office (e.g., subnational level). From each of these offices, data are aggregated again to be transmitted to the national HMIS. Ngugi emphasized that most individual-level data are left at the health facility, usually in paper forms, and only aggregate data are transmitted upward. There is no register for capturing information from the geriatric population since no specific service is designated for this group, said Ngugi. “With really intense effort” it may be possible to extract data on older populations from other registers, such as from inpatient or outpatient registers (for people over 5 years old). In addition to this limitation, HMIS suffers from poor availability of data, poor data quality, and low capacity for processing and use of data for decision making. A study on the collection of maternal child health data found that only about 6% of counties had good reporting of deliveries, and only about 26% had good reporting of outpatient visits. Overall, said Ngugi, data quality is low, with incomplete and inconsistent information. HMIS data present “really serious challenges” to its potential utility for informing studies of health and aging, he said.

National or regional surveys are another potential source of data for studies on aging. For example, Kenya conducts the National Population & Housing Census Surveys every 10 years: it collects information on the population classified by geographical units, age, sex, socioeconomic status (SES), and other parameters. One of the most important outputs of the census, he said, is that it informs the generation of nationally representative sampling frames. The last census in Kenya led to the generation of close to 6,000 sampling clusters of about 100 households each across the country. This enabled the nationally representative surveys to be developed, including demographic and health surveys. These surveys collect household information about housing, SES, HIV, chronic disease, and other measures. Thus, the census might be a place to implement nationally representative surveys of aging, said Ngugi. Another promising source of data is the registry of beneficiaries for the National Safety Net Program. This social protection program began 3 years ago and targets vulnerable children and adults over 70, with plans to reduce the eligibility age to 65. Recipients are identified and registered through an intensive grassroots effort, and their information is held at the Department of Social Protection. This registry, said Ngugi, could be a source for identifying potential participants for aging studies.

Longitudinal population studies have an important place in aging research, although there are currently only two in Kenya specifically on aging: the Health and Aging in Africa: A Longitudinal Study in South Africa (HAALSI) and the Longitudinal Study of Health and Aging in Kenya

Page 76 Cite Bookmark

(LOSHAK).³ Ngugi noted that other longitudinal studies in Kenya are disease- or condition-specific—for example, the Network for Analysing Longitudinal Population-based HIV/AIDS data on Africa,⁴ and the H3Africa Consortium⁵ on genomics.

HDSSs are platforms that continuously track the demographics and health of large, well-defined cohorts. These exist in several LMICs in Africa and Asia and provide robust infrastructure resources for nesting longitudinal population studies, said Ngugi. One of the challenges of HDSSs, however, is that they are not nationally representative. In addition, the characteristics of participating populations can change over time toward improved health outcomes; this change gradually makes the sample less similar to the broader population.

Given the drawbacks of these sources of data for studying aging, said Ngugi, he asked whether there is potential for linking data across the various sources. Unfortunately, he said, there is “limited feasibility” of linking due primarily to the limited potential for harmonizing. There are no unique identifiers used across databases, as most sources are designed by different stakeholders. Moreover, as he said earlier, the data in many sources, particularly HMISs, are of poor quality in terms of availability, completeness, timeliness, and consistency. Using statistical methods to link data sources is challenging due to nonstandard respondent identification information across surveys, even in the same population (e.g., name, dates of birth, locality).

Despite these challenges, said Ngugi, there are many stakeholders working on data in Africa who recognize these problems and are taking active steps to address them. One initiative is the African Population Cohort Consortium, funded by the Wellcome Trust. It is designed to bring together longitudinal population studies that track the health of large groups of people and to develop resources where gaps exist. When this project is mature, it will provide a pan-African network that can be used to host longitudinal population studies in the region.

Another promising project is the Implementation Network for Sharing Population Information from Research Entities (INSPIRE Network);⁶ this project is focused on data harmonization and works to empower data producers to collect data that are sharable. In addition, the INSPIRE Network works with data users and provides tools for data discoverability. Researchers and other stakeholders who have an interest in longitudinal studies of health and aging should consider engaging with these initiatives,

___________________

³ https://sites.google.com/umich.edu/loshak/home and https://academic.oup.com/innovateage/article/7/Supplement_1/1155/7490267

⁴ https://academic.oup.com/ije/article/45/1/83/2363877

⁵ https://h3africa.org/

⁶ https://aphrc.org/project/inspire-implementation-network-for-sharing-population-information-from-research-entities/

Page 77 Cite Bookmark

said Ngugi. Also, there is tremendous potential to leverage existing data platforms in order to further research on aging. For example, the National Sample Survey and Evaluation Programme that is generated out of each census has been utilized for national surveys on conditions such as AIDS or malaria; researchers on aging could do likewise. LOSHAK is using the census frames to identify participants for national data collection, collaborating with the Kenya National Bureau of Statistics on this effort. In addition, researchers can explore the potential to piggyback on national surveys that are conducted on a regular basis by adding modules on aging.

Ngugi closed by identifying some of the data sharing considerations and challenges in Africa. The data governance landscape has shifted dramatically over the last several years, particularly since the passage of data protection regulations in Europe in 2016. The number of countries in Africa with comprehensive data protection laws has more than doubled, from about 15 to more than 30. Kenya enacted the Kenya Data Protection Act in 2019; this act made a number of important changes:

enhanced the constitutional provisions on right to privacy;
established the Office of the Data Protection Commissioner;
regulated the processing of personal data;
provided for the rights of data “subjects” and obligations of data “controllers” and “processors”; and
outlined stiff penalties for noncompliance.

Many institutions in Kenya are currently establishing frameworks to ensure compliance, said Ngugi. Although new data collection initiatives need to complete elaborate Data Protection & Privacy Impact Assessments (DPPIAs), research data appear to be exempt from most provisions if they meet certain standards, including specific and adequate consent from subjects, data are sufficiently anonymized, personal data are not transferred outside the country, and a DPPIA is completed and submitted. These new regulations are restrictive, said Ngugi, but not prohibitive. It is critical for all stakeholders, including international partners, to work with local stakeholders to navigate these data governance regulations.

KEY CHALLENGES IN DATA

Andrew Steptoe (University College London) discussed the key data challenges and opportunities that other presenters identified, focusing on 3 questions:

Page 78 Cite Bookmark

How can we use existing cohort studies of younger people in LMICs for aging research?
How robust are retrospective assessments of early- and midlife experience?
What are the challenges of administrative data linkage and sharing?

There are a number of existing cohort studies that could be leveraged to study aging in LMICs, said Steptoe. The Health and Retirement Study (HRS) International Family of Studies began in higher-income countries, but there has been growing involvement of lower-income countries, including Mexico, Brazil, South Africa, China, and Kenya. Studies that focus on younger cohorts may not be immediately relevant to aging research but could be important in the future as cohorts age. These are often not nationally representative but tend to be quite representative in terms of SES. These studies include the Pelotas Birth Cohort study in Brazil, the Cebu Longitudinal Health and Nutrition Study in the Philippines, the Guatemalan Survey of Family Health, and the New Delhi Birth Cohort Study, among others. In addition, the World Bank Living Standard Measurement Study Program, as discussed by Palacios-López, has impressive data on demographics, standards of living, and work across aged cohorts.

Steptoe noted several challenges with repurposing general studies for aging research. First, attrition can be high in longitudinal studies, particularly as people age. Second, studies that were designed for one purpose (e.g., child health) may lack the measures needed to study aging—for example, health behaviors, cognitive changes, or loneliness. Third, it is critical to obtain consent from participants for further contact once the initial study is completed. Failing to do this can cause challenges, and newer data privacy laws have made it very difficult to reestablish contact with participants.

Next, Steptoe discussed the robustness of retrospective assessments of early- and midlife experiences. This is a crucial issue for aging studies, he said, because they often depend on this kind of information. Factors such as early-life SES, adverse childhood experiences, reproductive history, occupational history, and health behaviors in midlife are typically assessed with retrospective life history questionnaires; examples include Survey of Health, Ageing and Retirement in Europe (SHARES), Health and Retirement Study (HRS), China Health and Retirement Longitudinal Study (CHARLS), and English Longitudinal Study of Ageing (ELSA). Life history questionnaires need to be presented with care, said Steptoe, because it can be difficult for some older people to recall the exact timing of different experiences earlier in their lives. Some experiences are easier to recall than others. Marital history, number of children, and occupations are likely to be accurate, but things like body weight or physical activity patterns are very difficult for people to recall retrospectively. There have been several efforts to compare

Page 79 Cite Bookmark

concurrent and retrospective accounts. One such effort found that children whose parents reported at the time that they experienced chronic health conditions and financial hardship did not always remember these realities when asked for a retrospective account later in life. Steptoe said that this is not surprising, given that children may not be particularly aware of their health challenges or their family’s financial problems, but it does highlight a need for caution in looking at these types of data (Smith, 2009).

Finally, Steptoe touched on the challenges of administrative data linking and sharing. Administrative data provide a great source of information, including records from social security, Medicare-type programs, employee-provided pension plan information, national death records, hospital admissions, and outpatient consultations. Linking to these sources is going to vary greatly across countries, he said, and in the United Kingdom and Europe linkage is getting more restricted. Data privacy laws have made stakeholders “very nervous” about sharing with researchers in other countries, making the development of widely available data linkage a serious challenge. This is a critical issue, he said, that needs to be resolved in many countries, and it needs to be resolved at a higher level rather than on a case-by-case basis.

DISCUSSION

The speakers in this session, said Minki Chatterji (National Institute on Aging), presented a number of opportunities for leveraging existing data and also a number of challenges associated with using these data. Some of these opportunities seem like “low-hanging fruit,” but she emphasized the importance of focusing on the most important research questions to answer and working backward toward the data. It can be easy to get “mired” in all the data, but data are not helpful if they are not aligned with the priorities of the research agenda. For example, she said, existing cohort studies present an enormous opportunity for aging research, but it is critical to think about the questions in order to pinpoint the data that are relevant and important. Another major opportunity is triangulating sets of data in order to answer research questions. However, she noted, data sharing remains a big challenge that needs to be addressed.

Ngugi agreed that data sharing is a big challenge, but he said that there are solutions that can move research forward. For example, he and his colleagues are using data science approaches that allow researchers to analyze datasets without holding or viewing the data themselves (e.g., federated analyses, or generating synthetic datasets out of the original data). These solutions are not “100%,” he said, but they may be useful in a context in which data sharing is difficult or impossible.

Mary Ganguli (University of Pittsburgh; workshop planning committee member) noted two concerns with data sharing that need to be ad-

Page 80 Cite Bookmark

dressed. First, there are ways that deidentified data can be reidentified, so it is important to understand and address this concern. Second, some of the disclosures required by data privacy laws—such as telling participants about using their data in national repositories or for commercial application—are difficult to explain to any research participant, but particularly those in LMICs. Ngugi replied that there has always been a concern about researchers using data from people in LMICs and “making their careers out of it.” Some of the hesitations about data sharing, he said, stem from this concern: communities who spend time and resources collecting data want to be able to analyze and control their own data. This is one reason why building research capacity in LMICs is critically important, he said.

Ngugi shared one approach for analyzing data that stay in the hands of others. Ghana has made 10% of the data from their household census available; researchers can develop an analytical code and test it in the 10%. If it works, the researcher works with Ghana to run the code on the entire dataset and extract indicators, but the physical data stay with Ghana. Chatterji added that another approach is using a data enclave; given all of the concerns about data sharing, this could be a high-priority area for investments.

Ganguli asked Palacios-López whether the World Bank would be open to broadening the focus of their living standard measurement surveys to include indicators of health. This could be done in 3 ways, said Weir: (1) add a health component to the survey, (2) select some participants for a follow-on study that focuses on health, or (3) allow a third-party group to follow up with participants to conduct a health study. Palacios-López replied that the first option—adding a health component to the survey—would be the most palatable to the World Bank because the organization follows households over a long period of time and would not want to “hand off” participants to another party. The surveys actually have included health and nutrition indicators, and the World Bank is working to modify the surveys in the future to put a larger focus on health and climate change.