For a better experience, click the Compatibility Mode icon above to turn off Compatibility Mode, which is only for viewing older websites.

Pediatric Big Data Project

Using big data methods to prevent childhood illness

Health professional looking at a map and health data


Neighborhoods can have a major effect on health, and children are particularly vulnerable to the built and social aspects of neighborhoods. This project, a close collaboration between Drexel University, Children’s Hospital of Philadelphia (CHOP), Independence Blue Cross and the University of Pennsylvania, will leverage robust environmental data capabilities to use big data methods to prevent childhood illness.

By studying patterns found in geographically-mapped electronic health records, researchers can better understand factors related to the causes of illnesses, including pediatric asthma, childhood obesity, early childhood caries, as well as helping to prevent avoidable hospitalizations. Researchers will also develop predictors of adverse health outcomes and avoidable hospitalizations by categorizing data by socioeconomic status, race/ethnicity and neighborhood characteristic.

The research team will use big data, machine learning, and traditional epidemiological methods to investigate predictors of pediatric health concerns, including asthma, obesity, and avoidable hospitalizations, as well as causes of disparities in these outcomes. As part of this project, the UHC is developing a broad suite of place-based measures, partnering with CHOP investigators to link these measures to longitudinal electronic health record (EHR) data.


Children’s Hospital of Philadelphia (CHOP) and University of Pennsylvania

Research Methods

Linking geographically-referenced electronic health records (EHR) to place-based physical and social environment data is emerging as a method to study social and environmental determinants of health and health care outcomes.

The project will utilize CHOP EHR data from children residing in Philadelphia and surrounding areas, including individual-level clinical data as well as medication use and hospitalization data. The Drexel team will create a range of spatially-referenced environmental data that can be linked to individual-level patient data in order to investigate environmental health effects. Environmental data will include neighborhood socioeconomic characteristics and measures of segregation, built environment characteristics including measures of food and physical activity environments, air pollution, pollen and traffic exposures, and features of the social environment such as crime and safety.

The methods and processes developed as part of this project will have broad applicability to many EHR datasets, allowing their use to greatly expand the use of this data in public health surveillance, clinical and etiologic research.


Research Team


Funding provided by PA Non-Formula Funding through the Children’s Hospital of Philadelphia (CHOP).