UMN Medical School Researchers Creating World’s Largest Genomic Data Set for Acute Lymphoblastic Leukemia

Drs. Logan Spector and Saonli Basu and their research teams have received NIH funding to pool childhood cancer data to be used globally to maximize risk prediction in children and infants.


Dr. Logan Spector, professor of pediatrics at the University of Minnesota Medical School, and Dr. Saonli Basu, professor of biostatistics at the University of Minnesota School of Public Health, are working together on a five year project to build the world’s largest repository of germline genomic data for acute lymphoblastic leukemia (ALL) and other childhood cancers.

Dr. Spector has been studying childhood cancers for over 20 years. In addition to his responsibilities as a professor at the U of M, he is also the chair of the Childhood Cancer and Leukemia International Consortium (CLIC), whose aim is to answer the question: why do kids get cancer? Childhood cancer is rare; according to Dr. Spector, ALL among children in the United States is roughly 60 cases per million.

“And that’s the most common childhood cancer,” he elaborates, “so they get more rare from there. Some of the cancers we’re going to pool are literally one in a million.”

Utilizing Dr. Spector’s connections at CLIC, along with publicly available data, to collect genotypes for roughly 20,000 children all over the world with ALL, the team will build an enormously diverse genomic database that may help understand the heritability of ALL in different ancestries. 

“We’ll be doing more discovery of gene variants that raise or lower the risk of ALL, and essentially we’ll be doing the world’s largest Genome Wide Association Study (GWAS) for ALL,” explains Dr. Spector.

Dr. Spector invited Dr. Basu to be a co-principal investigator on the grant because he knew that the research needed a strong statistical geneticist to propose the best analyses. Dr. Basu has a wealth of experience applying statistical methodologies and computational algorithms in genetic data sets to understand the genetic bases of a variety of complex diseases. 

“I’m really excited to be involved,” Dr. Basu says. “It’s a complex disease. Our research could provide some developing intervention strategies or identify genes and environment, how they contribute to pediatric cancer itself. It’s a challenging task, but it’s an exciting thing if we can contribute to that field.”

Dr. Basu has no easy task; typical statistical techniques used in GWAS won’t work well for this data because of the vast ethnic diversity. 

“The preprocessing steps are particularly challenging,” she explains. “We are developing analysis like quality control pipelines and guidelines to make sure that the same process is followed everywhere. We’ll have one unique way of harmonizing and integrating all these data sets.”

The team has five years to complete the research per the stipulations of their grant, and they have hit the ground running.

“We haven’t done data cleaning on this scale yet, but my hope is that we will have an analytics data set in about six months, which means we’ll have another four years to finish all the aims,” Dr. Spector says.

With data coming from all over the world, the U of M will serve as the data coordinating center for this massive genomic study, and Dr. Spector and Dr. Basu are excited for their team to be at the helm of such a large project with the potential to make an indescribable impact in the lives of children and families affected by childhood cancer.

“It’s possible to screen blood for very rare mutations,” Dr. Spector remarks. “If we understand what makes a child at high risk for ALL, we could periodically sample blood to see if there’s any sign of early mutation. This is very speculative and far in the future, but these possibilities are what’s kept me in this field. It’s the mission to help children and hopefully prevent some childhood cancer, meaning they and their families would never have to go through it."