(Left to right) Sophia B. Gibson, Nikhita Damaraju, and Gus Gustafson: 'Approaching their task with equal parts determination, persistence, and dedication, along with a heavy dose of realism.'
Seeking ‘an end to the diagnostic odyssey for individuals and their families’
Working under the direction of BBI’s Danny Miller, M.D., Ph.D., three graduate students, along with colleagues in the U.S. and five other nations, are working on research that promises to address one of precision medicine’s greatest challenges.
They are generating long-read sequencing data from 1000 Genomes Project samples, a diverse collection of samples that can be openly shared, and using that data to identify a broader spectrum of variation, thereby improving the understanding of normal patterns of human genetic variation.
The three students, Nikhita Damaraju, Sophia B. Gibson, and Gus Gustafson, are approaching their task with equal parts determination, persistence, and dedication, along with a heavy dose of realism.
“As we continue to sequence and analyze more samples, our ability to ‘improve our understanding of normal patterns of human variation’ will get continually refined,” said Damaraju, a Ph.D. student in UW’s Public Health Genetics program. “We can observe the impact of using long-reads, particularly in difficult-to-detect regions of the genome. I’m looking forward to seeing this dataset being used as a variant prioritization set for the development of different types of clinical applications.”
Miller and Co-Principal Investigator Evan Eichler, Ph.D., also a BBI member and a UW professor of Genome Sciences, launched this endeavor, the 1000G ONT Sequencing Consortium, in June of 2022, using the Oxford Nanopore platform. Two years into the project, Consortium members from several U.S. universities as well as academic institutions in Belgium, Canada, Mexico, South Africa and the U.K., posted a pre-print paper on medRxiv on March 5 of this year, summarizing initial findings based on data from analyses of the first 100 samples:
“Using multiple structural variant (SV) callers, we identify an average of 24,543 high-confidence SVs per genome, including shared and private SVs likely to disrupt gene function as well as pathogenic expansions within disease-associated repeats that were not detected using short reads…. Together, these efforts will lead to improved clinical outcomes, new gene-phenotype associations, the use of novel therapies, and an end to the diagnostic odyssey for many of the individuals and their families who are living with an unsolved or incompletely understood genetic condition.”
That “diagnostic odyssey” is something Miller is committed to ending. His goal is to use this technology to help families get genetic answers more quickly and accurately.
“More than half of individuals with a suspected genetic condition remain undiagnosed after a comprehensive clinical evaluation,” he said. “Identifying conditions can take years, and involve multiple sample collections and visits to hospitals. Because long-read sequencing data captures the same information as multiple clinical tests used today, a single long-read sequencing experiment could replace nearly all current clinical genetic testing. I believe this technology will increase the diagnostic rate, decrease the time it takes to make a genetic diagnosis, and reduce barriers to accessing comprehensive clinical testing.”
For the graduate students working with Miller, part of the attraction of 1000G ONT Sequencing Consortium is future findings, especially those with potential clinical applications.
“Thanks to long-read sequencing technology, we now know that every human genome has approximately 25,000 structural variants,” said Gustafson, currently working on a Ph.D. in Molecular and Cellular Biology, after working nearly 10 years as a researcher at Seattle Children’s. “Having a large, comprehensive database of structural variation in healthy individuals from all over the world will power us to recognize rare, potentially disease-causing variants more efficiently and effectively, leading to faster diagnosis and, in the long run, treatment.”
Gibson, who is pursuing a Ph.D. in Genome Sciences, notes that, “Essentially, I look at regions of the genome that contain short, repetitive sequence motifs and what the distribution of the copy number is across samples. Repeat expansions reaching a certain size can be associated with disease.”
She believes that “the exciting thing about re-sequencing all of these samples with long reads is that we’re able to cover the highly repetitive gaps that were filled in by the telomere-to-telomere assembly (the first complete, gapless sequence of a human genome, announced in 2022).”
Miller contends the sequencing consortium is another important foundation in research leading to the prevalence of long-read sequencing in clinical practice. His optimism is readily apparent.
“We are laying the groundwork for a future in which complex genetic testing is a routine part of clinical practice,” he said. “For example, I imagine a day when an individual arrives for their annual physical at their general practitioners’ office and discusses the results of a long-read sequencing test that was collected and performed that morning. This could be a screen for variants known to be associated with cancer, or an evaluation of epigenetic changes that are associated with hypertension or diabetes. Honestly, I think the possibilities with this technology are endless.”