Mave DB | Brotman Baty Institute

Mavedb 2024

Large-scale variant effect maps yield insights into protein function, structure, and evolution, as well as improve computational prediction, and provide insights for clinicians. However, researchers have found that those maps have shortcomings in the availability, discoverability, and dissemination of data. Many published articles describing large-scale variant effect mapping do not provide variant effect scores for all assayed variants. BBI faculty and researchers want to address this shortcoming. In collaboration with the University of Washington (UW) and the Melbourne-based WEHI, they have established MaveDB, an open-source, public repository for datasets from Multiplexed Assays of Variant Effect (MAVEs), such as those generated by deep mutational scanning or massively parallel reporter assay experiments. MaveDB is hosted by the Fowler Lab in the UW Department of Genome Sciences in Seattle. This central repository enables researchers to store and publish processed MAVE datasets and metadata, as well as to link raw data using a format that is machine-readable, standardized, and searchable. A web interface allows researchers to make the data readily accessible for clinical applications, meta-analysis, or reanalysis as computational methods are refined. Moreover, other applications include MaveVis, which visualizes and provides context to protein variant effect maps by generating heatmaps and integrating them with secondary structure, surface accessibility, interaction interfaces, and conservation data. It is the first of potentially many such applications advancing knowledge and practical use of MaveDB. The creators of the database also envision developing additional applications, such as tertiary structure analysis, automatic imputation of missing values in variant effect maps, and a dashboard to assist researchers’ interpretations of datasets. MaveDB, MaveVis, and other systems simplify, standardize, and democratize MAVE data analysis. These tools represent the foundation of a community-driven, open-source platform that allows researchers to explore these comprehensive datasets. The effects of each dataset will increase as the number of assayed variants grows, thereby contributing to an expanding and more thorough understanding of genetic variation and sequence function.

View Dataset >