Jun 17, 2025, 8:13 PM
Jun 17, 2025, 8:13 PM

Biotech firm unveils one million new microbial species discovery

Highlights
  • Basecamp Research has collected extensive genetic data, uncovering over 1 million new microbial species and nearly 10 billion genes.
  • Experts express skepticism about the potential practical applications of this genetic database without further understanding of the organisms involved.
  • The company aims to utilize this data to train AI models in biology, although the effectiveness of this approach remains uncertain.
Story

In recent years, a British biotech company named Basecamp Research has been engaged in an extensive project focused on collecting genetic data from microbes that thrive in extreme environments across the globe. Their efforts have culminated in the identification of more than 1 million new microbial species and approximately 10 billion genes that are novel to the scientific community. This endeavor aims to generate a significant database of biodiversity, which Basecamp believes can be leveraged to train an artificial intelligence system, often referred to as a 'ChatGPT for biology.' The goal is to enable the AI to answer complex questions about Earth's biodiversity. Despite the excitement surrounding Basecamp's discoveries, skepticism remains within the scientific community regarding the practical applications of the genetic data. Jörg Overmann, a researcher at the Leibniz Institute DSMZ in Germany, has expressed concerns that simply increasing the number of known genetic sequences may not lead to advancements in drug discovery or other areas of chemistry without deeper understanding of the organisms these genes originate from. Overmann emphasizes that the relationship between novel genetic sequences and their functional potential is not straightforward, suggesting that increased genetic diversity does not guarantee useful biological insights. Alongside these concerns, recent developments in machine learning have led to the creation of various models capable of discerning patterns and making predictions based on large datasets of biological information. A notable example is AlphaFold, which was awarded the 2024 Nobel Prize in Chemistry for its ability to predict protein structures from genetic data. While generative models in biology have seen improvement, experts like Frances Ding at the University of California, Berkeley, note that they may not be yielding significantly better results, partially due to the absence of sufficient biodiversity in the training datasets. Basecamp Research believes that by exposing AI models to the extensive array of new microbial data gathered from their expeditions, these models will better comprehend biological processes. Nathan Frey, a machine learning researcher at Genentech, shares enthusiasm about the potential of this work, highlighting a need for innovative data collection methods that prioritize real-world samples over laboratory-sourced datasets. However, researchers remain cautious, pointing out that the future success of AI models in biology hinges on understanding the relevance and functionality of the new genetic information, which could necessitate traditional laboratory research to uncover the mysteries embedded within these newfound organisms.

Opinions

You've reached the end