AI Simulates 500 Million Years of Evolution to Create a Novel Fluorescent Protein

AI Simulates 500 Million Years of Evolution to Create a Novel Fluorescent Protein

In a groundbreaking fusion of artificial intelligence and evolutionary biology, researchers at EvolutionaryScale and the Arc Institute have developed a novel fluorescent protein, esmGFP, using their advanced AI model, ESM3. This achievement marks a significant milestone in computational biology, demonstrating the potential of AI to simulate extensive evolutionary processes and design functional proteins beyond those found in nature.

ESM3, a multimodal generative language model, was trained on an extensive dataset comprising over 3.15 billion protein sequences, 236 million protein structures, and 539 million protein annotations. This training enabled the model to understand and predict the sequence, structure, and function of proteins, effectively simulating 500 million years of molecular evolution in silico.

The researchers prompted ESM3 to design a green fluorescent protein (GFP), a type of protein known for its ability to emit light and widely used as a marker in molecular biology. The AI-generated protein, esmGFP, shares only 58% sequence similarity with its closest natural counterpart, a fluorescent protein from the bubble-tip sea anemone (Entacmaea quadricolor). Despite this significant divergence, esmGFP was synthesized and successfully exhibited fluorescence in laboratory tests, validating the model’s capability to design functional proteins that nature has not evolved.

This advancement holds immense promise for various applications, including drug discovery, environmental monitoring, and synthetic biology. The ability to design proteins with specific functions could lead to the development of new enzymes for breaking down plastics, novel therapeutics, and tools for exploring protein evolution.

As someone deeply engaged in the intersection of AI and biology, the development of esmGFP underscores the transformative potential of integrating computational models with biological research. The capacity of AI to simulate vast evolutionary timescales and generate functional proteins exemplifies a paradigm shift in how we approach biological design and discovery.


For more detailed information, you can refer to the original research article published in Science:

https://www.science.org/doi/10.1126/science.ads0018

And the official announcement from EvolutionaryScale:

https://www.evolutionaryscale.ai/blog/esm3-release 

Comments

Popular Posts