Skip to main content

Choose your Champion! Task-Specific vs. General Models

Should AI models be like Swiss Army knives, versatile and handy in a variety of scenarios? Or do we prefer them as precision tools, finely tuned for specific tasks? In the world of artificial intelligence, and natural language processing specifically, this is an ongoing debate. The question boils down to whether models trained for specific tasks are more effective at these tasks than general models. Task-specific models: specialization and customization In my last blog post , we looked at the rise of personalized LLMs, customized for specific users. Personalized LLMs can be seen as an extreme form of task-specific model. Fans of task-specific models stress that these kinds of models are better suited for tasks involving confidential or proprietary data. This is obviously true. But some people also believe that specialized models necessarily perform better in their specific domains. It may sound logical, but the ans...

Don't Look Now, but the Bots Are Designing New Proteins...

A robot sitting

Picture tiny protein architects effortlessly combining like pieces of an intricate puzzle to build nanoscale structures with mind-boggling precision.

Dream or nightmare?

These self-assembled protein structures hold the promise of creating entirely new materials with properties that defy our current imagination. But there are those who fear they also hold the key to the annihilation of all humankind…

Welcome to the fusion of machine learning (ML) and protein synthesis. It’s not so far away as you might think.

Say the words “artificial intelligence,” and most people today will probably think of the large language models like ChatGPT or any of the AI art generators.

But many other ML techniques are used in various fields with equally exciting applications. Protein prediction and synthesis is one such area. ML is making remarkable advancements with implications for biotechnology and materials science.

It works like this: ML algorithms are used to analyze protein sequences and structures. They can then forecast their shapes, functions, and interactions. Deep learning architectures like DeepMind's AlphaFold2 have achieved impressive accuracy. And it’s important because accurate protein prediction can help us understand biological processes. This can help us develop new drugs.

As a logical next step, ML is now being applied to protein synthesis as well. Using AI algorithms, researchers are starting to design new proteins with desired characteristics. For example, they have succeeded in creating more efficient enzymes for speeding up chemical reactions. The ones that, as mentioned in the introduction, can self-assemble into specific structures are not quite there yet, but it’s a rapidly evolving area of research. This will have wide-ranging applications in biotechnology and materials science, including pharmaceuticals, renewable energy, and materials engineering.

However, not everyone is excited about the possibilities of this technology. Some see a scenario where this may all go awry, providing a doorway for a hostile Superintelligence to enter the physical realm…I will touch on this a bit lower down. Let’s first let’s dive into the details of protein prediction and synthesis. How does it work? Who’s doing it? And how far have we come? After all, a journey of a thousand miles begins with a single amino acid sequence . . .

Predicting protein structures: How and why?

Predicting protein structures can help us understand how proteins work.

Proteins perform many functions in living organisms. For example, they catalyze chemical reactions, provide structural support, and facilitate communication between cells. The 3D structure of a protein plays a crucial role in determining its function.

Proteins are made up of long chains of amino acids that fold into specific 3D shapes. The folding of a protein is driven by its amino acid sequence and interactions between the amino acids.

Protein structure prediction refers to the computational methods and algorithms used to predict the 3D structure of a protein based on its amino acid sequence. These methods use various principles and techniques from physics, chemistry, and computer science to simulate and model protein folding.

By predicting protein structures, scientists can gain insights into how proteins function at a molecular level. The structure of a protein provides information about its binding pockets and interaction surfaces. This is crucial for understanding how the protein carries out its biological function. Knowledge of protein structures can aid in drug discovery and design, as drugs often target specific proteins and their active sites.

This can speed up the development of treatments for diseases like cancer, Alzheimer's, and infectious diseases.

A game called Go

DeepMind’s AlphaFold and AlphaFold2

In March 2016, DeepMind's AlphaGo program managed to defeat a renowned player of board game Go. The objective of the game is to control more territory than the opponent.

AlphaGo’s victory showed that DeepMind’s AI techniques had potential for scientific challenges. Following this, they formed a team to work on protein structure prediction.

In 2018, AlphaFold performed very well in the Critical Assessment of Protein Structure Prediction (CASP). This competition is held every two years. It evaluates the performance of computational methods in predicting protein structures. This performance led to the development of AlphaFold2, which won CASP in November 2020, solving the protein-folding problem with high accuracy.

DeepMind shared the system's details, code, and predictions and published papers in Nature. They also launched the AlphaFold Protein Structure Database with, at the time of writing, more than 200 million structures.

Traditionally, scientists used experimental techniques like X-ray crystallography and nuclear magnetic resonance spectroscopy to determine protein structures.This was expensive and time-consuming. However, it generated a lot of data that could be used to train AI algorithms. AlphaFold2 combined deep learning and structural biology insights to accurately predict protein structures based on their amino acid sequences as explained above.

How does AlphaFold2 work?

AlphaFold2 is trained using a vast dataset of known protein structures. By analyzing them, AlphaFold2 learns the patterns and principles that govern protein folding.

AlphaFold2 uses deep learning techniques. Specifically, it uses a type of neural network architecture called a transformer. Transformers have been widely successful in natural language processing (NLP) tasks, such as translation and text summarization. Researchers have adapted these techniques to the field of protein folding prediction. In NLP tasks, the transformer network learns to recognize complex relationships and dependencies in text. Similarly, AlphaFold2 uses this architecture to analyze and understand the intricate relationships in protein sequences.

A key component of AlphaFold2's success is its use of an attention mechanism. This allows the AI system to focus on relevant parts of the protein sequence and their interactions. It reduces the computational search space and improves prediction accuracy.

AlphaFold2 compares the sequences of related proteins and looks for patterns and similarities. This helps identify parts of the protein sequence that remain similar or unchanged across related proteins. AlphaFold2 uses this evolutionary information to improve structure prediction accuracy.

Given a protein's amino acid sequence, AlphaFold2 uses the trained model to predict its 3D structure. The system generates multiple potential structures and assigns confidence scores to each prediction.

AlphaFold2 then adjusts the predicted structures based on extra information. This could include physical constraints or energy calculations.

The final output of AlphaFold2 is a predicted protein structure, represented as a 3D arrangement of atoms.

Person Holding Black Pen Sketching Flower

Designing proteins

A team of biochemists led by David Baker at the University of Washington went beyond that. At first, they developed an AI system called RoseTTAFold. Like AlphaFold2, it could also predict protein structures. But they had even bigger plans. They wanted to use AI to design new proteins for specific purposes. So they did just that. The proteins designed by Baker’s team were then synthesized and produced in live cells.

The ability to design new proteins with specific functions has significant applications in biotechnology and materials science. Custom-designed proteins can pave the way for novel treatments, targeting specific diseases or biological processes with enhanced precision.

By customizing protein structures, researchers can also create new materials with unique properties and improve industrial processes, leading to advancements in areas such as drug delivery systems, biofuels production, and biomaterials engineering. The ability to engineer proteins opens up a world of possibilities for innovation and scientific exploration.

Baker’s team focused on enzymes called luciferases, which produce light in organisms like fireflies. They wanted to create synthetic luciferases that could bind to a man-made molecule called luciferin and remain stable. Using a combination of AI techniques, the team designed thousands of new, custom proteins that don't exist in nature. They then tested these designs to see which could produce light when treated with luciferin. Only a small percentage of the designs worked, but this was still a significant achievement. They used the knowledge gained from this experiment to design more luciferases for different shapes of luciferin.

Baker's team is now working on another AI system called RFdiffusion to improve protein design further. They plan to use it to create a synthetic protein for a nasal spray that can block influenza viruses from infecting cells. They believe that, in future, AI could also be used to design biomaterials, enzymes that break down plastics, and proteins that can capture solar energy.

Man Using Macbook Pro and Looking Worried

Why some people worry

There are those who worry that a hostile AI Superintelligence may eventually take over the world or worse. Some of these ideas may sound like bad sci-fi, but many AI experts agree on at least this basic premise: We should be careful.

In the case of protein synthesis, AI researcher Eliezer Yudkowsy foresees a rather specific scenario. Yudkowsky's fear is that eventually complex AI may become so adept at understanding and manipulating the folding patterns of proteins that they will be able to generate DNA strings. They could then email them to online laboratories that offer services like DNA synthesis. A human intermediary, for financial or other reasons, might be persuaded to receive the FedExed vials containing the synthesized proteins and mix them according to specifications.

The outcome of these steps will be a basic "wet" nanosystem made up of the newly created proteins. This nanosystem, like ribosomes in cells, will be able to receive outside instructions. The next step in this futuristic scenario, as Yudkowsky sees it, would be that the nanosystem could be used to construct more sophisticated systems. These increasingly advanced systems would then be capable of building even more complex structures and the rest could really be, to misuse the famous phrase, the end of history ...

... and why some worry less

We should note here that most AI experts are cautious, but not to this extent. In an interview with ABC News in March, Sam Altman, the CEO of OpenAI (the company behind ChatGPT), said AI "will be the greatest technology humanity has yet developed."

"We can all have an incredible educator in our pocket that's customized for us, that helps us learn ... We can have medical advice for everybody, that's beyond what we can get today. We can have creative tools that help us figure out the new problems we want to solve – wonderful new things to co-create with this technology, for humanity."

When the journalist asked him, "Is there a kill switch, a way to shut the whole thing down?" he answered, "Yes. What really happens, is any engineer can just say we're going to disable this for now."

The interviewer then asked if the AI model could become more powerful than humans. Altman's answer was, "In the sci-fi movies, yes. In our world, this model is sitting on a server. It waits until someone gives it an input."

A little girl making friends with a short robot

In short

The future holds immense promise as AI continues to shape the landscape of protein synthesis. ML algorithms are propelling us toward a deeper understanding of biological processes, more effective drug development, and the creation of materials with properties beyond our current imagination. This marriage of AI and protein synthesis may not get as much airtime as the language models and art generators, but it is revolutionizing biotechnology as we know it.

Comments

Popular posts from this blog

Why the Bots Hallucinate – and Why It's Not an Easy Fix

It’s a common lament: “I asked ChatGPT for scientific references, and it returned the names of non-existent papers.” How and why does this happen? Why would large language models (LLMs) such as ChatGPT create fake information rather than admitting they don’t know the answer? And why is this such a complex problem to solve? LLMs are an increasingly common presence in our digital lives. (Less sophisticated chatbots do exist, but for simplification, I’ll refer to LLMs as “chatbots” in the rest of the post.) These AI-driven entities rely on complex algorithms to generate responses based on their training data. In this blog post, we will explore the world of chatbot responses and their constraints. Hopefully, this will shed some light on why they sometimes "hallucinate." How do chatbots work? Chatbots such as ChatGPT are designed to engage in conversational interactions with users. They are trained on large ...

Chatbots for Lead Generation: How to harness AI to capture leads

What is lead generation? Lead generation is the process of identifying and cultivating potential customers or clients. A “lead” is a potential customer who has shown some interest in your product or service. The idea is to turn leads into customers. Businesses generate leads through marketing efforts like email campaigns or social media ads. Once you have identified one, your business can follow up with them. You can provide information, answer questions, and convert them into a customer. The use of chatbots for lead generation has become popular over the last decade. But recent advancements in artificial intelligence (AI) mean chatbots have become even more effective. This post will explore artificial intelligence lead generation: its uses and methods. We’ll specifically look at a chatbot that has been drawing a lot of attention: ChatGPT . What is ChatGPT? ChatGPT is a so-called “large language model.” This type of artificial intelligence system ...

Liquid Networks: Unleashing the Potential of Continuous Time AI in Machine Learning

In the ever-expanding realm of Artificial Intelligence (AI), a surprising source has led to a new solution. MIT researchers, seeking innovation, found inspiration in an unlikely place: the neural network of a simple worm. This led to the creation of so-called "liquid neural networks," an approach now poised to transform the AI landscape. Artificial Intelligence (AI) holds tremendous potential across various fields, including healthcare, finance, and education. However, the technology faces various challenges. Liquid networks provide answers to many of these. These liquid neural networks have the ability to adapt and learn from new data inputs beyond their initial training phase. This has significant potential for various applications, especially in dynamic and real-time environments like medical diagnosis and autonomous driving. The strengths of scaling traditional neural networks While traditional n...