Interview

Use of AI to Reveal the Secrets of Life

Minkyung Baek, Professor at the Department of Biological Sciences

Professor Minkyung Baek spent a busiest first semester than anyone else and was appointed a professor in the Department of Biological Sciences at SNU, her alma mater, last September. During that period, she was pondering research projects active in the future after she re-entered the College of Natural Sciences building as a professor, following three years as a student and a researcher. We examined the changes about to occur in the near future with Professor Baek, who is a key developer of the AI protein structure prediction program “RoseTTAFold,” which was selected as the ‘Best Research Achievement of the Year in 2021’ by the international research journal Science.

You were appointed a professor in the Department of Biological Sciences at SNU, your alma mater, last fall.
How did you spend your first semester?

I thought I escaped Building No. 500, College of Natural Sciences, after completing my doctorate degree in 2018 (laugh). Without knowing it, I moved into Building No. 500 as a professor and spent each day in a hectic manner. This period was critical to examine the needs of my research laboratory and devising plans. Although I considered accepting a decent job offer from a company, I selected an academic career because it would be more beneficial in the long run to increase the number of experts in the field of computational biology than promote corporate development. I feel rewarded whenever students with an interest in computational biology seek advice from me.

I hear that your interest in research, computational biology, is a “rising” discipline. Could you provide a brief introduction?

Proteins are the most important biomolecules involved in nearly all phenomena that occur in living organisms. Numerous proteins synthesized through various combinations of amino acids may become the driving force of life, or induce diseases, depending on their three-dimensional structures. Thus, understanding the structure of a protein and its functional mechanisms can be an enormous contribution to comprehending life processes and unlocking the first steps in drug development. The objectives of computational biology are to predict protein structure information and interactions through computer-based calculations and understand life processes in all organisms, including humans, on a deeper level.
Computational biology, as implied in its name, is computation (computer) added to biology. I conduct research in front of a computer rather than performing experiments. This discipline aims at accumulating a copious amount of data, ranging from genes to protein sequences, and understanding life through calculations based on these data. This could be described as biological research conducted “without getting your hands wet” (laugh).

AI has expanded the scope of calculations that computers can execute. You are also the developer of “RoseTTAFold,” an AI protein structure prediction software. How does AI function in RoseTTAFold?

Proteins have widely different properties, depending on combinations of 20 amino acids, and AI is involved in analyzing and predicting the appearances and functions of different proteins based on protein sequences. There have been prior experimental methods for identifying protein structures, such as X-ray crystallography, or cryogenic electron microscopy (cryo-EM). However, it was time-consuming and very expensive to discover structural data that are visible to people. Despite the known presence of 300 or more million proteins on the planet, it unsurprisingly takes years to identify the structure of a single protein. In contrast, RoseTTAFold uses AI designed to achieve a substantial reduction in time and cost. The capability of AI to rapidly identify protein structures allows this work to be performed within several seconds to a couple of hours. Less than 100% accuracy notwithstanding, it predicts even the range of accuracy. Ultimately, many experimenters nowadays unravel a protein structure via RoseTTAFold, and then verify that structure through experiments.

When the international research journal Science selected RoseTTAFold as the “Best Research Achievement of the Year in 2021,” this news must have elated you as a core developer.

Although I knew earlier that I had been nominated, I discovered that I was selected when I received interview requests from newspapers. I felt incredibly rewarded. I have been devoted to exploring protein structure prediction since my graduate school years. The recognition by more people than I expected exhilarated me because of their appreciation of the importance of this research. I became more confident that I would be able to persuade others in sharing my research.

The theme of this issue of SNU People is “coexistence with AI.” How do you imagine humanity coexisting with AI in the future?

I do not believe that AI will replace humans. AI will more likely assist humans with what they need to do. A developer will be free from implementing ideas in detail thanks to the self-learning of AI, whereas users will be able to gain assistance from AI, such as ChatGPT, in proofreading codes or foreign languages. Although existing jobs will shift into other forms, this will lead to other opportunities. All AI may be vulnerable to errors. The research and development to reduce these errors is the responsibility of humans. Moreover, this responsibility is also of those who use AI. Just as I need to verify whether the protein structure predicted by AI is accurate through my research.

I heard the news that you will lead the development of an AI that predicts the interaction of proteins and nucleic acids.
Please share which research you plan to focus more on in the future.

It is nucleic acids, such as DNA, that synthesize proteins. The recipe for producing proteins based on genetic information is stored in nucleic acids. Although the cells of our body have the same genetic information, they differentiate into different shapes, depending on tissues, such as eyes, skin, and organs because proteins called “transcription factors” interact with nucleic acids to determine which proteins should be more or less present.
A comprehension of the basic interactions between proteins and nucleic acids contributes massively to understanding various life phenomena, including why differences between tissues appear, how protein expression levels are regulated to respond to changes in the external environment, and why diseases occur when protein expression levels are abnormal. This understanding can be utilized to cure diseases, which occur owing to abnormalities in expression level control, by designing artificial transcription factors that increase or decrease the expression of a specific protein through binding to a specific nucleic acid sequence. It can be further applied to improve crops with excellent resistance to drought and diseases. This comprehension is applicable to the development of gene therapies, also called “genetic scissors.” This significance motivates us to develop an AI that predicts protein-nucleic acid interactions, which is still one of the conundrums in the bioscience field. For this purpose, we are striving to improve prediction accuracy by applying physicochemical theories that determine the interaction between two biomaterials to AI learning. The development of AI that predicts and designs protein-nucleic acid interactions will serve a key role in understanding life phenomena, as well as resolving current social issues, such as disease treatments and food crises. AI will not replace humans. AI will more likely assist humans with their needs. Despite the shift of existing jobs into other forms, this will lead to other opportunities.

I do not believe that AI will replace humans.
AI will more likely assist humans with what they need to do.
Although existing jobs will shift into other forms,
this will lead to other opportunities.