▶ Interview
Minkyung Baek, Professor at the Department of Biological Sciences
Professor Minkyung Baek spent a busiest first semester than anyone else and was appointed a professor in the Department of Biological Sciences at SNU, her alma mater, last September. During that period, she was pondering research projects active in the future after she re-entered the College of Natural Sciences building as a professor, following three years as a student and a researcher. We examined the changes about to occur in the near future with Professor Baek, who is a key developer of the AI protein structure prediction program “RoseTTAFold,” which was selected as the ‘Best Research Achievement of the Year in 2021’ by the international research journal Science.
I thought I escaped Building No. 500, College of Natural Sciences, after completing my doctorate degree in 2018 (laugh). Without knowing it, I moved into Building No. 500 as a professor and spent each day in a hectic manner. This period was critical to examine the needs of my research laboratory and devising plans. Although I considered accepting a decent job offer from a company, I selected an academic career because it would be more beneficial in the long run to increase the number of experts in the field of computational biology than promote corporate development. I feel rewarded whenever students with an interest in computational biology seek advice from me.
Proteins are the most important biomolecules involved in nearly all phenomena that occur in
living organisms. Numerous proteins synthesized through various combinations of amino acids may
become the driving force of life, or induce diseases, depending on their three-dimensional
structures. Thus, understanding the structure of a protein and its functional mechanisms can be
an enormous contribution to comprehending life processes and unlocking the first steps in drug
development. The objectives of computational biology are to predict protein structure
information and interactions through computer-based calculations and understand life processes
in all organisms, including humans, on a deeper level.
Computational biology, as implied in its name, is computation (computer) added to biology. I
conduct research in front of a computer rather than performing experiments. This discipline aims
at accumulating a copious amount of data, ranging from genes to protein sequences, and
understanding life through calculations based on these data. This could be described as
biological research conducted “without getting your hands wet” (laugh).
Proteins have widely different properties, depending on combinations of 20 amino acids, and AI is involved in analyzing and predicting the appearances and functions of different proteins based on protein sequences. There have been prior experimental methods for identifying protein structures, such as X-ray crystallography, or cryogenic electron microscopy (cryo-EM). However, it was time-consuming and very expensive to discover structural data that are visible to people. Despite the known presence of 300 or more million proteins on the planet, it unsurprisingly takes years to identify the structure of a single protein. In contrast, RoseTTAFold uses AI designed to achieve a substantial reduction in time and cost. The capability of AI to rapidly identify protein structures allows this work to be performed within several seconds to a couple of hours. Less than 100% accuracy notwithstanding, it predicts even the range of accuracy. Ultimately, many experimenters nowadays unravel a protein structure via RoseTTAFold, and then verify that structure through experiments.
Although I knew earlier that I had been nominated, I discovered that I was selected when I received interview requests from newspapers. I felt incredibly rewarded. I have been devoted to exploring protein structure prediction since my graduate school years. The recognition by more people than I expected exhilarated me because of their appreciation of the importance of this research. I became more confident that I would be able to persuade others in sharing my research.
I do not believe that AI will replace humans. AI will more likely assist humans with what they need to do. A developer will be free from implementing ideas in detail thanks to the self-learning of AI, whereas users will be able to gain assistance from AI, such as ChatGPT, in proofreading codes or foreign languages. Although existing jobs will shift into other forms, this will lead to other opportunities. All AI may be vulnerable to errors. The research and development to reduce these errors is the responsibility of humans. Moreover, this responsibility is also of those who use AI. Just as I need to verify whether the protein structure predicted by AI is accurate through my research.
It is nucleic acids, such as DNA, that synthesize proteins. The recipe for producing proteins
based on genetic information is stored in nucleic acids. Although the cells of our body have the
same genetic information, they differentiate into different shapes, depending on tissues, such
as eyes, skin, and organs because proteins called “transcription factors” interact with nucleic
acids to determine which proteins should be more or less present.
A comprehension of the basic interactions between proteins and nucleic acids contributes
massively to understanding various life phenomena, including why differences between tissues
appear, how protein expression levels are regulated to respond to changes in the external
environment, and why diseases occur when protein expression levels are abnormal. This
understanding can be utilized to cure diseases, which occur owing to abnormalities in expression
level control, by designing artificial transcription factors that increase or decrease the
expression of a specific protein through binding to a specific nucleic acid sequence. It can be
further applied to improve crops with excellent resistance to drought and diseases. This
comprehension is applicable to the development of gene therapies, also called “genetic
scissors.” This significance motivates us to develop an AI that predicts protein-nucleic acid
interactions, which is still one of the conundrums in the bioscience field. For this purpose, we
are striving to improve prediction accuracy by applying physicochemical theories that determine
the interaction between two biomaterials to AI learning. The development of AI that predicts and
designs protein-nucleic acid interactions will serve a key role in understanding life phenomena,
as well as resolving current social issues, such as disease treatments and food crises. AI will
not replace humans. AI will more likely assist humans with their needs. Despite the shift of
existing jobs into other forms, this will lead to other opportunities.