Protein Design: Discussion with David Baker

David Baker is an American biochemist and computational biologist who has pioneered methods to predict and design the three-dimensional structures of proteins.

Located at the University of Washington’s Institute for Protein Design (IPD), the Baker lab aims to develop and apply methods to address current day challenges through the design of new synthetic proteins.

In early 2018, David Baker was awarded the Hans Neurath Award winner by the Protein Society for his “recent contribution of unusual merit to basic protein science.”

The Society writes that David’s breakthroughs “reduce to practice what was for many decades the holy grail of protein science: fundamental understanding of the determinants of protein structure and stability that leads to consistent predictive capabilities, including the ability to design protein shapes and functions as desired.”

Discussion with David Baker

Can you tell us your background and how you got into protein design?

I was an undergraduate at Harvard University, then a graduate student at University of California, Berkeley, where I worked with Professor Randy Schekman, on protein transport in yeast. He ultimately won the Nobel Prize a few years ago for that work. I subsequently moved to the University of Washington as a professor and initially started working on how proteins fold experimentally. Once we had learned something about how proteins fold, we started developing computer algorithms for predicting protein structure from amino acid sequence. After we had gotten pretty good at predicting structure from amino acid sequence, we realized we could turn things around and design new structures using these principles: structures that aren’t found in nature. That’s the protein design problem.

Why did protein design peek your interest? Was there a specific sort of problem that you were interested in addressing? Or was this a scientific interest that you already had in the biochemistry of proteins?

In college, I took a biochemistry class and learned about the Protein Folding Problem: how all of the information needed to specify the structure and function of proteins is encoded in their amino acid sequences. I found the concept of having this code in the amino acid sequence fascinating. How this code specifies a three-dimensional structure and function. I was VERY interested.

Later on, I went to graduate school working on something completely different, more related to cell biology problems. Near the end of this cycle, when I decided to do postdoctoral research at the University of California, San Francisco, and I thought it would be interesting to return to my earlier interest in protein folding. So, when I was a postdoc, I did some ‘special case’ work on protein folding, but it’s when I came to the University of Washington that I became 100% focused on the fundamental Protein Folding Problem.

It seems that the Institute for Protein Design requires a lot of interdisciplinary work and collaboration. I was struck by the number of disciplines for which you’re an adjunct professor. On top of that, you do your own research, you publish, and you present publicly. Can you tell us about your day-to-day process?

Let me just build on something that you said, one of the reasons why I found protein folding absolutely addictive once I started working on it, is that it really is an incredibly cross-disciplinary problem. You can attack it from many, many different angles. Let’s briefly go through that:

Illustration I. Click on the tabs below to view details on each discipline:


Coding icon (Credit: Krafted from Envato Elements)}

Fundamentally, you can think of it as a coding problem. So, you have this sequence. It’s like a code that encodes a protein structure and function. Basically, the sequence completely specifies what the protein structure is and how it functions. So in some sense, that’s a computer science problem. You need an algorithm that can read out the code. It’s like a breaking-the-code problem.


Physics icon (Credit: Krafted from Envato Elements)}

The folding of proteins is really governed by the laws of physics. And so, it’s a physics problem.


Chemistry icon (Credit: Krafted from Envato Elements)}

It’s a chemistry problem because proteins fold in a complex environment.


Biology icon (Credit: Krafted from Envato Elements)}

It’s also a biology problem because protein folding is really the key to biology. It’s the simplest case of biological self-organization. Without protein folding, no living things would exist.


Biochemistry icon (Credit: Krafted from Envato Elements)}

It’s a biochemistry problem because biochemistry studies the interactions within and between biomolecules, and it’s really a fundamental part of that.


Statistics icon (Credit: Krafted from Envato Elements)}

In some sense it’s a statistics problem because you have so much data now. It’s kind of a big-data problem. There’s a huge amount of genome sequenced data that you can analyze.


Engineering icon (Credit: Krafted from Envato Elements)}

It’s an engineering problem because you can make new proteins to solve new problems. So that’s why I’m an adjunct professor in biological engineering and in chemical engineering. For chemical engineering, because we make new catalysts that catalyze new chemical reactions, and biological engineering because we can make new types of drugs and new types of biomaterials.


Genomic Science (Credit: Krafted from Envato Elements)}

I’m also an adjunct professor of genome sciences because understanding protein folding is really key to interpreting the genome.

It’s exciting because we’re right in the middle of everything and so the people working in my group come from all of those different areas and have all of those different interests, which we meld together in one pot.

Does that mean then that, when you’re teaching the different groups, are you teaching about protein design just through the lens of each of those specific disciplines? Or are you an expert in all those areas?

One of the great things about having such an international research group with probably the smartest, the most talented people interested in this area is that different people have different areas of expertise, and a lot of what goes on in our group is people teaching each other things that they’re expert in and learning from others the things that they didn’t know. It’s a communal brain.

I think of each researcher as a single neuron and, if everyone is working independently, you get lots of independent neurons. But if everyone works close together and communicates, and different researchers -like different neurons- have different capabilities, you get some powerful emergent properties. A lot of what I do is trying to recruit the best people, then the research environment here is basically one huge room – and I always make sure everyone is always talking to everybody so that people who have complementary expertise connect to work on a problem together. And then we just have a huge amount of communication at all different levels, group sizes, within the group.

I couldn’t be an expert in all of those areas, I know a little bit about all of them, but there are people in the group who know much more individually about any of those.

A huge part of your critical role also is in group management and ensuring that this type of communication goes on and happens at the highest level that it can be.

Yes. A key role for me is to make sure communication goes on at the highest level. It’s not a management issue in the sense that my desk, where I spend most of my time, is in the lab also. And while I work on my own project, I also wander around the lab to get little groups of people to start talking, and then I move on to the next group, so I am making connections. This is similar to a brain when neurons start connecting: I am trying to make lots of connections every day but in a very informal way as I don’t believe in hierarchy. It’s really a flat organization where everybody is kind of at the same level. There isn’t really much formal management.

That probably also helps facilitate the open dialogue.

That’s the idea so that, in a flat environment, everyone’s voice is heard equally; versus when you’re dealing with a lot of hierarchy, then you only hear the people right below you. Again, in the brain, it’s not like there’s a hierarchy – or maybe a hierarchy of neurons – but the more connections each person has, the better.

Can you talk about the big picture in terms of the institute’s goals and your research in general? What are the real world applications of those goals?

Definitely. Proteins carry out essentially all of the important functions in living things. They are involved in capturing solar energy and using that energy to build up molecules. They’re involved in basically doing all the chemistry that happens in biology, though they catalyze all the chemical reactions. All of developmental biology, the interactions within and between cells are mediated by proteins. And they form the materials that hold living things together: from abalone shells considered protein inorganic to interacting with inorganic minerals such as hair, collagen.

Because of that variety, there’s a wide range of applications of protein design. Although proteins do amazing things, they all are historical accidents that came through evolution. There was no plan to make anything that goes on in a human body. It just happened accidently through evolution. So, the principle of the Institute for Protein Design is, now that we understand the fundamental concepts underlying protein folding and protein design, we should be able to design new proteins built from first principles to solve the problem that we’re seeking to solve.

One way of looking at it: if you consider early human technology, back when we all lived in caves, you went outside your cave and you looked around for – say you wanted to solve a problem and it took digging – you would look around for a bone that might have roughly the right shape, and then you might sharpen it a bit. Or if you wanted to make a spear, you’d sharpen a stick or a bone. If you wanted to cross a river, you’d find a log and roll it over to cross a stream. The basic idea is: you examine the environment around you and modify things to your purposes. Of course, that’s not how technology works in the modern world. If you want to build a building or a bridge, you don’t start by going into a forest and looking for something that has roughly the right shape.

Yet, that’s really how engineering in the biological realm is currently done. The field of protein engineering basically means you find a protein that exists in nature and that does ‘sort of’ what you want, then you tweak it a little bit. At the Institute for Protein Design, we refer to that as “Neanderthal protein design” because it’s very much in that spirit of modifying what you find around you. The big picture for us is: learning how to build proteins completely from scratch.

If you wanted to build an airplane, you could try to modify birds and attach baskets to them – so you could get people to fly. However, you wouldn’t get very far because you really can’t get birds to carry people. Instead, if you understand the principles of flight by studying the aerodynamics, then you can learn actually how to build flying machines. That’s a pretty good analogy for what we do: we study the principles of protein folding by studying naturally-recurring proteins, we learn from those principles and then we design brand-new proteins that have superior properties in many ways.

The applications are in:

  • Medicine, vaccines — making and designing protein vaccines that elicit much stronger responses.
  • Therapeutics — that are more specific and more potent than the ones available today.
  • Materials — types of materials that are suited for today’s medical needs.
  • Diagnostics — more sensitive ways of detecting compounds of interest. Then, outside of the medical realm,
  • Catalysis — catalyzing chemical reactions for which there aren’t current catalysts. New self-organizing, nanomaterials.

Those are probably the main areas. We’re contemplating protein base computing as well.

Is there a process by which you determine which proteins you’re going to create? How do you decide that?

Like I said, the problem of finding the best things to work on is a challenge because proteins do basically everything in living things. Across the full range of functions that proteins carry out, you’re trying to make better ones. The way that we actually do things is: 1) A collaborative idea-generation where all the neurons, meaning everyone, are talking all the time. That’s one way in which new ideas come about. 2) I get daily e-mails from collaborators, or potential collaborators, asking if we’ve thought about a different, new type of problem where our technology could be relevant.

Obviously, we all follow the scientific literature. We’ve talked a lot to both private foundations, like the Michelson Medical Research Foundation headed by Dr. Gary K. Michelson, and to pharmaceutical companies who come to us with their own set of problems and ask if our technology can be applied to them. Lab members will go to meetings, they’ll talk to people, and we get all these different inputs and then we filter through it. That’s sort of how we decide what to do. Sometimes people come to my group with specific ideas and things that they’re excited about working on. It’s really a pretty wide range of ways. I can’t claim that we’ve got the optimal algorithm for deciding what we should do. [laughs]

With your current setup, are you limited to what you are actually working on? Is there a potential to share the software technology with others so that they can benefit from it as well?

The software we developed, Rosetta, is freely available to academics and nonprofit institutions. It’s available to anybody at the company. They just have to pay the University of Washington (UW) a relatively modest license fee. The software we make is completely available but we’re always improving it. There’s about fifty people, former graduate students and postdoctoral fellows who’ve trained and who’ve worked with me, some are professors at internationally renowned institutions, who are working on protein design as well.

Are you involved in discussions geared towards a general public, rather than the scientific community? Do you have a general message, or what you hope to get across to the public about, whether it’s protein design specifically, or the importance of funding this research? You’ve hit on a lot of important points and attributes.

Yes. In addition of the public outreach efforts the institute carries on, let me describe the main two public-facing projects that are internationally available:

  • Rosetta@home — That’s a project where everything we do is highly computer-intensive. When we have computing work jobs to do, we send those to Rosetta@home, which then distributes them to volunteers all around the world who contribute spare cycles on their computers.

    This is really the key – we couldn’t do what we’re doing without the volunteers in Rosetta@home. Because what they’re actually doing is testing all of the proteins we designed before we make them. Their computers will do the calculations then send us back the results.

  • Foldit — grew out of Rosetta@home due to user feebdack. Where Rosetta@home offers a screensaver up that shows the protein folding up, Foldit allows program participants to directly interact with their computer. It’s basically an online game where you’re trying to design the best proteins. Recently Foldit players have been designing proteins from scratch quite successfully.

The reason to have these two projects is, first of all, to get people to talk and get excited about the area of protein folding. Participants in Rosetta@home and Foldit get a feeling for it and can get more actively involved as they get more experienced. We try and give as much feedback we can about their progress, about the current state of protein folding and how we’re using their resources.

Fundamentally, the practice of science depends on how society decide to invest resources into an area of science. It’s highly important to communicate that to the public: what’s coming out of it and why it’s so important.

Carl Zimmer recently covered your work in The New York Times, and it crossed my mind that, with some of the political issues surrounding science and some of the misunderstandings, I wondered if protein design and synthetic proteins might stir the political reactions provoked by GMOs. Is this a concern you’ve dealt with in any way?

That’s an interesting point. We have not had that problem. Because we’re focused on the most immediate application that people think about which is drugs, therapeutics. I get lots of e-mails from people, or relatives of people, who suffer from a disease. They just want a cure and they don’t expect that cure to be something natural or already available. Indeed, there’s no particular reason why a cure should be a natural compound. I think the whole GMO issue doesn’t really come up with therapeutics. They are natural or naturopathic medicines but there’s not this view today that those are somehow superior to modern medicine. The fact is that most people would feel such natural solution is inferior. Most people feel differently when their health is at stake compared to their food.

(Interview took place on February 20th, 2018)

Image Credit

Nisha Kaul Cooch is Founder and Principal of BioInnovation Consulting LLC, a life sciences communications firm based in Washington DC. While earning her Ph.D. in Neuroscience, she studied the nature of decision making and information processing. The focus of her current work is entrepreneurship in the biotechnology industry.

Nisha Kaul Cooch is Founder and Principal of BioInnovation Consulting LLC, a life sciences communications firm based in Washington DC. While earning her Ph.D. in Neuroscience, she studied the nature of decision making and information processing. The focus of her current work is entrepreneurship in the biotechnology industry.

Nisha Kaul Cooch is Founder and Principal of BioInnovation Consulting LLC, a life sciences communications firm based in Washington DC. While earning her Ph.D. in Neuroscience, she studied the nature of decision making and information processing. The focus of her current work is entrepreneurship in the biotechnology industry.

Nisha Kaul Cooch is Founder and Principal of BioInnovation Consulting LLC, a life sciences communications firm based in Washington DC. While earning her Ph.D. in Neuroscience, she studied the nature of decision making and information processing. The focus of her current work is entrepreneurship in the biotechnology industry.


Gary Karlin Michelson, M.D. and Alya Michelson from the Michelson Medical Research Foundation are proud benefactors of the Institute for Protein Design at the University of Washington. Their generous support advances cutting edge research towards a revolutionary protein design pipeline.

Gary Karlin Michelson, M.D. and Alya Michelson from the Michelson Medical Research Foundation are proud benefactors of the Institute for Protein Design at the University of Washington. Their generous support advances cutting edge research towards a revolutionary protein design pipeline.

Gary Karlin Michelson, M.D. and Alya Michelson from the Michelson Medical Research Foundation are proud benefactors of the Institute for Protein Design at the University of Washington. Their generous support advances cutting edge research towards a revolutionary protein design pipeline.

Gary Karlin Michelson, M.D. and Alya Michelson from the Michelson Medical Research Foundation are proud benefactors of the Institute for Protein Design at the University of Washington. Their generous support advances cutting edge research towards a revolutionary protein design pipeline.