hitchhiker s guide to the galaxy
In a comical twist of fate, it turns out that The Hitchhiker’s Guide to the Galaxy was telling the truth: the answer is 42.
Life scientists have confirmed that Douglas Adam’s postulate that 42 is the “answer to the ultimate question of life, the universe, and everything” really is 42; 42 million, that is. That turns out to be the number of proteins present inside of an ordinary cell, the basic building block of life on this planet.
DNA and proteins are two of the most important molecules inside of a cell. DNA is the “blueprint” for the “house” that is you, while proteins are the actual wood, cement, and wiring of your body.
The goal of the “Human Genome Project,” one of the most expensive international research endeavors in modern biology, was to catalog every single gene found in the DNA of every person on Earth. Scientists hoped that by sequencing all the genetic information in human DNA that we could account for the complex behavior and physical traits of people.
Surprisingly, the predicted number of genes fell short from the predicted number by more than two thirds, which meant that a lot of the diversity of human traits was on the protein level. Thus, the focus shifted to figuring out the identities and functions of all the proteins in human cells, the human “proteome.”
Proteins are responsible for all life activities inside of a cell, and knowing how many proteins are present at a given time provides insightful information into the relative “health” of a cell.
For example, if a cell becomes infected with a virus, unique proteins designed to fight the infection will have their levels increased and knowing which proteins are elevated hints at their natural function. The identification of these proteins allows scientists and doctors to design new therapies.
Many efforts have been made to quantify the levels of all proteins encoded in a cell, the cell’s “proteome.”
Unfortunately, none of these efforts has provided the complete picture with a generally accepted number of proteins. In the report “Unification of Protein Abundance Datasets Yields a Quantitative Saccharomyces cerevisiae Proteome,” the authors attempt to coalesce the protein abundance data from 21 separate studies into a single value of “molecules per cell” (no small task given the complexity of the compiled data).
The researchers chose to examine yeast cells of the “budding yeast” Saccharomyces cerevisiae, which represents one of the most well-studied single-celled life form in biology mostly because it is used for converting sugar into alcohol.
Life scientists like yeast cells because they behave very similarly to human cells. Thus, when human cells cannot be tested, yeast cells provide an informative substitute.
Since the 21 different quantification studies used different mathematical units, this study sought to unite all the distinct values into a single “common unit” that would be an intuitive representation of protein levels, such as: “molecules per cell.”
The researchers used three distinct methods, producing values that correlated well with each other, suggesting no bias depending on the protocol selected.
Moreover, the analyses showed that some experimental methods for evaluating protein concentration were more sensitive than others, though still useful in developing a reasonably accurate assessment of the abundance of 92% of the yeast proteome.
Two distinct methods provided overlapping results regarding the functions of proteins that were most abundant versus those that were the least, which gave confidence that the final count of the proteins was indeed correct.
The most abundant proteins were associated with the production of new proteins. Proteins responsible for the actual shape of a cell were also found at relatively high levels.
In contrast, proteins necessary for cell growth and DNA repair were the most underrepresented. Some of the proteins found in the yeast cell proteome were not detected at all, suggesting they are only produced under special circumstances.
The researchers were also able to define what constitutes a “low abundance” versus a “high abundance” protein. A protein can be classified as “low abundance” when it exhibits 3 to 822 molecules per cell, while a “high abundance” protein has 140,000 to 750,000 molecules per cell.
The average level was 2,622 protein molecules per cell. Using the combined datasets, the authors concluded that the total protein molecules per yeast cell is 42,000,000; that is notably slightly more than half the theoretical number of proteins predicted per cell by standard calculations: 79,000,000. This is also interesting given that the number of genes found in human DNA was much lower than anticipated.
Importantly, when these levels were compared to cells under various environmental stresses, approximately 1,973 of the 4,100 evaluated proteins showed increases or decreases in their number. Furthermore, changes in protein content were predominantly for high abundance proteins, and those proteins showing changes in abundance were mostly stress-specific.
The authors cited another study that sought to determine the protein levels associated with the proteome (the full complement of proteins encoded in the DNA of a cell) of a common human cell line called U2OS. In that study, they were only able to assign functions for high abundance proteins.
Interestingly though, proteins seen at higher levels in the human cells mirrored the functionality of those identified in the current study: components of the protein synthesis machinery of the cell. This implies that the yeast cell continues to be a useful surrogate for human cells.
While this investigation’s findings advance our knowledge of the distribution of protein activity within a yeast cell that can be correlated to human cells, there are many more areas of biological research that need to be intertwined with this data.
For example, a single protein can have widely varied function based on whether it has been post-translationally modified (3).
“Post-translational modifications” are chemical changes added to the amino acids comprising a protein that can dramatically alter that protein’s function. It is similar to altering the flavor of a burger by adding cheese or bacon or both. Additionally, levels of different proteins can fluctuate up and down as a result of different “epigenetic modifications,” which are changes to the genetic code that do not involve alterations in the actual DNA sequence.
There is a significant amount of complexity to protein functionality and representation within a cell, but as this investigation has shown, it is not beyond the reach of modern science nor a certain satirical science writer.