We know almost nothing about thousands of proteins in the human body
Scientists have created an “unknome” of proteins encoded by human genes, whose existence is known but whose functions are mostly not
By Michael Le Page
8 August 2023
Around 20,000 genes that code for proteins have been identified in humans, but the function of many of these proteins is unknown
Tek Image/Science Photo Library/Getty Images
A database of proteins, dubbed the “unknome”, that ranks proteins according to how much we have learned about them has revealed that we still know next to nothing about thousands of human proteins. The team behind the database has also shown that at least some of these proteins are essential for survival.
To create the unknome, Sean Munro at the MRC Laboratory of Molecular Biology in Cambridge, UK, and his colleagues started with the 20,000 or so genes for proteins that have been identified in humans. They grouped together closely related human genes or proteins on the basis that they probably have similar functions, resulting in around 7500 protein clusters.
Read more:
Genome technology is transforming healthcare but what should we allow?
Advertisement
Next, they added closely related proteins found in commonly studied animals, such as mice or fruit flies, to these clusters, as these probably also have the same function. They then gave each protein cluster a score based on how many entries there were about its members in the main repository of information on the functions of genes, known as the Gene Ontology Resource.
A human protein that hasn’t been directly studied still scores highly if an equivalent protein has been well studied in another animal. Proteins also get higher scores for entries that are regarded as more reliable, such as having been published in a journal. The scoring is slightly arbitrary, says Munro, but this is inevitable when trying to work out what we don’t know.
The best-studied proteins have scores of well over 100. For instance, a protein called sonic hedgehog, which is involved in embryonic development, scores 168, while p53, which helps stop cells turning cancerous, scores 126. However, more than 2200 proteins have scores below 2, 1100 score below 1 and more than 800 score 0.