RESEARCHARTICLESUMMARY
◥
PROTEIN STRUCTURE
Principles of assembly reveal a
periodic table of protein complexes
Sebastian E. Ahnert,
*
Joseph A. Marsh,
*
Helena Hernández,
Carol V. Robinson, Sarah A. Teichmann
†
INTRODUCTION:
The assembly of proteins
into complexes is crucial for most biological
processes. The three-dimensional structures
of many thousands of homomeric and het-
eromeric protein complexes have now been
determined, and this has had a broad im-
pact on our understanding of biological
function and evolution. Despite this, the
organizing principles that underlie the great
diversity of protein quaternary structures
observed in nature remain poorly under-
stood, particularly in comparison with pro-
tein folds, which have been extensively classified
in terms of their architecture and evolution-
ary relationships.
RATIONALE:
In this work, we sought a com-
prehensive understanding of the general
principles underlying quaternary structure
organization. Our approach was to consider
protein complexes in terms of their assem-
bly. Many protein complexes assemble spon-
taneously via ordered pathways in vitro, and
these pathways have a strong tendency to be
evolutionarily conser
ved. Further
more, there
are strong similarities between protein com-
plex assembly and evolutionary pathways, with
assembly pathways often being reflective of
evolutionary histories, and vice versa. This
suggests that it may be useful to consider
the types of protein complexes that have
evolved from the perspective of what as-
sembly pathways are possible.
RESULTS:
We first examined the fundamen-
tal steps by which protein complexes can as-
semble, using electros
pray mass spectrometry
experiments, literature-curated assembly data,
and a large-scale analy
sis of protein complex
structures. We found that most assembly steps
can be classified into three basic types: di-
merization, cyclization, and heteromeric sub-
unit addition. By systematically combining
different assembly steps in different ways, we
were able to enumerate a large set of possible
quaternary structure topologies, or patterns
of key interfaces between the proteins within
a complex. The vast majority of real protein
complex structures lie within these topologies.
This enables a natural organization of protein
complexes into a
“
periodic table,
”
because
each heteromer can be related to a simpler
symmetric homomer topology. Exceptions
aremostlytheresultofquaternarystructure
assignment errors, or cases where sequence-
identical subunits can have different in-
teractions and thus introduce asymmetry.
Many of these asymmetric complexes fit
the paradigm of a peri-
odic table when their as-
sembly role is considered.
Finally, we implemented
amodelbasedonthe
periodic table, which pre-
dicts the expected fre-
quencies of each quaternary structure topology,
including those not yet observed. Our model
correctly predicts quaternary structure to-
pologies of recent crystal and electron micros-
copy structures that are not included in our
original data set.
CONCLUSION:
This work explains much of
the observed distribution of known protein
complexes in quaternary structure space and
provides a framework for understanding
their evolution. In addition, it can contrib-
ute considerably to the prediction and mod-
eling of quaternary structures by specifying
which topologies are most likely to be
adopted by a complex with a given stoichi-
ometry, potentially providing constraints for
multi-subunit docking and hybrid methods.
Lastly, it could help in the bioengineering of
protein complexes by identifying which topol-
ogies are most likely to be stable, and thus
which types of essential interfaces need to be
engineered.
▪
RESEARCH
SCIENCE
sciencemag.org
11 DECEMBER 2015
•
VOL 350 ISSUE 6266
1331
Protein assembly steps lead to a periodic table of protein complexes and can predict likely quaternary structure topologies.
Three main
assembly steps are possible: cyclizat
ion, dimerization, and subunit addition. By combining t
hese in different ways, a large set of possible quaternary structure
topologies can be generated. These can be arran
ged on a periodic table that describes most known complexes and that can predict previously
unobserved topologies.
The list of author affiliations is available in the full article online.
*These authors contributed equally to this work.
†
Corresponding author. E-mail: saraht@ebi.ac.uk
Cite this paper as S.E. Ahnert
et al., Science
350
, aaa2245
(2015). DOI: 10.1126/science.aaa2245
ON OUR WEB SITE
◥
Read the full article
at http://dx.doi.
org/10.1126/
science.aaa2245
..................................................
on May 24, 2016
http://science.sciencemag.org/
Downloaded from
RESEARCHARTICLE
◥
PROTEIN STRUCTURE
Principles of assembly reveal a
periodic table of protein complexes
Sebastian E. Ahnert,
1
*
Joseph A. Marsh,
2,3
*
Helena Hernández,
4
Carol V. Robinson,
4
Sarah A. Teichmann
1,3,5
†
Structural insights into protein complexes have had a broad impact on our understanding of
biological function and evolution. In this work, we sought a comprehensive understanding of the
general principles underlying quaternary structure organization in protein complexes. We first
examined the fundamental steps by which protein complexes can assemble, using experimental
and structure-based characterization of asse
mbly pathways. Most assembly transitions can be
classified into three basic types, which can then be used to exhaustively enumerate a large set
of possible quaternary structur
e topologies. These topologies, which include the vast majority
of observed protein complex stru
ctures, enable a natural organization of protein complexes into a
periodic table. On the basis of this table, we ca
n accurately predict the expected frequencies
of quaternary structure topol
ogies, including those not yet observed. These results have
important implications for quaternary structure prediction, modeling, and engineering.
E
volution has given rise to an enormous var-
iety of protein complexes (
1
–
3
). The organ-
izing principles that underlie this diversity
remain poorly understood, particularly in
comparison with protein folds, which have
been classified extensively in terms of their ar-
chitecture (
4
–
6
) and evolution (
7
,
8
). However, net-
work models have shown considerable promise in
recent years for characterizing and comparing pro-
tein complexes. For example, complexes are often
represented as networks of associations between
proteins, with little consideration for structure or
stoichiometry. Alternatively, a graph representa-
tion, which we introduced several years ago, can be
used to capture the main features of quaternary
structure topology (
9
). In this model, the nodes are
the polypeptide chains, d
efined by their amino acid
sequence and often referred to as subunits, and the
edges are the interfaces between physically inter-
acting chains, weighted according to size.
Many protein complexes assemble spontaneously
via ordered pathways in vitro
,
and we have shown
that these assembly pathways have a strong ten-
dency to be evolutionarily conserved (
10
,
11
). Fur-
thermore, there are stron
g similarities between
protein complex assembly and evolutionary path-
ways, with assembly pathways often being reflective
of evolutionary histories, and vice versa (
12
). Thus,
quaternary structure evolution essentially can be
thought of as an assembly process occurring on
an evolutionary time scale. This suggests that it
may be useful to consider the types of protein com-
plexes that have evolved from the perspective of
assembly pathways.
In this work, we attempted to understand and
explain the organization of protein complexes in
quaternary structure spa
ce, using the principles
of assembly. First, by characterizing the assembly
pathways of a large number
of protein complexes,
we found that assembly can be explained gener-
ally by three basic steps: dimerization, cyclization,
and subunit addition. Combinations of these steps
allow us to exhaustively enumerate possible qua-
ternary structure topologies within a given re-
gion of quaternary structure space.
To achieve this, we considered each polypeptide
chain as a distinct self-assembly building block
and considered all the ways in which interfaces
can be distributed across the chains that are pres-
ent in the complex. The large variety of possible
topologies generated by this approach were then
compared to observed structures. We found that
~92% of known protein complex structures are
compatible with this model.
A major benefit of this assembly-centric view
of protein complexes is that it enables a natural
organization of complexes into a
“
periodic table,
”
ordered by the number of subunit repeats (
r
)and
the number of subunit types that are unique with-
in a given complex (
s
). Exceptions are primarily
the result of quaternary structure assignment er-
rors or cases where sequence-identical subunits
can have different interactions and thus introduce
asymmetry. Many of these asymmetric complexes
fit the paradigm of a periodic table when their
assembly role (rather than their subunit identity)
is considered.
Finally, by combining the periodic table with our
enumeration, we introduced a model to predict the
expected frequencies of different quaternary struc-
ture topologies. Not only
does this model effective-
ly replicate the relative frequencies of known protein
complex structures, it also predicts the new topol-
ogies that are most likely to be observed in the future.
A survey of transitions in the assembly
pathways of protein complexes
To understand the principles that underlie qua-
ternary structure organization, it is useful to be-
gin by considering the different ways in which
protein complexes can assemble. We therefore first
sought to determine the assembly and disassembly
[
“
(dis)assembly
”
] pathways for as many protein
complexes as possible. Previously, we have used
electrospray mass spectrometry to characterize
the (dis)assembly of eight homomers (
10
)andeight
heteromers (
11
,
13
). Whereas the homomers fol-
lowed simple pathways, more diversity was ob-
served for the heteromeric complexes. For this
reason, in this study, we experimentally charac-
terized the (dis)assembly pathways of nine addi-
tional heteromers with widely varying quaternary
structures (Fig. 1). In all of these cases, well-defined
intermediate subcomplexes could be identified
under at least one set of experimental conditions.
All eight homomers and 15 of the 17 heteromers
characterized by electrospray mass spectrometry
to date have stoichiometries under native con-
ditions that are consistent with the published biol-
ogicalunitsintheProteinDataBank(PDB).
We also searched the literature for protein com-
plexes of known structure for which experimental
(dis)assembly data are available, as we have done
previously (
10
,
11
). Often, these are cases where at
least two different oligomeric states have been
observed under equilibrium conditions. In total,
we identified 11 homomers and 13 heteromers for
which some (dis)assembly information is availa-
ble in the literature.
We obtained further information on protein
assembly by considering the large number of pro-
tein complexes of known structure. We searched
for pairs of protein complexes where the quater-
nary structure of one complex could be described
as a subset of the other. Such pairs include, for
example, a homodimer and a homotetramer with
highly similar or identical sequences, suggesting
that the tetramer assembles via a dimeric inter-
mediate. Also included are homomer-heteromer
pairs, where the heteromer has acquired a subunit
with respect to the homomers. In total, this ap-
proach identified 154 homomers and 263 heteromers
with putative structure-based assembly information.
We recognize that the structure-based path-
ways do not represent direct characterization of
assembly. Instead, they indicate that two or more
differentquaternarystructurestateshavebeen
observed, and we assume that assembly transi-
tions can occur between them. Even for biophysi-
cally characterized asse
mbly pathways, we do not
always have evidence that they are physiologically
relevant. However, the fact that the biophysical
and structure-based pathways have a strong ten-
dency to reflect evolutionary history (
10
)andto
RESEARCH
SCIENCE
sciencemag.org
11 DECEMBER 2015
•
VOL 350 ISSUE 6266
aaa2245-1
1
Theory of Condensed Matter Group, Cavendish Laboratory,
University of Cambridge, JJ Thomson Avenue, Cambridge
CB3 0HE, UK.
2
Medical Research Council Human Genetics
Unit, Institute of Genetics and Molecular Medicine, University
of Edinburgh, Western General Hospital, Edinburgh EH4 2XU,
UK.
3
European Molecular Biology Laboratory
–
European
Bioinformatics Institute, Wellcome Trust Genome Campus,
Hinxton, Cambridge CB10 1SD, UK.
4
Physical and
Theoretical Chemistry Laboratory, Department of Chemistry,
University of Oxford, South Parks Road, Oxford OX1 3QZ,
UK.
5
Wellcome Trust Sanger Institute, Wellcome Trust
Genome Campus, Hinxton, Cambridge CB10 1SA, UK.
*These authors contributed equally to this work.
†
Corresponding
author. E-mail: saraht@ebi.ac.uk
on May 24, 2016
http://science.sciencemag.org/
Downloaded from
be evolutionarily conserved (
11
)doessuggestthat
they have a functional relevance.
Given this large set of assembly data, we next
asked what quaternary structure transitions (as-
sembly steps) tend to be observed. For homomeric
complexes, we classified all possible transitions
into three types (Fig. 2A, left). First, there is di-
merization, where a doubling of the complex
occurs and a twofold axis of rotational symmetry
is formed (e.g., monomer-to-dimer or dimer-to-
tetramer). Second, there is cyclization, which
involves the assembly of a ring-like quaternary
structure with higher-order rotational symmetry
(e.g., monomer-to-trimer o
r monomer-to-tetramer).
Third, there is fractional transition, an inherently
asymmetric step in which the quaternary struc-
ture changes by a non-integer ratio (e.g., dimer-
to-timer or trimer-to-tetramer).
For each homomer with assembly data, we iden-
tified all the assembly steps that could account for
the transitions between the free monomers, the
observed subcomplexes, and the full complex (see
Methods). The distributions of these three different
assembly steps are shown in Fig. 2B. All three data
sets show a similar trend, with dimerization being
the most common step, cyclization being the next
most common, and fractional
transitions being rare.
This is consistent with previous observations of the
favorable assembly and evolutionary transitions be-
tween homomers with different symmetries (
10
).
In heteromers, there are two further assembly
steps that are possible, in addition to the three
steps observed for homomers. These are illus-
trated in Fig. 2A (right): subunit addition, in which
a new subunit is acquired (e.g., monomer-to-
heterodimer); and nonstoichiometric transition,
in which the types of subunits within the hetero-
mer remain the same, but their relative ratios
change (e.g., assembly from 1:1 to 2:1 stoichiometry).
The distributions of all five possible assembly
steps for heteromers are shown in Fig. 2C. The
same trend is observed among the three homo-
meric steps, with dimerization being the most com-
mon and few fractional transitions. However, across
all five possible steps, the most common observed
step for heteromers from all three data sets is het-
eromeric subunit addition.
Within the heteromers, there is a difference
between the transitions observed in the mass
spectrometry data and those recorded in the other
data sets. Specifically, nonstoichiometric transitions
are much more common in mass spectrometry
data, as evident from the considerable number of
subcomplex intermediates with uneven stoichi-
ometry (different numbers of each subunit type)
shown in Fig. 1. This can be attributed to two fac-
tors: the sensitivity of the mass spectrometry
measurements to low-populated assembly inter-
mediates, and the way in which the mass spec-
trometry experiments are performed
—
namely,
over a range of destabilizing solution conditions
designed to progressively disrupt the quaternary
structure of the complex. We know that such non-
stoichiometric transitions must occur in many
cases where they are not observed. For example,
consider the transition from an AA homodimer
to a BAAB heterotetramer, where there is no in-
teraction between the two
B subunits. In this case,
an AAB assembly intermediate should form, giv-
en that it is highly improbable that two separate
B subunits would bind simultaneously. How-
ever, this asymmetric subcomplex is unlikely to
be observed under non-destabilizing conditions
and without highly sensitive mass spectrometry
measurements.
Enumeration of the topological space of
protein complexes
Next, we explored quaternary structure space by
combining different assembly steps to determine
which protein complex topologies are possible.
Given that the protein complex assembly pathways
described above are dominated by dimerization,
cyclization, and subunit addition, we focused on
these three steps.
An important consideration is interface sym-
metry. Dimerization results in a twofold axis of
rotational symmetry, and therefore the interface
formed by dimerization will be isologous (sym-
metric or head-to-head) and will involve two
aaa2245-2
11 DECEMBER 2015
•
VOL 350 ISSUE 6266
sciencemag.org
SCIENCE
Fig. 1. Mass spectrometry characterization of heteromer (dis)assembly
pathways.
For each characterized complex, the known three-dimensional
structure is shown with a representative mass spectrum, accompanied by
graph representations of the full complex and subcomplexes. In all cases, the
full complex is represented by the rightmost graph. A full list of subcomplexes
is provided in table S1. The structures of 3DVA, 3O8O, and 4B7Y shown here
differ from those in the PDB: 3DVA is missing the
g
subunit, because it was
not present in our sample, and the 4:4 model of 3O8O and the 4:2 model of
4B7Y were built from the unit cell to match the mass spectrometry data.
Colors in the graph representations indicate homomeric isologous (green),
homomeric heterologous (blue), and heteromeric heterologous (red) interfaces;
shapes indicate different subunit types.
RESEARCH
|
RESEARCH ARTICLE
on May 24, 2016
http://science.sciencemag.org/
Downloaded from