Conference Agenda

Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).

 
 
Session Overview
Session
MS-1: Structural bioinformatics
Time:
Sunday, 15/Aug/2021:
10:20am - 12:45pm

Session Chair: Janusz Marek Bujnicki
Session Chair: Jiri Cerny
Location: Club B

50 1st floor

Invited:  Zhichao Miao (UK)


Session Abstract

For all abstracts of the session as prepared for Acta Crystallographica see PDF in Introduction, or individual abstracts below.


Presentations
10:20am - 10:25am

Introduction to session

Janusz Bujnicki, Jiří Černý



10:25am - 10:55am

RNA-Puzzles - the evaluation and automation of RNA 3D structure prediction

Zhichao Miao1,2,3

1Translational Research Institute of Brain and Brain-Like Intelligence and Department of Anesthesiology, Shanghai Fourth People's Hospital Affiliated to Tongji University School of Medicine, Shanghai 200081, China; 2Newcastle Fibrosis Research Group, Institute of Cellular Medicine, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne, UK; 3European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridge CB10 1SD, UK

RNA-Puzzles is a collective endeavour dedicated to the advancement and improvement of RNA 3D structure prediction. With agreement from crystallographers, the RNA structures are predicted by various groups before the publication of the crystal structures. Systematic protocols for comparing models and crystal structures are described and analyzed. In RNA-Puzzles, we discuss a) the capabilities and limitations of current methods of 3D RNA structure based on sequences; b) the progress in RNA structure prediction; c) the possible bottlenecks that hold back the field; d) the comparison between the automated web server and human experts; e) the prediction rules, such as coaxial stacking; f) the prediction of structural details and ligand binding; g) the development of novel prediction methods; and h) the potential improvements to be made.

Till now, 28 RNAs with crystal structures have been predicted, while many of them have achieved high accuracy in comparison with the crystal structures. We have summarized part of our results in three papers and two community-wide meetings. With the results in RNA-Puzzles, we illustrate that the current bottlenecks in the field may lie in the prediction of non-Watson-Crick interactions and the reconstruction of the global topology. Correct coaxial stacking and tertiary contacts are key for the prediction of RNA architecture, while ligand binding modes can only be predicted with low resolution.

We now further extend the prediction to RNA sequences in the Rfam families. We have predicted structures for 20 RNA families, while some of the predictions have been confirmed by crystal or cryo-EM structures, indicating the possibility to use predicted models for functional inference. The predicted models also helped in 'Molecular Replacement' for crystal structures.

For the model evaluation, we present RNA-Puzzles toolkit, a computational resource including (i) decoy sets generated by different RNA 3D structure prediction methods (raw, for-evaluation and standardized datasets), (ii) 3D structure normalization, analysis, manipulation, visualization tools (RNA_format, RNA_normalizer, rna-tools) and (iii) 3D structure comparison metric tools (RNAQUA, MCQ4Structures).

With the increasing number of RNA structures being solved as well as the high-throughput biochemical experiments, RNA 3D structure prediction is becoming routine and accurate. Experimental data-aided structure modelling may effectively help in understanding the noncoding RNA function, especially the viral RNAs.



10:55am - 11:25am

A nucleic acid structural alphabet and conformational analyses at dnatco.datmos.org

Bohdan Schneider

Institute of Biotechnology of the Czech Academy of Sciences, Vestec, Czech Republic

The experimental models archived in the Protein Data Bank provide a rich source of structural information on proteins and nucleic acids. Complex architectures of RNA molecules as well as non-canonical DNA structures prove that the sugar-phosphate backbone is not a scaffold-like structure more or less passively accommodating to and enabling base pairing and stacking motifs formed by the four nitrogenous bases. In the past, RNA structures attracted more attention [1-5] as their 3D folds are formed by visibly rich ensemble of the backbone geometries. The self-recognition of DNA duplexes posed seemingly fewer challenges to analysis of their structural details. However, a detailed look showed structurally well defined conformers [6, 7] that proved useful in discriminating different modes of binding of DNA to transcription factors and the nucleosome core particle in histones [8]. The analysis has shown that differences in the local DNA structure relate to specificity of the binding. In the year 2020, the conformational spaces of DNA and RNA, which were traditionally analyzed separately, were described by one unified set of dinucleotide conformers, which are called NtC, and by a related structural alphabet CANA, Conformational Alphabet of Nucleic Acids [9]. I will briefly describe the principle of fully automated and robust assignment of the NtC classes and CANA symbols and overview related tools that help to annotate, validate, refine experimental structures, and build computer models of NA molecules. All these tools are feely available at the web service dnatco.datmos.org [10].
[1] Duarte et al. NAR 31:4755 (2003). [2] Murray et al. PNAS USA 100:13904 (2003). [3] Hershkovitz et al. NAR 31:6249 (2003). [4] Schneider et al. NAR 32:1666 (2004). [5] Richardson et al. RNA 14:465 (2008). [6] Svozil et al. NAR 36:3690 (2008). [7] Schneider et al. Acta Cryst D74:52 (2018). [8] Schneider et al. Genes 8:278 (2017). [9] Černý et al. NAR 48:6367 (2020). [10] Černý et al. Acta Cryst D76:805 (2020).



11:25am - 11:45am

Applications of residue contact predictions in structural biology.

Filomeno Sanchez Rodriguez1,2, Ronan Keegan3, Melanie Vollmar2, Gwyndaf Evans2, Daniel Rigden1

1University of Liverpool, Liverpool, United Kingdom; 2Life Science, Diamond Light Source, Harwell Science and Innovation Campus, Didcot, Oxfordshire, United Kingdom; 3STFC, Rutherford Appleton Laboratory, Harwell Oxford, United Kingdom

Recent developments in the field of evolutionary covariance and machine learning have enabled the precise prediction of residue-residue contacts and increasingly accurate inter-residue distance predictions. Access to accurate covariance information has played a pivotal role in the recent advances observed in the field of protein bioinformatics, particularly the improvement of prediction of protein folds by ab initio protein modelling. As this work seeks to showcase, this data is of equal value in the field of X-ray crystallography, with several practical applications in MR, model validation and map interpretation.

The most prevalent technique for the solution of the phase problem in macromolecular crystallography is molecular replacement (MR). In most cases, the availability and detection of a suitable search model, typically a solved structure homologous to the target of interest, is the key limitation of conventional MR. In those cases where no such structure is available, unconventional MR approaches are used. Recent results suggest that even in those cases where no homologous structures are found for a given target, it may still be possible to find suitable search models among unrelated structures, in the form of regions that share high, albeit local, structural similarity with the target. The challenge then becomes the accurate identification of such search models among the vast number of available solved structures. Here we present SWAMP, a novel pipeline for the solution of structures of transmembrane proteins, which exploits the latest advances in residue contact predictions for the detection of fragments later to be used as search models. SWAMP includes a library of ensembles built by clustering commonly observed packings of transmembrane helical pairs in close contact, mined from the available databases. Residue contact predictions are used in the process of search model selection: the contact maximum overlap between the target’s predicted contacts and the observed contacts of each member of the library is used to estimate the likelihood of the helical pair being a successful search model. Preliminary results show that SWAMP is capable of detecting valid search models originating from unrelated solved structures solely exploiting this contact information. This enables the solution of new and challenging structures without the use of experimental phasing techniques, and opens a whole new avenue of research in which predicted contact information is used to extend the reach of unconventional MR.

The final outcome of X-ray crystallographic experiments is the determination of the structure of interest, which requires building a model that satisfies the experimental observations. However, experimental limitations can lead to the presence of unavoidable uncertainties during model building resulting in regions that require validation and potentially further refinement. Many metrics are available for model validation, but are mostly limited to the consideration of the physico-chemical aspects of the model or its match to the map. We present new metrics based on the availability of accurate inter-residue distance predictions, which are then compared with the distances observed in the emerging model. Early results suggest that these metrics are capable of detection of register and other errors, even in challenging cases where conventional metrics may struggle.

Residue contact and inter-residue distance predictions are usually represented respectively as two-dimensional binary matrices called contact maps and distograms. These typically omit contacts between sequential near neighbours resulting in a blank space on and near the diagonal axis of the matrix. A multitude of properties can be predicted by other sequence-based methods and researchers often need to consider diverse sources of information in order to form a complete and integrated picture for the inference of structural features that can facilitate the structure solution. Here we present ConPlot, a web-based application which uses the typically empty space near the contact map or distogram diagonal to display multiple coloured tracks representing other sequence-based predictions. These predictions can be uploaded in various popular file formats. This web application is currently available online at www.conplot.org, along with documentation and examples.



11:45am - 12:05pm

Pepsi-SAXS/SANS - small-angle scattering-guided tools for integrative structural bioinformatics

Sergei Grudinin1, Anne Martel2, Sylvain Prevost2

1CNRS, Grenoble, France; 2ILL, Grenoble, France

I will present some recent developments of our Pepsi package for integrative modeling of macromolecules guided by small-angle scattering profiles. These include very fast tools for the all-atom computations of X-ray and neutron small-angle scattering profiles, called Pepsi-SAXS and Pepsi-SANS, respectively [1,2]. These tools implement algorithms specifically designed to handle two notable properties of large macromolecules and their complexes, such as for instance viral capsids, namely their high flexibility and high degree of symmetry. Flexibility of macromolecules is not spontaneous but linked with their structure and function. Computationally, it can be often approximated with just a few collective coordinates, which can be computed e.g. using the Normal Mode Analysis (NMA). NMA determines low-frequency motions at a very low computational cost and these are particularly interesting to the structural biology community because they give insight into protein function and dynamics. On our side, we have proposed a computationally efficient nonlinear NMA method that can be applied to largest complexes from the Protein Data Bank (PDB), and which also very well preserves local stereochemistry [3-5].

Flexibility of macromolecules is often linked with their structure and function. Computationally, it can be approximated with just a few collective coordinates computed using the Normal Mode Analysis (NMA). NMA determines low-frequency motions at a very low computational cost. This technique is particularly interesting for the structural biology community as it allows extrapolating biologically relevant motions starting from high-resolution structures. Recently, we have shown that it can be extended to model local deformations and to better preserve the stereochemistry of the protein. We have developed a computationally efficient nonlinear NMA method that can be applied to the largest complexes from the Protein Data Bank (PDB) [3-5].

Large symmetrical protein structures have seemingly evolved in many organisms because they carry specific morphological and functional advantages compared to small individual protein molecules. Recently we have proposed a novel free-docking method for protein complexes with arbitrary point-group symmetry [6]. It assembles complexes with cyclic symmetry, dihedral symmetry, and also those of high order (tetrahedral, octahedral, and icosahedral). We also proposed an efficient analytical solution to the inverse problem, that is the identification of symmetry group with the corresponding axes and their continuous symmetry measures in a protein assembly [7-8].

With Pepsi-SAXS and Pepsi-SANS, one can leverage the above-mentioned developments, by optimizing structures along low-frequency « normal modes », performing automatic and adaptive coarse-graining of molecular models, rescoring free-docking predictions, including those of symmetric assemblies, and also optimizing structural transitions. Structural models produced by Pepsi-SAXS/SANS were ranked top in the recent data-assisted protein structure prediction sub-challenge in CASP13 [9].

[1] Grudinin, S. et al. (2017). Acta Cryst. D, D73, 449 – 464. For more information https://team.inria.fr/nano-d/software/pepsi-saxs/
[2] https://team.inria.fr/nano-d/software/pepsi-sans/
[3] Hoffmann, A. & Grudinin, S. (2017). J. Chem. Theory Comput. 13, 2123 – 2134. For more information https://team.inria.fr/nano-d/software/nolb-normal-modes/
[4] Grudinin, S., Laine, E., & Hoffmann, A. (2020). Predicting protein functional motions: an old recipe with a new twist. Biophysical journal, 118(10), 2513-2525.
[5] Laine, E., & Grudinin, S. (2021). HOPMA: Boosting protein functional dynamics with colored contact maps. The Journal of Physical Chemistry B, 125(10), 2577-2588.
[6] Ritchie, D. W. & Grudinin, S (2016). J. Appl. Cryst., 49, 1-10.
[7] Pages, G., Kinzina, E, & Grudinin, S (2018). J. Struct.Biology, 203 (2), 142-148.
[8] Pages, G. & Grudinin, S (2018). J. Struct.Biology, 203 (3), 185-194.
[9] Hura, G. L., ... & Tsutakawa, S. E. (2019). Small angle X‐ray scattering‐assisted protein structure prediction in CASP13 and emergence of solution structure differences. Proteins: Structure, Function, and Bioinformatics, 87(12), 1298-1314.



12:05pm - 12:25pm

Refactoring the B-factor: intuitively extracting structural dynamics from macromolecular disorder

Nicholas M Pearce1, Piet Gros2

1Free University of Amsterdam, Amsterdam, The Netherlands; 2Utrecht University, Utrecht, The Netherlands

Displacement parameters (B-factors) play a crucial role in macromolecular structure determination, yet are rarely used for biological interpretation. This is somewhat egregious, since they account for the local flexibility of individual protein states/conformations. We have developed a new approach[1] for dividing the disorder information in a macromolecular model into a hierarchical series of components on different length-scales, which reveals the components of the atomic disorder that result from molecular disorder, domain disorder, or local atomic disorder. This makes both molecular and atomic disorder intuitively understandable in terms of likely domain motions and internal atomic motions. We demonstrate this new approach by studying the flexibility of the catalytic site in crystal structures of the SARS-CoV-2 main protease. Additionally, we apply the method to structures determined by cryo-EM, where we can investigate and visualize the flexibility in both the extended and non-extended receptor-binding domains of the SARS-COV-2 spike glycoprotein, and in the iron-reductase STEAP4, which hint at a mechanism for electron transfer.



12:25pm - 12:45pm

Computational modeling of RNA 3D structures and RNA-protein complexes, with the use of experimental data

Janusz Marek Bujnicki

International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland

Ribonucleic acid (RNA) molecules are master regulators of cells. They are involved in many molecular processes: they transmit genetic information, sense cellular signals and communicate responses, and even catalyze chemical reactions. RNA function and in particular its ability to interact with other molecules such as proteins, is encoded in the sequence. Understanding how RNAs and RNA-protein complexes carry out their biological roles requires detailed knowledge of the RNA structure.

Due to limitations in experimental structure determination, complete high-resolution structures are available for a tiny fraction of all the known RNA molecules crucial for numerous fundamental cellular processes. <1% of RCSB entries represent RNA structures, and only around 3% of RNA families available in the Rfam database have at least one experimentally determined structure. This relative paucity of information compared to what is available for proteins also makes computational RNA 3D structure prediction much less successful. Currently, purely computational RNA 3D structure prediction is limited to sequences shorter than 100 nt.

I will present strategies for computational modeling of RNA and RNA-protein complex structures that utilize SimRNA, a suite of methods developed in my laboratory, which use coarse-grained representations of molecules, rely on the Monte Carlo method for sampling the conformational space, and employ statistical potentials to approximate the energy and identify conformations that correspond to biologically relevant structures. In particular, I will discuss the use of computational approaches for RNA structure determination based on low-resolution experimental data, including low-resolution crystallographic electron density maps and cryo-EM maps.

References

1 Ponce-Salvatierra, A. et al. Biosci. Rep. 39, BSR20180430 (2019)
2 Boniecki, M. J. et al. Nucleic Acids Res. 44, e63 (2016)