Conference Agenda

Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).

 
 
Session Overview
Session
Poster - 03 Structure prediction: Crystal structure prediction
Time:
Sunday, 15/Aug/2021:
5:10pm - 6:10pm

Session Chair: Qiang Zhu

 


Presentations

Poster session abstracts

Radomír Kužel



Machine Learning on Experimental Crystal Structures to parametrize Models of the Gibbs Energy in Computational Crystallography

Detlef WM Hofmann1, Liudmila LN Kuleshova2

1CRS4, Pula, Italy; 2FlexCryst, Uttenreuth, Germany

The idea to use data mining techniques to derive force field basing on crystallographic structural information we reported first on the ECM18 (1998) in Prague. We will give an outline of the machine learning, present the results of the validation, and give an insight to some applications.

The approach is based on the idea that experimental the lattice energy of crystals must fulfill three conditions regarding the Gibbs energy. The lattice energy must be below zero, the crystal structure must be a local minimum, and, if available, the experimental and the calculated lattice energy must coincide. These equations can be used for the parametrization of a given model (Force Field or DFT-functional) by machine learning. The properly parametrized data mining force field1 allows to calculate the Gibbs lattice energy of all known crystal structures within a few hours. Since the Gibbs energy defines the reaction energy, the obtained energies can be used to predict chemical-physical properties of crystals, for instance, the formation of co-crystals, polymorph stability, and solubility.

The parametrized “data mining force field” was validated regarding the three conditions mentioned above: firstly, the experimental lattice energies of the reference structures have been compared with the calculated energies. The observed errors are in the order of experimental errors. Secondly, the Gibbs lattice energies were calculated for all crystal structures available in CSD. Their energies were found below zero in 99.4 %. Finally, for 500 random structures the change in density and energy was checked. The mean errors for density was found below 5%, for the energy below 2%.

The very high speed, around 5 s per minimization, makes the model attractive for more complex tasks: for crystal structure prediction. Crystal structure prediction requires several hundred minimizations and a proper similarity index between crystal structures2. Even more complex is an in silico co-crystal screening3. It requires hundreds of crystal structure predictions in a reasonable time. On the high performance computing cluster of CRS4 this can be done with a few days.

  1. Hofmann, Detlef WM. "Data mining in organic crystallography." Data Mining in Crystallography. Springer, Berlin, Heidelberg, 2009. 89-134.

  2. Hofmann, Detlef Walter Maria, and Ludmila Kuleshova. "New similarity index for crystal structure determination from X-ray powder diagrams." Journal of applied crystallography 38.6 (2005): 861-866.

  3. Stepanovs, Dmitrijs, et al. "Cocrystals of pentoxifylline: In silico and experimental screening." Crystal Growth & Design 15.8 (2015): 3652-3660.

Figure 1: The crystal structure of the cobalt complex JUDLEZ is found positive during validation. The reason is a misplaced hydrogen. As consequence to one carbon is assigned the atom type “hypercovalent carbon” C and during the energy calculation a strong repulsive interaction between C4 and C is found. The force field and the data base are constantly improved by analyzing such kind of outliers.



Predicting the packing behaviour of porous organic cages

Emma Helen Wolpert, Kim E Jelfs

Department of Chemistry, Imperial College London, Molecular Sciences Research Hub, White City Campus, London, W12 0BZ, UK

Porous organic cages are a subset of porous materials which are made up of covalently bonded organic molecules forming cages with intrinsic porosity. Unlike extended framework materials, such as metal organic frameworks which are connected through covalent or coordination bonds, the assembly of porous organic cages is defined by weak dispersion forces. Therefore, the connectivity between the cages can be easily manipulated by varying the chemical functionality or solvent [1]. This leads to a variety of porous organic cage solids which, depending on the packing behaviour, may contain only intrinsic cavities or have extrinsic pores between the cages resulting in one, two, or, three dimensional pore networks [2]. Consequently, the packing behaviour of the porous organic cages can have a vast effect on the properties of the material [3]. It has been suggested that in principle, different cages can be combined to produce structures with specific properties [4]. However, the challenge in reliably predicting the packing behaviour of molecular crystals, due to the lack of strong bonding networks, results in difficulty in targeted design [4].

Although crystal structure prediction can accurately determine crystal energy landscapes, it is computationally expensive to apply to multiple molecular combinations [5]. Here we aim to determine the packing behaviour of porous organic cages through coarse graining. We start by creating a coarse grained Hamiltonian containing the dominant intermolecular interactions between the cages, informed by force field models. We then aim to employ Monte Carlo simulations using our model in conjunction with hard particle Monte Carlo simulations [6] to determine the thermodynamic phase behaviour of the packing of the cages. This work focuses on the well-studied porous organic cage CC3 [1] as a proof-of-concept example to determine the extent to which we can use coarse graining to analyse the packing behaviour of other, less well-studied porous organic cages.

References: 1. T. Tozawa, J. Jones, S. Swamy, et al. Nature Mater 8, 973–978 (2009)

2. Y. Liu, G. Zhu, W. You, et al. J. Phys. Chem. C 123, 3, 1720–1729 (2019)

3. M. E. Briggs and A. I. Cooper Chem. Mater. 29, 1, 149–157 (2017)

4. J. Jones, T. Hasell, X. Wu, et al. Nature 474, 367–371 (2011)

5. T. Hasell, S. Y. Chong, K. E. Jelfs, et al. J. Am. Chem. Soc. 134, 1, 588–598 (2012)

6. J. A. Anderson, M. E. Irrgang, and S. C. Glotzer. Computer Physics Communications 204, 21-30 (2016)



Come for the drug, stay for the solvent!

Ioana Sovago, Peter Wood

The Cambridge Crystallographic Data Centre, 12 Union Road, Cambridge, CB1 2EZ, UK

The ability to predict physicochemical properties starting from 2-dimensional molecular information is of paramount importance within the crystal engineering discipline, finding applications in industries as diverse as pharmaceuticals, agrochemicals, and pigments.

Within the CCDC, we have been developing a suite of predictive methods to help scientists assess the likely properties of a given small molecule.

The large amount of data available and the fast-growing Artificial Intelligence (AI) field can now facilitate the development of software tools allowing such predictions. Due to the rise of new and easy to implement Machine Learning (ML) algorithms in recent years1–3 multiple scientific questions have been answered by applying AI approaches. It is now possible to quickly predict NMR spectra using ML models based on quantum calculations1 which help with the interpretation of experimental NMR spectra. Space groups can be predicted solely based on Pair Distribution Functions,2 and for the first time a new antibiotic was identified using ML, thus significantly reducing the number of experiments required.3

We have developed a method that provides an early-stage assessment of the likelihood of solvate formation, so that this can be factored into target compound selection and experimental solid form screening can be planned more effectively. Using a sophisticated machine-learning approach we can predict solvate formation quickly using only 2D molecular information. The addition of effective assessment of the likelihood of solvate formation to our solid form design toolbox takes us a big step closer towards more a complete understanding of the behaviour of compounds in the solid state as well as the ability to factor in prediction of solid-state properties in the design stage of a project.

1 W. Gerrard, L. A. Bratholm, M. J. Packer, A. J. Mulholland, D. R. Glowacki and C. P. Butts, Chem. Sci., 2020, 11, 508–515.

2 C. H. Liu, Y. Tao, D. Hsu, Q. Du and S. J. L. Billinge, Acta Crystallogr. Sect. A Found. Adv., 2019, 75, 633–643.

3 J. M. Stokes, K. Yang, K. Swanson, W. Jin, A. Cubillos-Ruiz, N. M. Donghia, C. R. MacNair, S. French, L. A. Carfrae, Z. Bloom-Ackerman, V. M. Tran, A. Chiappino-Pepe, A. H. Badran, I. W. Andrews, E. J. Chory, G. M. Church, E. D. Brown, T. S. Jaakkola, R. Barzilay and J. J. Collins, Cell, 2020, 180, 688-702.e13.



Structural characterization of Cu-tpy-nucleotide ternary complexes

Suma Dilipkumar, Nethaji Munirathinam

Indian Institute of Science, Bengaluru, India

Structural information on ternary metal-aromatic amine-nucleotide complexes is required to explicate the role of metal ions in protein-nucleotide interactions. In the present work, we report two ternary copper complexes i.e., Cu-tpy-GMP (A) and Cu-tpy-CMP (B) [ tpy - 2, 2’:6’, 2” terpyridine, GMP – Guanosine 5’-monophosphate, CMP – Cytidine 5’-monophosphate], where both are 1D(linear) coordination polymers. They are crystallized in space groups P21(A), and P212121(B) with habit block and rhombus, respectively. In polymer A, the monomer is hexanuclear and the metal ions are bridged by two ‘O’s of phosphate group, O6 and N7 of the heterocyclic base, giving distorted square pyramidal geometry to all the Cu centres. In tetranuclear monomeric unit of polymer B, the metal ions are bridged by one ‘O’ of the phosphate group, O2 and N3 of the heterocyclic base, giving two different geometries i.e., distorted octahedral to 2 Cu and distorted square pyramidal to 2 other Cu centres. Polymer A has 2 units of 5’-GMP and B has the same number of 5’-CMP units in the monomeric asymmetric unit. The spectator molecules are 6 perchlorates, 1 H2O and 3MeOH in Polymer A, whereas Polymer B has 4 perchlorates and 16 H2O. The point of polymeric chain extension is at O6 of one nucleotide and Cu ion in Polymer A, but in Polymer B it is at sugar ring OH of one nucleotide and Cu ion. Both the structures are stabilized by H-bonding and pi-pi stacking interactions.



CSP: Paracetamol via Grid Search & PSO

Milan Kočí

FSNPE,Czech Technical University in Prague, Prague, Czech Republic

The greatest usage of paracetamol is in the pharmaceutical industry. Paracetamol shows a considerable tendency to polymorphism and the aim of this work was to predict the crystal structure of its most stable polymorph. We achieved the goal by generating a large number of structures by Grid Search and subsequent reducing of the number of structures using the global optimization algorithm Particle Swarm Optimization.