Introduction

What is a biological macromolecule?

If you are looking at this documentation, you probably already know what a macromolecule is. Macromolecules are an essential element of cell life and are extremely heterogeneous. They can catalyze chemical reactions (i.e., proteins), transfer information among generations (i.e., DNA) or constitute the cell itself (i.e., lipids, carbohydrates and proteins).

Machine learning in structural biology

The study of the 3D structure of these macromolecules has dramatically increased in importance during the last years. The amount of structural data available in public databases such as the protein data bank is increasing at an unprecedented rate. This has not only the effect of increasing our knowledge about structural biology, but it has also opened the door to the application of machine learning algorithms to biological structural data.

The technological gap

However, biological structures are often hard to handle. From public datasets, you can download atomic coordinates, but this type of data is not directly usable in standard machine learning algorithms. For this reason, structural bioinformatics has been lacking behind in machine learning with respect to other fields such as computer vision, where modern neural network architectures, such as 3D-convolutional neural network or transformers, are largely used.

What is PyUUL?

To overcome this problem we built PyUUL, a pytorch library that can transform biological structures in differentiable 3D objects that are suitable for machine learning algorithms developed for computer vision. This library therefore greatly increases the number of neural network architectures applicable to structural bioinformatics. Currently, the user can choose between three different types of data representation: voxel-based, surface point cloud and volumetric point cloud. If you want to learn more about PyUUL, you can read the manuscript at:

Possible applications in the scientific world

PyUUL can be used to import machine learning algorithms from computer vision to structural bioinformatics. Some of the algorithms listed below might be good candidates for future works: