In recent years, machine learning models have become increasingly popular for risk assessment of chemical compounds. However, they are often considered “black boxes” due to their lack of transparency, which leads to skepticism among toxicologists and regulators. To increase confidence in these models, researchers at the University of Vienna suggested carefully identifying the regions of chemical space where these models are weak. They developed an innovative software tool (“MolCompass”) for this purpose and the results of this research approach have just been published in the prestigious Journal of Cheminformatics.
Over the years, new pharmaceuticals and cosmetics have been tested on animals. These tests are expensive, raise ethical concerns and often fail to accurately predict human responses. Recently, the European Union supported the RISK-HUNT3R project to develop the next generation of animal-free risk assessment methods. The University of Vienna is a member of the project consortium. Computational methods now allow full assessment of the toxicological and environmental risks of new chemicals by computer, without the need to synthesize the chemical compounds. But one question remains: How confident are these computer models?
It’s all about reliable forecasting
To address this issue, Sergey Sosnin, senior scientist at the Pharmacoinformatics Research Group at the University of Vienna, focused on binary classification. In this context, a machine learning model provides a probability score from 0% to 100% indicating whether a chemical compound is active or not (e.g. toxic or non-toxic, bioaccumulative or non-bioaccumulative, binding or non-binding a specific human protein). This probability reflects the model’s confidence in its prediction. Ideally, the model should only be confident of its correct predictions. If the model is uncertain, giving a confidence score of about 51%, these predictions can be ignored in favor of alternative methods. A challenge arises, however, when the model has full confidence in incorrect predictions.
This is the real nightmare scenario for a computational toxicologist. If a model predicts that a compound is non-toxic with 99% confidence, but the compound is actually toxic, there is no way of knowing that something is wrong.”
Sergey Sosnin, Senior Scientist, Pharmacoinformatics Research Group, University of Vienna
The only solution is to identify the regions of “chemical space” – which includes possible classes of organic compounds – where the model has a priori “blind spots” and avoid them. To do this, a researcher evaluating the model must check the predicted results for thousands of chemical compounds one by one—a tedious and error-prone task.
Overcoming this major hurdle
“To help these researchers,” continues Sosnin, “we developed interactive graphical tools that display chemical compounds on a 2D plane, like geographic maps. Using colors, we highlight compounds that were incorrectly predicted with high confidence, allowing users to identify as clusters of red dots The map is interactive, allowing users to explore chemical space and explore areas of concern.
The methodology was demonstrated using an estrogen receptor binding model. After visual analysis of the chemical space, it became clear that the model works well for e.g. steroids and polychlorinated biphenyls, but fails completely for small acyclic compounds and should not be used for them.
The software developed in this project is freely available to the community on GitHub. Sergey Sosnin hopes that MolCompass will lead chemists and toxicologists to a better understanding of the limitations of computational models. This study is a step towards a future where animal testing is no longer necessary and the only workplace for a toxicologist is a computer desk.
Source:
Journal Reference:
Sosnin. SMALL., et al. (2024). MolCompass: multitool for chemical space navigation and visual validation of QSAR/QSPR models. Journal of Cheminformatics. doi.org/10.1186/s13321-024-00888-z.