Evaluation of Vector Representations of Lipid Nanoparticles in Cheminformatic Predictions of Transfection Efficiency – American Journal of Student Research

American Journal of Student Research

Evaluation of Vector Representations of Lipid Nanoparticles in Cheminformatic Predictions of Transfection Efficiency

Publication Date : Jul-11-2025

DOI: 10.70251/HYJR2348.342328


Author(s) :

Shruti Patel.


Volume/Issue :
Volume 3
,
Issue 4
(Jul - 2025)



Abstract :

Lipid nanoparticles (LNPs) are a revolutionary drug delivery system for RNA-based therapeutics, as they are difficult to degrade and can efficiently transport their contents to distant target cells. Optimizing LNP formulations is essential for improving therapies, yet the best method to computationally represent these formulations for predictive modeling remains unexplored. This study analyzes different strategies for constructing the formulation vector of LNPs to evaluate their impact on predicting transfection efficiency. Specifically, three approaches were examined: fully describing all lipid components using molecular descriptors, fully describing only the cationic lipid while incorporating molar ratios for other components, and fully describing both the ionizable and helper lipids while using molar ratios for rest. Machine learning models were trained using each formulation representation, revealing minimal differences in prediction accuracy. The results suggest that the structures of the cationic and helper lipids are most critical, and including molecular descriptors for the PEGylated (PEG) lipid and cholesterol may introduce excess noise in the neural network without improving its performance. This can streamline LNP formulation research, which traditionally takes years of testing to design specific LNPs. By identifying effective strategies to represent LNP formulations, this work contributes to optimizing RNA-based drug delivery systems. revolutionary drug delivery system for RNA-based therapeutics, as they are difficult to degrade and can efficiently transport their contents to distant target cells. Optimizing LNP formulations is essential for improving therapies, yet the best method to computationally represent these formulations for predictive modeling remains unexplored. This study analyzes different strategies for constructing the formulation vector of LNPs to evaluate their impact on predicting transfection efficiency. Specifically, three approaches were examined: (1) fully describing all lipid components using molecular descriptors, (2) fully describing only the cationic lipid while incorporating molar ratios for other components, and (3) fully describing both the ionizable and helper lipids while using molar ratios for rest. Machine learning models were trained using each formulation representation, revealing minimal differences in prediction accuracy. The results suggest that the structures of the cationic and helper lipids are most critical, and including molecular descriptors for the PEGylated (PEG) lipid and cholesterol may introduce excess noise in the neural network without improving its performance. This can streamline LNP formulation research, which traditionally takes years of testing to design specific LNPs. By identifying effective strategies to represent LNP formulations, this work contributes to optimizing RNA-based drug delivery systems.