Enhancing Microorganism Classification Using Multinomial Logit Regression, k-Nearest Neighbor, and a Hybrid Approach
Publication Date : Nov-13-2025
Author(s) :
Volume/Issue :
Abstract :
The aim of this study is to address the current challenges in microorganism identification and classification–particularly those that impede timely and accurate medical diagnoses and treatments for patients. Recognizing the constraints of traditional bacterial classification processes, this study explores the potential of machine learning in streamlining microorganism identification using their morphological features. For this research, three modeling techniques were tested: multinomial logistic regression (MLR), k-nearest neighbor (k-NN), and a novel hybrid model integrating the two. Using a dataset sourced from Kaggle–a Google website that serves as a platform for members of the scientific community to publish their datasets–and individually benchmarked in their abilities to accurately distinguish between ten distinct microorganism species (Spirogyra, Volvox, Pithophora, Ulothrix, Diatom, Fungi, Yeast, Rhizopus, Penicillium, Aspergillus sp., and Protozoa) based on twenty-four numerical features detailing their geometry and structure. The experiment’s findings revealed that while the k-NN model outperformed the multinomial logistic model, the hybrid approach yielded the highest degree of accuracy in its classification of the ten microorganism species. In comparison to conventional cultivation techniques employed in clinical microbiology, machine learning forgoes the lengthy processing time associated with it, as it has the capability to accurately identify pathogens based on existing data. As a result, patients–especially those in urgent cases–are able to receive rapid diagnosis and necessary treatment before the bacterial culture is fully grown.
