Proof‑of‑Concept Machine Learning Classifier for Identifying Clouds and Haze in Exoplanet Transmission Spectra – American Journal of Student Research

American Journal of Student Research

Proof‑of‑Concept Machine Learning Classifier for Identifying Clouds and Haze in Exoplanet Transmission Spectra

Publication Date : Mar-05-2026

DOI: 10.70251/HYJR2348.421418


Author(s) :

Alexander Liu.


Volume/Issue :
Volume 4
,
Issue 2
(Mar - 2026)



Abstract :

Traditional methods for determining whether an exoplanet atmosphere is clear, cloudy, or hazy can require substantial manual interpretation of transmission spectra. Here, a proof‑of‑concept machine learning (ML) classifier is developed to assess whether synthetic training data can support automated classification of atmospheric conditions in observed spectra. A dataset of 10800 synthetic spectra was generated using petitRADTRANS, spanning three classes (clear, cloudy, and hazy). The dataset was split into training, validation, and testing sets (70/15/15 percent), and a random forest classifier was trained and evaluated. The model achieved testing accuracy of approximately 99–100 percent, with cross‑validated F1 scores above 0.98 across all classes. The trained model was then applied to five observed exoplanet spectra and produced cloudy classifications with probabilities between 65 and 89 percent. Although the small sample size and synthetic training data limit generalizability, this study demonstrates the potential for ML to accelerate atmospheric characterization workflows. Future work with larger and more diverse datasets will be required to validate the method for broader scientific applications.