Automated Dyslexia Screening In Czech Elementary Students Through LSTM Analysis Of Eye Tracking Data

Publication Date : Oct-08-2025

DOI: 10.70251/HYJR2348.35637650

Author(s) :

Siya Paigude.

Volume/Issue :

Volume 3

Issue 5

(Oct - 2025)

Abstract :

Dyslexia is a neurodevelopmental reading disability affecting the speed and accuracy of word recognition in 5-10% of the population, as well as reading fluency and comprehension. Early intervention is essential, yet existing screening methods are often unreliable, slow, and may take months to produce results. Limited accessibility increases the risk of under-diagnosis, leaving many children without proper identification. This research explores the potential of a deep learning-based approach for preliminary dyslexia screening in Czech 4th-grade students (ages 9-10) through analysis of eye tracking data from the ETDD70 dataset. The dataset contains eye-tracking recordings from 70 Czech participants (35 dyslexic, 35 non-dyslexic) aged 9-10 years, comprising over 7.2 million eye movement data points across three distinct reading tasks: syllable reading, meaningful text comprehension, and pseudo-text decoding. We developed a novel model consisting of three parallel bidirectional Long Short-Term Memory (LSTM) networks with 128-dimensional hidden vectors and multi-head attention mechanisms with 3 attention heads, trained using AdamW optimizer with learning rate scheduling. The primary contribution of this model is its innovative architecture, which enables simultaneous processing and extraction of information from three distinct eye-tracking datasets. Within our limited dataset of 70 participants (train: 46, validation: 12, test: 12), the model demonstrated 83.33% overall accuracy with perfect sensitivity (100% recall) for dyslexic identification, alongside 75% precision for dyslexic classification, 100% precision for non-dyslexic identification, and an F1-score of 85.71%. As a proof-of concept study with a limited sample of 70 participants, these preliminary results demonstrate technical feasibility but lack baseline model comparisons to establish architectural advantages. Validation through larger-scale studies with diverse populations and systematic comparison against traditional machine learning approaches is required before clinical application can be considered.

American Journal of Student Research®