Feature Selection for Autism Spectrum Disorder Prediction using LASSO Logistic Regression
DOI:
https://doi.org/10.48165/pimrj.2026.3.1.8Keywords:
Autism Spectrum Disorder (ASD), Machine Learning, Feature Selection, LASSO Logistic Regression, Embedded Methods, Psychiatric Screening, Predictive Modeling, InterpretabilityAbstract
Autism Spectrum Disorder (ASD) is a complex neurodevelopmental condition where early and reliable detection is essential for timely intervention. Machine learning methods are increasingly used to support psychiatric assessments, yet their effectiveness depends strongly on identifying the most relevant features. This study applies a single embedded feature selection approach—LASSO (Least Absolute Shrinkage and Selection Operator) logistic regression—to the publicly available UCI Autism Screening Dataset for Children. The dataset, containing behavioral screening questions and demographic variables, was preprocessed and evaluated using stratified 10-fold cross-validation. LASSO was employed both as a classifier and as a feature selector, shrinking less informative coefficients to zero while retaining the most predictive attributes. Results show that LASSO successfully reduced the dimensionality of the dataset, maintaining strong predictive performance in terms of accuracy, recall, and F1-score. Importantly, family history of autism and specific behavioral responses emerged as consistently influential features. This focused study highlights the value of LASSO as a dual-purpose tool for prediction and feature selection in ASD research. The findings demonstrate that concise, interpretable feature sets can be derived without compromising accuracy, supporting the development of efficient and transparent diagnostic aids for clinical practice.References
Abraham, A., et al. (2014). Machine learning for neuroimaging with scikit-learn. Frontiers in Neuroinformatics, 8(14), 1–10.
American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders (5th ed.). American Psychiatric Publishing.
Beam, A. L., & Kohane, I. S. (2018). Big data and machine learning in health care. JAMA, 319(13), 1317–1318.
Bishop, C. M. (2006). Pattern recognition and machine learning. Springer.
Duda, S. R., Kosmicki, D., & Wall, D. (2014). Testing the accuracy of an observation-based classifier for rapid detection of autism risk. Translational Psychiatry, 4, e424–e430.
Guyon, I., & Elisseeff, A. (2003). An introduction to variable and feature selection. Journal of Machine Learning Research, 3, 1157–1182.
Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning: Data mining, inference, and prediction (2nd ed.). Springer.
Lord, C., Elsabbagh, M., Baird, G., & Veenstra-Vanderweele, J. (2018). Autism spectrum disorder. The Lancet, 392(10146), 508–520.
Louppe, G. (2014). Understanding random forests: From theory to practice. arXiv preprint arXiv:1407.7502, 1–50.
Mostafa, S. B., Thabtah, M., & Al-Zahrani, H. (2017). Autism spectrum disorder screening: Machine learning adaptation and DSM-5 fulfillment. ACM SIGAPP Applied Computing Review, 17(2), 19–28.
Rabiner, L. R. (2021). Machine learning approaches for autism spectrum disorder diagnosis and prediction. Journal of Biomedical Informatics, 113, 1–12.
Thabtah, A., & Peebles, D. (2018). A new machine learning model based on inductive learning. Applied Intelligence, 48, 2939–2957.
Thabtah, M. (2017a). Autism spectrum disorder screening: Machine learning adaptation and DSM-5 fulfillment. In Proceedings of the 1st International Conference on Medical and Health Informatics (ICMHI’17) (pp. 1–6).
Thabtah, M. (2017b). Machine learning in autistic spectrum disorder behavioral research: A review. Information, 8(3), 1–20.
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1), 267–288.
Vabalas, K., Gowen, F. J., & Poliakoff, L. (2019). Machine learning algorithm validation with a limited sample size. PLOS ONE, 14(11), 1–20.
Venkataraman, S., et al. (2020). Artificial intelligence in psychiatry: An overview of ethical challenges. Asian Journal of Psychiatry, 54, 1–6.
Wall, J. J. (2020). Use of machine learning in autism spectrum disorder: A scoping review. Molecular Autism, 11(22), 1–12.
World Health Organization. (2019). International classification of diseases (11th rev.). WHO.
Zhang, Y., Wang, X., & Li, Y. (2022). Machine learning for clinical diagnosis and prognosis of autism spectrum disorder. Frontiers in Psychiatry, 13, 1–12.

