首页 | 本学科首页   官方微博 | 高级检索  
     


Performance of the supervised learning algorithms in sex estimation of the proximal femur: A comparative study in contemporary Egyptian and Turkish samples
Affiliation:1. Forensic Medicine and Clinical Toxicology, Faculty of Medicine, Alexandria University, Alexandria, Egypt;2. Biomedical Engineering, Medical Research Institute, Alexandria University, Egypt;3. Institute for Intelligent Systems Research and Innovation, Deakin University, Australia;4. Diagnostic Radiology, Faculty of Medicine, Alexandria University, Egypt;5. Fixed Prosthodontics, Faculty of Dentistry, Ain Shams University, Egypt;6. University of Coimbra, Research Centre for Anthropology and Health, Department of Life Sciences, Coimbra, Portugal;7. University of Coimbra, Laboratory of Forensic Anthropology, Department of Life Sciences, Coimbra, Portugal
Abstract:Sex estimation standards are population specific however, we argue that machine learning techniques (ML) may enhance the biological sex determination on trans-population application. Linear discriminant analysis (LDA) versus nine ML including quadratic discriminant analysis (QDA), support vector machine (SVM), Decision Tree (DT), Gaussian process (GPC), Naïve Bayesian (NBC), K-Nearest Neighbor (KNN), Random Forest (RFM) and Adaptive boosting (Adaboost) were compared. The experiments involve two contemporary populations: Turkish (n = 300) and Egyptian populations (n = 100) for training and validation, respectively. Base models were calibrated using isotonic and sigmoid calibration schemes. Results were analyzed at posterior probabilities (pp) thresholds >0.95 and >0.80. At pp = 0.5, ML algorithms yielded comparable accuracies in the training (90% to 97%) and test sets (81% to 88%) which are not modified after employing the calibration techniques. At pp >0.95, the raw RFM, LDA, QDA, and SVM models have shown the best performance however, calibration techniques improved the performance of various classifier especially NBC and Adaboost. By contrast, the performance of GPC, KNN, QDA models worsened by calibration. RFM has shown the best performance among all models at both thresholds whereas LDA benefited the best from using both calibration methods at pp >0.80. Complex ML models are not necessarily achieving better performance metrics. LDA and QDA remain the fastest and simplest classifiers. We demonstrated the capability of enhancing sex estimation using ML on an independent population sample however, differences in the underlying probability distribution generated by models were detected which warranted more cautious application by forensic practitioners.
Keywords:Forensic anthropology  Supervised machine learning algorithms  Femur sexual dimorphism  Regional sex estimation standards  Contemporary metapopulations skeletal database
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号