Selecting relevant Fourier transform infrared spectroscopy wavenumbers for clustering authentic and counterfeit drug samples |
| |
Affiliation: | 1. Department of Industrial Engineering, Federal University of Rio Grande do Sul, Av. Osvaldo Aranha, 99-5° andar, Porto Alegre, RS, Brazil;2. CRETIES — Centro de Referência em Avaliação de Tecnologias e Insumos Estratégicos em Saúde, Av. Osvaldo Aranha, 99-6o andar, Porto Alegre, RS 90035-190, Brazil;3. Rio Grande do Sul Technical and Scientifical Division, Brazilian Federal Police, Avenida Ipiranga 1365, 90160-093 Porto Alegre, RS, Brazil;4. Department of Pharmacy, Universidade Federal do Rio Grande do Sul, Av. Ipiranga, 2752, 90610-000 Porto Alegre, RS, Brazil;1. Netherlands Forensic Institute, The Hague, Leiden University, The Netherlands;2. Netherlands Forensic Institute, The Hague, The Netherlands |
| |
Abstract: | This paper proposes a novel method for selecting subsets of wavenumbers provided by attenuated total reflectance by Fourier transform infrared (ATR-FTIR) spectroscopy able to improve the clustering of medicine samples into two groups; i.e., authentic or fraudulent. For that matter, we apply principal components analysis (PCA) to ATR-FTIR data, and derive two variable importance indices from the PCA parameters. Next, an iterative variable (i.e. wavenumbers) elimination procedure and sample clustering through k-means and Fuzzy C-means techniques are carried out; clustering performance is assessed by the Silhouette Index (SI). The performance of the proposed method is compared with a greedy variable selection method, the “leave one variable out at a time” approach, in terms of clustering quality, percent of retained variables, and computational time. When applied to Viagra ATR-FTIR data, our propositions increased the average SI from 0.5307 to 0.8603 using 0.61% of the original 661 wavenumbers; as for Cialis ATR-FTIR data, clustering quality increased from 0.7548 to 0.8681 when 1.21% of the original wavenumbers were retained in the procedure. The retained wavenumbers, located in the 1091–1046 cm− 1 region, comprise the lactose typically hailed as key substance to discriminate between authentic and counterfeit samples. |
| |
Keywords: | |
本文献已被 ScienceDirect 等数据库收录! |
|