Generalizable Cysteine Quantification in Pea Cultivars from SERS Spectra Using AI
This article has been Reviewed by the following groups
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
- Evaluated articles (Arcadia Science)
Abstract
Rapid quantification of sulfur-containing amino acids, particularly cysteine, in legumes is critical for assessing nutritional quality, supporting breeding program screening, and ensuring consistency in quality control processes. However, conventional methods, such as high-performance liquid chromatography (HPLC), are time-consuming and resource-intensive for high-throughput applications. This study evaluated artificial intelligence models for predicting cysteine concentration from surface-enhanced Raman spectroscopy (SERS) spectra of pea extracts. SERS spectra were acquired from 20 cultivars grown at three geographically distinct locations, with HPLC-measured cysteine concentrations as a ground truth reference. Linear regression, partial least squares regression, support vector regression, random forest regression, and a one-dimensional convolutional neural network (1D-CNN) were compared using within-cultivar splits and leave-one-cultivar-out (LOCO) evaluation. The 1D-CNN achieved RMSE 0.008 g/100 g within cultivars and maintained performance under LOCO, while other models showed limited generalization. Shapley Additive Explanations highlighted informative bands in the 630–760 cm −1 range, and noise modeling optimized scan-count selection.
Article activity feed
-
Figure 2.
Are there wavenumbers that are consistent across the two? If there is not a significant degradation in the performance of the 1D-CNN when doing LOCO, why is the SHAP value 10x smaller?
-
To prepare the SERS substrates, prefabricated paper-based SERS (P-SERS) substrates (Metrohm) were168handled
is there a part # for this substrate?
-
sing a Raman system equipped with a 785 nm excitation laser delivering 100175mW at the sample surface
can you provide a part # for the system?
-