Machine Learning approaches in the evaluation of pXRF data in provenance studies of transport amphorae

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Ceramics, which have been manufactured at a specific location from local clayey raw materials following a particular workflow of clay paste processing, are assumed to exhibit a characteristic composition in terms of elemental concentrations, mineralogical compounds and petrographic fabric. The examination of ceramics from different manufacturing sites or regions for example through elemental analysis allows for defining compositional categories as reference for the origin of manufacture. The examination of ceramics from trading or consumption sites, on the other hand, allows for assigning them to these reference categories and, thus, for investigating their origin and dissemination. In the particular case of transport amphorae, ancient trade networks for commodities, such as wine, oil or grain, can be investigated. For the elemental analysis of archaeological ceramics commonly laboratory methods are applied, such as neutron activation analysis (NAA) or wavelength-dispersive X-ray fluorescence spectrometry (WD-XRF), providing high analytical performance requiring, though, the sampling of an albeit minute material amounts from a ceramic artifact. Handheld portable energy dispersive XRF (pXRF), on the other hand, allows for non-invasive analysis of large numbers of ceramic artifacts within comparably short time periods. A major drawback of pXRF, though, concerns the higher analytical uncertainties in terms of precision or reproducibility as well as in terms of accuracy impeding eventually the statistical data evaluation following approaches commonly applied to multivariate quantitative data collected with laboratory analyses and the comparison with external reference data. However, even though pXRF data might be more blurry or fuzzy they still represent compositional similarities or dissimilarities, which might be revealed with alternative approaches for categorization. An initial case study testing unsupervised machine learning with self-organizing maps (SOM) on a dataset of Hellenistic transport amphorae from the Paphos Agora, a market place in Cyprus, indicated the potential of automated categorization of pXRF data through machine learning. In the present case study supervised machine learning models, such support vector machines (SVM), random forest (RF) as well as supervised artificial neural networks (ANN), have been tested on pXRF data of transport amphorae from East Aegean islands and from Paphos. For this, NAA data of a part of the analysed amphora fragments have been used for predefining compositional categories in the training data. The present data repository at our laboratory comprises c. 2200 measurements of c. 1400 individual amphora fragments from production centres as well as exchange centres in the Eastern Mediterranean region. Even though this is a comparably large number of data records the generation of synthetic training data was tested. The ultimate scope of the present case study will be to train a machine learning model for automated pattern recognition and prediction of the origin of manufacture of transport amphorae in order to study trading networks in the region.

Article activity feed