TY - JOUR
T1 - Making sense of fossils and artefacts
T2 - a review of best practices for the design of a successful workflow for machine learning-assisted citizen science projects
AU - Eijkelboom, Isaak
AU - Schulp, Anne S.
AU - Amkreutz, Luc
AU - Verheul, Dylan
AU - der Vaart, Wouter Verschoof Van
AU - van der Vaart-Verschoof, Sasja
AU - Hogeweg, Laurens
AU - Brunink, Django
AU - Mol, Dick
AU - Peeters, Hans
AU - Wesselingh, Frank
N1 - Publisher Copyright:
Copyright 2025 Eijkelboom et al.
PY - 2025/1/1
Y1 - 2025/1/1
N2 - Historically, the extensive involvement of citizen scientists in palaeontology and archaeology has resulted in many discoveries and insights. More recently, machine learning has emerged as a broadly applicable tool for analysing large datasets of fossils and artefacts. In the digital age, citizen science (CS) and machine learning (ML) prove to be mutually beneficial, and a combined CS-ML approach is increasingly successful in areas such as biodiversity research. Ever-dropping computational costs and the smartphone revolution have put ML tools in the hands of citizen scientists with the potential to generate high-quality data, create new insights from large datasets and elevate public engagement. However, without an integrated approach, new CS-ML projects may not realise the full scientific and public engagement potential. Furthermore, object-based data gathering of fossils and artefacts comes with different requirements for successful CS-ML approaches than observation-based data gathering in biodiversity monitoring. In this review we investigate best practices and common pitfalls in this new interdisciplinary field in order to formulate a workflow to guide future palaeontological and archaeological projects. Our CS-ML workflow is subdivided in four project phases: (I) preparation, (II) execution, (III) implementation and (IV) reiteration. To reach the objectives and manage the challenges for different subject domains (CS tasks, ML development, research, stakeholder engagement and app/infrastructure development), tasks are formulated and allocated to different roles in the project. We also provide an outline for an integrated online CS platform which will help reach a project’s full scientific and public engagement potential. Finally, to illustrate the implementation of our CS-ML approach in practice and showcase differences with more commonly available biodiversity CS-ML approaches, we discuss the LegaSea project in which fossils and artefacts from sand nourishments in the western Netherlands are studied.
AB - Historically, the extensive involvement of citizen scientists in palaeontology and archaeology has resulted in many discoveries and insights. More recently, machine learning has emerged as a broadly applicable tool for analysing large datasets of fossils and artefacts. In the digital age, citizen science (CS) and machine learning (ML) prove to be mutually beneficial, and a combined CS-ML approach is increasingly successful in areas such as biodiversity research. Ever-dropping computational costs and the smartphone revolution have put ML tools in the hands of citizen scientists with the potential to generate high-quality data, create new insights from large datasets and elevate public engagement. However, without an integrated approach, new CS-ML projects may not realise the full scientific and public engagement potential. Furthermore, object-based data gathering of fossils and artefacts comes with different requirements for successful CS-ML approaches than observation-based data gathering in biodiversity monitoring. In this review we investigate best practices and common pitfalls in this new interdisciplinary field in order to formulate a workflow to guide future palaeontological and archaeological projects. Our CS-ML workflow is subdivided in four project phases: (I) preparation, (II) execution, (III) implementation and (IV) reiteration. To reach the objectives and manage the challenges for different subject domains (CS tasks, ML development, research, stakeholder engagement and app/infrastructure development), tasks are formulated and allocated to different roles in the project. We also provide an outline for an integrated online CS platform which will help reach a project’s full scientific and public engagement potential. Finally, to illustrate the implementation of our CS-ML approach in practice and showcase differences with more commonly available biodiversity CS-ML approaches, we discuss the LegaSea project in which fossils and artefacts from sand nourishments in the western Netherlands are studied.
KW - AI
KW - Archaeology
KW - Citizen science
KW - Palaeontology
KW - Project design
U2 - 10.7717/peerj.18927
DO - 10.7717/peerj.18927
M3 - (Systematic) Review article
SN - 2167-8359
VL - 13
JO - PEERJ
JF - PEERJ
IS - 2
M1 - e18927
ER -