Vision and mission
TerraLID envisions a scientific community in which as many data as possible are freely available for all. In this community, TerraLID serves as the central hub for lead isotope data maintained by the community and well-connected to other research data infrastructures. TerraLID significantly facilitates handling lead isotope data, and helps to improve the quality of published data, the richness of metadata, and the reproducibility of interpretations based on lead isotope data. To turn this vision into reality, several challenges must be overcome.
Data handling
On the data level, these are the large variety in the reported metadata (incl. their reportinng formats) and the manual data compilation from different sources by each researcher or group. The former is tackled by the TerraLID metadata profile, which allows to uniformly describe lead isotope data and their contextual information. The metadata profile uniformly structures the large variety of reported metadata, ensures a consistent level of detail, and guarantees that each metadatum is clearly defined. It resolves ambiguities in, e.g., transliterated site names by using persistent identifiers to augment the information stored in the TerraLID database with metadata records from other data infrastructures. By doing so, the TerraLID metadata profile provides a common ground for reporting lead isotope data and ensures that sufficient contextual information for their reuse. Reporting data according to the metadata profile will make the combination of data from different sources easy, saving a lot of time for individual researchers. In addition, the metadata profile increases transparency and reproducibility of data treatment processes because it eliminates individual decisions in the compilation process of data arising from insufficient definitions and/or only partial overlap of metadata in different data sources.
Data curation
Providing a central hub for the storage of uniformly structured lead isotope data accelerates and facilitates data compilation even further. The coherent data curation workflows of such a hub ensure a level of consistency that can rarely be achieved by individual researchers, especially in the long-term. Data published through TerraLID will experience the same scrutiny to optimally prepare them for re-use independent from their initial research questions and any interpretations derived from them. At the same time, collecting data in such a central hub makes searching across datasets and fine-tuning selections of reference data for a given research question very easy.
Diversity in academic backgrounds
Another challenge is the diverse academic background of the community. Enabling all users to carry out such fine-tuned searches requires access to the database and its content through a well-designed graphical user interface. In addition, intuitively usable tools which enforce a basic level of scientific quality are necessary to enable researchers without in-depth background in lead isotope geochemistry to properly handle the data and to critically assess interpretations based on them. This will be realised through the TerraLID web application as intuitive user interface to the database and collection of the most common tools for handling lead isotope data. For the latter, own data can be temporarily uploaded and the results downloaded together with all necessary additional information (e.g., full reference list, publication-quality plots, important statistical parameters). Open education materials provide a starting point for training in the lead isotope method for everyone new to the method or interested in refreshing their knowledge.
Data sharing
Ultimately, TerraLID contributes to changing the academic culture towards open research practices with a particular focus on data sharing practices. This effort is guided by the principle “as open as possible, as closed as necessary”. Being machine-generated data, lead isotope data are usually not copyrightable1. Moreover, openly sharing data raises trust in research outcomes because it increases transparency and reproducibility. Moreover, it helps to overcome global disparities in data access. The TerraLID team is nevertheless fully aware that data should sometimes not shared openly because this might, e.g., impact indigenous groups negatively, and these reasons must and will be respected.
Adding value to the community
The team is fully aware that the integration of TerraLID in everyday research practices and active support from the scientific community will ultimately depend on its ease of use. Providing the information for the metadata profile will require additional effort compared to current practices and so will feeding data into TerraLID. Therefore, TerraLID will develop intuitive and efficient templates and tools that guide researchers through data entry and it will try to automatically fill in as much information as possible from other sources. Likewise, data integrity checks and data curation workflows will be automated as much as possible to minimise the effort necessary for maintaining TerraLID.
Giving proper credit
Data compilations such as the ones derived from TerraLID eventually result in a long list of references and these should include citation of TerraLID as well. Only then, the authors of the data get appropriate credit for their work (e.g., number of citations) while the impact of TerraLID on the community can be monitored (essential for acquiring future funding). TerraLID will offer convenient ways to properly cite all relevant publications for their data subset by e.g. easy import of reference lists into common reference manager software. On a more general level, the requirement for citation directly clashes with the length restrictions of most publication formats. This is a challenge many research data infrastructures face and technical solutions are currently under development2. TerraLID will adopt them as soon as they are available.
Footnotes
Further information in, e.g., the Joint statement by CETAF, SPNHC and BHL on DATA within scientific publications: clarification of [non]copyrightability.↩︎
See the Recommendations of the RDA Complex Citation Working Group on Zenodo.↩︎