2. Metadata and Ontologies


  • Susanne Arndt, TIB Hannover
  • Dr. Holger Israel, PTB Braunschweig

Key Objective:

Enabling FAIR data management through extensive digitization of research documentation and processes using interoperable metadata schemata and ontologies.

Metadata & Ontologies – Solutions for Physics

The Task Area (TA) Metadata & Ontologies will work with pilot users representing use cases from all the domains covered by NFDI4Phys. We will bridge the gaps between existing isolated metadata systems and ontologies to underpin physics with a common, interoperable semantic system. To enable semantic data interoperability, discovery and exploitation of research data across physical sub-disciplines we will develop, deploy, and sustainably operate a terminology service for physicists. The terminology service will include tools for collaborative curation and (discipline-specific) creation of vocabularies as well as a machine interface for the dissemination of a harmonized terminology to our services developed by other TAs, e.g. FAIR Laboratory .

Our user-driven activities will include the following:

  • A metadata standard will be developed via our paradigmatic use case Polarimetry including information on sample history, measurement conditions (e.g. temperature, external stimuli), softwares for measurement and data analysis, and parameters of the optical modelling enabling the TA to derive a minimum information standard for physics.
  • An extended metadata model will be constructed by our Domain Optics and Photonics via our paradigmatic use case Quantum Emitters with a focus on modularity which allows reusability in other physics domains.
  • Specific metadata schemata for the Domains Biological Physics, Soft Matter and Statistical Physics will be evolved by our paradigmatic use case Biological Information Processing in the context of active materials at microscopic length scales, mostly with the help of particle-based mesoscale computer simulations.
  • An ontological model for the uniform documentation of research data in plasma technology is currently developed by our Domain Cold Plasma and TA Federated Repositories . The model shall be extended and disseminated in the plasma community via our paradigmatic use case RDM in low-temperature plasma science and plasma technology. The process-oriented approach will be partially transferred and adopted by neighbouring sub-domains.
  • Harmonised metadata representations for measurands, where individual data points consist of a numerical value, the appropriate unit and an estimate of its uncertainty (incl. statistical distributions) as well as Digital Object Identifiers (DOI) for the traceability of calibrations will be implemented. The TA will also liaise with the international metrology community at the intersection of metrology and research data management.
  • Ontological Surveys will be conducted by TA Community Interactions .
  • A General ASCII File Reader is being developed as part of TA FAIR Laboratory , which will extract metadata to support automatic mapping of metadata schemata and metadata values.

Metadata standards and ontology development are essential to drive extensive digitisation of research processes and documentation in physics. Thus, we want to develop the ontological substructure of our domains by a federated network of subject-specific ontologies fostering interoperability of metadata. Ontologies are “formal, explicit specification[s] of […] shared conceptualisation[s]” (Studer et al. 1997) that define logical relations between instances of a certain type characterized by attributes. In NFDI4Phys, the ontological substructure particularly needs to describe a typical experiment, as well as analytical or numerical calculations together with its embedding environment or boundary conditions. Our aim is to create FAIR digital objects (FDO) that provide the needed “[s]emantic interoperability […] when it comes to linking heterogeneous data from different communities in a machine-actionable way, especially in the context of cross-disciplinary research” (FDO Forum) .

The state of the art in many communities as yet is ontology building with little automation. Progress has been made by the Artificial Intelligence (AI) community via machine learning and deep neural networks (cf. Dessi et al. 2020). Since semantic analysis certainly exceeds the capabilities of the average physicist or at the very least is outside of their formal training or interest, AI approaches will be explored to support the development of well-curated ontologies and metadata networks. However, computing power is still low relative to the enormous complexity of the problem at hand preventing a full semantic analysis. This will change with the rise of Quantum Computing – another domain addressed by NFDI4Phys. We will foster and benefit from semantic expertise within the consortium, our good connection to related NFDIs (e.g. NFDI4Chem , NFDI4DataScience, NFDI4Culture ) and our close interaction with the FDO Forum and other organisations.

previous task area (FAIR Laboratory) | task areas | next task area (Federated Repositories)