Skip to main content
x
Call for community review of the SPARC Data Structure (SDS)

The purpose of this document is to solicit community feedback on the SPARC Data Structure (SDS) that was submitted to INCF for endorsement as a standard. The document contains the INCF standards and best practices committee's review of SDS, and the criteria in which it was evaluated (open, FAIR, testing and implementation, governance, adoption and use, stability and support, extensibility and comparison to similar standards). For the next 60 days, we are seeking community feedback on SDS.           

About SPARC:     
The SPARC data structure is a consistent file structure and naming convention, based on the Brain Imaging Data Structure (BIDS) to ensure that the diverse types of data in SPARC is organized in a similar manner. The current version is SDS 2.0 (released June 24, 2022).    

Summary of Discussion:     
Overall, the members of the INCF Standards and Best Practices Committee could potentially meet the criteria for INCF endorsement. It is open, has strong documentation, and supports FAIR reasonably well with evidence of efforts to align it with BIDS and the DANDI metadata structure. Its use is currently imposed on the SPARC community of approximately 500 investigators with no evidence of use outside of this community. While SDS is inspired by BIDS, it was designed explicitly to accommodate data collection patterns that are fundamentally incompatible with BIDS 1.0 structures (the 20% that BIDS does not attempt to cover). Since SDS has been a consortium used standard, it currently lacks a formal governance structure; however, the submitters have indicated that once SDS is used by groups outside of the consortium that a formal governance structure will be established.    

Recommendation:     
The INCF Standards and Best Practices Committee voted to put SDS forward for community review. The committee is particularly interested in receiving comments from the BIDS and DANDI metadata structure communities. In addition, the committee would also like for SDS, BIDS, and DANDI to draft a commentary on the relationship between the standards to better help the community in determining which standard to use.

Authors and Affiliations

Standards and Best Practices Committee, International Neuroinformatics Coordinating Facility, June 2023

Competing Interests

No competing interests were disclosed

SBP Review Criteria (112.56 KB)
Keywords
Modeling
Simulation
Physiology
Anatomy

Comments

5 Comments
#1

Jesus Martinez

Thu, 09/14/2023 - 19:22

Metacell
The SPARC Data Structure (SDS) has proven to be a well defined file structure and easy to use. Metacell has been able to use these SDS derived files and render them visually in a File Explorer like application, which makes it easier for users to visualize data and explore the contents of the datasets. Having a consistent file structure it's critical from a Sofware Development perspective, as it allows us to develop applications for these SPARC datasets.
#2

Bhavesh Patel

Thu, 09/14/2023 - 19:41

FAIR Data Innovations Hub, California Medical Innovations Institute
I am fully in support of the SPARC Data Structure (SDS) becoming an INCF endorsed standard. My affiliation with the SPARC Consortium has allowed me to witness firsthand the transformative impact of the SDS on our data management practices. Initially, my team and I began utilizing the SDS to share our SPARC-funded datasets. Over time, we recognized the profound benefits of the SDS and started using it to structure all our datasets beyond just our SPARC projects. The SDS offers an intuitive framework enriched with robust metadata that facilitate data sharing and reuse among our team members. This not only enhances collaboration within our team but also contributes significantly to the broader scientific community's ability to harness and build upon our research findings once they are disseminated. The SDS has the potential to have a very large impact beyond just the neuroscience community. Indeed, the SDS is designed to accommodate any data types and therefore provides a one stop solution for structuring any biomedical research data for which there is currently no community-agreed standards.
Competing Interests
I lead the development of SODA, a software that simplifies the process of structuring datasets according to the SDS.
#3

Jerry Skefos, …

Thu, 09/14/2023 - 20:17

MetaCell (current), Boston University School of Medicine (former)
SDS is being developed openly on GitHub with an MIT license for any entity to utilize freely. It strictly adheres to the FAIR criteria, employing persistent identifiers, and allowing for rich metadata with PID's. Designed to handle all biomedical research data types, its anticipated use cases expand beyond BIDS and DANDI while still maintaining a commitment to interoperability with these data standards. I encourage the INCF Standards and Best Practices Committee to support the advancement of SDS for the benefit of the broader scientific community.
Competing Interests
We are developing a novel visualization tool to browse SDS files and quickly identify any errors in their structure or missing information during the curation process to ensure only the highest quality, FAIR data is submitted to repositories using this standard.
#4

David Nickerson

Fri, 09/15/2023 - 05:58

Auckland Bioengineering Institute, University of Auckland, New Zealand
I am strongly in support of the SPARC Data Structure (SDS) becoming an INCF endorsed standard. As vice chair of the Computational Modeling in Biology Network (COMBINE) I am very familiar with the power of endorsement of these kinds of community standards and believe the SDS will benefit greatly from this endorsement - especially in terms of growing beyond the SPARC project.

As the technical lead for the SPARC Data and Resource Center’s MAP Core, I have been responsible for coordinating development of software tooling for mapping SPARC data to anatomical scaffolds and for developing many aspects exploration of SPARC data on the SPARC Portal (https://sparc.science). Our endeavours have been significantly enabled by the adoption of the SDS. Our mapping and portal visual exploration tools rely on the rich structured annotations that describe the data contained in SDS datasets.

We have been fortunate to work directly with the developers of the SDS, contributing feature requests and questions needed to support the range of data and knowledge that we need to to enable the mapping and data registration workflows as well as the visual exploration on the SPARC Portal. Furthermore, we have pushed the limits of the SDS in adopting this format to store computational modelling data - for example, anatomical organ scaffolds (finite element models) with external data embedded in the scaffold; or compartmental models with associated simulation experiments that can be interactively executed across SPARC resources. Maintaining unambiguous provenance in these datasets which can be surfaced on the SPARC Portal in a manner enabling users to understand what they are looking at and where it came from. Accurate and comprehensive attribution and citation of datasets is a crucial aspect of FAIR data platforms and tools.

Beyond SPARC, I have been able to take learnings from our exposure to the SDS and apply them to other standardisation efforts I am involved in. In particular, the structured metadata (with recommended terminologies/ontologies to use) has proven very powerful as we look to link computational models to experimental and clinical data.

As noted in the submission, one key omission for a community standard like this is the specification of a governance framework. Going forward, I believe that broader adoption of the SDS will require establishing a formal governance structure and clearly defined process by which decisions on changes to the specification are made.
Competing Interests
I am a member of the SPARC Data and Resource Center.
#5

Thiranja Prasa…

Fri, 09/15/2023 - 20:36

Auckland Bioengineering Institute, University of Auckland, New Zealand
I strongly support the SPARC Data Structure (SDS) becoming an INCF-endorsed standard. I lead the Clinical Translational Technologies Group at the Auckland Bioengineering Institute (ABI), where we are developing a Digital Twin Platform as part of the 12 LABOURS project (https://www.auckland.ac.nz/en/abi/our-research/research-groups-themes/12-Labours.html). This platform aims to provide common infrastructure to support the generation of integrated digital twins for clinical and home-based healthcare applications, along with supporting the demonstration of their efficacy in clinical trials. These efforts are aimed at creating an ecosystem to make data and research outcomes FAIR, enable reproducible science, meet Aotearoa New Zealand’s data sovereignty requirements, support clinical translation of computational physiology workflows and digital twins, and provide a foundation for integrating and supporting research developments across ABI and its institutional, national, and international collaborations.

A key component of this platform involves storing data using the SPARC Dataset Structure (SDS). This has provided a robust mechanism to standardise our data management practices and maximise the reuse and impact of data generated from our research. Over 30 researchers who are part of 12 LABOURS exemplar projects have started to store their data in SDS format in diverse computational physiology applications, including the development of novel biomarkers for pulmonary hypertension, rehabilitation of upper limb disorders, control of organ function by the autonomic nervous system in the uterus and stomach, and supporting breast cancer diagnosis and treatment.

Despite being investigators outside the SPARC community, we have found the SPARC community and the developers of the SDS to be very supportive of our needs and have always been open to feedback. For example, they have helped us build tools that enable programmatically creating SDS datasets (https://github.com/SPARC-FAIR-Codeathon/sparc-me) and workflow descriptions (https://github.com/SPARC-FAIR-Codeathon/sparc-flow), which we are applying in our platform to maximise reuse of research outcomes and support reproducible science (https://github.com/ABI-CTT-Group/digitaltwins-api).

We strongly support the SDS becoming an INCF-endorsed standard due to the demonstrated benefits we've experienced using it and the exceptional support provided by its developers. As active members of the broader research community, we look forward to contributing to the SDS development roadmap. We also plan to adopt the SDS across our institute of over 300 researchers (https://www.auckland.ac.nz/en/abi/our-research/research-groups-themes.html). We are also starting to work towards extending the application of the SDS as part of the 12 LABOURS data catalogue in the New Zealand Government funded Medtech-iQ Aotearoa initiative (https://www.cmdt.org.nz/medtech-iq-aotearoa) - New Zealand's national innovation hub for medical devices and digital health technologies. This data catalogue will store FAIR descriptions of diverse data in SDS format that hundreds of researchers across New Zealand will contribute.
Competing Interests
None