T-19: Metadata Framework for Sharing and Developing Code Repository for Standard Analyses
Poster Presenter
Hanming Tu
Vice President, IT
Frontage Laboratories, Inc. United States
Objectives
To present a metadata framework for sharing and developing standard scripts through the GitHub repository, a R package developed for sharing the scripts, and the works done by the Standard Analyses and Code Sharing working group (SACS) in Pharmaceutical Users Software Exchange (PhUSE).
Method
The “Code Repository” project team in the SACS Working Group was formed as part of a FDA/PhUSE collaboration, authored a white paper to recommend a set of script metadata and developed a R package to test out the concept of using the script metadata to identify and execute scripts.
Results
Since the phuse-scripts repository was created in GitHub in 2013, many scripts were contributed and hosted in the repository, but users are facing many difficulties:
• Not easy to find scripts due to a) script metadata are not defined thus the metadata files are not consistent; b) the index page based on metadata files is not updated promptly.
• Not easy to navigate in the repository due to a) scripts are not well organized; b) the folders are deep and complicated.
• Not easy to use the scripts due to a) need to download the scripts and b) modify the scripts. A person must generally make changes to the original script to make it work in the local environment.
Script metadata provides the information about the script’s purpose, version, execution environment, library, data files used, inputs, outputs, review history, ratings etc. The metadata makes it easy to share, access and execute scripts in the repository. Here are the key deliverables coming out from the project:
• Clearly defined script metadata in a white paper
• Conduct a proof of concept (PoC) using the script metadata to access and execute R scripts in the repository.
• Developed a R package and a web framework to access, display, download and execute scripts hosted in the repository
Conclusion
Script metadata provides the information about the script’s purpose, version, execution environment, library, data files used, inputs, outputs, review history, ratings, etc. The metadata make it easy to share, access and execute scripts in the repository. The PhUSE R package provides a web application framework for further building a platform for sharing and accessing the scripts in the repository. The initial tests on R scripts shows the framework of the metadata and application can be used to drive sharing and developing of the scripts hosted in a repository.