Skip to main content

Big Data Portal For Sharing Real-world Bioinformatics Data Sets to the Public Domain

Submission Number: 85
Submission ID: 121
Submission UUID: a6a2c476-84ec-49ef-a27c-bf9a7c461a60
Submission URI: /form/project

Created: Tue, 01/26/2021 - 15:59
Completed: Tue, 01/26/2021 - 16:23
Changed: Thu, 05/05/2022 - 03:49

Remote IP address: 74.78.188.196
Submitted by: Bruce Segee
Language: English

Is draft: No
Webform: Project
Big Data Portal For Sharing Real-world Bioinformatics Data Sets to the Public Domain
Northeast
488CA134-FDB7-45F4-8793-A0D73935DA88.jpeg
big-data (4), bioinformatics (277), data-management (260), data-wrangling (6), hpc-storage (171), metadata (264), science-gateway (28), storage (47)
Complete

Project Leader

Rocko Graziano
{Empty}
{Empty}

Project Personnel

Larry Whitsel
Joseph Neumann, Ben Burnett
{Empty}

Project Information

This project aims to facilitate the sharing of large data sets for research and education across Maine as well as across the Open Storage Network. It is the intention of Mount Desert Island Biological Laboratories (MDIBL) to make data files and metadata publicly available in exchange for free access. This data is of interest and value to Data Science faculty at the University of Maine Augusta, for teaching and research as part of a system-wide data science degree.

The project requires the development of a front-end and back-end system, preferably developed in Go and deployed in a container, preferably Docker. The end result will allow uploading, downloading, metadata tagging, and HPC job submissions that use the data.

Project Information Subsection

It is the intention that this be an interface that can be used day-to-day by researchers at MDIBL, but flexible enough to allow the easy incorporation of other data sets and be shared to other researchers and educators (and students) with minimal effort on the part of the end user.
{Empty}
The ideal student would have skills with linux, Go language, database and containers. This is a development project which relies heavily on the use of Python, HTML, and either Django or Flask to create RESTful interfaces to access & maintain the data stores. The right candidate will be comfortable with Python development & HTML/web technologies; experience working with Linux systems would also be useful. You should be willing to work independently to research and deploy technologies.
{Empty}
Practical applications
{Empty}
University of Maine, Augusta
Jewett Hall
Augusta, Maine. 04330
NE-University of Maine
02/01/2021
No
Already behind3Start date is flexible
{Empty}
{Empty}
01/18/2022
{Empty}
03/15/2022
  • Milestone Title: Working Prototype
    Completion Date Goal: 2022-01-31
  • Milestone Title: Core Features Implemented
    Completion Date Goal: 2022-02-28
  • Milestone Title: Deployment
    Completion Date Goal: 2022-03-25
It is anticipated that several questions related to Go and/or containers will be generated and answered.
{Empty}
The student will gain familiarity and experience with full stack software development.
{Empty}
It is anticipated that at least one method for sharing big data sets will result.
This project will utilize existing CEPH storage at the University of Maine as well as one or more virtual machines to run the code and act as the web interface. HPC resources will be used in the processing of data, and as such some modest use is required for development.
{Empty}

Final Report

{Empty}
{Empty}
{Empty}
From Ben Burnett, Student Participant:
Being a research facilitator for the Northeast Cyberteam had a positive impact on my performance as a student. As a graduate student a lot of the work I do requires me to be like a horse with blinders on and deeply focus on one subject. Working on a Cyberteam project presented me the opportunity to step out of my own field of research to learn about and experience other perspectives and cultures within research computing. Learning different perspectives is invaluable as an aspiring problem solver, and I am grateful for the chance to have learned more as a Northeast Cyberteam research facilitator.
{Empty}
{Empty}
{Empty}
{Empty}
{Empty}
{Empty}