“This is essentially a fire hose of data,” said the molecular biologist, adding that the amount of computing power and time to collect and analyze such information can be overwhelming.
Why is such analysis important? Because knowledge about these microbial organisms could provide critical information about their health and the health of the oceans. “We wouldn’t have a food chain without these organisms,” Jenkins said.
She is one of many URI scientists using what is called “big data” to study everything from the condition of the oceans and its organisms to the health care system and global financial systems to threats to the nation’s infrastructure.
“We need statisticians and computer scientists to help us see the cause and effect of the environment on these organisms and help us determine the significance of those relationships in a changing climate,” Jenkins said.
Those are just some of the reasons why the University has established the Big Data Collaborative, A Vision for Data-Intensive Discovery. The multi-college, cross-disciplinary effort has begun hiring a cluster of eight new faculty members across numerous disciplines and establishing a high-performance computing core facility and center, a resource for URI faculty and students and other universities and colleges around the state. In addition, Provost Donald H. DeHayes has agreed to hire a research systems administrator for the core high-performance computing center, which would be in addition to the eight faculty hires.
Yang Shen, professor of oceanography and Ying Zhang, professor of cell and molecular biology, have already provided $650,000 to create a high performance computing platform with about 1,300 central processing cores. Having multiple processing cores permits the computer to work on different parts of computation at the same time, or in parallel. This can reduce the time needed to complete the calculation. But because these are not straightforward calculations, expert support is needed to set up high performance computers and to develop effective strategies for completing the computations correctly and efficiently.
“Human resources, such as the new systems administrator are now needed to provide access to and grow this new and needed resource beyond the two scholars now using it,” Shen said.
“The Big Data initiative recognizes the influence of the almost unfathomable amount of information affecting every aspect of our lives every day, and the way in which it has transformed almost every discipline at the University,” DeHayes said. “This Big Data Collaborative represents a substantial institutional investment, and an absolutely critical investment in the future of learning and discovery at URI. I am grateful to the URI faculty and staff who have come together to crystallize this plan. It will play a huge role in the way in which we educate our students and prepare them for new and exciting jobs and give our faculty the teaching and research tools that will be critical in the classroom, library and research faculties across the campus.”
The other major reason for this commitment is the critical need to educate undergraduate and graduate students across the entire University in the exciting area of data exploration and discovery.
“A new data science major is being developed, but this initiative and resources will also provide needed infrastructure and facilities for existing classes in several majors, including computer science, business, and computer engineering to train students for the emerging workforce in big data and data analytics,” said Joan Peckham, professor and chair of the Department of Computer Science and Statistics, who heads the Big Data Cluster Hire and high performance computing initiatives with Shen.
“Thirty faculty scholars, struggling with issues around big data, wanted to collaborate on big data and high performance computing and their applications to research, education and outreach,” Peckham said. “This group formed the Big Data Collaborative and was awarded the cluster in a competitive process.”
“The Big Data community at URI acknowledges the commonly cited Four V’s— of Big Data–velocity, the unprecedented speed at which data is created; the veracity of the data (reliability of the data), volume of data, and variability (the complexity of the data),” Peckham said. “The collaborative defines Big Data as any data effort for which there are insufficient technologies or techniques available to domain experts in dealing with any aspect of a data set in any discipline. Addressing these big data needs will be our focus in the classroom and laboratories at URI.
“Disciplines at the forefront of big data include astronomy, biology and earth sciences,” Peckham said. “At URI, data from satellites and sensors has dominated large data sets at the University for years. As soon as the human genome was sequenced, the field exploded. Scholars had an opportunity to access and archive so much information. They were faced with the big questions. What does it all mean? How can we make sense of the large and complex volumes of data that we can now easily collect?”
For the new URI initiative, one faculty member will be hired in each of the following areas:
• Cybersecurity and Big Data – Computer Science, College of Arts and Sciences
• Biostatistics – College of Pharmacy
• Geology (Big Data) — Graduate School of Oceanography
• Epidemiology and Big Data (Nutrition) — new Academic Health Collaborative
• Human and Animal Microbiome – The microorganisms and their genetic material in or on living organisms – College of the Environment and Life Sciences.
Additionally, there are three core Big Data science positions in:
• Computational Statistics and Machine Learning – Computer Science and Statistics, College of Arts and Sciences
• Computational Biology and Environmental Science – College of the Environment and Life Sciences
• Computer Engineering and Big Data – College of Engineering.
“The new faculty we are hiring to work with data about the human microbiome, cybersecurity, the environment and life sciences, pharmacy, geology, diet and epidemiology, and data management and storage in the library signal the arrival of the data tsunami that is engulfing us and heralds a new and exciting beginning for scholar-educators and their students at URI,” Peckham said.
Peckham said the next important goal will be to build a stronger and more collaborative interdisciplinary community of scholars and students at URI.
Getting back to Jenkins and her microbial marine research, her work begins with her team going out on boats and ships to collect samples. But now technology is telling her and many other researchers that they vastly underestimated the diversity and number of microbial organisms in the ocean.
“Biology has become a digital science,” said Jenkins, a member of a NASA Scientific Definition Team interested in pairing the types of the data she gets from her samples with ocean data remotely observed from satellites orbiting the earth.
Read more at Big Data sidebar.
Yang Shen, left, URI professor of oceanography, and Joan Peckham, professor and chair of the Department of Computer Science and Statistics, who heads the Big Data Cluster Hire and high performance computing initiatives with Shen, pose before servers Shen uses in his research on earthquakes and nuclear explosion monitoring.
URI photo by Michael Salerno Photography