Empowering Undergraduate Research and Educational Experience with Data Science

Librarians in the Data & Visualization Services department partnered with researchers in the Department of Materials Science and Engineering to develop interactive data science instructional materials for a grant funded online educational module.

Overview

In the spring of 2021 members of the Data & Visualization Services (DVS) partnered with with Professor Yara Yingling and her Research Group in the Department of Materials Science and Engineering on the development of instructional materials introducing computer programming and data science methods for an online Materials Informatics educational interactive module. This module targets undergraduate students interested in learning the basics of data science and is one of the project outcomes supported by the University of North Carolina System Undergraduate Research Program Award entitled "Empowering undergraduate research and educational experience with data science," which also includes the involvement of students and faculty from NC State and Appalachian State Universities in the process of beta-testing of the module. In support of the grant, we developed an instructional outline based on predetermined learning goals and produced a series of Python-based computational notebooks with companion instructional videos that will be incorporated into the final version of the Materials Informatics module.

How We Did It

We worked with the developer of the project, Dr. Alexey Gulyuk, to first establish an outline for the concepts and methods that we would cover in our materials to match with the predetermined learning goals established by the project group. The final outline included a guide to setting up a local Python environment using Anaconda Navigator and a basic introduction to programming in Python as well as data cleaning, analysis, and visualization techniques using the Python data analysis library pandas.

After establishing an outline, we used Jupyter Notebook to create three interactive instructional Python notebooks that include hands-on exercises for students to apply programming and data science techniques with Python. We also worked with Dr. Gulyuk to incorporate a materials informatics dataset that would be relevant to the other content of the overall module. In addition to being included in the Materials Informatics module, these notebooks are open source and are accessible in a GitHub repository. This repository includes instructions for running the notebooks on a local machine and has also been set up with Binder to enable running these notebooks interactively online without local setup.

Following the development of the Python notebooks, members of DVS used Zoom to record the instructional delivery of each notebook, describing each concept and writing and executing the associated Python code. Dr. Gulyuk and undergraduate students involved in beta-testing the module attended these recordings to provide feedback and suggest any necessary improvements to the content and content delivery. Due to technical issues and suggestions by the project group several of these recordings were rerecorded. Once we had complete recordings for each notebook we used Adobe Premier Pro to correct and edit the videos to improve the production quality. We then produced transcripts of each video by recording the playback of each video in Zoom to generate transcripts that were then manually edited for any errors. The final product of this phase includes three captioned videos that guide students through the concepts and code covered in each notebook.

Team

  • Staff profile photo
    Walt Gurley
    Former Data Visualization Analyst
  • Staff profile photo
    Claire Cahoon
    Former Libraries Fellow
  • Staff profile photo
    Alexey Gulyuk
    Postdoctoral Researcher at North Carolina State University