Online
Jun 24 and 26, 2025
Jul 8 and 9, 2025
1:00 pm - 4:00 pm
Instructors: Matt Kweskin, Adam Mansur, Richard Naples, Mike Trizna
Helpers: Sue Zwicker
This workshop will introduce tools that can be used to organize, clean, analyze, and plot data. We'll be mostly working with an ecology dataset, but don't let that scare you away! The approaches taught in this course are important across a wide variety of disciplines. Please take a look at the schedule below to learn more about exactly what these lessons will cover, including links to lesson materials.
Who: The course is aimed at graduate students and other researchers. You don't need to have any previous knowledge of the tools that will be presented at the workshop.
Where: This training will take place online. The instructors will provide you with the information you will need to connect to this meeting.
When: Jun 24 and 26, 2025 & Jul 8 and 9, 2025; 1:00 pm - 4:00 pm Add to your Google Calendar.
Requirements: Participants must have access to a computer with a Mac, Linux, or Windows operating system (not a tablet, Chromebook, etc.) that they have administrative privileges on. They should have a few specific software packages installed (listed below).
Accessibility: We are committed to making this workshop accessible to everybody.
We are dedicated to providing a positive and accessible learning environment for all. We do not require participants to provide documentation of disabilities or disclose any unnecessary personal information. However, we do want to help create an inclusive, accessible experience for all participants. We encourage you to share any information that would be helpful to make your Carpentries experience accessible. To request an accommodation for this workshop, please fill out the accommodation request form. If you have questions or need assistance with the accommodation form please email us.
Glosario is a multilingual glossary for computing and data science terms. The glossary helps learners attend workshops and use our lessons to make sense of computational and programming jargon written in English by offering it in their native language. Translating data science terms also provides a teaching tool for Carpentries Instructors to reduce barriers for their learners.
Workshop Recordings: Carpentries workshops are designed to be interactive rather than lecture-based, with lessons that build upon one another. To foster a positive online learning environment, we strongly recommend that participants join in real time. As a result, workshop recordings are not recommended and may not be available to learners.
Contact: Please email mansura@si.edu for more information.
Roles: To learn more about the roles at the workshop (who will be doing what), refer to our Workshop FAQ.
The Carpentries project comprises the Software Carpentry, Data Carpentry, and Library Carpentry communities of Instructors, Trainers, Maintainers, helpers, and supporters who share a mission to teach foundational computational and data science skills to researchers.
Want to learn more and stay engaged with The Carpentries? Carpentries Clippings is The Carpentries' biweekly newsletter, where we share community news, community job postings, and more. Sign up to receive future editions and read our full archive: https://carpentries.org/newsletter/
Data Carpentry develops and teaches workshops on the fundamental data skills needed to conduct research. Its target audience is researchers who have little to no prior computational experience, and its lessons are domain specific, building on learners' existing knowledge to enable them to quickly apply skills learned to their own research. Participants will be encouraged to help one another and to apply what they have learned to their own research problems.
For more information on what we teach and why, please see our paper "Good Enough Practices for Scientific Computing".
Everyone who participates in Carpentries activities is required to conform to the Code of Conduct. This document also outlines how to report an incident if needed.
We will use this collaborative document for chatting, taking notes, and sharing URLs and bits of code.
Please be sure to complete these surveys before and after the workshop.
The workshop consists of four afternoon sessions in June and July. Each lesson uses a different tool to organize, clean, analyze, and/or plot data. Participants are welcome to attend the full workshop or just the lessons that appeal to them.
We will plan to take 10-minute breaks around 2 PM and 3 PM each day.
Before starting | Pre-workshop survey |
1:00 PM | Introduction |
1:15 PM | Data Organization in Spreadsheets |
4:00 PM | End of day |
End | Post-workshop survey |
Before starting | Pre-workshop survey |
1:00 PM | Introduction |
1:15 PM | Data Cleaning with OpenRefine |
4:00 PM | End of day |
End | Post-workshop survey |
Before starting | Pre-workshop survey |
1:00 PM | Introduction |
1:15 PM | Data Management with Python |
4:00 PM | End of day |
1:00 PM | Review of previous day |
1:30 PM | Plotting Data with Python |
4:00 PM | End of day |
End | Post-workshop survey |