This poster presents an evidence-based design framework for the third iteration of WaterSoftHack, an NSF CyberTraining project developing data science skills in water science through a virtual, hackathon-centered workshop. We offer a transparent "lessons learned" narrative that translates Year 2 evaluation findings directly into actionable design interventions, providing a transferable model for NSF PIs managing intensive cyberinfrastructure training programs.
Year 2 Context
Summer 2025 featured 18 Fellows participating in a three-week virtual format. Data collection included pre- and post-workshop surveys (11 PRE, 5 POST, 4 matched) and 10 in-depth interviews. We employed a formative, mixed-methods approach using CIPP and Kirkpatrick frameworks to identify reinforceable strengths and intervention-requiring challenges.
Successes to Reinforce
Year 2 confirmed critical program strengths: high satisfaction (POST mean 4.25, n=8), documented self-efficacy gains across core competencies, and concrete skill development with advanced models (Transformers, LSTMs). Participants credited stable Google Colab notebooks and hands-on exemplars. Most significantly, multiple hackathon teams are actively pursuing publications, demonstrating immediate knowledge transfer to dissertation projects and tangible research outputs.
Critical Challenges Driving Redesign
Two persistent challenges anchor our Year 3 redesign:
Cognitive Overload: The compressed timeline created an overwhelming pace and steep learning curve that hindered deep conceptual understanding for participants with varied backgrounds.
Socio-Technical Conflict: Mismatched prior knowledge led to interpersonal friction during the hackathon phase, exacerbated by perceived gaps in faculty mediation protocols.
Evidence-Based Year 3 Design
Our poster details five concrete revisions directly responsive to Year 2 data:
"Week Zero" Scaffolding: A mandatory 2-3 week preparatory period with curated readings on specific ML models to harmonize baseline knowledge.
Streamlined Curriculum: Week 1 theory focused exclusively on hackathon-relevant models, with expanded guided code walkthroughs explaining parameter choices.
Proactive Team Support: Pre-workshop social events for strategic team formation, plus explicit training on team norms and conflict resolution protocols.
Strengthened Outcome Tracking: Post-workshop writing sprints and structured check-ins to convert hackathon outputs into manuscript submissions.
Enhanced Response Rates: Modest incentives and clearer communications to improve evaluation power.
Poster Contribution
Our visual presentation features: (1) an evidence-to-action map connecting Year 2 findings to Year 3 design choices; (2) a reproducible evaluation framework for mixed-methods replication; and (3) an adaptable implementation checklist for CyberTraining PIs working with similar domain-anchored, hackathon-based training contexts.
By sharing our iterative refinement process transparently, we contribute to community understanding of best practices for designing effective data science workforce development programs that balance technical skill development with socio-technical support structures.
The full paper will be available to logged in and registered conference attendees once the conference starts on June 21, 2026, and to all visitors after the conference ends on June 24, 2026