A Collaborative Approach to Supporting
Research and High Performance Computing
Marcy Vana, PhD
Senior Support Scientist
Bernard Becker Medical Library
Washington University in St. Louis
vanam@wustl.edu
COLLABORATION BEGINS
Researchers approach Bernard Becker Medical
Library (BBML) with a need for genomic data
analysis software that is easy to use (graphical user
interface) in an environment with computing
resources beyond lab computers/laptops
BBML approaches Center for High Performance
Computing (CHPC) with idea for collaboration
COLLABORATION BEGINS
Installation of genomics software on the CHPC
cluster (WUSTL Galaxy instance & Partek Flow)
- web interface for users
- analysis jobs run on the CHPC cluster
- user programming experience not needed
COLLABORATION BEGINS
Installation of genomics software on the CHPC
cluster
- CHPC maintains software
- BBML provides user support and training
- BBML facilitates licensing when applicable
OUTCOMES
Installation of genomics software on the CHPC
cluster
- positive feedback from researchers
- appreciate BBML & CHPC collaboration to
make this type of software available
CHALLENGES
Installation of genomics software on the CHPC
cluster (WUSTL Galaxy instance)
- open source software
- aspects of software challenging to maintain
CHALLENGES
Installation of genomics software on the CHPC
cluster (Partek Flow)
- commercial software
- part of BBML software licensing program
- too expensive for some researchers
COLLABORATION GROWS
Researchers approach BBML with an interest in
learning R
CHPC would like to increase awareness about
resources and make it easier for new users to get
started on the cluster
COLLABORATION GROWS
Develop workshops to introduce the basics of
research computing
- assume no experience with cluster
computing or programming
COLLABORATION GROWS
Develop workshops to introduce the basics of
research computing
- Computing 101 (using the CHPC cluster)
- R
- Python
COLLABORATION GROWS
Computing 101 workshop series
- connecting to a remote system
- moving data back and forth
- basic Unix
- shell scripts & batch scripts
- interacting with the queueing system
COLLABORATION GROWS
R workshop series
- objects, classes, data structures
- basic data manipulation & visualization
- packages
- RStudio
- using R on the CHPC cluster
COLLABORATION GROWS
Python workshop series
- variables, data types, functions
- basic data manipulation & visualization
- Numpy, Pandas, Matplotlib
- Jupyter Notebook
- using Python on the CHPC cluster
COLLABORATION GROWS
Develop workshops to introduce the basics of
research computing
- same format for all workshops
- lecture (30 minutes)
- hands-on exercises (90 minutes)
OUTCOMES
Develop workshops to introduce the basics of
research computing
- praise for working to meet unmet need
- increased awareness
- seen in new light
- fulfilling for staff
CHALLENGES
Develop workshops to introduce the basics of
research computing
- experience level of attendees varies widely
BBML CONSIDERATIONS
Teaching research computing workshops is in line
with BBML mission to support research enterprise
on campus
CHPC is a great partner – very collaborative, equal
teaching role for all involved
BBML CONSIDERATIONS
Library handles majority of administrative tasks
associated with workshops (takes time but
provides benefits as well)
- marketing
- registration/waiting lists
- communication with registrants
- post-workshop surveys
COLLABORATION GROWS AGAIN
Join forces with the recently created Institute for
Informatics (I2) to expand content
- genomic data analysis with R/Bioconductor
- manipulating/cleaning EHR data for analysis
COLLABORATORS
BBML
Maze Ndonwi
Robert Engeszer
Paul Schoening
CHPC
Malcolm Tobias
Xing Huang
Institute for Informatics
Aditi Gupta
Madhurima Kaushal
Andrea Krussel
Philip Payne

Collaborative Approach to Supporting Research & High Performance Computing

  • 1.
    A Collaborative Approachto Supporting Research and High Performance Computing Marcy Vana, PhD Senior Support Scientist Bernard Becker Medical Library Washington University in St. Louis vanam@wustl.edu
  • 2.
    COLLABORATION BEGINS Researchers approachBernard Becker Medical Library (BBML) with a need for genomic data analysis software that is easy to use (graphical user interface) in an environment with computing resources beyond lab computers/laptops BBML approaches Center for High Performance Computing (CHPC) with idea for collaboration
  • 3.
    COLLABORATION BEGINS Installation ofgenomics software on the CHPC cluster (WUSTL Galaxy instance & Partek Flow) - web interface for users - analysis jobs run on the CHPC cluster - user programming experience not needed
  • 4.
    COLLABORATION BEGINS Installation ofgenomics software on the CHPC cluster - CHPC maintains software - BBML provides user support and training - BBML facilitates licensing when applicable
  • 5.
    OUTCOMES Installation of genomicssoftware on the CHPC cluster - positive feedback from researchers - appreciate BBML & CHPC collaboration to make this type of software available
  • 6.
    CHALLENGES Installation of genomicssoftware on the CHPC cluster (WUSTL Galaxy instance) - open source software - aspects of software challenging to maintain
  • 7.
    CHALLENGES Installation of genomicssoftware on the CHPC cluster (Partek Flow) - commercial software - part of BBML software licensing program - too expensive for some researchers
  • 8.
    COLLABORATION GROWS Researchers approachBBML with an interest in learning R CHPC would like to increase awareness about resources and make it easier for new users to get started on the cluster
  • 9.
    COLLABORATION GROWS Develop workshopsto introduce the basics of research computing - assume no experience with cluster computing or programming
  • 10.
    COLLABORATION GROWS Develop workshopsto introduce the basics of research computing - Computing 101 (using the CHPC cluster) - R - Python
  • 11.
    COLLABORATION GROWS Computing 101workshop series - connecting to a remote system - moving data back and forth - basic Unix - shell scripts & batch scripts - interacting with the queueing system
  • 12.
    COLLABORATION GROWS R workshopseries - objects, classes, data structures - basic data manipulation & visualization - packages - RStudio - using R on the CHPC cluster
  • 13.
    COLLABORATION GROWS Python workshopseries - variables, data types, functions - basic data manipulation & visualization - Numpy, Pandas, Matplotlib - Jupyter Notebook - using Python on the CHPC cluster
  • 14.
    COLLABORATION GROWS Develop workshopsto introduce the basics of research computing - same format for all workshops - lecture (30 minutes) - hands-on exercises (90 minutes)
  • 15.
    OUTCOMES Develop workshops tointroduce the basics of research computing - praise for working to meet unmet need - increased awareness - seen in new light - fulfilling for staff
  • 16.
    CHALLENGES Develop workshops tointroduce the basics of research computing - experience level of attendees varies widely
  • 17.
    BBML CONSIDERATIONS Teaching researchcomputing workshops is in line with BBML mission to support research enterprise on campus CHPC is a great partner – very collaborative, equal teaching role for all involved
  • 18.
    BBML CONSIDERATIONS Library handlesmajority of administrative tasks associated with workshops (takes time but provides benefits as well) - marketing - registration/waiting lists - communication with registrants - post-workshop surveys
  • 19.
    COLLABORATION GROWS AGAIN Joinforces with the recently created Institute for Informatics (I2) to expand content - genomic data analysis with R/Bioconductor - manipulating/cleaning EHR data for analysis
  • 20.
    COLLABORATORS BBML Maze Ndonwi Robert Engeszer PaulSchoening CHPC Malcolm Tobias Xing Huang Institute for Informatics Aditi Gupta Madhurima Kaushal Andrea Krussel Philip Payne