Welcome and Introductions
Overview
Teaching: 10 min
Exercises: 0 minQuestions
Instructor introductions
Objectives
Overview of the course
FIXME
Welcome and Introductions
Welcome to this tutorial on using the Singularity containerization framework to install customizable applications.
This tutorial is a primer on how to use the Singularity containerization framework to install customizable applications and use those containers on Linux based computer systems.
Software containers and containerization technologies are emerging as a valuable research artifact to facilitate and promote scientific reproducibility. Software containers provide a separation of the containerized application environment from the host system, acting as an abstracted virtualization layer within the operating system of the host machine. One of the primary reasons behind this rapid adoption of containerization frameworks is that this abstracted virtualization layer allows containers to be easily migrated from one system and executed on another as long as the underlying containerization framework is present on a given system. Another major factor driving adoption, particularly in the computational biology and bioinformatics domain, is that this abstraction and virtualization allows application and container developers to bundle up the necessary prerequisites and software dependencies required by their applications, allowing their applications to be easily transferred and executed by others seeking to utilize them on their own datasets.
This tutorial will allow participants to interact with and build Singularity containers for various computational biology and bioinformatics applications using cloud hosted resources (ex: instances in AWS, GCP, or Azure) provided to each participant for the duration of the tutorial and configured with a host system Linux operating system and Singularity containerization framework.
Presenters
- William S. Sanders (Shane) is the senior manager of the Cyberinfrastructure group at The Jackson Laboratory, where his team focuses on providing the high performance computing, storage, and other data science resources required to meet the needs of JAX faculty research efforts. Dr. Sanders has a BS in Biochemistry and a BS in Computer Science, and obtained an MS in Computer Science and a PhD in Molecular Biology from Mississippi State University. His research efforts focus on using artificial intelligence and high performance computing to provide insight into biological problems, in a variety of species including chicken, cotton, and rattlesnake.
- Jason S. Macklin is the HPC Systems Engineer at The Jackson Laboratory. He is responsible for the design and architecture of JAX HPC systems and currently manages 7 disparate HPC resources in support of the computational needs of the JAX research efforts. Prior to JAX, Mr. Macklin served as HPC Systems Engineer at 454 Life Sciences and Roche Life Science. Mr. Macklin obtained a BA in English from the University of Connecticut.
- Richard Yanicky is a Systems Analyst at The Jackson Laboratory, supporting the HPC and bioinformatics community through outreach and one on one project work. He holds both BS and MS degrees in Applied and Computational Mathematics from the University of Connecticut, in addition to a MS in Applied Statistics from Worcester Polytechnic Institute. He has 20 years in industry and academia supporting corporate drug discovery pipelines and biological research pipelines for companies such as Pfizer, Abbott, and Eli Lilly. Prior to JAX, Richard was most recently at UCSD supporting research in human genetics for short tandem repeats and epigenetics.
- Aaron McDivitt is a Systems Administrator at JAX focusing not only on maintaining the underlying storage and HPC infrastructure, but also educating and assisting the HPC user community with their research needs. In a past life, Aaron was a math and science educator/coach at the middle and high school levels, but transitioned to the IT field in 2013. He received his BA in Education from Cedarville University.
- David McKenzie is a Cyberinfrastructure Architect at the Jackson Laboratory, supporting research and researchers. His primary focuses are research storage, networking, and cloud computing. He recieved a B.S. in Information Technology from the Rochester Institute of Technology in 2006.
- Kurt Showmaker is a Systems Analyst at The Jackson Laboratory, supporting HPC facilitation and adaptation. Dr. Showmaker previous work includes associate director of the University of Mississippi Medical Center’s bioinformatics core, where he assisted in supporting clinical and translational research for multiple centers including, including the Mississippi Center for Clinical and Translational Research ,Cardiovascular-Renal Research Center, Mississippi Center of Excellence in Perinatal Research, Molecular and Genomics Core Facility, and through the support from the University of Southern Mississippi the Mississippi INBRE. In 2020 through the XSEDE project (a national multi-institution collaborative computing project) he was selected as an XSEDE Campus Champion Fellow for the 2020-2021 fellow class. During the course of his career he has conducted bioinformatic analysis on multiple HPC systems including Mississippi State University’s HPCC, University of Mississippi’s Mississippi Center for Supercomputing Research, University of Southern Mississippi’s Magnolia Cluster, University of Illinois Urbana-Champaign’s campus cluster, and the Carl R. Woese Institute for Genomic Biology’s Biocluster.
Objectives
- Learn what a software container is and why adoption of containerized applications is increasing.
- Learn what existing, online resources for containers are already available (ex: DockerHub, Singularity-Hub).
- Learn how to navigate and interact with containers from the Linux command line.
- Learn how to build your own containers both using container definition files and from scratch.
- Learn how to leverage containers for your existing scientific workflows.
Connecting to the Workshop Resources
Please use the instructions below to connect to the Google Cloud virtual machine machine instances used for this course.
1. Click here to view a list of virtual machines (VMs) for this workshop. Each one is represented by a unique Public IP address.
2. Find your name in the list or place your name in an empty space in the “Workshop User” column. This ensures only one user will be logged in per virtual machine instance.
3 Copy the public IP address associated with your VM.
Mac Users
- Open the terminal app on your computer and type the following command and pasting in your IP address from the spreadsheet:
ssh student@<your_vm_IP>
- Type “yes” to connect to the server.
- Paste or type in the unique password for your VM (the password field will be blank even though you type or paste characters) and hit the “Enter” key.
Windows User
- Open the Putty app on your computer. MobaXterm is also a good option if your prefer this program.
- Start an SSH session to the unique Public IP address associated with your VM. Use **student@
** in the Hostname box.
- Click “Accept” for the certificate warning
- If prompted, type student as the login name
- Type your password (do not copy/paste) from the spreadsheet into the password field. Pasting your password may not work properly.
Click Here to view a PDF version of the instructions above
Key Points
First key point. Brief Answer to questions. (FIXME)