Docker Containers for Data Science and Reproducible Research
Course Tutorial to make your work reproducible using Docker Containers
Description
Get excited!
This course is designed to jump-start using Docker Containers for Data Science and Reproducible Research by reproducing several practical examples.
Course will help to setup Docker Environment on any machine equipped with Docker Engine (Mac, Windows, Linux). Course will proceed with all steps to create custom and distributed development environment [RStudio] in a container. Forget about manual update of your Development Environment! Work as usual, add or develop the research document into your Container, test it and distribute in an image! Result will be reproducible independently on the R version, perhaps after several years...
Same about running R programs in the container. We will demonstrate this capability including testing the container on completely different machines (Mac, Windows, Linux)
Summary of ideas we will cover in this course:
Reproduce and share work on a different infrastructure
Be able to repeat the work after several years
Use R-Studio in an isolated environment
Tips to personalize work with Docker including usage of Automated Builds
What is covered by this course?
This course will provide several use cases on using Docker Containers for Data Science:
Preparing your computer for using Docker
Working pipeline to develop docker image
Building Docker image to work with R-Studio in Interactive mode
Building Docker images to run R programs
Using Docker network to communicate between containers
Building ShinyServer in Docker container
Walk-though example of developing Shiny App as an R Package and deploying in Docker Container using golem framework
More relevant materials may be added to this course in the future (e.g. continous integration and deployment, docker-compose)
Why to take this course and not other?
Added value of this course is to provide a quick overview of functionality and to provide valuable methods and templates to build on. Focus of this course is to make a learning journey as easy as possible - simply watch these videos and reuse provided code!
Just Start using Docker Containers with your Data Science tools by reproducing this course!
What You Will Learn!
- Use Docker Containers to run R Scripts in a reproducible way
- Create customized R Studio in a Docker Container [portable, automated updates]
- Build personal Docker Images originated from verified publishers
- Save Docker Images locally or using Docker Hub online repository
- Share result of your work to your colleagues
- Save and document your work with Version Control
- Practical use of Version Control during development process
- Run containers using Shell/Bat scripts
- Use Auto-builds to update Docker images
- Develop R packages
- Develop Shiny Application with golem framework
Who Should Attend!
- Data Scientists willing to use Docker in their toolset
- Anyone willing to deploy R script on Docker Container
- Anyone willing to use R-Studio on Docker Container
- Anyone curious about Docker for Data Science