Introduction
thoth
simplifies containerization by automatically
creating a Docker environment that matches your local R setup. This
ensures your analysis can be reproduced exactly as intended, regardless
of where it runs.
Automatic Setup
When creating a new project, thoth
handles Docker
configuration automatically:
library(thoth)
create_analytics_project(
"my_analysis",
use_docker = TRUE
)
This creates two key files:
1. Dockerfile
FROM rocker/rstudio:${R_VERSION}
# System dependencies
RUN apt-get update && apt-get install -y \
python3-pip \
&& rm -rf /var/lib/apt/lists/*
# Install DVC
RUN pip3 install dvc
# Project setup
WORKDIR /project
COPY . /project/
# R package installation
RUN R -e 'install.packages("renv")'
RUN R -e 'renv::restore()'
# Set permissions
RUN chown -R rstudio:rstudio /project
CMD ["/init"]
Key Features
1. Version Matching
- Uses exactly the same R version as your local installation
- Ensures perfect reproducibility across environments
- Automatically handles system dependencies
Best Practices
Project Organization
- Keep Docker files in version control
- Document any customizations
- Include Docker instructions in README
Development Workflow
# 1. Create project with Docker
create_analytics_project("analysis", use_docker = TRUE)
# 2. Start container
system("docker-compose up -d")
# 3. Develop in RStudio Server
# Access at http://localhost:8787
# 4. Track dependencies automatically
# renv and Docker handle the rest
Customization
You can customize the Docker setup by modifying:
- Dockerfile: Add tools or system dependencies
# Add custom system packages
RUN apt-get update && apt-get install -y \
your-package-here
# Add R packages
RUN R -e 'install.packages("your-package")'
- docker-compose.yml: Adjust container settings
Next Steps
- Try the end-to-end example:
vignette("end-to-end-example")
- Learn about DVC integration:
vignette("dvc-tracking")
- Check Docker documentation for advanced features