Skip to content

A simple project I have done during the studies to learn Docker and Docker Compose. I have written cheatsheet on top of that

Notifications You must be signed in to change notification settings

yanamlnk/docker-cheatsheet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

docker-cheatsheet

  1. Project Structure
  2. Docker
    1. Main elements to know
    2. Installation
    3. Dockerfile components
    4. Key Docker commands
    5. Docker Compose Elements
    6. Docker Flags
    7. Volume Types
    8. Restart Policies
  3. Add Docker To Project
    1. Requirements
    2. Poll Dockerfile
    3. Result Dockerfile
    4. Worker Dockerfile
    5. Dockerignore file
    6. Docker Compose File
    7. Redis service
    8. DB Service
    9. Worker, Poll and Result services
    10. Networks
    11. Volumes
    12. .env and .gitignore
  4. Run the project

The project structure

All project files were given by the school (except Docker). Here is a small description:

  • it is a web application that has a poll part - where you can vote - and result part - where you can see the results of the vote
  • poll - written in Python Flask, it pushes the results of the vote to the Redis queue
  • redis - holds the results until worker consumes and process them
  • worker - Java application that consumes votes and saves them to PostgreSQL database
  • database - persistently stores the results
  • result - Node.js application that fetches the results from database and displays them. Uses Socket.io

So the general schema for infrastructure is the following:

  [Poll]                [Result]     <- (frontend)
     |                     |
  [Redis] - [Worker] - [Database]    <- (backend)

So there are 5 microservices that needs to be connected and "communicate" with each other. The final project tree will be the following:

.
├── compose.yml
├── poll
│   ├── Dockerfile
│   └── (poll python files)
├── result
│   ├── Dockerfile
│   └── (result JS files)
├── schema.sql
└── worker
    ├── Dockerfile
    └── (worker Java files)

So there will be 3 different containers: poll, result and worker, and a compose file that will connect them all together

Docker

There is always a problem, that the same code will be working perfectly on one machine, and giving errors on the other. It happens due to different OS, versions of libraries, etc. This is where Docker comes into the game: it creates an isolated environment called container, where your code will be running, and since it is coming pre-configured, it will be working everywhere the same.

Main elements to know:

  1. Dockerfile
  • Text file containing instructions to build a Docker image
  • Series of commands that Docker executes to create an image
  • Each instruction creates an immutable layer in the image
  • Base component for creating reproducible builds
  1. Docker Image
  • A read-only template containing application code, libraries, dependencies, tools, and other files
  • Like a "snapshot" or blueprint for creating containers
  • Can be stored in registries (like Docker Hub)
  • Built using a Dockerfile
  • Layered architecture (each instruction creates a new layer)
  1. Docker Container
  • A runnable instance of a Docker image
  • Isolated environment with its own filesystem, network interface, and process space
  • Can be started, stopped, moved, and deleted
  • Like a lightweight, isolated virtual machine
  1. Docker Volume
  • Mechanism for persisting data generated by and used by containers
  • Exists outside the container lifecycle
  • Three types:
    • Named volumes (managed by Docker)
    • Bind mounts (direct link to host filesystem)
    • tmpfs mounts (stored in host memory)
  • Necessary because data is not persistent in Docker container. Once it is stopped - all data generated will disappear
  1. Docker Network
  • Enables communication between containers
  • Isolates container communications
  • Types:
    • Bridge (default)
    • Host
    • None
    • Custom networks
  1. Docker Registry
  • Storage and distribution system for Docker images
  • Can be public (like Docker Hub) or private
  • Repository for sharing and versioning images
  1. Docker Compose
  • Tool for defining and running multi-container applications
  • Uses YAML file to configure application services
  • Manages the complete application lifecycle, sets up networks and control the order of container creations (if there are specific dependencies)

Relationships:

Dockerfile -> Docker Image -> Docker Container
                    ^
             Docker Registry

Container <-> Volume (for persistence)
Container <-> Network (for communication)

Installation

Install Docker Desktop, that includes everything necessary + UI.

To check if you have Docker and Compose:

  • docker --version
  • docker compose version

Dockerfile components:

FROM         # Base image to build upon
WORKDIR      # Sets working directory for instructions
COPY         # Copies files from host to container
ADD          # Copies files (with extra features like URL support and tar extraction)
RUN          # Executes commands during image build
ENV          # Sets environment variables
EXPOSE       # Documents which ports are intended to be published
CMD          # Default command to run when container starts
ENTRYPOINT   # Main command to run (CMD becomes arguments to this)
VOLUME       # Creates a mount point for external volumes

Key Docker commands:

# Images
docker build -t name:tag .    # Build image from Dockerfile
docker pull image:tag        # Pull image from registry
docker push image:tag        # Push to registry
docker images               # List local images
docker rmi image           # Remove image

# Containers
docker run image           # Create and start container
docker start/stop name    # Start/stop existing container
docker ps                # List running containers
docker ps -a             # List all containers
docker rm container      # Remove container
docker logs container    # View container logs
docker exec -it container bash  # Enter running container

# System
docker system prune      # Clean up unused resources
docker volume ls        # List volumes
docker network ls      # List networks

Docker Compose Elements:

version:        # Compose file version
services:       # Define application services
  webapp:       # Service name
    build:      # Build from Dockerfile
    image:      # Use existing image
    ports:      # Port mapping (host:container)
    volumes:    # Mount volumes
    environment: # Environment variables
    networks:    # Connect to networks
    depends_on:  # Service dependencies
    restart:     # Restart policy

networks:       # Define custom networks
volumes:        # Define named volumes

Docker Flags

-d          # Run in background (detached)
-p          # Port mapping
-v          # Volume mounting
--name      # Assign container name
--network   # Connect to network
-e          # Set environment variables
--rm        # Remove container when it exits
-it         # Interactive terminal

Volume Types

# Named volumes
volumes:
  mydata:

# Bind mounts
volumes:
  - ./host/path:/container/path

# tmpfs mounts (memory only)
tmpfs:
  - /temp

Restart Policies

restart:
  no             # Never restart
  always         # Always restart
  on-failure     # Restart only on failure
  unless-stopped # Always restart unless manually stopped

Add Docker to the project

Requirements

There are the following requirements:

  • create 3 images, respecting the specifications described below.
  • no ENTRYPOINT.
  • no latest versions
  • Poll
    • the image is based on an official Python image ;
    • the app exposes and runs on port 80 ;
  • Result
    • the image is based on an official Node.js Alpine image ;
    • the app exposes and runs on port 80 ;
    • The node_modules folder must be excluded from the build context.
  • Worker
    • The image is built using a multi-stage build:
    • First stage - compilation: – is based on maven:3.9.6-eclipse-temurin-21-alpine and is named builder. – is used to build and package the Worker application using:
      • mvn dependency:resolve from within the folder containing pom.xml;
      • then mvn package from within the folder containing the src folder. – generates a file in the target folder named worker-jar-with-dependencies.jar
    • Second stage - run: – is based on eclipse-temurin:21-jre-alpine ; – is the one really running the worker using java -jar worker-jar-with-dependencies.jar.
  • Docker images must be as simple and lightweight as possible.
  • Name of the Compose file is compose.yml
  • Compose file should contain:
    • 5 services: – poll (builds poll image, redirects port 5000 of the host to the port 80 of the container) – redis (uses an existing official image of Redis, opens port 6379) – worker (builds worker image) – db (uses an existing official image of PostgreSQL, has its database schema created during container first start) – result (builds result image, redirects port 5001 of the host to the port 80 of the container)
    • 3 networks: poll-tier, result-tier and back-tier.
    • 1 volume: db-data.

Poll Dockerfile

# poll/Dockerfile
FROM python:3.11-slim
WORKDIR /app
COPY . .
RUN pip install -r requirements.txt
ENV PORT=80
EXPOSE 80
CMD ["python", "app.py"]
  • FROM - takes official Python image with version 3.11. Slim for minimal image size
  • WORKDIR - creates a directory INSIDE container (each container has its own filesystem) and sets it as working directory. It is like running command mkdir /app && cd /app
  • COPY - copies files from local directory (first .) into the container's current working directory (second .). First path to local directory is just ., because Dockerfile is already in the poll directory will all the source code and configs being inside poll directory, too. Command COPY includes all files except those in .dockerignore
  • RUN executes specified command during image build. In this case command is pip install -r requirements.txt which is the command to install libraries included in requirements.txt
  • ENV sets environment variable inside the container. In this case variable PORT is created with assigned value 80
  • EXPOSE for documentation purposes and is not affecting the container in the global sense. In this case it is exposing port 80 just to put a label that "this container will use port 80". This command itself does not make this port accesible, you need to explicitly publish it in order to use, for example, with command docker run -p 5000:80 your-image
  • CMD is to precise default command to run when container starts. This command can be overridden when starting a container, unlike ENTRYPOINT that sets a fixed command that cannot be easily overridden

Result Dockerfile

# result/Dockerfile
FROM node:21-alpine

WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .

ENV PORT=80
EXPOSE 80

CMD ["npm", "start"]

Here, everything is the same as in Poll Dockerfile, with a small key difference: there are 2 COPY.

  • First COPY copies just package.json files and after that install necessary dependencies
  • Second COPY copies the rest (including package.json twice, but it will just rewrite package.json without installing dependencies one more time)
  • Why it is done? Docker uses layer caching. If layer created with COPY package*.json is not changed, and only code (second COPY) is changed - then Docker will copy just the rest of the files and will skip the step with installation of dependencies, saving time. Docker will reinstall dependencies only if package.json is changed.
  • both ./ and . mean "current directory", but:
    • . is "everything in current directory"
    • ./ is explicitly "current directory"

Worker Dockerfile

# worker/Dockerfile
FROM maven:3.9.6-eclipse-temurin-21-alpine AS builder
WORKDIR /app
COPY . .
RUN mvn dependency:resolve
RUN mvn package

FROM eclipse-temurin:21-jre-alpine
WORKDIR /app
COPY --from=builder /app/target/worker-jar-with-dependencies.jar .
CMD ["java", "-jar", "worker-jar-with-dependencies.jar"]
  • this Dockerfile is interesting because it is multi-stage, which is used to create a smaller and more efficient final image.
  • First stage:
    • AS builder names this stage for later reference
    • RUN 2 commands: mvn dependency:resolve to download dependencies, mvn package to create jar file
  • Second stage:
    • COPY --from=builder takes this file from the stage named builder
Stage 1 (builder)             Stage 2 (final)
+------------------+          +----------------+
| Maven image      |          | JRE image      |
| Source code      |   JAR    |                |
| Builds JAR    -----→-------→ Only JAR file   |
| ~400MB           |          | ~100MB         |
+------------------+          +----------------+

Dockerignore file

  • It tells Docker which files/directories to EXCLUDE during the build process
  • Makes builds faster by copying fewer files
  • Reduces the final image size

In result folder, I have create .dockerignore and added to the file node_modules. Thanks to this:

  • Docker skips copying the node_modules directory
  • Dependencies are cleanly installed inside the container (because they could have been installed using specific OS, which can be unsupposted by other systems)
  • Build process is faster and cleaner

Docker Compose File

Now in the root of the project, create compose.yml file:

version: '3.8'

services:
  redis:
    image: redis:alpine
    ports:
      - "6379:6379"
    networks:
      - poll-tier
      - back-tier
    restart: unless-stopped

  db:
    image: postgres:15-alpine
    volumes:
      - db-data:/var/lib/postgresql/data
      - ./schema.sql:/docker-entrypoint-initdb.d/schema.sql
    environment:
      - POSTGRES_USER=${POSTGRES_USER}
      - POSTGRES_PASSWORD=${POSTGRES_PASSWORD}
      - POSTGRES_DB=${POSTGRES_DB}
    networks:
      - back-tier
      - result-tier
    restart: unless-stopped

  worker:
    build: ./worker
    environment:
      - REDIS_HOST=redis 
      - POSTGRES_HOST=db
      - POSTGRES_PORT=${POSTGRES_PORT}
      - POSTGRES_USER=${POSTGRES_USER}
      - POSTGRES_PASSWORD=${POSTGRES_PASSWORD}
      - POSTGRES_DB=${POSTGRES_DB}
    networks:
      - back-tier
    depends_on:
      - redis
      - db
    restart: unless-stopped
    
  poll:
    build: ./poll
    ports:
      - "5000:80"
    environment:
      - REDIS_HOST=redis 
      - OPTION_A=${OPTION_A}
      - OPTION_B=${OPTION_B}
      - OPTION_C=${OPTION_C}
      - OPTION_D=${OPTION_D}
    networks:
      - poll-tier
    depends_on:
      - redis 
    restart: unless-stopped

  result:
    build: ./result
    ports:
      - "5001:80"
    environment:
      - POSTGRES_HOST=db
      - POSTGRES_PORT=${POSTGRES_PORT}
      - POSTGRES_USER=${POSTGRES_USER}
      - POSTGRES_PASSWORD=${POSTGRES_PASSWORD}
      - POSTGRES_DB=${POSTGRES_DB}
    networks:
      - result-tier
    depends_on:
      - db
    restart: unless-stopped

networks:
  poll-tier:
  result-tier:
  back-tier:

volumes:
  db-data:
  • version affects available features and syntax and specifies Docker Compose file format version. 3.8 is very stable, even though not the latest one
  • services defines application containers. The order of service declarations in the compose.yml file doesn't determine the startup order. The actual startup order is determined by the depends_on configuration.:

redis:

redis:
    image: redis:alpine        # Uses pre-built Redis image
    ports:
      - "6379:6379"           # Port mapping (host:container)
    networks:                  # Connected networks
      - poll-tier
      - back-tier
    restart: unless-stopped
  • regarding ports: 5000:80 means that outside world uses localhost:5000, inside container uses port 80. When you open localhost:5000 in browser, Docker forwards that traffic to port 80 in the container

db:

db:
    image: postgres:15-alpine
    volumes:                   # Data persistence
      - db-data:/var/lib/postgresql/data         # Named volume
      - ./schema.sql:/docker-entrypoint-initdb.d/schema.sql  # Init script
    environment:              # Environment variables
      - POSTGRES_USER=${POSTGRES_USER}
  • environment creates environment values in the container. ${POSTGRES_USER} is a value saved in .env file.
  • for volumes:
    • db-data:/var/lib/postgresql/data creates a named volume in location /var/lib/postgresql/data
    • ./schema.sql:/docker-entrypoint-initdb.d/schema.sql - ./schema.sql is source file on host machine, /docker-entrypoint-initdb.d/schema.sql is a destination path in container. /docker-entrypoint-initdb.d/ is a special directory in PostdreSQL:
      • PostgreSQL automatically executes any .sql files in this directory
      • Only runs when the database is first initialized (first time startup)
      • Used to set up initial database schema, tables, etc.
  • so you don't need to additionaly create a user or configure the database, the container will:
    • Use the credentials from your .env file
    • Automatically create the user
    • Run the schema.sql when the container first starts

worker, poll, result

worker:
    build: ./worker           # Build from Dockerfile
    environment:              
      - REDIS_HOST=redis      # Service discovery
      - POSTGRES_HOST=db      # Reference other services
    depends_on:               # Startup order
      - redis
      - db
  • build includes path to the corresponding Dockerbuild file
  • environment values in this case (redis and db) are the name of the services. Services on the same network can talk to each other using service names as hostnames. Example: Worker can reach Redis using just "redis" as hostname. Docker's internal DNS automatically resolves these service names to the correct container IP addresses. So REDIS_HOST=redis tells the app to look for a host named 'redis'
  • depends_on declares an order. Worker must be created after redis and db

networks

networks:
  poll-tier:    # For poll and redis
  result-tier:  # For result and db
  back-tier:    # For worker, redis, and db

is a declaration for the networks

volumes

volumes:
  db-data:      # Named volume for database persistence

is a declaration of the volumes

.env and .gitignore

In root, there is also .env file with all values for environment values.

# Database settings
POSTGRES_USER=____
POSTGRES_PASSWORD=____
POSTGRES_DB=____
POSTGRES_PORT=____

# Vote options
OPTION_A=____
OPTION_B=____
OPTION_C=____
OPTION_D=____

And .gitignore includes files that are not needed to be pushed to the git:

.env
result/node_modules
worker/target

Run the project

Now to start the project, open the Docker Desktop to start the Docker, and then run command docker compose up --build. You don't need to use --build everytime to launch the project, only the for the first time. To stop container: docker compose down.

The application should be accessible at:

About

A simple project I have done during the studies to learn Docker and Docker Compose. I have written cheatsheet on top of that

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published