MervCodes

Tech Reviews From A Programmer

How to Optimize Docker Image Size: From 1GB to Under 100MB

1 min read

How to Optimize Docker Image Size: From 1GB to Under 100MB

I recently inherited a project with a 1.4GB Docker image. For a Node.js API. Let that sink in. CI/CD took forever, deploys were slow, and every new instance during autoscaling took ages to pull the image. I got it down to 82MB in about an hour using the techniques in this guide. It wasn't hard — just required being deliberate about what goes into the image.

Here's everything I've learned about shrinking Docker images without breaking anything.

Why Image Size Actually Matters

Smaller images have real, measurable benefits:

  • Faster builds and deploys. A 100MB image pushes and pulls in a fraction of the time. In autoscaling scenarios, this directly affects how fast new instances come online.
  • Lower costs. Registries, CI runners, and hosts all store and transfer images. Multiply a 900MB savings across hundreds of weekly builds and it adds up fast.
  • Smaller attack surface. Every package inside a container is a potential vulnerability. Fewer components = fewer CVEs to patch.
  • Easier debugging. When a container only has what it needs, there's less noise during incident response.

Start With the Right Base Image

This single change has the biggest impact. Here's a rough size comparison:

Base Image Approximate Size
ubuntu:24.04 ~78MB
node:22 ~1.1GB
node:22-slim ~220MB
node:22-alpine ~140MB
python:3.13 ~1GB
python:3.13-slim ~150MB
python:3.13-alpine ~55MB
golang:1.23 ~820MB
alpine:3.20 ~7MB
distroless/static ~2MB
scratch 0MB

See the pattern? Full images ship an entire OS. -slim strips docs and dev headers. -alpine swaps to musl-based Alpine Linux. And scratch is literally nothing.

My rule of thumb: Start with -slim or -alpine. If Alpine gives you compatibility issues (musl vs glibc), fall back to -slim rather than the full image.

Multi-Stage Builds Are the Real Game-Changer

Multi-stage builds let you use a fat image for building and a tiny image for running. Only the compiled output gets copied to the final image.

# Stage 1: Build
FROM golang:1.23 AS builder
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -o /app/server .

# Stage 2: Run
FROM alpine:3.20
RUN apk add --no-cache ca-certificates
COPY --from=builder /app/server /server
ENTRYPOINT ["/server"]

The build stage has the entire Go toolchain (820MB+). The final image is just Alpine (~7MB) plus your binary. Result: 15-25MB typically.

For static binaries, you can go all the way to scratch:

FROM scratch
COPY --from=builder /app/server /server
ENTRYPOINT ["/server"]

That's an image containing literally just your binary. Often under 10MB.

This works for Node.js too. Build in a full image, copy only production node_modules and built output to a slim runtime:

FROM node:22 AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build

FROM node:22-alpine
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY package*.json ./
EXPOSE 3000
CMD ["node", "dist/index.js"]

Clean Up in the Same Layer

Each RUN creates a layer. If you install packages in one layer and clean up in the next, the files still exist in the first layer:

# Bad: cleanup in separate layer saves nothing
RUN apt-get update && apt-get install -y build-essential
RUN apt-get clean && rm -rf /var/lib/apt/lists/*

# Good: single layer, cleanup before commit
RUN apt-get update && \
    apt-get install -y --no-install-recommends build-essential && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

That --no-install-recommends flag is a hidden gem. On Debian-based images, it prevents APT from pulling in suggested packages, often saving 100MB+.

Use .dockerignore

Without a .dockerignore, Docker sends your entire build directory to the daemon — including node_modules, .git, test files, IDE configs, and everything else.

.git
node_modules
dist
*.md
.env*
.vscode
__pycache__
*.pyc
tests
coverage

This speeds up builds and prevents accidentally baking secrets or junk into your image.

Only Install What You Need

I've seen production containers with vim, curl, wget, and git installed "just in case." Each tool adds size and attack surface.

If you need a tool only during the build, install it in the builder stage. It won't appear in the final image.

For Alpine, use --no-cache to skip caching the package index:

RUN apk add --no-cache curl

Language-Specific Tips

Node.js

  • npm ci --omit=dev or yarn install --production to skip dev deps
  • If using a bundler, bundle into a single file and skip node_modules entirely
  • node-prune removes unnecessary files from node_modules

Python

  • pip install --no-cache-dir to skip wheel caching
  • Pin deps in requirements.txt, install only what's needed
  • Use pip install --target with multi-stage builds

Go

  • CGO_ENABLED=0 for static binaries that run on scratch/distroless
  • -ldflags="-s -w" strips debug info, cutting binary size by 20-30%
  • UPX can compress further (adds startup decompression time)

Java

  • jlink creates custom JREs with only the modules you need — cuts JRE from ~300MB to 30-50MB
  • Eclipse Temurin Alpine images as runtime base
  • Strip debug metadata from JARs

BuildKit Cache Mounts

Docker BuildKit has cache mounts that keep package data cached across builds without including it in the final layer:

RUN --mount=type=cache,target=/var/cache/apt \
    --mount=type=cache,target=/var/lib/apt/lists \
    apt-get update && apt-get install -y --no-install-recommends build-essential

Fast builds (cached packages reused) + clean images (cache not in the layer). Best of both worlds.

Consider Distroless

Google's distroless images contain only your app and runtime deps. No shell, no package manager, no OS utilities. Small and secure.

FROM gcr.io/distroless/static-debian12
COPY --from=builder /app/server /server
ENTRYPOINT ["/server"]

Available for Java, Python, Node.js, Go. The tradeoff: you can't shell into the container for debugging. But for production, that's often acceptable.

Measure What You've Got

You can't optimize what you don't measure:

  • docker image history <image> — shows size per layer
  • dive <image> — interactive layer explorer that identifies waste
  • docker scout — analyzes both size and vulnerabilities

I run dive after every optimization pass to find the next target.

A Real Before-and-After

Before (1.1GB):

FROM node:22
WORKDIR /app
COPY . .
RUN npm install
EXPOSE 3000
CMD ["node", "src/index.js"]

After (85MB):

FROM node:22-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --omit=dev
COPY . .
RUN npm run build && npx node-prune

FROM node:22-alpine
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/package.json ./
EXPOSE 3000
CMD ["node", "dist/index.js"]

Alpine base, multi-stage build, production deps only, pruned node_modules. 92% reduction.

FAQ

Does Alpine cause compatibility issues?

Alpine uses musl instead of glibc. Most apps are fine, but some native modules (especially in Node and Python) can fail. If you hit issues, use -slim instead — still much smaller than the full image.

Will smaller images be slower?

No. Image size affects pull time and storage, not runtime performance. Your app runs identically regardless of how much extra OS tooling is bundled alongside it.

Can I use scratch for Python or Node.js?

Not directly — they need a runtime. But distroless images include just the interpreter and essentials. Or bundle into standalone executables with pkg/PyInstaller, then use scratch.

How do I debug distroless containers?

kubectl debug for ephemeral debug containers in K8s. Or maintain a separate debug variant with an Alpine base for troubleshooting.

Do compressed layers make size less important?

Registries transfer compressed layers, but uncompressed size still determines disk usage on every host. Optimizing uncompressed size reduces compressed size proportionally.

How often should I rebuild base images?

Monthly minimum for security patches. Pin specific versions for reproducibility, and automate updates with Dependabot or Renovate.

What size should I target?

Rough guidelines: Go services under 20MB, Node/Python under 100-150MB, Java under 200MB. If you're way above these, there's room to optimize.

Sources

Related Articles