I wrote a Dockerfile linter and ran it against 100 popular open-source Dockerfiles from GitHub.
The results? The same 5 mistakes appeared in over 60% of them.
The Methodology
I grabbed Dockerfiles from repos with 1,000+ stars across different languages (Python, Node.js, Go, Java). Ran my linter and categorized every issue.
Mistake #1: Using :latest Tag (73% of Dockerfiles)
# Bad
FROM python:latest
# Good
FROM python:3.11-slim
Why it matters: :latest is a moving target. Your build works today, breaks tomorrow when the base image updates. I've seen production outages from this exact issue.
The fix: Always pin to a specific version. Use slim/alpine variants to reduce image size by 80%.
Mistake #2: Running as Root (68%)
# Bad — runs everything as root
FROM node:20
COPY . /app
CMD ["node", "server.js"]
# Good — creates and uses non-root user
FROM node:20-slim
RUN groupadd -r app && useradd -r -g app app
COPY --chown=app:app . /app
USER app
CMD ["node", "server.js"]
Why it matters: If an attacker escapes the container, they have root access to the host. Container escapes happen more than you think.
Mistake #3: No Layer Caching Strategy (61%)
# Bad — reinstalls dependencies every time code changes
FROM python:3.11-slim
COPY . /app
RUN pip install -r /app/requirements.txt
# Good — dependencies cached unless requirements.txt changes
FROM python:3.11-slim
COPY requirements.txt /app/
RUN pip install --no-cache-dir -r /app/requirements.txt
COPY . /app
Why it matters: Docker caches layers. If you copy all your code before installing dependencies, changing one line of code invalidates the dependency cache. Builds go from 10 seconds to 5 minutes.
Mistake #4: apt-get Without Cleanup (54%)
# Bad — leaves cache in the image
RUN apt-get update && apt-get install -y curl wget
# Good — clean up in the same layer
RUN apt-get update && \
apt-get install -y --no-install-recommends curl wget && \
rm -rf /var/lib/apt/lists/*
Why it matters: The apt cache can add 100-300MB to your image. Since Docker layers are additive, cleaning up in a separate RUN does nothing — the data is already in a previous layer.
Mistake #5: No HEALTHCHECK (82%)
# Bad — Docker has no idea if your app is actually working
CMD ["python", "server.py"]
# Good — Docker can detect and restart unhealthy containers
HEALTHCHECK --interval=30s --timeout=3s --retries=3 \
CMD curl -f http://localhost:8080/health || exit 1
CMD ["python", "server.py"]
Why it matters: Without a healthcheck, Docker thinks your container is healthy as long as the process is running. But your app could be deadlocked, out of memory, or returning 500s — and Docker won't know.
The Full Results
| Issue | Occurrence | Severity |
|---|---|---|
| No HEALTHCHECK | 82% | Medium |
| Using :latest | 73% | High |
| Running as root | 68% | High |
| No caching strategy | 61% | Medium |
| No apt cleanup | 54% | Medium |
| ADD instead of COPY | 31% | Low |
| Secrets in ENV | 12% | Critical |
| No .dockerignore | 47% | Medium |
Automate It
I open-sourced the linter. Run it on your Dockerfiles:
git clone https://github.com/spinov001-art/dockerfile-linter
python linter.py Dockerfile
Output:
🔴 HIGH: Line 1 — Using :latest tag
Fix: Pin to specific version (e.g., python:3.11-slim)
🔴 HIGH: Running as root (no USER instruction)
Fix: Add USER nonroot before CMD
🟡 MEDIUM: Line 8 — apt-get without cleanup
Fix: Add && rm -rf /var/lib/apt/lists/*
Score: 62/100
Add to CI:
- name: Lint Dockerfile
run: python linter.py Dockerfile --fail-on high
What Dockerfile mistakes have bitten you in production? I'd love to hear your war stories.
Follow for more DevOps and security content.
Need custom dev tools, scrapers, or API integrations? I build automation for dev teams. Email spinov001@gmail.com — or explore awesome-web-scraping.
More from me: 10 Dev Tools I Use Daily | 77 Scrapers on a Schedule | 150+ Free APIs
Also: Neon Free Postgres | Vercel Free API | Hetzner 4x More Server
NEW: I Ran an AI Agent for 16 Days — What Actually Works
You might also like:
Need data from the web without writing scrapers? Check my *Apify actors** — ready-made scrapers for HN, Reddit, LinkedIn, and 75+ more sites. Or email: spinov001@gmail.com*
Top comments (0)