DEV Community

keeper
keeper

Posted on

I Tried Running WSL 2 on Windows 11 for AI Work. Here's Why I Gave Up.

I Tried Running WSL 2 on Windows 11 for AI Work. Here's Why I Gave Up.

I have a Windows 11 mini PC (AMD HX370, 96GB RAM) that I use for running embodied AI experiments — MuJoCo, Genesis simulations, and local LLM inference via LM Studio.

My daily driver for automation is an AI assistant (Hermes) running on a Linux server. I talk to it via Telegram. It works great — I say "run a cartpole demo" and it SSHes into my Windows machine and executes the code.

But I wanted more: I wanted the Windows machine to be self-automating. Respond to commands directly, without me acting as the SSH middleman every time.

My AI assistant needs Linux. My machine is Windows 11. The obvious answer: WSL 2.

This is the story of why WSL 2 didn't work, and why I eventually gave up on it.

The Setup: WSL 2 Looks Easy

Installing WSL 2 was straightforward:

wsl --install -d Ubuntu-24.04
Enter fullscreen mode Exit fullscreen mode

Then inside WSL, install my AI assistant framework (Hermes Agent):

curl -fsSL https://hermes-agent.sh/install | bash
hermes setup
Enter fullscreen mode Exit fullscreen mode

Configure Telegram bot token, start the gateway. Telegram replied: "Hermes Ready!"

It worked. I could send a command from my phone, WSL would receive it, and execute on Windows.

For 30 minutes, I thought this was the solution.

The Problem: LM Studio Drops to Crawl

My Windows machine runs LM Studio — it loads a local LLM (GLM-4.7-Flash) for inference at about 24 tok/s. I use it for local generation tasks.

After WSL + Hermes had been running for a while, inference speed dropped:

24 tok/s → 18 → 12 → sometimes 5 tok/s
Enter fullscreen mode Exit fullscreen mode

Inconsistent slowdown. Worse than consistent slow.

The Root Cause: Windows Memory Integrity + VBS

Here's what was actually happening — and it's not what most WSL troubleshooting guides will tell you.

When you enable WSL 2 on Windows 11, you enable Hyper-V. This triggers Windows Device Security features:

  1. Core Isolation / Memory Integrity — a virtualization-based security (VBS) feature
  2. It creates a secure kernel running inside the Hyper-V hypervisor
  3. Every kernel-mode code access goes through hypervisor verification

This is what the security layer looks like:

┌──────────────────────────────────────────────┐
│            Windows 11 Host                    │
│                                                │
│   ┌──────────┐    ┌──────────────────────┐    │
│   │ LM Studio │    │  WSL 2 (Ubuntu)      │    │
│   │ ~24 tok/s │    │  ┌────────────────┐  │    │
│   │           │    │  │ Hermes Agent   │  │    │
│   └──────────┘    │  └────────────────┘  │    │
│        │          └──────────┬───────────┘    │
│        ▼                     ▼                │
│   ┌──────────────────────────────────────┐    │
│   │   Memory Integrity / VBS             │    │
│   │   (Virtualization-Based Security)    │    │
│   │   Every memory access via Hypervisor  │    │
│   └──────────────────┬───────────────────┘    │
│                      │                        │
│                      ▼                        │
│                Physical Hardware              │
└──────────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

Memory Integrity is off by default. But here's the trap:

  1. You enable WSL 2 → you need virtualization → you go into BIOS to enable SVM Mode
  2. Windows notices virtualization is active → suggests enabling Memory Integrity for "enhanced security"
  3. You follow the prompt → enable it → reboot
  4. After reboot, LM Studio goes from 24 tok/s → 3-5 tok/s

Why this kills LLM inference:

LLM inference is a memory-bandwidth-bound workload. The model weights are large contiguous blocks loaded into RAM. Inference reads these weights repeatedly at high speed.

VBS + Memory Integrity wrap every memory access in hypervisor validation. For a normal application, this adds maybe 1-2% overhead. But for LLM inference — which is essentially a tight loop of reading giant memory blocks — the per-access validation cost compounds catastrophically.

Metric Before (Normal) After (Memory Integrity On)
LM Studio speed ~24 tok/s ~3-5 tok/s
CPU usage 30-40% 70-80%
System feel Smooth Sluggish
MuJoCo sim FPS 60+ 30-40

The Fix (and the Dilemma)

Turn off Memory Integrity:

Windows Security → Device Security → Core Isolation → Memory Integrity → OFF

Reboot. Performance comes back. But now you have:

⚠️ Your device is at risk. Memory integrity is off.

A permanent yellow warning in your security center. And the deeper dilemma:

  • Keep WSL 2 + disable Memory Integrity → WSL works, LM Studio works, but you constantly see a security warning. Plus WSL 2's virtualization layer is still active, even if Memory Integrity is off.
  • Remove WSL 2 entirely → No virtualization layer needed, no security warnings, LM Studio stays at 24 tok/s. But you can't run Linux-only apps.

I chose neither path directly. Instead, I found something I already had.

The Solution: Native Windows Telegram Bot

On my Windows machine, there was already a Telegram bot gateway running natively — OpenClaw Gateway. It runs directly on Windows, no WSL required:

{
  "telegram_bot_token": "...",
  "port": 18789,
  "commands": {
    "run_demo": "python D:/eembodied/examples/demo.py --headless"
  }
}
Enter fullscreen mode Exit fullscreen mode

The Telegram bot T2-Bot responds to commands directly on Windows. It doesn't need Linux, doesn't need WSL, doesn't trigger Hyper-V or Memory Integrity.

Final architecture:

          Telegram
            |
     ┌──────┴───────┐
     |              |
  Hermes          T2-Bot
(Linux Server)  (OpenClaw on Windows)
     |              |
     |      ┌───────┴────────┐
     |      │ T2 (Windows)   │
     |      │ ├ MuJoCo       │
     |      │ ├ Genesis      │
     |      │ ├ LM Studio    │  ← Stable 24 tok/s
     |      │ └ OpenClaw     │
     |      └────────────────┘
     └──────► SSH (backup) ──┘
Enter fullscreen mode Exit fullscreen mode

Now I tell my AI assistant "run a UR5e grasp demo" → Hermes dispatches to T2-Bot → T2 executes natively → results come back. LM Studio stays at 24 tok/s. No WSL involved.

Why WSL 2 Didn't Work for AI Workloads

This experience changed my perspective on WSL 2:

WSL 2 is great for CLI tools (grep, awk, git, ssh). It's not suitable for running actual Linux workloads alongside GPU/memory-intensive AI applications.

The fundamental issues:

  1. Windows security architecture conflicts with WSL 2. Enabling Hyper-V (required by WSL 2) triggers virtualization-based security features. These features destroy performance for memory-bandwidth-intensive workloads like LLM inference.

  2. AMD GPU users are second-class citizens. WSL 2 GPU acceleration only supports NVIDIA CUDA. If you have an AMD iGPU (like my Radeon 890M), you're out of luck.

  3. Resource control is opaque. The Vmmem process is a black box in Task Manager. You can set memory limits via .wslconfig but you can't guarantee real-time performance for critical applications.

  4. The file system tax. Accessing Windows files via /mnt/c/ is 10x slower than native ext4. For AI projects with large datasets, this matters.

Windows Has an AI Infrastructure Problem

This isn't just about WSL. Windows as a platform has structural problems for AI development in 2026:

On Linux:

Install NVIDIA driver → nvidia-smi → install CUDA → install PyTorch → run
Enter fullscreen mode Exit fullscreen mode

No "device security", "core isolation", or "virtualization-based security" to fight with. Docker has native GPU passthrough.

On Windows:

  1. Install NVIDIA driver (if you have AMD GPU, sorry)
  2. Decide between WSL, native, or Docker Desktop (all have pitfalls)
  3. If using WSL: enable virtualization → trigger Hyper-V → trigger Memory Integrity
  4. Performance drops → disable security features → system warns you
  5. Tell yourself "this is fine"

Microsoft talks about AI constantly (Copilot, Azure AI). But the developer experience for running AI workloads on Windows has barely improved. In 2026, I had to:

  • Enter BIOS to enable virtualization
  • Configure Memory Integrity
  • Disable security features to restore performance
  • Accept permanent security warnings
  • Find a third-party native Windows tool (OpenClaw) for automation

This shouldn't be the norm.

What I Would Want from WSL 3

If Microsoft wants WSL to be a serious platform for AI development:

  1. Decouple virtualization from security. WSL should not force-enable VBS/Memory Integrity.
  2. GPU support beyond NVIDIA. AMD and Intel users deserve equal GPU acceleration in WSL.
  3. Granular resource controls. A proper UI panel for CPU/memory allocation, not .wslconfig incantations.
  4. Better networking defaults. NAT mode with automatic port forwarding baked in.

Summary

Step What Happened
Need Auto-command Windows machine from Telegram
Attempt Install AI assistant (Linux-only) via WSL 2
Blocker 1 BIOS virtualization disabled by default
Blocker 2 Memory Integrity/VBS nukes LM Studio (24→3 tok/s)
Pivot Close Memory Integrity → restore performance
Dilemma WSL requires Hyper-V, Hyper-V = security compromise
Resolution Abandon WSL. Use native Windows Telegram bot instead
Now Phone command → T2 executes. LM Studio stable 24 tok/s

Final lesson: Native approaches beat nested approaches. OpenClaw runs on Windows directly. No intermediate layer. No performance tax. It was always there — I just didn't think to use it first.


Full course repository: bossman-lab/embodied-intelligence

Chinese version of this article: t2-wsl-journey.md

Top comments (0)