DEV Community

Steven Hans
Steven Hans

Posted on

πŸŽ™οΈ I built a voice chat app because I wanted to talk to my son while we play online

It all started with something simple: my son and I wanted to communicate while playing online together. I tried the existing options β€” Discord, TeamSpeak, and others β€” but something always bothered me: ads, data collection, cluttered interfaces, or mediocre audio quality. I just wanted something secure, private, with good audio quality, and free. So I decided to build it myself.

The result is NoamVC: an open source, peer-to-peer voice chat application with end-to-end encryption.

What makes it technically different?
πŸ”Š Direct audio, no middlemen
Voice travels directly between participants using native WebRTC β€” no server processes or stores audio. The server only coordinates the initial connection (signaling).

πŸ”’ Multi-layered security

E2E encryption with Insertable Streams API + 256-byte key derived via PBKDF2 (100K iterations, SHA-256)
Signaling signed with HMAC-SHA256, with secret embedded in Rust β€” never exposed to JavaScript
Anti-replay protection with a 5-minute window
Storage in an encrypted IOTA Stronghold vault β€” zero data in localStorage
🎧 Optimized Opus codec
32 kbps, mono, with Forward Error Correction (FEC), Discontinuous Transmission (DTX), and 20 ms frames. Minimum bandwidth consumption with maximum clarity.

πŸ–₯️ Native desktop app
Built with Tauri 2 (Rust) for macOS (Apple Silicon + Intel), Windows, and Linux. It's not a heavy Electron app β€” it's a real native app, lightweight and signed.

⚑ Modern stack
React 19 Β· TypeScript 5.9 Β· Vite 7 Β· Tailwind CSS 4 Β· NestJS 11 Β· Zustand Β· shadcn/ui

🎨 Polished design
Dark theme with reactive animated backgrounds that respond when someone speaks: SVG waves, equalizer bars, floating particles, and pulse rings. Real-time speech detection with FFT-256 analyzer. Latency indicator with live RTT.

🌐 Bilingual
Interface in Spanish and English with automatic system language detection.

This was definitely a project I really enjoyed building. I gained a ton of learning along the way. I hope you find it useful and that you're encouraged to leave your feedback so I can keep improving it.

πŸ”— Project: https://noamvc.web.app/

OpenSource #WebRTC #Tauri #React #Rust #TypeScript #VoiceChat #P2P #Encryption #DevDad #BuildInPublic

Top comments (3)

Collapse
Β 
chovy profile image
chovy β€’

Really solid crypto choices here. PBKDF2 with 100K iterations for the key derivation is the right call, and using Insertable Streams for E2EE on WebRTC is still underutilized β€” most P2P voice apps skip that layer entirely.

One thing I've been thinking about a lot: the classical crypto primitives (ECDH for key exchange, SHA-256, etc.) are going to need post-quantum upgrades eventually. Harvest-now-decrypt-later attacks mean that even voice signaling metadata could be valuable to store and crack later.

I've been building something in the same privacy-first space β€” qrypt.chat, a text chat focused on quantum-resistant encryption from day one. Different use case (text vs voice) but the same fundamental problem: making strong crypto invisible to users.

Your Tauri + Rust approach is interesting for this too. Rust's memory safety is a genuine security advantage over Electron for anything handling crypto. Nice work shipping this.

Collapse
Β 
steven_hans_b26a962c69563 profile image
Steven Hans β€’

Thanks for the detailed technical analysis! I'm glad the cryptography decisions resonate with someone who clearly understands the space.

On the post-quantum point: totally agree. The "harvest now, decrypt later" scenario is real and something I have on my radar. Today NoamVC uses ECDH + PBKDF2-SHA256 (100K iterations) + Insertable Streams for frame-by-frame E2EE, but the modular architecture allows the key derivation layer to be swapped out without touching the rest of the audio pipeline. When post-quantum primitives (ML-KEM/Kyber) are more mature and stabilized in WebCrypto, migrating the key derivation will be a surgical change.

Since the last post, several important improvements have been implemented:

  • Signaling server rewritten in Rust β€” Migrated from NestJS to Axum + Socketioxide. Same protocol, but now with strict payload validation, sliding window rate limiting (100 events/10s), 5-min anti-replay window, and all backed by Rust's memory safety guarantees.
  • Room admission system β€” Room creators now approve/reject each participant before they join. WebRTC is not negotiated until the host explicitly admits them.
  • Storage migrated to SQLite β€” Replaced IOTA Stronghold with tauri-plugin-sql (SQLite). More stable, more lightweight, and eliminates a heavy dependency without sacrificing local storage security.
  • Firebase SDK removed from the client β€” Bug reporting now uses the Firestore REST API directly (228 KB β†’ 6 KB bundle), with the API key embedded in the Rust binary via env!() at compile time β€” never exposed to the frontend.
  • E2EE text chat β€” Messages encrypted with AES-256-GCM via P2P DataChannels, using the same key derived from the room ID.
  • Full CI/CD β€” GitHub Actions generates signed and notarized builds for macOS (ARM + Intel) and Windows, with automatic in-app updates via Ed25519.

I completely agree that Rust + Tauri is a real advantage over Electron for systems handling cryptography. Not just because of memory safety β€” the ability to embed secrets in the binary at compile time and keep the entire HMAC-SHA256 signing layer in Rust, inaccessible from JavaScript, is a level of isolation that Electron simply cannot offer.

I'll take a look at qrypt.chat β€” the quantum-first approach to text is interesting. Different problems, same philosophy: making robust cryptography invisible to the user.

Collapse
Β 
ankush_banyal_708fa19a469 profile image
Ankush Banyal β€’

Love this β€” building it for your son makes it even more special

Technically very solid approach with pure P2P WebRTC and E2E encryption. Super clean privacy-first mindset.

If you ever decide to scale beyond small peer groups (larger rooms, recording, moderation, etc.), you might eventually need an SFU-based setup instead of pure mesh. That’s where something like Ant Media Server can help handle low-latency real-time audio at scale.

But honestly, for what you built β€” this is awesome work. Respect