DEV Community

Stefano viana
Stefano viana

Posted on

How I Built an AI Trading Bot That Predicts Crypto with 84.6% Accuracy (and Open-Sourced It)

The Problem

I've been trading crypto for years. Like most traders, I was losing more than I was winning. The market moves too fast for human decision-making — by the time you analyze a chart, the opportunity is gone.

So I did what any developer would do: I tried to automate it.

The First Attempt: Tree-Based Models (54% accuracy)

I started with the classic ML approach:

  • XGBoost + LightGBM ensemble
  • 72 features (RSI, MACD, Bollinger Bands, volume ratios, etc.)
  • Walk-forward validation (no data leakage!)

The result? 54% accuracy on out-of-sample data.

That's barely better than a coin flip. And with trading fees eating 0.1% per trade, 54% actually loses money.

I tried everything:

  • Feature engineering (72 → 100+ features)
  • Hyperparameter tuning (Optuna, 500 trials)
  • CatBoost added to the ensemble
  • Different timeframes (15m, 1h, 4h)

Best I could squeeze out: 60%. Still not enough.

The Breakthrough: LSTM + Attention (84.6%)

Then I rented a GPU on RunPod ($0.44/hour, A40) and tried something different: a Bidirectional LSTM with Multi-Head Attention.

Why LSTM? Because financial markets have temporal dependencies. A candle 30 hours ago affects what happens now. Tree-based models treat each data point independently — they can't learn these sequences.

Architecture

class LSTMPredictor(nn.Module):
def init(self, input_size=19, hidden_size=128, num_layers=2):
self.lstm = nn.LSTM(
input_size, hidden_size, num_layers,
batch_first=True, bidirectional=True, dropout=0.3
)
self.attention = nn.MultiheadAttention(
hidden_size * 2, num_heads=4, batch_first=True
)
self.fc = nn.Sequential(
nn.Linear(hidden_size * 2, 64),
nn.ReLU(), nn.Dropout(0.3),
nn.Linear(64, 2) # UP or DOWN
)

Key decisions:

  • 19 features (not 72 — less is more with LSTM)
  • 64-hour sequences (the model "sees" nearly 3 days of history)
  • Bidirectional (learns patterns forward and backward)
  • Multi-head attention (focuses on the most important timesteps)

Training Tricks That Mattered

  1. Data Augmentation # Add noise to prevent overfitting noise = np.random.normal(0, 0.01, features.shape) augmented = features + noise

# Time shift: offset sequences by 1-3 candles
shift = np.random.randint(1, 4)
shifted = features[shift:]

  1. Snapshot Ensemble Instead of using the single best model, I saved the top 3 checkpoints and averaged their predictions. This reduces variance significantly:
  2. Snapshot 1: 84.6%
  3. Snapshot 2: 84.0%
  4. Snapshot 3: 84.1%
  5. Ensemble: 84.6%

  6. Walk-Forward Validation
    This is critical. Most "90% accuracy" claims use random train/test splits, which leak future data into training. Walk-forward validation is temporal:

Window 1: Train [Jan-Jun] → Test [Jul-Aug] = 70.3%
Window 2: Train [Mar-Aug] → Test [Sep-Oct] = 72.2%
Window 3: Train [May-Oct] → Test [Nov-Dec] = 85.0%
Window 4: Train [Jul-Dec] → Test [Jan-Feb] = 83.8%
Average: 77.8%

The model genuinely improved on recent data — it's not just memorizing patterns.

The 19 Features

After testing 72+ features, these 19 survived:

┌───────┬───────────────────────────┬───────────────────────────┐
│ # │ Feature │ Why It Works │
├───────┼───────────────────────────┼───────────────────────────┤
│ 1-4 │ Returns (1h, 3h, 5h, 10h) │ Multi-scale momentum │
├───────┼───────────────────────────┼───────────────────────────┤
│ 5-6 │ Volatility (10h, 20h) │ Risk regime detection │
├───────┼───────────────────────────┼───────────────────────────┤
│ 7 │ RSI (14) │ Overbought/oversold │
├───────┼───────────────────────────┼───────────────────────────┤
│ 8 │ MACD │ Trend strength │
├───────┼───────────────────────────┼───────────────────────────┤
│ 9 │ Bollinger Width │ Squeeze detection │
├───────┼───────────────────────────┼───────────────────────────┤
│ 10 │ ATR │ Volatility-adjusted stops │
├───────┼───────────────────────────┼───────────────────────────┤
│ 11 │ Volume Ratio │ Volume breakouts │
├───────┼───────────────────────────┼───────────────────────────┤
│ 12-14 │ EMA distance (9, 21, 50) │ Trend position │
├───────┼───────────────────────────┼───────────────────────────┤
│ 15 │ Candle Body Ratio │ Price action │
├───────┼───────────────────────────┼───────────────────────────┤
│ 16 │ Z-Score │ Mean reversion signal │
├───────┼───────────────────────────┼───────────────────────────┤
│ 17 │ Log Volume │ Normalized volume │
├───────┼───────────────────────────┼───────────────────────────┤
│ 18-19 │ Hour/Day encoding │ Time patterns │
└───────┴───────────────────────────┴───────────────────────────┘

From Model to Live Trading Bot

Having a good model is only half the battle. Turning it into a live trading system required:

Risk Management

  • ATR-based stop-loss (adapts to volatility)
  • Partial close at +1.5% and +3% (lock in profits)
  • Trailing stop (let winners run)
  • Max 3 concurrent positions (diversification)
  • Auto-unstuck (graduated exit from losing trades)

Infrastructure

  • Bybit API with limit orders (0.02% maker fee vs 0.055% taker)
  • Exchange-side stop-loss (safety net if bot crashes)
  • PM2 process management with auto-restart
  • PostgreSQL for full state persistence (survives restarts)
  • Telegram notifications (public + private channels)

The Blend

The final system uses both models:

  • 60% tree-based (XGBoost + LightGBM) for baseline
  • 40% LSTM for neural boost
  • Direction-aware blending for SHORT positions

Results

After deploying the LSTM model:

  • Old model (tree-based only): 54% accuracy, losing money to fees
  • New model (LSTM blend): 84.6% directional accuracy
  • The bot now opens positions with 80-95% confidence signals
  • Pump scanner catches volume spikes (5x+ average) for quick trades

Open Source

The entire codebase is open-source:

GitHub: https://github.com/stefanoviana/deepalpha

You can:

  1. Self-host for free (bring your own API keys)
  2. Use the cloud version at https://deepalphabot.com (free 7-day trial)
  3. Train your own models with the included training scripts

Quick Start

git clone https://github.com/stefanoviana/deepalpha.git
cd deepalpha
pip install -r requirements.txt
cp .env.example .env # Add your Bybit API keys
python deepalpha.py

What I Learned

  1. Walk-forward validation is non-negotiable. My tree models claimed 70.9% with random splits. Real accuracy: 54%.
  2. Sequence models beat tabular models for time series. LSTM sees patterns that XGBoost literally cannot.
  3. Data augmentation prevents overfitting. Adding noise + time shifts doubled the effective training data.
  4. Less features = better. Going from 72 to 19 features improved accuracy by removing noise.
  5. The model is 30% of the work. Risk management, infrastructure, and edge cases are the other 70%.

What's Next

  • Reinforcement learning for dynamic position sizing
  • Multi-timeframe fusion (1h + 4h + 1d)
  • Continuous online learning from live trades
  • More exchange support (currently Bybit, Binance, OKX, Gate.io)

If you find this useful, a star on https://github.com/stefanoviana/deepalpha would mean a lot. And if you have questions about the architecture, training process, or deployment — drop a comment below.


DeepAlpha is built with Python, PyTorch, ccxt, FastAPI, and PostgreSQL. It's MIT licensed and free to use.

Top comments (0)