Stefano viana

Posted on May 5

How I Built an AI Trading Bot That Predicts Crypto with 84.6% Accuracy (and Open-Sourced It)

The Problem

I've been trading crypto for years. Like most traders, I was losing more than I was winning. The market moves too fast for human decision-making — by the time you analyze a chart, the opportunity is gone.

So I did what any developer would do: I tried to automate it.

The First Attempt: Tree-Based Models (54% accuracy)

I started with the classic ML approach:

XGBoost + LightGBM ensemble
72 features (RSI, MACD, Bollinger Bands, volume ratios, etc.)
Walk-forward validation (no data leakage!)

The result? 54% accuracy on out-of-sample data.

That's barely better than a coin flip. And with trading fees eating 0.1% per trade, 54% actually loses money.

I tried everything:

Feature engineering (72 → 100+ features)
Hyperparameter tuning (Optuna, 500 trials)
CatBoost added to the ensemble
Different timeframes (15m, 1h, 4h)

Best I could squeeze out: 60%. Still not enough.

The Breakthrough: LSTM + Attention (84.6%)

Then I rented a GPU on RunPod ($0.44/hour, A40) and tried something different: a Bidirectional LSTM with Multi-Head Attention.

Why LSTM? Because financial markets have temporal dependencies. A candle 30 hours ago affects what happens now. Tree-based models treat each data point independently — they can't learn these sequences.

Architecture

class LSTMPredictor(nn.Module):
def init(self, input_size=19, hidden_size=128, num_layers=2):
self.lstm = nn.LSTM(
input_size, hidden_size, num_layers,
batch_first=True, bidirectional=True, dropout=0.3
)
self.attention = nn.MultiheadAttention(
hidden_size * 2, num_heads=4, batch_first=True
)
self.fc = nn.Sequential(
nn.Linear(hidden_size * 2, 64),
nn.ReLU(), nn.Dropout(0.3),
nn.Linear(64, 2) # UP or DOWN
)

Key decisions:

19 features (not 72 — less is more with LSTM)
64-hour sequences (the model "sees" nearly 3 days of history)
Bidirectional (learns patterns forward and backward)
Multi-head attention (focuses on the most important timesteps)

Training Tricks That Mattered

Data Augmentation # Add noise to prevent overfitting noise = np.random.normal(0, 0.01, features.shape) augmented = features + noise

# Time shift: offset sequences by 1-3 candles
shift = np.random.randint(1, 4)
shifted = features[shift:]

Snapshot Ensemble Instead of using the single best model, I saved the top 3 checkpoints and averaged their predictions. This reduces variance significantly:
Snapshot 1: 84.6%
Snapshot 2: 84.0%
Snapshot 3: 84.1%
Ensemble: 84.6%
Walk-Forward Validation
This is critical. Most "90% accuracy" claims use random train/test splits, which leak future data into training. Walk-forward validation is temporal:

Window 1: Train [Jan-Jun] → Test [Jul-Aug] = 70.3%
Window 2: Train [Mar-Aug] → Test [Sep-Oct] = 72.2%
Window 3: Train [May-Oct] → Test [Nov-Dec] = 85.0%
Window 4: Train [Jul-Dec] → Test [Jan-Feb] = 83.8%
Average: 77.8%

The model genuinely improved on recent data — it's not just memorizing patterns.

The 19 Features

After testing 72+ features, these 19 survived:

┌───────┬───────────────────────────┬───────────────────────────┐
│ # │ Feature │ Why It Works │
├───────┼───────────────────────────┼───────────────────────────┤
│ 1-4 │ Returns (1h, 3h, 5h, 10h) │ Multi-scale momentum │
├───────┼───────────────────────────┼───────────────────────────┤
│ 5-6 │ Volatility (10h, 20h) │ Risk regime detection │
├───────┼───────────────────────────┼───────────────────────────┤
│ 7 │ RSI (14) │ Overbought/oversold │
├───────┼───────────────────────────┼───────────────────────────┤
│ 8 │ MACD │ Trend strength │
├───────┼───────────────────────────┼───────────────────────────┤
│ 9 │ Bollinger Width │ Squeeze detection │
├───────┼───────────────────────────┼───────────────────────────┤
│ 10 │ ATR │ Volatility-adjusted stops │
├───────┼───────────────────────────┼───────────────────────────┤
│ 11 │ Volume Ratio │ Volume breakouts │
├───────┼───────────────────────────┼───────────────────────────┤
│ 12-14 │ EMA distance (9, 21, 50) │ Trend position │
├───────┼───────────────────────────┼───────────────────────────┤
│ 15 │ Candle Body Ratio │ Price action │
├───────┼───────────────────────────┼───────────────────────────┤
│ 16 │ Z-Score │ Mean reversion signal │
├───────┼───────────────────────────┼───────────────────────────┤
│ 17 │ Log Volume │ Normalized volume │
├───────┼───────────────────────────┼───────────────────────────┤
│ 18-19 │ Hour/Day encoding │ Time patterns │
└───────┴───────────────────────────┴───────────────────────────┘

From Model to Live Trading Bot

Having a good model is only half the battle. Turning it into a live trading system required:

Risk Management

ATR-based stop-loss (adapts to volatility)
Partial close at +1.5% and +3% (lock in profits)
Trailing stop (let winners run)
Max 3 concurrent positions (diversification)
Auto-unstuck (graduated exit from losing trades)

Infrastructure

Bybit API with limit orders (0.02% maker fee vs 0.055% taker)
Exchange-side stop-loss (safety net if bot crashes)
PM2 process management with auto-restart
PostgreSQL for full state persistence (survives restarts)
Telegram notifications (public + private channels)

The Blend

The final system uses both models:

60% tree-based (XGBoost + LightGBM) for baseline
40% LSTM for neural boost
Direction-aware blending for SHORT positions

Results

After deploying the LSTM model:

Old model (tree-based only): 54% accuracy, losing money to fees
New model (LSTM blend): 84.6% directional accuracy
The bot now opens positions with 80-95% confidence signals
Pump scanner catches volume spikes (5x+ average) for quick trades

Open Source

The entire codebase is open-source:

GitHub: https://github.com/stefanoviana/deepalpha

You can:

Self-host for free (bring your own API keys)
Use the cloud version at https://deepalphabot.com (free 7-day trial)
Train your own models with the included training scripts

Quick Start

git clone https://github.com/stefanoviana/deepalpha.git
cd deepalpha
pip install -r requirements.txt
cp .env.example .env # Add your Bybit API keys
python deepalpha.py

What I Learned

Walk-forward validation is non-negotiable. My tree models claimed 70.9% with random splits. Real accuracy: 54%.
Sequence models beat tabular models for time series. LSTM sees patterns that XGBoost literally cannot.
Data augmentation prevents overfitting. Adding noise + time shifts doubled the effective training data.
Less features = better. Going from 72 to 19 features improved accuracy by removing noise.
The model is 30% of the work. Risk management, infrastructure, and edge cases are the other 70%.

What's Next

Reinforcement learning for dynamic position sizing
Multi-timeframe fusion (1h + 4h + 1d)
Continuous online learning from live trades
More exchange support (currently Bybit, Binance, OKX, Gate.io)

If you find this useful, a star on https://github.com/stefanoviana/deepalpha would mean a lot. And if you have questions about the architecture, training process, or deployment — drop a comment below.

DeepAlpha is built with Python, PyTorch, ccxt, FastAPI, and PostgreSQL. It's MIT licensed and free to use.

DEV Community

How I Built an AI Trading Bot That Predicts Crypto with 84.6% Accuracy (and Open-Sourced It)

Top comments (0)