Reinforcement Learning Trading Bot

Performance Results

The agent was tested on unseen data and achieved highly promising results, demonstrating an ability to capture profitable trends while managing risk.

Return on Test Data 12.5%

Initial Portfolio $10,000

Final Value $11,250.96

How It Works

The bot utilizes a Deep Reinforcement Learning approach, specifically the Proximal Policy Optimization (PPO) algorithm. It analyzes a variety of technical indicators to decide whether to Buy, Hold, or Sell at any given time step.

Observation Space: 50 days of historical price data combined with RSI, MACD, and Bollinger Bands.
Reward Function: Optimized for long-term portfolio growth adjusted for drawdown.
Environment: Custom OpenAI Gym (Gymnasium) environment simulated on years of historical stock market data.

Technical Methodology

PPO Algorithm: Stable-Baselines3 implementation for robust and stable policy updates.
Indicators: Integrated technical analysis signals (SMA/EMA50, RSI, MACD).
Data Source: High-resolution historical data fetched via the yfinance API.

Stack & Requirements

Python
Stable-Baselines3
Gymnasium
yfinance
Pandas
Matplotlib