Essential Python Techniques for Building Profitable Algorithmic Trading Systems in 2024

#programming #devto #python #softwareengineering

As a best-selling author, I invite you to explore my books on Amazon. Don't forget to follow me on Medium and show your support. Thank you! Your support means the world!

Building algorithmic trading systems requires specific techniques that transform raw market data into executable strategies. I want to share methods I use to build these systems, focusing on practical Python code you can apply immediately. Let's start from the beginning.

Getting the Data Right

Everything begins with data. You need clean, reliable price information. I typically use APIs to pull historical data. The key is to structure it for analysis, calculating the basic metrics that form the foundation of any model.

import yfinance as yf
import pandas as pd
import numpy as np
from datetime import datetime

def get_stock_data(symbol, years=1):
    """
    Fetches and prepares stock data.
    Simple and direct.
    """
    ticker = yf.Ticker(symbol)
    df = ticker.history(period=f"{years}y")

    # Basic calculations are crucial
    df['daily_change'] = df['Close'].pct_change()
    df['log_change'] = np.log(df['Close'] / df['Close'].shift(1))
    # Annualized volatility, a basic risk measure
    df['volatility'] = df['daily_change'].rolling(window=20).std() * np.sqrt(252)

    # Clean up missing values from calculations
    df.dropna(inplace=True)
    return df

# Let's see it in action
apple_data = get_stock_data("AAPL")
print(f"Fetched {len(apple_data)} days of data for AAPL.")
print(f"Recent volatility: {apple_data['volatility'].iloc[-1]:.4f}")

This gives you a time series. The daily changes and volatility are your first building blocks. Without this clean foundation, everything else will be flawed.

Finding Signals in the Noise

Raw prices are just numbers. Technical indicators help us interpret them. I think of them as lenses, each highlighting a different market characteristic like trend, momentum, or exhaustion.

import pandas_ta as ta  # A great library for indicators

def add_technical_signals(price_data):
    """
    Applies a set of common technical indicators.
    This creates potential trading signals.
    """
    df = price_data.copy()

    # Trend: Simple and Exponential Moving Averages
    df['sma_50'] = ta.sma(df['Close'], length=50)
    df['ema_20'] = ta.ema(df['Close'], length=20)

    # Momentum: Relative Strength Index
    df['rsi'] = ta.rsi(df['Close'], length=14)

    # A classic trend/momentum combo: MACD
    macd_result = ta.macd(df['Close'], fast=12, slow=26, signal=9)
    df['macd_line'] = macd_result['MACD_12_26_9']
    df['macd_signal'] = macd_result['MACDs_12_26_9']

    # Volatility: Bollinger Bands
    bb_result = ta.bbands(df['Close'], length=20, std=2)
    df['bb_upper'] = bb_result['BBU_20_2.0']
    df['bb_lower'] = bb_result['BBL_20_2.0']

    # Create a simple composite signal
    df['signal'] = 0  # Default: hold

    # Rule 1: RSI oversold
    df.loc[df['rsi'] < 30, 'signal'] = 1  # Buy signal
    # Rule 2: RSI overbought
    df.loc[df['rsi'] > 70, 'signal'] = -1 # Sell signal
    # Rule 3: MACD crossover
    df.loc[df['macd_line'] > df['macd_signal'], 'signal'] = 1
    df.loc[df['macd_line'] < df['macd_signal'], 'signal'] = -1

    return df

# Apply to our data
apple_with_signals = add_technical_signals(apple_data)
signal_summary = apple_with_signals['signal'].value_counts()
print(f"Signal summary: {dict(signal_summary)}")

Indicators are not crystal balls. They are probabilities. A low RSI suggests a higher probability of a bounce, not a guarantee. Combining several indicators often works better than relying on one.

Testing Before You Risk Capital

This is the most important step. A strategy that sounds good in your head can fail completely in reality. Backtesting simulates how your strategy would have performed using historical data. The goal is to avoid costly mistakes.

class StrategyTester:
    """
    A simple backtesting engine.
    It tracks a portfolio based on trading signals.
    """
    def __init__(self, starting_cash=10000, trade_cost=5.0):
        self.starting_cash = starting_cash
        self.trade_cost = trade_cost  # Commission per trade

    def run_test(self, data, signal_col='signal'):
        """
        Runs the backtest over the provided data.
        """
        df = data.copy()
        # Initialize portfolio columns
        df['position'] = 0  # Shares held
        df['cash'] = self.starting_cash
        df['portfolio_value'] = self.starting_cash

        position = 0
        for i in range(1, len(df)):
            current_price = df.iloc[i]['Close']
            current_signal = df.iloc[i][signal_col]
            prev_cash = df.iloc[i-1]['cash']

            # Decision Logic
            if current_signal == 1 and position == 0:
                # Buy signal, and we're not in a position
                shares_to_buy = prev_cash // current_price
                cost = (shares_to_buy * current_price) + self.trade_cost
                if cost <= prev_cash:
                    position = shares_to_buy
                    new_cash = prev_cash - cost
                else:
                    new_cash = prev_cash
            elif current_signal == -1 and position > 0:
                # Sell signal, and we hold shares
                sale_value = (position * current_price) - self.trade_cost
                new_cash = prev_cash + sale_value
                position = 0
            else:
                # No action
                new_cash = prev_cash

            # Update the DataFrame
            df.loc[df.index[i], 'position'] = position
            df.loc[df.index[i], 'cash'] = new_cash
            df.loc[df.index[i], 'portfolio_value'] = new_cash + (position * current_price)

        self.results = df
        return self.calculate_stats()

    def calculate_stats(self):
        """
        Calculates performance metrics from the test.
        """
        df = self.results
        final_value = df['portfolio_value'].iloc[-1]
        total_return_pct = ((final_value - self.starting_cash) / self.starting_cash) * 100

        # Daily returns and risk
        df['daily_portfolio_return'] = df['portfolio_value'].pct_change()
        avg_daily_return = df['daily_portfolio_return'].mean()
        daily_risk = df['daily_portfolio_return'].std()

        # Simple Sharpe Ratio (assuming no risk-free rate for simplicity)
        if daily_risk > 0:
            sharpe = (avg_daily_return / daily_risk) * np.sqrt(252)
        else:
            sharpe = 0

        # Maximum Drawdown: Worst peak-to-trough decline
        running_max = df['portfolio_value'].cummax()
        drawdown = (df['portfolio_value'] - running_max) / running_max
        max_drawdown_pct = drawdown.min() * 100

        stats = {
            'initial_capital': self.starting_cash,
            'final_value': final_value,
            'total_return_%': total_return_pct,
            'sharpe_ratio': sharpe,
            'max_drawdown_%': max_drawdown_pct,
            'number_of_trades': (df['position'].diff() != 0).sum()
        }
        return stats

# Test our RSI/MACD strategy
tester = StrategyTester(starting_cash=100000)
performance = tester.run_test(apple_with_signals)

print("Backtest Results:")
for key, value in performance.items():
    print(f"{key:>20}: {value:.2f}" if isinstance(value, float) else f"{key:>20}: {value}")

A backtest will show you the strategy's historical return, but pay more attention to the Sharpe Ratio and Maximum Drawdown. A high return with a huge drawdown means sleepless nights and the risk of abandoning the strategy at the worst time.

Not Putting All Eggs in One Basket

Trading a single stock is risky. Portfolio optimization helps you allocate capital across multiple assets to get a better return for the level of risk you're willing to take. The core idea is that different assets move differently, and combining them can smooth the ride.

def build_optimal_portfolio(return_series, target_return=None):
    """
    Finds a mix of assets that aims for optimal risk/return.
    Based on classic portfolio theory.
    """
    # Expected returns for each asset
    mean_returns = return_series.mean()
    # How asset returns move together (covariance)
    cov_matrix = return_series.cov()

    num_assets = len(mean_returns)

    # This is the optimization part: minimize risk for a given return
    # We'll use a simple quadratic optimizer
    import scipy.optimize as sco

    def portfolio_statistics(weights):
        """Calculate return and volatility for a given weight set."""
        port_return = np.sum(mean_returns * weights)
        port_volatility = np.sqrt(np.dot(weights.T, np.dot(cov_matrix, weights)))
        return port_return, port_volatility

    def minimize_volatility(target_return):
        """Finds weights that minimize volatility for a target return."""
        constraints = (
            {'type': 'eq', 'fun': lambda x: np.sum(x) - 1},  # Weights sum to 1
            {'type': 'eq', 'fun': lambda x: portfolio_statistics(x)[0] - target_return}
        )
        bounds = tuple((0, 1) for asset in range(num_assets))  # No short selling
        initial_guess = num_assets * [1./num_assets]  # Start with equal weight
        result = sco.minimize(lambda w: portfolio_statistics(w)[1], initial_guess,
                               method='SLSQP', bounds=bounds, constraints=constraints)
        return result.x

    # If no target is given, find the minimum volatility portfolio
    if target_return is None:
        constraints = ({'type': 'eq', 'fun': lambda x: np.sum(x) - 1})
        bounds = tuple((0, 1) for asset in range(num_assets))
        result = sco.minimize(lambda w: portfolio_statistics(w)[1], num_assets * [1./num_assets],
                               method='SLSQP', bounds=bounds, constraints=constraints)
        optimal_weights = result.x
    else:
        optimal_weights = minimize_volatility(target_return)

    opt_return, opt_volatility = portfolio_statistics(optimal_weights)
    sharpe = opt_return / opt_volatility

    return {
        'weights': {asset: weight for asset, weight in zip(return_series.columns, optimal_weights)},
        'expected_return': opt_return,
        'expected_volatility': opt_volatility,
        'sharpe_ratio': sharpe
    }

# Example: Optimize a tech portfolio
tech_symbols = ['AAPL', 'MSFT', 'GOOGL', 'NVDA', 'ADBE']
all_data = {}
for sym in tech_symbols:
    all_data[sym] = get_stock_data(sym, years=2)['daily_change']

return_dataframe = pd.DataFrame(all_data).dropna()
portfolio_result = build_optimal_portfolio(return_dataframe)

print("\nOptimal Portfolio Allocation:")
for stock, weight in portfolio_result['weights'].items():
    print(f"  {stock}: {weight:.1%}")
print(f"\nExpected Annual Return: {portfolio_result['expected_return']*252:.1%}")
print(f"Expected Annual Volatility: {portfolio_result['expected_volatility']*np.sqrt(252):.1%}")

The optimal mix is rarely what you'd guess. Often, it includes significant weight in assets you might not consider the "best" individually, because they balance the others.

Protecting What You Have

Making money is one thing. Keeping it is another. Risk management is the set of rules that prevent a single bad trade or a market crash from wiping you out. It's not glamorous, but it's essential.

class TradingRiskController:
    """
    Enforces rules to limit losses.
    """
    def __init__(self, max_capital_per_trade=0.1, max_daily_loss=0.05):
        # Don't risk more than 10% of capital on one idea
        self.max_trade_risk = max_capital_per_trade
        # Stop trading for the day after a 5% loss
        self.daily_loss_limit = max_daily_loss
        self.daily_pnl = 0

    def calculate_position_size(self, account_value, entry_price, stop_loss_price):
        """
        Determines how many shares to buy based on where you'll admit you're wrong.
        """
        risk_per_share = abs(entry_price - stop_loss_price)
        # Don't risk more than the allowed % of your account on this trade
        total_capital_at_risk = account_value * self.max_trade_risk
        # How many shares does that risk amount allow?
        shares = total_capital_at_risk / risk_per_share
        return int(shares)

    def check_daily_limit(self, current_account_value, starting_account_value):
        """
        Checks if daily loss limit has been breached.
        """
        daily_return = (current_account_value - starting_account_value) / starting_account_value
        if daily_return < -self.daily_loss_limit:
            return False  # Stop trading
        return True

    def calculate_value_at_risk(self, portfolio_returns, confidence=0.95, horizon=1):
        """
        Estimates a worst-case loss over a given period.
        A simple historical method.
        """
        # Sort returns from worst to best
        sorted_returns = np.sort(portfolio_returns)
        # Find the return at the (1-confidence) percentile
        index = int((1 - confidence) * len(sorted_returns))
        var = -sorted_returns[index]  # Make it positive for a loss amount
        # Scale for time horizon
        var_scaled = var * np.sqrt(horizon)
        return var_scaled

# Using the risk manager
risk = TradingRiskController()
my_account_value = 100000
entry = 150
stop_loss = 145

size = risk.calculate_position_size(my_account_value, entry, stop_loss)
print(f"For a trade entering at ${entry} with a stop at ${stop_loss}:")
print(f"  You can buy {size} shares.")
print(f"  Capital at risk: ${size * (entry-stop_loss):.2f}")

# Estimate portfolio risk
portfolio_returns = return_dataframe.mean(axis=1)  # Our tech portfolio returns
var_estimate = risk.calculate_value_at_risk(portfolio_returns, confidence=0.99, horizon=5)
print(f"\n5-day Value at Risk (99% confidence): {var_estimate:.2%} of portfolio.")

The position size calculation is perhaps the most valuable tool here. It directly links the size of your bet to the point where your trade idea is proven wrong.

Trading Without Moving the Market

When you want to buy or sell a large amount, doing it all at once can move the price against you. Execution algorithms break a large order into smaller pieces to hide your intentions and get a better average price.

class TradeExecutor:
    """
    Schedules a large order over time.
    """
    def __init__(self, algorithm='TWAP'):
        self.algorithm = algorithm

    def create_schedule(self, total_quantity, intervals, historical_volumes=None):
        """
        Creates an order schedule.
        TWAP: Time-Weighted Average Price. Trades evenly over time.
        VWAP: Volume-Weighted Average Price. Trades in proportion to typical volume.
        """
        schedule = []
        if self.algorithm == 'TWAP':
            # Simple division
            base_order = total_quantity // intervals
            remainder = total_quantity % intervals
            schedule = [base_order] * intervals
            # Distribute the remainder
            for i in range(remainder):
                schedule[i] += 1

        elif self.algorithm == 'VWAP' and historical_volumes is not None:
            # Trade more when the market is typically more active
            total_typical_volume = sum(historical_volumes)
            for vol in historical_volumes:
                percent_of_volume = vol / total_typical_volume
                order_for_interval = int(total_quantity * percent_of_volume)
                schedule.append(order_for_interval)
            # Handle any rounding leftovers
            scheduled_total = sum(schedule)
            difference = total_quantity - scheduled_total
            if difference != 0:
                schedule[-1] += difference
        else:
            raise ValueError("Algorithm not supported or missing volume data.")

        return schedule

# Simulate executing a large order
executor = TradeExecutor(algorithm='VWAP')

# Let's assume typical hourly volume profile for a stock (as a percentage)
typical_hourly_volume = [0.05, 0.08, 0.10, 0.12, 0.15, 0.13, 0.11, 0.09, 0.07, 0.06, 0.03, 0.02]
# These 12 numbers represent 12 hours of the trading day

order_schedule = executor.create_schedule(
    total_quantity=100000,  # Sell 100,000 shares
    intervals=12,
    historical_volumes=typical_hourly_volume
)

print("VWAP Execution Schedule (shares per hour):")
for hour, shares in enumerate(order_schedule, 1):
    print(f"  Hour {hour:2d}: {shares:6d} shares")
print(f"Total to execute: {sum(order_schedule)} shares")

Using VWAP, you'd trade very little in the first and last hour, and a lot during the midday when markets are most liquid. This helps you blend in.

Understanding the Mechanics of Trading

The market isn't a single price. It's a list of orders—the order book. Analyzing this microstructure helps you understand the immediate supply and demand and can improve short-term trading decisions.

class MarketDepthAnalyzer:
    """
    Models the limit order book.
    """
    def __init__(self):
        self.bids = {}  # Price -> Quantity (buy orders)
        self.asks = {}  # Price -> Quantity (sell orders)

    def update(self, new_bids, new_asks):
        """
        Updates the book with new bid/ask data.
        Bids: {price: quantity_wanted}
        Asks: {price: quantity_offered}
        """
        self.bids = new_bids
        self.asks = new_asks

    def get_best_prices(self):
        """Gets the highest bid and lowest ask."""
        if not self.bids or not self.asks:
            return None, None
        best_bid = max(self.bids.keys())
        best_ask = min(self.asks.keys())
        return best_bid, best_ask

    def get_spread(self):
        """Calculates the bid-ask spread."""
        best_bid, best_ask = self.get_best_prices()
        if best_bid and best_ask:
            return best_ask - best_bid
        return None

    def get_mid_price(self):
        """Calculates the mid-point between best bid and ask."""
        best_bid, best_ask = self.get_best_prices()
        if best_bid and best_ask:
            return (best_bid + best_ask) / 2
        return None

    def simulate_market_order(self, quantity, side='buy'):
        """
        Simulates filling a market order.
        Shows the cost of immediate execution.
        """
        total_cost = 0
        filled = 0
        orders = []

        if side == 'buy':
            # Buy from the asks, starting with the cheapest
            for price in sorted(self.asks.keys()):
                if filled >= quantity:
                    break
                available = self.asks[price]
                to_take = min(available, quantity - filled)
                cost = to_take * price
                total_cost += cost
                filled += to_take
                orders.append((price, to_take))
        else:  # sell
            # Sell to the bids, starting with the highest
            for price in sorted(self.bids.keys(), reverse=True):
                if filled >= quantity:
                    break
                available = self.bids[price]
                to_take = min(available, quantity - filled)
                cost = to_take * price
                total_cost += cost
                filled += to_take
                orders.append((price, to_take))

        avg_price = total_cost / filled if filled > 0 else 0
        return {'orders': orders, 'filled': filled, 'avg_price': avg_price}

# Example Order Book
analyzer = MarketDepthAnalyzer()

example_bids = {149.90: 500, 149.85: 1000, 149.80: 800}
example_asks = {150.10: 300, 150.15: 700, 150.20: 400}

analyzer.update(example_bids, example_asks)

bb, ba = analyzer.get_best_prices()
print(f"Market: Best Bid = ${bb}, Best Ask = ${ba}")
print(f"Spread: ${analyzer.get_spread():.2f}")
print(f"Mid Price: ${analyzer.get_mid_price():.2f}")

# Simulate a market buy
buy_trade = analyzer.simulate_market_order(quantity=1000, side='buy')
print(f"\nBuy 1000 shares via market order:")
print(f"  Average price paid: ${buy_trade['avg_price']:.2f}")
print(f"  Slippage vs Mid: {((buy_trade['avg_price'] - analyzer.get_mid_price())/analyzer.get_mid_price()*100):.2f}%")

Seeing the order book explains why you might not get the price you see on screen. If the best ask is $150.10 but only for 300 shares, your 1000-share order will eat into higher-priced offers, raising your average cost. This is "slippage."

Putting It All Together

These techniques form a pipeline. You start with data, find signals, test them rigorously, determine how to size the position within a diversified portfolio, protect yourself with risk rules, execute the trade carefully, and understand the market mechanics as you do it.

Each step has its own pitfalls. Data can be messy. Indicators can give false signals. Backtests can be overly optimistic. Portfolios need rebalancing. Risk limits will feel restrictive when you're excited. Execution takes patience. The order book is unpredictable.

The code I've provided is a starting point. In practice, you'll need to adapt it—add more robust error handling, connect to live data feeds, integrate with a broker's API for real trading, and log every decision for review.

Remember, the goal isn't to find a magical winning strategy. It's to build a robust, disciplined process that manages risk and learns from the market over time. Start small, test thoroughly, and always respect the risk. The market is a demanding teacher, but a systematic approach is your best guide.

📘 Checkout my latest ebook for free on my channel!

Be sure to like, share, comment, and subscribe to the channel!

101 Books

101 Books is an AI-driven publishing company co-founded by author Aarav Joshi. By leveraging advanced AI technology, we keep our publishing costs incredibly low—some books are priced as low as $4—making quality knowledge accessible to everyone.

Check out our book Golang Clean Code available on Amazon.

Stay tuned for updates and exciting news. When shopping for books, search for Aarav Joshi to find more of our titles. Use the provided link to enjoy special discounts!