Methodology - Bellatrix Capital

Pipeline Overview

Our prediction system follows a rigorous end-to-end pipeline that ensures data quality, feature relevance, model robustness, and prediction validation.

Data Ingestion

Real-time market data

Feature Engineering

100+ technical features

Model Inference

5 ML models ensemble

Validation

Performance tracking

Data Sources & Quality

Quality predictions start with quality data. We aggregate historical and real-time price data from multiple exchanges to ensure accuracy and reliability.

Data Collection

Hourly OHLCV (Open, High, Low, Close, Volume) data for 20+ cryptocurrencies
Historical data spanning multiple market cycles for robust training
Real-time price feeds for timely predictions
Fundamental data including FOMC decisions and macro indicators

Quality Assurance

Automated data validation to detect anomalies and gaps
Cross-exchange verification for price accuracy
Outlier detection and handling protocols
Missing data imputation using robust statistical methods

Feature Engineering

We engineer over 100 features from raw price data, combining traditional technical indicators with custom-designed predictive features.

Technical Indicators

Trend: Ichimoku Cloud (Tenkan, Kijun, Senkou Span A/B), Moving Averages (SMA, EMA)
Momentum: RSI, MACD, Stochastic Oscillator, Williams %R
Volatility: Bollinger Bands, ATR, Keltner Channels, Standard Deviation
Volume: OBV, Volume Profile, Volume-Price Trend

Advanced Features

Pattern Detection: Volatility squeeze (BB inside Keltner), momentum expansion signals
Divergence Analysis: RSI divergence, MACD histogram divergence
Periodic Patterns: Day-of-week effects, hour-of-day seasonality
Cross-Asset Features: BTC correlation, sector momentum

Feature Selection

Not all features are created equal. We use importance ranking from gradient boosting models and recursive feature elimination to identify the most predictive features, reducing noise and improving generalization.

Model Architecture

We deploy an ensemble of five distinct model architectures, each with unique strengths for capturing different patterns in market data.

XGBoost

Our primary model uses extreme gradient boosting with careful hyperparameter tuning. XGBoost excels at capturing non-linear relationships between features and handles missing values gracefully.

LightGBM

Microsoft's gradient boosting framework provides faster training and often complementary predictions to XGBoost, particularly effective on high-cardinality features.

Deep Neural Network (DNN)

A multi-layer feedforward network with dropout regularization captures complex feature interactions that tree-based models might miss.

LSTM (Long Short-Term Memory)

Our sequence model uses a sliding window of historical data to capture temporal dependencies and momentum patterns in price movements.

Graph Neural Network (GNN)

An experimental model that learns correlations between assets, capturing how movements in major cryptocurrencies influence altcoins.

Ensemble Strategy

We combine predictions from all five models using a weighted ensemble approach. Weights are dynamically adjusted based on recent performance, giving more influence to models that have been accurate in current market conditions.

Training & Validation

Rigorous training and validation procedures ensure our models generalize to unseen data and don't overfit to historical patterns.

Walk-Forward Validation

We use time-series cross-validation with a walk-forward approach. Models are trained on historical data and validated on subsequent periods, mimicking real-world deployment conditions.

Train/Validation/Test Split

Training: Historical data for model fitting (no data leakage)
Validation: Recent historical data for hyperparameter tuning
Test: Holdout data never seen during training for final evaluation

Preventing Overfitting

Early stopping based on validation loss
L2 regularization in neural networks
Tree depth limits and minimum samples per leaf in boosting models
Feature importance pruning to remove noisy predictors

Prediction Validation

Every prediction is timestamped under saved before its target time. Once the prediction window closes, we compare predictions against actual outcomes.

Metrics We Track

Direction Accuracy: Did we correctly predict up/down movement?
Mean Absolute Error: Average magnitude of prediction errors
Hit Rate by Confidence: Accuracy stratified by prediction confidence
Model-Specific Performance: Individual model accuracy tracking

Transparency Commitment

All prediction outcomes are published on our Performance page. We believe in full transparency — both successes and failures are documented and analyzed.

Important Disclaimer

Past performance does not guarantee future results. Cryptocurrency markets are highly volatile and unpredictable. Our predictions are probabilistic estimates, not financial advice. Always do your own research and never invest more than you can afford to lose.

Continuous Improvement

Markets evolve, and our models must adapt. We employ a continuous improvement process to maintain prediction quality.

Model Retraining

Weekly model retraining with latest data
Regime detection to identify market condition changes
Automatic performance degradation alerts

Research & Development

Ongoing feature engineering experimentation
New model architecture testing (transformers, attention mechanisms)
Alternative data source integration (sentiment, on-chain metrics)