Skip to main content

Queue Entropy Analysis

Queue-based concurrent system for Shannon entropy analysis of trader behavior patterns. Functional tests and synthetic market simulations pass locally. Performance claims (5M packets/sec) are extrapolated from micro-benchmarks and require dedicated validation.
Source Code

Summary

Queue Entropy Analysis provides a working C++17 implementation of Shannon entropy calculation and a concurrent pipeline with tested functional correctness on synthetic data. The repository includes unit tests, edge case coverage, and market simulations. Performance numbers reported (5M packets/sec, sub-millisecond latency) are extrapolated from a short synthetic micro-benchmark and should not be treated as proof of sustained production throughput without dedicated benchmarking.

Key Achievements

Verified implementations:

  • Shannon Entropy Calculation: Correct implementation matching theoretical expectations
  • Test Coverage: 23 tests (5 queue edge cases, 6 entropy edge cases, 6 pipeline edge cases, 6 market simulations) all passing on this machine
  • Concurrent Pipeline: Working producer-consumer architecture with backpressure
  • Code Quality: Fixes applied during validation (latency metric calculation, test assertion typo)

Actual performance characteristics:

  • Synthetic micro-benchmark: A 5k-event HFT simulation extrapolates to multi-million ops/sec on this machine; this is not a sustained throughput measurement
  • Queue implementation: Mutex-based (ConcurrentQueue) or hybrid with atomics (OptimizedQueue); NOT lock-free.
  • Testing: All tests use synthetic/simulated data; zero real market data validation

Technical Implementation

Core Concurrent Queue:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
template <typename T>
class ConcurrentQueue {
    void push(const T& value) {
        std::lock_guard<std::mutex> lock(m_mutex);
        m_queue.push(value);
        m_cond_var.notify_one();
    }
    
    bool try_pop(T& result) {
        std::lock_guard<std::mutex> lock(m_mutex);
        if (m_queue.empty()) return false;
        result = m_queue.front();
        m_queue.pop();
        return true;
    }
};

Real-Time Entropy Calculation:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
double EntropyCalculator::calculate_entropy(const std::vector<TraderAction>& actions) {
    if (actions.empty()) return 0.0;
    std::map<TraderAction, int> counts;
    for (const auto& action : actions) {
        counts[action]++;
    }
    double entropy = 0.0;
    int total = static_cast<int>(actions.size());
    for (const auto& [action, count] : counts) {
        double p = static_cast<double>(count) / total;
        if (p > 0) {
            entropy -= p * std::log2(p);
        }
    }
    return entropy;
}

Validation Results

Test Coverage (on this machine, synthetic data):

  • Edge Case Testing: 17/17 tests passed (queue, entropy, pipeline edge cases)
  • Market Simulation: 6/6 synthetic scenarios executed (bull, bear, crash, normal, HFT, recovery)
  • Mathematical Accuracy: Shannon entropy implementation is correct; calculations match theoretical expectations
  • Performance Note: HFT micro-benchmark extrapolates to multi-million packets/sec but requires dedicated benchmarking for sustained claims

Market Behavior Patterns (synthetic simulations only):

  • Market Crashes (synthetic): Low entropy (0.25-0.40 bits) + High sell dominance observed in test data
  • Normal Trading (synthetic): High entropy (1.54 bits) + Balanced distribution in test data
  • Bull Markets (synthetic): High entropy (1.39 bits) + More buys than sells in test data
  • Bear Markets (synthetic): High entropy (1.27 bits) + More sells than buys in test data

Research Applications

Potential future uses (pending real-world validation):

  • Behavioral analysis: Entropy quantifies trading action diversity; currently tested on synthetic data
  • Market pattern recognition: Simulations show low entropy correlates with panic selling, high entropy with diverse behavior
  • Queue performance: The pipeline handles simulated high-frequency workloads; real-world throughput untested
  • Research foundation: Provides a working codebase for entropy-based market analysis experiments

Technical Specifications (accurate)

Language: C++17 with standard library features Queue Type: Mutex-based (ConcurrentQueue) or hybrid with atomics (OptimizedQueue). NOT lock-free. Entropy Calculation: Correct Shannon entropy formula H = -Σ(p_i * log2(p_i)) Memory Model: Uses std::mutex for synchronization; atomics for some metrics Thread Safety: Thread-safe via mutexes and condition variables Performance Claims: Micro-benchmark extrapolation only; sustained throughput unvalidated Entropy Range: 0.0 to ~1.585 bits (theoretical max for 3 discrete actions)

Current Status

What works:

  • Shannon entropy calculation is mathematically correct
  • All provided unit and simulation tests pass on this machine
  • Concurrent pipeline executes successfully with synthetic workloads
  • Code compiles cleanly with C++17

What is unvalidated:

  • Real market data integration (tests use synthetic data only)
  • Sustained production throughput (micro-benchmark is not a sustained workload)
  • Lock-free performance claims (implementation uses mutexes, not lock-free)
  • Real-world applicability of entropy-volatility correlation

Recommended next steps:

  • Add a proper benchmark harness with sustained load testing
  • Validate on real market data
  • Evaluate or implement a true lock-free MPMC queue if required
  • Profile and optimize under realistic conditions

Repository Structure

Queue/
├── include/                    # Header files
├── src/                       # Source files
├── tests/                     # Test suites
├── CMakeLists.txt            # CMake build configuration
├── Makefile                  # Build automation
└── README.md                 # Project documentation

Build and Usage

Quick Start:

1
2
3
make all
make test
make perf

Manual Compilation:

1
g++ -std=c++17 -I include -pthread -O3 -o market_entropy_analyzer src/*.cpp