================
Chord Detection
================

The chord detection pipeline transforms symbolic note events from Basic Pitch into meaningful chord progressions through multiple processing stages.

Pipeline Overview
=================

.. code-block:: text

    Audio Input → Note Detection → Chord Assembly → Post-Processing → Final Output
         ↓              ↓                ↓                ↓              ↓
      WAV/MP3    Basic Pitch ML   Pitch Classes    Smoothing    Chord Symbols
                   Note Events     Clustering      Confidence      + Timing

Note Detection Stage
====================

Basic Pitch Integration
-----------------------

The pipeline uses Spotify's Basic Pitch model, a compact convolutional architecture designed for automatic music transcription in low resource settings:

**Model Operation:**

* Operates on a constant Q transform with three bins per semitone
* Forms approximation of harmonic CQT by vertically shifting spectrogram
* Produces three time-frequency maps: onsets, sustained notes, multi-pitch activity
* Onset peaks extracted and matched to sustained activity
* Notes shorter than ~120ms are removed

**Processing Steps:**

1. Audio preprocessing by Basic Pitch model
2. CQT feature extraction with harmonic alignment
3. Model inference producing time-frequency maps
4. Note event extraction with onset/pitch/duration
5. Post-processing to remove brief notes

**Output Format:**

.. code-block:: python

    {
        "onset": 1.024,      # Start time in seconds
        "offset": 1.536,     # End time in seconds  
        "pitch": 64,         # MIDI note number
        "confidence": 0.92   # Detection confidence
    }

Chord Assembly Algorithm
========================

Once symbolic events arrive from the server, they are grouped into time windows for chord analysis:

Time Window Clustering
----------------------

Notes are grouped into time windows for chord analysis:

.. code-block:: python

    # Notes are grouped into time windows based on temporal proximity
    # Root candidates are tested against pitch class templates
    # Chord types from Dictionary are matched against detected pitch classes
    # Median filtering with window size 3 is applied for smoothing

Pitch Class Extraction
----------------------

Convert MIDI notes to pitch classes (0-11):

.. code-block:: python

    def extract_pitch_classes(notes):
        """Extract pitch classes from note events."""
        pitch_classes = {}
        for note in notes:
            pc = note.pitch % 12
            weight = note.confidence * note.duration
            pitch_classes[pc] = pitch_classes.get(pc, 0) + weight
        return pitch_classes

Root Detection
--------------

Identify the most likely root note:

**Algorithm:**

1. **Bass Note Priority**: Lowest note gets extra weight
2. **Pitch Class Histogram**: Count occurrences weighted by confidence
3. **Harmonic Template Matching**: Compare against known chord patterns
4. **Context Awareness**: Consider previous/next chords

.. code-block:: python

    def detect_root(pitch_classes, bass_note=None):
        """Detect chord root from pitch classes."""
        candidates = []
        
        for root in range(12):
            score = 0
            # Check for major triad
            if root in pitch_classes:
                score += pitch_classes[root] * 2  # Root weight
            if (root + 4) % 12 in pitch_classes:
                score += pitch_classes[(root + 4) % 12]  # Major third
            if (root + 7) % 12 in pitch_classes:
                score += pitch_classes[(root + 7) % 12]  # Fifth
                
            candidates.append((root, score))
            
        # Bass note bonus
        if bass_note is not None:
            bass_pc = bass_note % 12
            for i, (root, score) in enumerate(candidates):
                if root == bass_pc:
                    candidates[i] = (root, score * 1.5)
                    
        return max(candidates, key=lambda x: x[1])[0]

Chord Quality Detection
-----------------------

Determine chord type from pitch classes:

**Chord Templates:**

.. code-block:: python

    CHORD_TEMPLATES = {
        'major': [0, 4, 7],
        'minor': [0, 3, 7],
        'dim': [0, 3, 6],
        'aug': [0, 4, 8],
        'maj7': [0, 4, 7, 11],
        'dom7': [0, 4, 7, 10],
        'min7': [0, 3, 7, 10],
        'dim7': [0, 3, 6, 9],
        # ... more templates
    }

**Matching Algorithm:**

.. code-block:: python

    def detect_chord_quality(pitch_classes, root):
        """Match pitch classes against chord templates."""
        best_match = None
        best_score = 0
        
        # Transpose pitch classes relative to root
        relative_pcs = transpose_pitch_classes(pitch_classes, -root)
        
        for chord_type, template in CHORD_TEMPLATES.items():
            score = calculate_template_match(relative_pcs, template)
            if score > best_score:
                best_score = score
                best_match = chord_type
                
        return best_match, best_score

Post-Processing
===============

Temporal Smoothing
------------------

A median filtering step with window size 3 removes brief fluctuations in chord detection:

.. code-block:: python

    def smooth_chords(chords, window_size=3):
        """Apply median filter to smooth chord sequence."""
        smoothed = []
        
        for i in range(len(chords)):
            window_start = max(0, i - window_size // 2)
            window_end = min(len(chords), i + window_size // 2 + 1)
            window = chords[window_start:window_end]
            
            # Vote on most common chord in window
            chord_votes = {}
            for chord in window:
                key = (chord.root, chord.quality)
                weight = chord.confidence * chord.duration
                chord_votes[key] = chord_votes.get(key, 0) + weight
                
            # Select winning chord
            best_chord = max(chord_votes, key=chord_votes.get)
            smoothed.append(best_chord)
            
        return smoothed

Confidence Scoring
------------------

Calculate overall confidence for detected chords:

**Factors:**

1. **Note Detection Confidence**: Average confidence of constituent notes
2. **Template Match Score**: How well pitch classes match chord template
3. **Temporal Stability**: Consistency with neighboring chords
4. **Harmonic Context**: Likelihood given key and progression

.. code-block:: python

    def calculate_chord_confidence(chord, notes, context):
        """Calculate confidence score for detected chord."""
        # Note confidence
        note_conf = np.mean([n.confidence for n in notes])
        
        # Template match confidence
        template_conf = chord.template_score
        
        # Temporal stability
        stability = calculate_stability(chord, context.previous, context.next)
        
        # Harmonic likelihood
        harmonic_conf = calculate_harmonic_likelihood(chord, context.key)
        
        # Weighted average
        weights = [0.3, 0.3, 0.2, 0.2]
        scores = [note_conf, template_conf, stability, harmonic_conf]
        
        return np.average(scores, weights=weights)

Chord Filtering
---------------

Remove low-confidence or spurious detections:

.. code-block:: python

    def filter_chords(chords, min_confidence=0.5, min_duration=0.1):
        """Filter out low-quality chord detections."""
        filtered = []
        
        for chord in chords:
            if chord.confidence >= min_confidence and \
               chord.duration >= min_duration:
                filtered.append(chord)
            elif filtered and chord.duration < min_duration:
                # Extend previous chord
                filtered[-1].end_time = chord.end_time
                
        return filtered

Advanced Techniques
===================

Jazz Chord Extensions
---------------------

Detect extended and altered chords:

* 9ths, 11ths, 13ths
* Altered tensions (♭9, ♯11, etc.)
* Slash chords (inversions)
* Polychords

Borrowed Chords
---------------

Identify chords from parallel modes:

* Modal interchange
* Secondary dominants
* Neapolitan chords
* Augmented sixth chords

Voice Leading Analysis
----------------------

Track individual voice movements:

* Smooth voice leading detection
* Parallel motion identification
* Contrary motion analysis

Performance Metrics
===================

The chord detection pipeline includes:

* **Server Processing**: Advanced chord inference with Krumhansl-Schmuckler key detection
* **Three Analysis Modes**: HTTP Server (primary), CoreML Local, Simulation
* **Robust Audio Loading**: Multiple fallback methods to handle various formats
* **Audio Preprocessing**: Padding and format conversion as needed
* **Median Filtering**: Window size 3 for chord smoothing

Limitations
===========

Current limitations include:

* Difficulty with dense orchestral arrangements
* Challenges with extreme registers
* Reduced accuracy for chromatic passages
* Lower performance on atonal music

Future Improvements
===================

Planned enhancements:

* Deep learning chord recognition model
* Genre-specific templates
* User-guided correction
* Real-time adaptation
* Microtonal support