System Architecture

High-Level Overview

Amadeus is arranged as three independent layers that provide clean separation between transcription and harmonic analysis:

┌─────────────────────────────────────────────────────────┐
│                   iOS Application (Swift)               │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐ │
│  │   UI Layer   │  │   Managers   │  │    Models    │ │
│  │   SwiftUI    │  │ AudioManager │  │ChordDetection│ │
│  │    Views     │  │AnalysisMgr   │  │  Pipeline    │ │
│  └──────────────┘  └──────────────┘  └──────────────┘ │
└─────────────────────────┬───────────────────────────────┘
                          │ HTTPS
                          │ JSON
┌─────────────────────────▼───────────────────────────────┐
│              Python Analysis Server (FastAPI)           │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐ │
│  │   REST API   │  │ Basic Pitch  │  │Chord Inference│ │
│  │  Endpoints   │  │    Model     │  │   Module     │ │
│  └──────────────┘  └──────────────┘  └──────────────┘ │
└──────────────────────────────────────────────────────────┘

Component Details

iOS Application Layer

The iOS application is built using Swift and SwiftUI, following MVVM architecture:

Views Layer

MainTabView: Root navigation container
AnalyseView: Audio analysis interface
LibraryView: Music theory reference
LiveView: Real-time detection (future)
ProfileView: User settings and preferences

Managers Layer

AudioManager: Handles playback, seeking, and audio session
AnalysisManager: Coordinates analysis workflow
ChordDetectionPipeline: Local chord assembly and post-processing

Models Layer

ChordDetection: Chord data structure
NoteEvent: Note representation
ChordDictionary: Theory database
ScaleDictionary: Scale definitions

Network Layer

BasicPitchHTTPClient: Server communication
Multipart form data encoding
JSON response parsing

Python Server Layer

The server uses FastAPI for high-performance async operations and integrates Spotify’s Basic Pitch model:

API Layer

/analyze endpoint for audio processing
/health endpoint for status monitoring
CORS middleware for web compatibility

Processing Pipeline

Audio file reception and validation
Format conversion with 4 fallback methods (librosa, soundfile, ffmpeg, pydub)
Audio padding to minimum 3 seconds to prevent crashes
Basic Pitch model inference (compact convolutional architecture)
Note event extraction (onset time, pitch, duration)
Advanced chord inference with Krumhansl-Schmuckler key detection
Temporal median filtering (window size 3) for stability
JSON response generation with notes, chords, and key

Core Modules

main.py: FastAPI application and endpoints
chord_inference.py: Groups symbolic events into chord segments
Audio preprocessing handled by Basic Pitch model
Model produces time-frequency maps for onsets, sustained notes, and multi-pitch activity

Data Flow

Request Flow

1. User selects/records audio
     ↓
2. iOS app encodes audio as WAV
     ↓
3. HTTP POST to /analyze endpoint
     ↓
4. Server processes audio
     ↓
5. Basic Pitch generates note events
     ↓
6. Chord inference from notes
     ↓
7. JSON response to iOS app
     ↓
8. Local smoothing and filtering
     ↓
9. UI updates with results

Data Structures

Audio Upload:

POST /analyze
Content-Type: multipart/form-data

audio_file: <binary WAV data>

Server Response:

{
  "notes": [
    {
      "onset": 0.0,
      "offset": 0.5,
      "pitch": 60,
      "confidence": 0.95
    }
  ],
  "chords": [
    {
      "onset": 0.0,
      "offset": 2.0,
      "chord": "C",
      "confidence": 0.85,
      "pitch_classes": [0, 4, 7]
    }
  ],
  "key": {
    "key": "C",
    "mode": "major",
    "confidence": 0.75
  }
}

Technology Stack

iOS Application

Language: Swift 5.0+
UI Framework: SwiftUI
Minimum iOS: 17.0+
Audio: AudioKit 5.6.0, AVFoundation
Music Theory: Tonic 1.0.6
Networking: URLSession
ML: CoreML (implemented with nmp.mlpackage)

Python Server

Language: Python 3.8+
Web Framework: FastAPI
ML Framework: TensorFlow
Audio Processing: librosa, soundfile
Model: Spotify Basic Pitch
Server: Uvicorn ASGI

Communication Protocol

Protocol: HTTP/HTTPS
Default Server: http://192.168.1.111:8000
Data Format: JSON
File Upload: Multipart form data
Authentication: None (token-based planned)
Analysis Modes: HTTP Server, CoreML Local, Simulation

Deployment Architecture

Development Environment

iOS: Xcode with Swift Package Manager
Server: Python virtual environment
Local testing on same network

Production Environment

iOS: App Store distribution
Server: Cloud deployment (AWS/GCP/Azure)
CDN for static assets
Load balancing for scalability

Security Considerations

HTTPS encryption for all communication
Input validation and sanitisation
File size limits (50MB)
Rate limiting (planned)
Authentication tokens (planned)

Performance Optimisations

Audio compression before upload
Server-side caching of results
Chunked processing for long files
Parallel processing where possible
CDN for model weights distribution

Scalability Design

The architecture supports horizontal scaling:

Stateless server design
Queue-based processing (future)
Microservice separation (future)
Database for result storage (future)
Container orchestration (future)