Wiki: Omega 13 Generated: 2025-12-29
Relevant source files The following files were used as context for generating this wiki page: - [src/omega13/app.py](https://github.com/b08x/omega-13/blob/main/src/omega13/app.py) - [src/omega13/config.py](https://github.com/b08x/omega-13/blob/main/src/omega13/config.py) - [src/omega13/audio.py](https://github.com/b08x/omega-13/blob/main/src/omega13/audio.py) - [src/omega13/session.py](https://github.com/b08x/omega-13/blob/main/src/omega13/session.py) - [src/omega13/transcription.py](https://github.com/b08x/omega-13/blob/main/src/omega13/transcription.py) - [README.md](https://github.com/b08x/omega-13/blob/main/README.md)

Getting Started

1. Introduction

Omega-13 is a retroactive audio recording system designed to capture audio buffers from the past (defaulting to 13 seconds) and process them through a transcription pipeline. The system operates as a Terminal User Interface (TUI) application built on the Textual framework, coordinating between a JACK-based audio engine, a local or containerized Whisper transcription server, and a session management layer. Its primary role is to bridge the gap between continuous audio monitoring and on-demand archival/transcription.

Sources: src/omega13/app.py:#L57-L80, README.md:#L76-L85

2. System Architecture and Initialization

The application initializes by loading persistent configurations and establishing a JACK client. The boot sequence reveals a dependency chain where the UI cannot effectively function without a valid audio backend, yet the system allows the UI to launch even if inputs are unconfigured.

Component Interaction Flow

The following diagram illustrates the startup and input connection sequence:

Sources: src/omega13/app.py:#L125-L150, src/omega13/audio.py:#L22-L40

Core Components

Component Responsibility Key Interaction
ConfigManager Persists user settings (hotkeys, server URLs, paths) in ~/.config/omega13/config.json. Provides the global_hotkey and server_url to the app.
AudioEngine Manages a numpy-based ring buffer and handles real-time audio capture via JACK. Feeds peak levels to the VUMeter UI components.
SessionManager Handles temporary storage in /tmp/omega13 and permanent archival. Triggers metadata syncing when transcriptions are added.
TranscriptionService Interfaces with a whisper-server HTTP API for asynchronous processing. Updates the TranscriptionDisplay upon completion or error.

Sources: src/omega13/config.py:#L14-L40, src/omega13/audio.py:#L18-L55, src/omega13/session.py:#L45-L60, src/omega13/transcription.py:#L45-L65

3. Configuration Mechanisms

The system relies on a JSON configuration file. A notable structural pattern is the hardcoded reliance on a specific versioning scheme (version 2) and a default “retroactive” window of 13 seconds, which is reflected in both the class constants and the project name.

Default Configuration Attributes

Field Default Value Purpose
global_hotkey <ctrl>+<alt>+space Trigger for starting/stopping the capture.
server_url http://localhost:8080 Endpoint for the Whisper transcription API.
save_path Path.cwd() Default directory for permanent session storage.
buffer_duration 13 Duration in seconds of the pre-record ring buffer.

Sources: src/omega13/config.py:#L30-L45, src/omega13/audio.py:#L15-L16

4. Audio Capture and Hotkey Logic

The system uses a “toggle” mechanism to control recording. This is implemented via a CLI flag --toggle that sends signals to a running instance identified by a PID file. This approach bypasses Wayland’s security restrictions on global key-sniffing by delegating hotkey management to the Desktop Environment.

Sources: src/omega13/app.py:#L175-L200, README.md:#L60-L75

5. Transcription and Data Flow

Transcription is an asynchronous process. When a recording ends, the file path is passed to the TranscriptionService. A fucking weird but functional deduplication logic exists in the session management: it compares new transcription segments against the last five entries to prevent overlapping text if the engine captures redundant audio.

Transcription Logic Invariants

  • Cooperative Shutdown: The service uses a threading.Event to ensure threads are not abandoned during app exit.
  • Deduplication: The system joins the last 5 transcriptions (~500 words) to find the longest matching suffix with the new segment’s prefix.
  • Clipboard Integration: If enabled in config.json, the result is automatically pushed to the system clipboard.

Sources: src/omega13/transcription.py:#L66-L85, src/omega13/session.py:#L1-L30, src/omega13/config.py:#L40-L45

Conclusion

“Getting Started” with Omega-13 involves a multi-stage orchestration of JACK audio, system-level signal handling, and HTTP-based inference. The architecture is structurally dependent on external components (JACK and Whisper-Server) being pre-configured, while providing a TUI that acts as the central state coordinator. The most significant structural mechanism is the retroactive ring buffer, which ensures that the 13 seconds of audio preceding a user’s manual trigger are never lost.