Storage & Archival
Bewitch stores metrics in DuckDB with optional data lifecycle management: retention pruning, compaction, and Parquet archival for long-term storage.
DuckDB Storage
The daemon writes metrics using high-performance bulk inserts. The schema is applied automatically on startup. Database writes are decoupled from collection — the API cache is always updated immediately, so the TUI never waits on disk I/O.
WAL checkpointing
DuckDB uses a write-ahead log (WAL) for crash safety. Checkpoints are handled automatically when the WAL exceeds checkpoint_threshold (default 16MB). For additional crash safety, setcheckpoint_interval to force periodic checkpoints:
[daemon]
checkpoint_threshold = "16MB" # auto-checkpoint WAL size
checkpoint_interval = "5m" # forced periodic checkpointRetention Pruning
When retention is configured, the daemon periodically deletes metrics older than the specified duration.
[daemon]
retention = "30d" # delete data older than 30 days
prune_interval = "1h" # run pruning every hourCompaction
Compaction performs a full database rebuild to reclaim fragmented space. It can run on a schedule or be triggered manually.
[daemon]
compaction_interval = "7d" # weekly compactionbewitch compact
# or remotely
bewitch -addr myserver:9119 -token secret compactDuring compaction, incoming writes are buffered in memory and flushed on completion. Pruning, compaction, and archiving are mutually exclusive (coordinated via mutex).
Parquet Archival
For long-term storage efficiency, metrics older than archive_threshold can be exported to monthly Parquet files compressed with zstd (~10x smaller than DuckDB).
[daemon]
archive_threshold = "7d"
archive_interval = "6h"
archive_path = "/var/lib/bewitch/archive"
retention = "90d" # also prunes old Parquet filesHow it works
- Data older than
archive_thresholdis exported to monthly Parquet files - Exported data is deleted from DuckDB to save space
- Dimension tables are snapshotted to Parquet on each archive run
- History API queries automatically combine DuckDB and Parquet data based on the time range
- Old Parquet files are deleted based on the
retentionsetting
Manual archive/unarchive
# Archive old data to Parquet
bewitch archive
# Reload all Parquet data back into DuckDB
bewitch unarchiveunarchive reloads all Parquet data into DuckDB, removes the Parquet files, and resets the archive state. Useful for changing strategies or disabling archival.
Snapshots
Create standalone DuckDB files for offline analysis — complex queries, sharing with colleagues, or use with DBeaver, Jupyter, or the DuckDB CLI.
# Metrics + dimensions only (default)
bewitch snapshot /tmp/metrics.duckdb
# Include alerts, preferences, scheduled jobs
bewitch snapshot -with-system-tables /tmp/backup.duckdbSnapshots merge the live database and any archived Parquet data into a single self-contained file. Open directly with any DuckDB-compatible tool:
duckdb /tmp/metrics.duckdb "SELECT COUNT(*) FROM cpu_metrics"Concurrency
API requests are served concurrently with database writes, so the TUI stays responsive during heavy collection. During pruning or compaction, incoming writes are buffered in memory and flushed on completion.
Schema
Schema is applied automatically on startup. Key tables:
cpu_metrics— per-core CPU usagememory_metrics— memory usagedisk_metrics— disk space and I/Onetwork_metrics— network throughputtemperature_metrics— sensor temperaturespower_metrics— power consumptiongpu_metrics— GPU utilization, frequency, power, memoryprocess_metrics— process resource usageprocess_info— enriched process metadatadimension_values— normalized dimension lookups (mount, device, interface, sensor, zone)alert_rules— alert rule definitionsalerts— fired alertspreferences— key-value UI preferencesarchive_state— archival tracking