Live Infrastructure Monitoring Panel
Overview
A purpose-built operational awareness layer embedded directly into the admin dashboard of a production system. Instead of standing up a heavyweight observability stack (Prometheus + Grafana + exporters), the goal was a minimal, zero-dependency monitoring panel that gives the engineering team immediate visibility into the servers and containers that matter — with no additional infrastructure to maintain.
Feature Breakdown
System Resource Tracking
Using psutil, the panel streams CPU utilisation (per-core and aggregate), RAM pressure, disk I/O, and network throughput to the frontend at configurable intervals. Data is pushed over WebSocket so the dashboard updates in real time without polling the HTTP endpoint repeatedly.
Container Monitoring via Docker API
The Docker SDK for Python exposes container state, resource consumption (CPU %, memory, net I/O), and live log tails for every running container. The panel surfaces this alongside host metrics on a unified timeline, making it easy to correlate a CPU spike with a specific container workload.
Alert Thresholds
Simple threshold rules (configurable in the dashboard UI) trigger a highlighted visual indicator when any metric crosses a defined boundary. No external alerting system required — the warnings appear inline in the monitoring view.
Minimal Footprint
The monitoring backend runs as a lightweight Flask blueprint mounted on the existing admin app, adding negligible memory overhead. All data is ephemeral — nothing is written to disk; the panel is a live view, not a metrics database.
Stack
Python · Flask · psutil · Docker SDK for Python · WebSockets · Docker Compose · Linux Server Administration
Outcome
Reduced mean time to detect (MTTD) infrastructure anomalies from minutes to seconds for the ops team. The integrated approach eliminated the cognitive overhead of switching between multiple monitoring tools during incident response.