Monitoring
OnTree includes comprehensive system monitoring to help you track resource usage, identify bottlenecks, and ensure optimal performance of your applications.
Dashboard Overview
The monitoring system displays four key metrics directly on the main dashboard:
- CPU Usage - Processor utilization percentage
- Memory Usage - RAM consumption
- Disk Usage - Storage utilization
- Network Activity - Transfer rates (upload/download)
Each metric card shows:
- Current value with percentage
- 24-hour sparkline graph
- Visual indicators for quick status assessment
Real-Time Monitoring
Update Intervals
Different metrics update at optimized intervals:
- CPU: Every 1 second (high variability)
- Network: Every 1 second (real-time activity)
- Memory: Every 60 seconds (slower changes)
- Disk: Every 60 seconds (gradual changes)
Sparkline Graphs
Each metric displays a miniature graph showing the last 24 hours:
- Time-aware scaling - Accurate time representation
- Gap detection - Shows data collection interruptions
- Auto-scaling - Adjusts to value ranges
- Click to expand - Access detailed views
Detailed Metric Views
Click any sparkline to open detailed analysis:
Time Range Selection
Choose from multiple time windows:
- 1 Hour - Fine-grained recent activity
- 6 Hours - Short-term trends
- 24 Hours - Daily patterns (default)
- 7 Days - Weekly trends
Chart Features
Detailed charts provide:
- Large visualization - 700x400 pixel graphs
- Grid lines - Easy value reading
- Time labels - Clear x-axis timestamps
- Value labels - Precise y-axis measurements
- Filled area - Visual weight for metrics
Understanding Metrics
CPU Usage
Shows total CPU utilization across all cores:
- 0-25%: Light load, good headroom
- 25-50%: Moderate load, healthy
- 50-75%: Heavy load, monitor closely
- 75-100%: Very heavy, may need optimization
Common causes of high CPU:
- Compilation or builds
- Media transcoding
- Database queries
- Runaway processes
Memory Usage
Displays RAM consumption as percentage:
- Includes: Application memory, caches, buffers
- Excludes: Swap usage (monitored separately)
- Healthy range: 50-80% (Linux uses RAM for caching)
Memory optimization tips:
- Set container memory limits
- Monitor for memory leaks
- Use memory-efficient applications
Disk Usage
Shows primary disk utilization:
- Path: Root filesystem (
/
) - Includes: System, applications, data
- Warning threshold: 80% full
- Critical threshold: 90% full
Disk management:
- Regular cleanup of unused containers
- Archive old logs
- Use volume mounts for large data
Network Activity
Displays current transfer rates:
- Download rate: Incoming traffic
- Upload rate: Outgoing traffic
- Units: Automatically scaled (B/s, KB/s, MB/s, GB/s)
- Combined graph: Total network activity
Network patterns:
- Spikes during updates/downloads
- Steady streams for media servers
- Periodic bursts for backups
Performance Optimization
OnTree's monitoring system is optimized for minimal overhead:
Caching
- 5-minute cache for generated graphs
- Intelligent invalidation on new data
- Reduced CPU usage from rendering
Efficient Queries
- Batch data fetching for multiple metrics
- Indexed timestamps for fast retrieval
- Automatic data cleanup after 7 days
Resource Usage
The monitoring system itself uses:
- < 1% CPU for data collection
- < 50MB RAM for caching
- < 100MB disk for 7-day history
Using Monitoring Data
Identifying Issues
Watch for these patterns:
-
CPU Spikes
- Sudden jumps indicate new processes
- Sustained high usage suggests optimization needed
-
Memory Leaks
- Gradually increasing memory without drops
- Eventually leads to system slowdown
-
Disk Fill
- Steady increase toward capacity
- Can cause application failures
-
Network Anomalies
- Unexpected high traffic
- Could indicate issues or attacks
Capacity Planning
Use historical data for planning:
- Peak usage times - Scale resources accordingly
- Growth trends - Predict future needs
- Bottlenecks - Identify limiting factors
Performance Tuning
Monitoring helps optimize applications:
- Set resource limits based on actual usage
- Schedule intensive tasks during low-usage periods
- Identify inefficient applications for replacement
Integration with Apps
Container Metrics
While system monitoring shows host metrics, container-specific monitoring is planned:
- Per-container CPU usage
- Individual memory consumption
- Container network traffic
- Storage per application
Alerts and Notifications
Future features will include:
- Threshold-based alerts
- Email/webhook notifications
- Custom alert rules
- Integration with monitoring stacks
Advanced Monitoring
Exporting Data
Access raw monitoring data:
# SQLite database location
/data/ontree.db
# Query example
SELECT * FROM system_vital_logs
WHERE timestamp > datetime('now', '-1 hour')
ORDER BY timestamp DESC;
Custom Dashboards
Integrate with external tools:
- Prometheus - Export metrics endpoint
- Grafana - Rich visualization
- InfluxDB - Time-series storage
API Access
Programmatic access to metrics:
# Get current CPU usage
curl http://localhost:8080/api/monitoring/cpu
# Get historical data
curl http://localhost:8080/api/monitoring/cpu?range=24h
Configuration
Enable/Disable Monitoring
In config.toml
:
# Monitoring configuration
monitoring_enabled = true
monitoring_interval = 60 # seconds
monitoring_retention = 7 # days
Via environment:
MONITORING_ENABLED=false treeos
Performance Tuning
Adjust monitoring for your needs:
# Reduce frequency for low-end systems
monitoring_interval = 300 # 5 minutes
# Increase retention for long-term analysis
monitoring_retention = 30 # 30 days
# Disable specific metrics
disable_network_monitoring = true
Troubleshooting
No Data Showing
- Check monitoring is enabled in configuration
- Verify data collection in logs
- Check database permissions
- Ensure sufficient disk space
Incorrect Values
- Verify system tools are installed (iostat, vmstat)
- Check container permissions for /proc access
- Compare with system tools for validation
Performance Impact
If monitoring impacts performance:
- Increase collection interval
- Reduce retention period
- Disable unnecessary metrics
- Check for resource constraints
Best Practices
Regular Review
- Daily: Check for anomalies
- Weekly: Review trends
- Monthly: Capacity planning
Baseline Establishment
- Document normal operating ranges
- Note typical peak times
- Track seasonal variations
Proactive Monitoring
- Set up alerts for thresholds
- Plan maintenance windows
- Scale before hitting limits
The monitoring system is your window into OnTree's performance, helping you maintain a healthy, efficient environment for all your applications.