Update memory.pm#6149
Conversation
| return 'critical'; | ||
| } | ||
| # WARNING seulement si LES DEUX sont WARNING | ||
| if ($exit_prct eq 'warning' && $exit_free eq 'warning') { |
There was a problem hiding this comment.
In custom_memory_threshold, 'warning' is reachable only if both prct and free are warning; when only one threshold type is configured, warning can never trigger, contradicting the mode’s single-threshold behavior.
Details
✨ AI Reasoning
1) The new logic is intended to support using percentage thresholds alone or free-bytes thresholds alone.
2) The warning branch requires both threshold evaluations to be warning at the same time.
3) If only one threshold type is configured, the other evaluation cannot become warning, so this branch cannot be satisfied.
4) This creates a definite control-flow contradiction with the advertised behavior and makes warning alerts impossible in those valid configurations.
🔧 How do I fix it?
Trace execution paths carefully. Ensure precondition checks happen before using values, validate ranges before checking impossible conditions, and don't check for states that the code has already ruled out.
Reply @AikidoSec feedback: [FEEDBACK] to get better review comments in the future.
Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info
Description
Enhancement of Linux memory monitoring with combined threshold AND logic
This PR enhances the os::linux::local::mode::memory mode to support simultaneous monitoring of both:
Memory usage percentage (e.g., alert if > 80% used)
Free memory in bytes (e.g., alert if < 1GB free)
Key improvements:
Combined threshold logic (AND): Alerts trigger only when both conditions are met simultaneously
Unified counter architecture: Single counter instead of 5 separate counters, reducing code complexity
Consolidated perfdata generation: All metrics (usage bytes, free bytes, usage %, available, buffer, cached, slab) generated in one function
Explicit options: Clear option names (--warning-memory-usage-prct, --warning-memory-usage-free) instead of generic --warning-* wildcards
Better documentation: POD now includes detailed threshold logic explanation
Threshold logic:
IF usage_prct CRITICAL OR free_byte CRITICAL → CRITICAL
IF usage_prct WARNING AND free_byte WARNING → WARNING
IF at least one OK → OK
This prevents false positives where high percentage usage on large memory systems would trigger alerts even with plenty of free space available.
Fixes #(issue) - Improved memory monitoring precision for mixed workload environments
Type of change
Functionality enhancement or optimization (non-breaking change)
Breaking change (option names changed from wildcard to explicit)
How this pull request can be tested?
Test environment:
Linux server (RHEL/CentOS 7+, Ubuntu, Debian)
NRPE/check_centreon_nrpe3 configured
System with /proc/meminfo available
Test scenarios:
bashcheck_centreon_nrpe3 -H -p 8084 -u
-c check_centreon_plugins -a 'os::linux::local::plugin' 'memory'
'--warning-memory-usage-prct="80" --critical-memory-usage-prct="90"'
Expected: Alert if memory usage > 80%
bash--warning-memory-usage-free="@0:1073741824" --critical-memory-usage-free="@0:536870912"
Expected: Alert if free memory < 1GB (warning) or < 512MB (critical)
Note: Use @0:value syntax for "alert if below" thresholds
bash--warning-memory-usage-prct="80" --critical-memory-usage-prct="90"
--warning-memory-usage-free="@0:1073741824" --critical-memory-usage-free="@0:536870912"
Expected behavior examples:
Server with 32GB RAM:
85% used (27.2GB) + 4.8GB free → OK (usage WARNING but free OK = OK)
85% used (27.2GB) + 800MB free → WARNING (both WARNING = WARNING)
92% used (29.4GB) + 2GB free → CRITICAL (usage CRITICAL = CRITICAL regardless of free)
Server with 8GB RAM:
85% used (6.8GB) + 1.2GB free → OK (usage WARNING but free OK = OK)
85% used (6.8GB) + 900MB free → WARNING (both WARNING = WARNING)
85% used (6.8GB) + 400MB free → CRITICAL (free CRITICAL = CRITICAL)
bash--swap --warning-swap-prct="50" --critical-swap-prct="80"
Expected: Also monitors swap usage (unchanged behavior)
bash# On RHEL 7.1+ or CentOS 8+
Expected: Used memory calculation includes slab subtraction (visible in output as "used (-buffers/cache/slab)")
Sample outputs:
OK state:
OK: Ram total: 7.52 GB used (-buffers/cache/slab): 4.50 GB (59.84%) free: 3.02 GB (40.16%) available: 2.80 GB (37.23%) |
'memory.usage.bytes'=4831838208B;;;0;8070512640
'memory.free.bytes'=3238674432B;@0:1073741824;@0:536870912;0;8070512640
'memory.usage.percentage'=59.84%;80;90;0;100
'memory.available.bytes'=3003121664B;;;0;8070512640
'memory.buffer.bytes'=16384B;;;0;
'memory.cached.bytes'=2145386496B;;;0;
'memory.slab.bytes'=138735616B;;;0;
WARNING state (both thresholds met):
WARNING: Ram total: 7.52 GB used (-buffers/cache/slab): 6.38 GB (84.93%) free: 900.00 MB (11.68%) available: 800.00 MB (10.38%)
CRITICAL state (at least one CRITICAL):
CRITICAL: Ram total: 7.52 GB used (-buffers/cache/slab): 6.90 GB (91.80%) free: 600.00 MB (7.79%) available: 500.00 MB (6.49%)
Architecture changes:
Before (5 separate counters):
perl$self->{maps_counters}->{memory} = [
{ label => 'memory-usage', ... },
{ label => 'memory-usage-free', display_ok => 0, ... },
{ label => 'memory-usage-prct', display_ok => 0, ... },
{ label => 'memory-available', display_ok => 0, ... },
{ label => 'memory-available-prct', display_ok => 0, ... },
{ label => 'buffer', ... },
{ label => 'cached', ... },
{ label => 'slab', ... }
];
After (1 unified counter + custom functions):
perl$self->{maps_counters}->{memory} = [
{ label => 'memory-usage',
closure_custom_perfdata => custom_memory_perfdata,
closure_custom_threshold_check => custom_memory_threshold,
...
}
];
Benefits:
Cleaner code (less repetition)
Single evaluation point for combined logic
All perfdata generated together ensuring consistency
Easier maintenance
Migration guide (Breaking change):
Old syntax → New syntax:
For percentage thresholds:
bash# Before (legacy redirect)
--warning="80" --critical="90"
After (explicit)
--warning-memory-usage-prct="80" --critical-memory-usage-prct="90"
For free memory thresholds:
bash# Before (wildcard)
--warning-memory-usage-free="1073741824" --critical-memory-usage-free="536870912"
After (Nagios range format)
--warning-memory-usage-free="@0:1073741824" --critical-memory-usage-free="@0:536870912"
Important: The @0:value syntax means "alert if value is IN range [0, threshold]", i.e., "alert if below threshold"
Checklist
I have followed the coding style guidelines provided by Centreon
I have commented my code, especially hard-to-understand areas
Added detailed comments in custom_memory_threshold explaining AND logic
Documented why CRITICAL takes precedence over WARNING
I have rebased my development branch on the base branch (develop)
I have reviewed all help messages in the .pm file
All sentences begin with a capital letter
All sentences end with a period
POD includes dedicated "THRESHOLD LOGIC" section with clear examples
Nagios range syntax explained in option descriptions
I have provided output examples showing the result of this code
Additional notes:
Performance impact: Negligible - same /proc/meminfo read, only threshold evaluation logic changed
Backward compatibility: PARTIALLY compatible
Legacy --warning / --critical redirects still work (backward compatible)
New explicit options preferred for new deployments
Wildcard options (--warning-memory-*) replaced with explicit names
Use cases benefiting from this change:
Large memory servers (64GB+): Avoid false alerts when 80% used but 12GB still free
Database servers: Combine "no more than 90% used" with "at least 2GB free for cache"
Application servers: Ensure both percentage headroom AND absolute free space
Recommended migration path:
Update service templates to use explicit option names
Define both percentage and absolute thresholds based on server RAM size
Test on staging before production deployment
Summary by Aikido
⚡ Enhancements
🔧 Refactors
📚 Documentation
More info