[Nagiosplug-devel] [ nagiosplug-Bugs-1348746 ] check_disk reports incorrect disk free with neg space on BSD

SourceForge.net noreply at sourceforge.net
Sat Dec 8 17:53:16 CET 2007


Bugs item #1348746, was opened at 2005-11-04 19:59
Message generated for change (Comment added) made by dermoth
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=397597&aid=1348746&group_id=29880

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Parsing problem
Group: Release (specify)
>Status: Closed
>Resolution: Fixed
Priority: 5
Private: No
Submitted By: Ted Cabeen (secabeen)
Assigned to: Nobody/Anonymous (nobody)
Summary: check_disk reports incorrect disk free with neg space on BSD

Initial Comment:
With check_disk running on FreeBSD 5-STABLE, when a
disk has negative free space remaining, the amount of
free space goes hugely positive:
DISK CRITICAL - free space: /usr 36028797018963968 MB
(1191472380510208%):

Here's a df from the time:
/dev/ad4s1g          3096462   2989082  -140336   105%
   /usr

----------------------------------------------------------------------

>Comment By: Thomas Guyot (dermoth)
Date: 2007-12-08 11:53

Message:
Logged In: YES 
user_id=375623
Originator: NO

This is fixed in SVN. The root cause of the problem is in Gnulib which is
why it was so hard to track this problem; I implemented a simple workaround
in check_disk. The credits should go to Matthias as he was kind enough to
upload me a FreeBSD VM to test on.

You can try on the latest SVN HEAD (which will likely be released next
week) or use the next daily snapshot.

To get the HEAD:

$ svn co
http://nagiosplug.svn.sourceforge.net/svnroot/nagiosplug/nagiosplug/trunk/
nagiosplug

Snapshots are there (Make sure it's at least Dec. 9 2007):
http://nagiosplug.sourceforge.net/snapshot/


----------------------------------------------------------------------

Comment By: Matthias Eble (psychotrahe)
Date: 2007-12-03 04:41

Message:
Logged In: YES 
user_id=1694341
Originator: NO

Hello Altpeter,

I'm pretty sure, the threshold/argument problem doesn't exist in the
latest versions including 1.4.10.
The debug output of your command line is commented out in the current
code.

negative freespace:
I currently can't imagine where the negative values come from but IMO they
shouldn't be there.
However, I'll try to find some time to test.

Matthias


----------------------------------------------------------------------

Comment By: Thomas Guyot (dermoth)
Date: 2007-11-30 20:44

Message:
Logged In: YES 
user_id=375623
Originator: NO

Sorry about that. A few of us looked into it a while back and couldn't
find the issue. I can take a second look, but it would help if you first
patch check_disk with the attached check_disk.extra-debug.patch and send me
the full output after running the plugin with -vvv (Please limit it to one
path if possible).

Since I don't have a BSD system to test with the attached patch will give
me what I need to simulate your system and hopefully reproduce the bug.
File Added: check_disk.extra-debug.patch

----------------------------------------------------------------------

Comment By: Frank Altpeter (altpeter)
Date: 2007-11-30 16:05

Message:
Logged In: YES 
user_id=145970
Originator: NO

Are there any efforts on this topic yet? Can i help somehow in finding out
the reason for this? Me, personally, thinks that this should go on a
somewhat high priority because this bug makes check_disk more or less
untrustable ...


----------------------------------------------------------------------

Comment By: Frank Altpeter (altpeter)
Date: 2007-11-30 16:05

Message:
Logged In: YES 
user_id=145970
Originator: NO

Are there any efforts on this topic yet? Can i help somehow in finding out
the reason for this? Me, personally, thinks that this should go on a
somewhat high priority because this bug makes check_disk more or less
untrustable ...


----------------------------------------------------------------------

Comment By: Frank Altpeter (altpeter)
Date: 2007-11-21 04:35

Message:
Logged In: YES 
user_id=145970
Originator: NO

Hmmm, just detected one more problem with check_disk processing on
FreeBSD:

root at canismajor:~ # /usr/local/libexec/nagios/check_disk -w 10% -c 5% -X
devfs -X procfs -X linprocfs -X tmpfs -X union /var
DISK WARNING - free space: /var 498 MB (5% inode=97%);|
/var=9419MB;8924;9420;97;9916


root at canismajor:~ # /usr/local/libexec/nagios/check_disk -w 10% -c 5% -X
devfs -X procfs -X linprocfs -X tmpfs -X union -vvv 
DISK OK - free space: / 264 MB (53% inode=92%); /tmp 177 MB (36%
inode=90%); /usr 6458 MB (65% inode=83%); /var 496 MB (5% inode=97%);
/var/spool 31789 MB (69% inode=95%); /var/spool/mail 72962 MB (54%
inode=87%);
264 of 496 MB (53% inode=92%) free on /dev/amrd0s1a (type ufs mounted on
/) warn:0 crit:0 warn%:10% crit%:5%
177 of 496 MB (36% inode=90%) free on /dev/amrd0s1d (type ufs mounted on
/tmp) warn:0 crit:0 warn%:0% crit%:0%
6458 of 9916 MB (65% inode=83%) free on /dev/amrd0s1f (type ufs mounted on
/usr) warn:0 crit:0 warn%:0% crit%:0%
496 of 9916 MB (5% inode=97%) free on /dev/amrd0s1e (type ufs mounted on
/var) warn:0 crit:0 warn%:0% crit%:0%
31789 of 46096 MB (69% inode=95%) free on /dev/amrd0s1g (type ufs mounted
on /var/spool) warn:0 crit:0 warn%:0% crit%:0%
72962 of 135854 MB (54% inode=87%) free on /dev/amrd1s1d (type ufs mounted
on /var/spool/mail) warn:0 crit:0 warn%:0% crit%:0%| /=231MB;445;470;92;495
/tmp=318MB;495;495;90;495 /usr=3459MB;9916;9916;83;9916
/var=9420MB;9916;9916;97;9916 /var/spool=14307MB;46096;46096;94;46096
/var/spool/mail=62891MB;135853;135853;87;135853



e.g. when checking a mount point directly, the insufficient space gives a
warning, but when checked as a summary, the state is OK because it looks
like that the warning and critical criteria are only used for the first
found mount point ...

----------------------------------------------------------------------

Comment By: Frank Altpeter (altpeter)
Date: 2007-11-20 17:42

Message:
Logged In: YES 
user_id=145970
Originator: NO

A little more input, because just hinted from #nagios:


# df -h /tmp
Filesystem       Size    Used   Avail Capacity  Mounted on
/dev/amrd0s1e    496M    461M   -4.5M   101%    /tmp

# check_disk -vvv /tmp | head -1
For /tmp, used_pct=101 free_pct=-1 used_units=460 free_units=1.75922e+13
total_units=495 used_inodes_pct=1 free_inodes_pct=99 fsp.fsu_blocksize=2048
mult=1048576


----------------------------------------------------------------------

Comment By: Holger Weiss (hweiss)
Date: 2007-11-20 17:22

Message:
Logged In: YES 
user_id=759506
Originator: NO

Thanks, we'll have to look into it.

----------------------------------------------------------------------

Comment By: Frank Altpeter (altpeter)
Date: 2007-11-20 17:14

Message:
Logged In: YES 
user_id=145970
Originator: NO

I would like this bug to have reopened. It still exists in nagios-plugins
version 1.4.10 at least at FreeBSD 6.2-RELEASE-p5, as the following test
shows:

Filesystem       Size    Used   Avail Capacity  Mounted on
/dev/amrd0s1e    496M    461M   -4.5M   101%    /tmp

# /usr/local/libexec/nagios/check_disk /tmp
DISK OK - free space: /tmp 17592186044411 MB (-1% inode=99%);|
/tmp=460MB;;;0;495

Would be great to have a fix soon - this is quite bad since i cannot trust
check_disk anymore with that...


----------------------------------------------------------------------

Comment By: SourceForge Robot (sf-robot)
Date: 2006-11-02 22:20

Message:
Logged In: YES 
user_id=1312539

This Tracker item was closed automatically by the system. It was
previously set to a Pending status, and the original submitter
did not respond within 14 days (the time period specified by
the administrator of this Tracker).

----------------------------------------------------------------------

Comment By: Ton Voon (tonvoon)
Date: 2006-10-19 15:49

Message:
Logged In: YES 
user_id=664364

Doesn't look like any updates since I last requested in July. Marking call
into 
pending.

Ton

----------------------------------------------------------------------

Comment By: Ton Voon (tonvoon)
Date: 2006-07-19 19:18

Message:
Logged In: YES 
user_id=664364

Ted,

Can you try the latest snapshot at http://nagiosplug.sf.net/snapshot.
There 
have been major changes to check_disk to sync it with coreutils' df so
there 
shouldn't be sign problems.

If you still have problems, can you tell me what version of df are you
using?

Ton

----------------------------------------------------------------------

Comment By: Ton Voon (tonvoon)
Date: 2005-11-08 07:05

Message:
Logged In: YES 
user_id=664364

>From 1.4 onwards, we use the GNU coreutils library to get df data. I don't

know if FreeBSD use their own routines or not, but GNU coreutils should 
support it.

Yes, I guess signed integers should fix. Was an assumption on our part 
that values would be always positive.

----------------------------------------------------------------------

Comment By: M. Sean Finney (seanius)
Date: 2005-11-08 06:28

Message:
Logged In: YES 
user_id=226838

hi,

well chalk this up to my having been away from traditional
unix/bsd implementations.  afaict in linux such reserved
space is still taken into
calculation of total available space (ie, you could get
ENOSPC before the disk reached 0%).

but anyway, i think the fix is still obvious, that we should
do all scans and assignments as signed integers instead of
unsigned.  if i don't hear any complaints from anyone else
on the plugins team, i'll probably do this at some point
(and hope that it doesn't break
something else)

also, having taken a look at the check_disk code, i can't
seem to find any references to the df program... so i guess
if you're using 1.4.2 or later that it's purely within the
internal disk space routines.

----------------------------------------------------------------------

Comment By: Ted Cabeen (secabeen)
Date: 2005-11-07 16:25

Message:
Logged In: YES 
user_id=40466

All modern unix file-systems reserve a portion (5-10%) of
the disk space for use by root only and to speed disk
accesses.  If the root user exceeds the normal disk space
and uses some of the reserve space, the system will
represent the amount of free space as negative.  

I don't know how check_disk is checking the disk space (df
or internal routines).  Is there an easy way to check?

check_disk (nagios-plugins 1.4.2) 1.57 is the version I'm
running.

----------------------------------------------------------------------

Comment By: M. Sean Finney (seanius)
Date: 2005-11-07 07:56

Message:
Logged In: YES 
user_id=226838

hi,

um, i just have to ask.  how do you have negative free space?  

some other information that would be helpful:
- is check_disk using the df command or internal disk space
routines?
- if df, what df command syntax is check_disk using?
- what version of the plugins are you using?

i believe that the plugin is making an assumption that the
amount of disk space available is unsigned, because, er...
well i'd never heard of negative disk space, anyway :)

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=397597&aid=1348746&group_id=29880




More information about the Devel mailing list