[Nagiosplug-devel] Suggested alterations to the Performance P rotocoll

Voon, Ton Ton.Voon at egg.com
Fri Sep 10 14:45:06 CEST 2004


> -----Original Message-----
> From:	Karl DeBisschop [SMTP:karl at debisschop.net]
> Sent:	10 September 2004 01:36
> To:	Ben Clewett
> Cc:	Yves Mettier; nagiosplug-devel at lists.sourceforge.net
> Subject:	Re: [Nagiosplug-devel] Suggested alterations to the
> Performance Protocoll
> 
> Ben Clewett wrote:
> > 9.  Limit protocol to just numerical data:  No.  But will not be 
> > supported by some storage programs.  (Yves performance parsing engine to
> 
> > allow such programs to do what they like with this data.  Hows this work
> 
> > going?)
> 
> Right - programs that are built for graphing would be expected to ignore 
> such data. But that does not give license to design the guideline such 
> that nothing but graphing can ever be done with the data.
> 
	[Voon, Ton]  [apologies for using Outlook]
	Actually, I disagree on this fundamental point. I think perf data is
**only** about graphing the data - this is why I am against strings and
check_time labels. 

	I've been thinking a lot about this and I think what is required is
the concept of "additional structured data". The status output gives an
overview, the perf data gives the graphing and the "additional structured
data" gives plugin-specific stuff that can be processed in some way.

	I've been playing with BMC Patrol and one thing that is quite clever
is if it finds cpu usage is high on a server, it returns a list of the top 5
processes bound on the CPU. So, with the above in mind, I can see this in
future:

	$ check_procs --metric=CPU -c 90%
	CRITICAL: 2 processes over 90% CPU | process=2 |
<process><name>httpd</name><cpu>93%</cpu></process><process><name>ora_pmon_W
EB</name><cpu>95%</cpu></process>

	$ check_log -c 10 -e "killed"
	CRITICAL: 14 errors in log file /var/adm/messages | errors=14 |
<error><time>{unixtime}</time><message>httpd killed - restarted
automatically</message></error><error><time>{unixtime2}</time><message>Other
message with killed in</message></error>etc,etc

	[Syntax and output not necessarily 100% accurate :)]

	The use of XML is purely because it is easily extensible and people
can write XLST to parse it any way they want. The | is the separator between
overview, status and extra data. However, there are limits to the output of
plugins which would need to be addressed first.

	If this is the way to go, then that's great - in the future. I would
like to focus on what is required with perfdata and, for me, that means only
graphable data, so I still disagree with check_time and strings.

	But this is a nice lively debate, so I anxiously await your
comments!

	Ton


This private and confidential e-mail has been sent to you by Egg.
The Egg group of companies includes Egg Banking plc
(registered no. 2999842), Egg Financial Products Ltd (registered
no. 3319027) and Egg Investments Ltd (registered no. 3403963) which
is authorised and regulated by the Financial Services Authority. Egg
Investments Ltd. is entered in the FSA register under number 190518. 

Registered in England and Wales. Registered offices: 1 Waterhouse
Square, 138-142 Holborn, London EC1N 2NA.

If you are not the intended recipient of this e-mail and have received
it in error, please notify the sender by replying with 'received in
error' as the subject and then delete it from your mailbox.





More information about the Devel mailing list