home | list info | list archive | date index | thread index

Re: [OCLUG-Tech] oswatcher alternative, collector of top/ps/iostat/vmstat/... info

  • Subject: Re: [OCLUG-Tech] oswatcher alternative, collector of top/ps/iostat/vmstat/... info
  • From: "Brenda J. Butler" <bjb [ at ] sourcerer [ dot ] ca>
  • Date: Sat, 13 Jul 2013 22:55:21 -0400
I'm curious why nagios/munin are overkill.  I think they exactly match
your requirements.

Scheduling the tests and keeping track of the result in a scalable way
can be a bit complicated - the actual tests are basically plugins.
nagios and munin come with a few built-in tests (basically, the ones
you want to see) and the rest are plugins, probably in separate
packages.

It's a bit annoying to learn nagios config language though, I have to
admit.  Munin is way less complicated, but the thinning of data as
time goes by annoys me.  Then again, it was one of your requirements.
The graphs are a bonus.  You don't have to look at them if you don't
want to.

I haven't looked at zenoss, but will keep an eye open for it.

bjb


On Fri, Jul 12, 2013 at 11:49:23PM -0400, Peter Sjöberg wrote:
> On 07/12/2013 10:28 AM, Brenda J. Butler wrote:
> > 
> > 
> > I don't know oswatcher, but based on your description the following
> > would be usefule for you:
> > 
> > 
> > munin (keeps a contstant sized database, which thins out as you look back
> > in time).
> 10sec look and it looks like overkill but I will look at it more.
> 
> > 
> > nagios
> Definitely overkill. Using nagios for other things but what I'm after is
> not monitoring as much as a tool to use after the monitoring alerted
> that something is bad. At that point I want to know what did lead up to
> all memory used up or what process that did consume all cpu/io since
> once the alert happens it many time gets resolved with a big shotgun
> like a reboot (like when they accidentally started 40 instances of a
> java app on a server designed for 4) and we are left to tell what
> happened without logs.
> 
> 
> On 07/12/2013 01:36 PM, Jeffrey Moncrieff wrote:>
> > You can also try zenoss.
> >
> Will check on that later
> 
> > 
> > In both cases, if there is some test they don't already do, you can
> > write your own and have them use it.
> > 
> Well, google did find https://github.com/stephenlang/scrutiny and that's
> about the closest I seen to what I'm looking for but a bit to basic.
> 
> Since after all it's not that much to it I started writing something
> that I will try out over the weekend. I know one challenge will be to be
> able to actually collect anything when the system is crawling but
> anything is better then what we have now which is nothing (besides 1
> minute sar data which tend to stop before system dies).
> 
> /ps
> 



> _______________________________________________
> Linux mailing list
> Linux [ at ] lists [ dot ] oclug [ dot ] on [ dot ] ca
> http://oclug.on.ca/mailman/listinfo/linux

---end quoted text---