[ale] Todays trends

Wed Oct 9 07:45:27 EDT 2013

So the issue wasn't Nagios - it was the folks managing it.

Nagios is a great tool and we use it in Production.   As you sort of allude to it is configurable so you don't have to use defaults.  Not only that it allows you to create your own monitors easily.   We've created some very esoteric monitors for various purposes such as snmp checks for air handlers,  web checks for a variety of different scenarios (they're not all as simple as seeing if the web page is responding) and determining which node of a cluster has the actual service running at this moment.   You can use snmp checks or you can do full scripting on your various servers by adding nrpe (UNIX/Linux) or NSClient++ (Windows).

The benefit to using a monitoring system is you set it not for your "properly" configured but rather for what occurs when it quits behaving "properly".    It is much the same as performance tuning.  You set up things the way you expect them to work then adjust for the reality of what you're actually observing.

The problem you describe happens with any monitoring system in which people managing the tool don't actually manage it.   At a Fortune 500 company I used to work at we had a rather pricey monitoring system in place and it was somewhat annoying because all it really did was page me endlessly in the middle of working on a problem.   Nagios at least allows end users to acknowledge problems once they occur which:
a) Lets others who see the alert see also that it has been acknowledged.
b) Stops sending out needless emails and pages until the issue clears.

It is amusing to me that we have a group here that complains they don't want our Nagios alerts because they say they do their own monitoring but I often have to let them know when something is down because I see it in Nagios and they don't have a clue it is an issue until I tell them.

-----Original Message-----
From: ale-bounces at ale.org [mailto:ale-bounces at ale.org] On Behalf Of Jeff Hubbs
Sent: Tuesday, October 08, 2013 5:33 PM
To: Atlanta Linux Enthusiasts
Subject: Re: [ale] Todays trends

My experience with Nagios in production was a bad one but only because the people who were doing it paid no attention to the nominal operating behavior of the servers being monitored and relied on dead-stupid broad-brush configurations that alarmed all the time uselessly.  And they wouldn't revisit their configurations in any sort of reasonable way because, well, that was *work*; you had to *think*!  The moral is that in typical (by which I mean over-servered in the extreme) environments, in order to use Nagios effectively you wind up having to put in the kind of thought and effort that would have best been put into setting up reliable systems in the first place.  It's a bit like a racing team putting half their effort into the efficiency of their tow trucks. :)

A Puppet set up I encountered as my IT career was ending had a similar problem, but it tended to hold everything back; whenever you wanted to advance anything - change to a new version of a Linux distribution (if you're stuck in that kind of goat-rope) or introduce a new distribution
- you had to drag the Puppet implementation along collaterally and it seemed as though people would rather run ancient versions of stuff and suffer those consequences than do the dragging.  If anything, it seemed to me that Puppet and the like get used to facilitate practices that shouldn't be occurring in the first place:  server overprofileration.

On 10/8/13 4:11 PM, JD wrote:
> On 10/08/2013 01:46 PM, Boris Borisov wrote:
>> Back in mid 2000's when I was working actively on Linux servers my
>> major tasks were related to settings firewalls, LAMP servers,
>> networking, ISP user traffic management (today maybe irrelevant since
>> users have unlimited traffic), Qmail servers etc. Every install had
>> own hardware except VHOST sites and mail servers.
>>
>> I want to get myself up to date so please just number few of the new
>> hot technologies used by Linux administrators.
>>
>> I assume these two are "hot": Virtual servers and cloud computing.
> DevOps, CM-tools, monitoring, alarming, alerting Nagios, Splunk,
> Cacti, OpenNMS, Puppet, Chef, CFengine, Ansible are a few of the
> tools, but there are many others.
>
> When deploying servers to a public or private cloud, can you scale
> from 5 to 500 servers easily with your deployment tools AND not be
> tied to a single cloud provider?  Are the deployment tools
> another-mouth-to-feed or do they really make things easier, reproducible, more secure, more flexible, idiot-proof?
>
> OpenStack seems to be gaining traction in the Fortune 50 world too.
> They aren't throwing VMware out completely, but they are looking to minimize that technology.
>
>
> _______________________________________________
> Ale mailing list
> Ale at ale.org
> http://mail.ale.org/mailman/listinfo/ale
> See JOBS, ANNOUNCE and SCHOOLS lists at
> http://mail.ale.org/mailman/listinfo
>

_______________________________________________
Ale mailing list
Ale at ale.org
http://mail.ale.org/mailman/listinfo/ale
See JOBS, ANNOUNCE and SCHOOLS lists at
http://mail.ale.org/mailman/listinfo

Athena(r), Created for the Cause(tm)
Making a Difference in the Fight Against Breast Cancer

---------------------------------
CONFIDENTIALITY NOTICE: This e-mail may contain privileged or confidential information and is for the sole use of the intended recipient(s). If you are not the intended recipient, any disclosure, copying, distribution, or use of the contents of this information is prohibited and may be unlawful. If you have received this electronic transmission in error, please reply immediately to the sender that you have received the message in error, and delete it. Thank you.
----------------------------------