Naaa,  mine is a big use case/implementation; it’s just 3 out of a bagillion devices that this needs done for, its not my personal email, it’s a distro. We could easily do an JSON post to any manner of transports, but “mail();” was so much simpler.

All we need to know is when it’s down so someone can go kick the hamster wheel, that’s “ping only”, monitoring to me.

If you’re looking for graphs / SLA type metrics too, then I’d argue you’re already looking for more than “ping only” monitoring....

I see what you’re saying though; OBS has such a great altogether GUI experience, it’d be nice!

-Ryan

On Mar 3, 2019, at 08:15, Colin Stubbs <cstubbs@gmail.com> wrote:


I know where you're at... but,

0) No refactoring required - this is part of the point of the patch ( I would not have even attempted this if it wasn't so easy... ) and my long writeup for Mike given he hadn't actually looked at JIRA yet. Uses existing code/architecture - bit flag and device attributes - to mark a device as "don't do any SNMP to this device right now"

1) Observium uses fping...

2) Smokeping is horrible - can provide great value sure, but horrible to use, and again, extra tool. I'm a long way from unfamiliar with it and have both used it many years ago, and considered it for our current environments. Not useful to us.

3) Smokeping is not integrated to Observium if the device is not monitored by SNMP - this is fine when it's just you or a small team of people who understand this and are happy to flick between interfaces and how to work with Smokeping. Not great when you have a larger team and you really just need one interface on a monitor in your operations room that tells everyone everything that's up/down/problematic right now. Particularly not fine when your level 1's can't really login on CLI and edit config files etc.

4) IP SLA's - vendor specific thing. I love them in the right environment. Fantastic tool and capability to have in a network. We have zero Cisco in this situation and don't want to add a device specifically for this.

5) Your description sounds like it's just you in this environment... and if email fails, or you're hit by a bus, no one else knows there's a problem let alone how your alerting solution works and how to maintain it. Also, manual investigation/access to history. By having Observium do the ping we get to leverage the alerting capabilities inherent to it, e.g. email/webhook/PagerDuty/OpsGenie/VictorOps etc etc etc., as well as the webUI for visual alert notification and access to graphs/history.

It's actually a simpler and cleaner solution; unless you value seeing the min/max/average/jitter values that IP SLA or Smokeping can help you see. In which case it's not what you need.

-Colin

Email: cstubbs @ gmail . com


On Sun, 3 Mar 2019 at 22:44, Ryan Huff <ryanhuff@outlook.com> wrote:
I really do understand this Colin, on a deeper level than you might imagine ;). I have no vested interest either way personally, I’m just offering some unsolicited thoughts...

I always looked at this specific topic like asking a 10lbs. (4.5kg or 0.7149 stones) sledge hammer to drive a nail. Sure, it can do it but it’ll probably be messy and ugly because a sledge hammer was never meant to drive nails, and when you use tools in ways they weren’t really designed for, things can get messy. Just use a hammer instead.

I have done three things to achieve icmp only monitoring and also stick within the base code (as a developer, I hate mucking with base code or re-inventing wheels when I want to do different things). Icmp-only monitoring to me, always just seemed silly (not in purpose) to try and work into an existing and established application that didn’t already do it. 

It’s just too easy of a problem to solve otherwise, than to go to the time/expense of refactoring an application IMHO.

- I use RTT SLAs to monitor next hop devices (which OBS is fabulous at picking up), assuming the device hopped from is SLA capable and OBS monitored.

- I use Fping (Smokeping) integrated with OBS. Using the Smokeping web wrapper makes a great RTT only tool, and you also have the OBS integration for when it makes sense to use on OBS devices.

- I have a small (very small) bash script that reads a text file of IP address and launches one thread per device to run continuous pings. Not a very scalable script, but like you, I only need it for a very few limited cases. The script uses sendmail to email me if it can’t reach something.

Thanks,

Ryan

On Mar 3, 2019, at 06:35, Colin Stubbs via observium <observium@observium.org> wrote:

Hi Mike,

They're in the JIRA ticket, but sure, here's some inserted to this email.

I'm aware of Adam/your preference to reject or simply ignore the regular requests for this kind of feature, and the confusion it causes new users considering Observium for use in their organisations when they find out they can't do the most basic of monitoring to something using Observium.

A great many of your users would gain a lot of value from being able to add a few ping only hosts.

Why build, manage and integrate (to other tools) another ping based monitoring system when all you need is to monitor a handful of devices with ping only; simply because those devices fail to offer SNMP for some reason?

Anything that does have a working SNMP service can and should be monitored with SNMP to gain greater insight into what's happening on it... but not everything does.

No one in their right mind should attempt to use Observium as a system which monitors massive numbers of DNS names or IP's with ping only, there are better systems out there for that, but there are so many situations in a which an Observium install can and should be able to monitor some devices ping only.

I've associated the ticket with the two previous JIRA requests for this feature; which Adam rejected.

They were just feature requests as I recall; no code proposed. Totally understandable you guys didn't want to work on the feature when it's not really what you feel Observium is for.

Rejecting this now when you have just been gifted working code, that modifies things fairly minimally, and simply leverages the ping information that's already occurring to each and every device (that ping is not disabled for), and uses an existing method to mark devices so that further SNMP based polling does not occur... would make no sense.

That said; I totally understand if you want me to fix/add/improve things before considering it further - simply communicate what's necessary and I'll do that.

So, for anyone else on the list, who wants the feature; now is your best chance to add your voice and explain why this kind of feature is of use to you. Because you're probably not going to get it otherwise.

We've been using this internally since early 2017... I just got around to updating Observium to the latest and greatest (reasons, terrible terrible reasons...), and had to update my patch.

But, I'd really prefer not to have to maintain a patch, and I definitely don't want to have to publish it so others can use it and cause Observium Ltd further dramas because of it.

How many hosts do we monitor ping only?

Six. A grand total of six. Everything else has SNMP in some form; and we want as much info as we can extract from them via it - which is why we love Observium. Nothing else compares for SNMP based monitoring IMHO.

Most of those ping only devices are next hop service provider devices just outside our network, for which they won't give us any kind of SNMP access. Ping'ing them offers a small (very small!) degree of insight into link latency/loss from us to them (manual comparison I must admit), and/or potential device problems for the thing that's currently responding for that IP.

Hey look... a perfect use case for this feature...

The others are devices that we use but which simply don't offer SNMP because vendor X didn't bother to add net-snmp to their standard build for CentOS/Ubuntu/whatever weird and wonderful Linux distro they built them on; when they really should have. Don't care much just need to know they're on the network in some way and want a long track record of response time to them.

Hey look... a perfect use case for this feature...

Not surprisingly we don't want to have to operate and maintain a separate system for six IP's. Alerting can be flappy because Observium/fping is only sending a single ICMP echo - we're OK with that and don't even really want more pings which would lead to average values anyway - and we've worked around flaps with delays and automatic suppression of duplicate alerts in our alert management and escalation tool.

The six are monitored by IP; the DNS hostname test below was simply to double check there was no issue with this and a DNS name. Just a hostname that I've never managed to forget and use a lot to ping test connectivity out.

Also, I just realised that the "Skip ping" option is no longer automatically hidden in device settings when "Skip SNMP" is checked; I'll add an updated diff in JIRA shortly after I look at the JS for that again.

<Capture.PNG>

<observium_ping_only_screens3.PNG>

<observium_ping_only_screens4.png>

<observium_ping_only_screens5.PNG> 


On Sun, 3 Mar 2019 at 18:07, Mike Stupalov <mike@observium.org> wrote:
Ohh my dear..

Which really benefit to use Observium for ping_only hosts?

Pls show (screenshots) how this devices displayed on your install.

Colin Stubbs via observium wrote on 03/03/2019 07:43:

For the benefit of anyone on the list who doesn't use JIRA... and also so that others who support (want/need) this feature can comment.

https://jira.observium.org/browse/OBS-2925

Attached patch defines OBS_SNMP_SKIP flag and uses snmp_skip device attribute, similar to OBS_PING_SKIP and ping_skip, in order to have hosts that are ping only.

Ping only hosts can still have the Observium Unix Agent installed (tested), and other poller modules such as IPMI enabled (untested).

Tested:

  1. Add/remove/rename ping only hosts via CLI
  2. Add/remove ping only hosts via webUI
  3. View/interact ping only hosts via webUI - SNMP specific features/menus etc are hidden while skip SNMP is enabled
  4. Alerting - device_status equals 0 && device_status_type equals ping - will trigger alerts for host down/recovery events
  5. Shifting a previously SNMP contactable host to ping only by ticking skip SNMP box - old SNMP graphs/etc are maintained and remain available - remove skip SNMP box and SNMP polling begins again

Things I know kind of don't work right now:

  1. Location override - poller/discovery doesn't seem to perform geocoding and whatever else is happening there

Things that could be improved:

  1. Unix Agent poller module etc is enabled by default for all hosts, for ping only hosts, perhaps it should be disabled by default? Will improve performance by reducing the number of processes that hang while the 10s default connect timeout happens.

Totally untested:

  1. Use of autodiscovery SNMP skip - should work in theory, unsure if those parts of the patch should actually be used though. Some people out there may actually want to add anything that does respond to ping and can be found thru adjacency and routing protocol info etc??

 

Patch generated from recent trunk, touches files as below,

[root@desktop observium]# diff -r -u observium-trunk root | grep -v ^Only > ping_only_hosts.diff
[root@desktop observium]# cd root
[root@desktop root]# svn status
M add_device.php
M html/pages/addhost.inc.php
M html/pages/device/edit/device.inc.php
M html/pages/device/edit.inc.php
M html/pages/device/graphs.inc.php
M html/pages/device/perf.inc.php
M includes/config-variables.inc.php
M includes/defaults.inc.php
M includes/definitions.inc.php
M includes/discovery/functions.inc.php
M includes/functions.inc.php
M includes/polling/functions.inc.php
M poller.php
M rename_device.php
[root@desktop root]# svn info | grep ^Revision
Revision: 9704
[root@desktop root]
[root@desktop observium]# mv ping_only_hosts.diff ping_only_hosts_r9704.diff


-Colin

Email: cstubbs @ gmail . com
_______________________________________________
observium mailing list
observium@observium.org
http://postman.memetic.org/cgi-bin/mailman/listinfo/observium

--
Mike Stupalov
Observium Limited, http://observium.org