> This would probably not be rule based, as the margin for error is ridiculously huge.
> Fixing a device to a remote poller would have yo be a very deliberate thing.
> The most likely method would be to fix a device to the instance from which it was added.
Right, we can choice the nearest poller or load balanced between each poller resource.
Because we don't have the same interface numbers and BGP sessions on each routers.
If one poller have the only big routers and others only the small routers, too bad :-)

>   I'm still not 100% convinced that this is useful for the majority of instances though, as remote SNMP is over 9000 times less retarded than remote mysql and remote rrdcached. It may be useful for behind-firewall installs, though.
Exact, but it's not really infinitely scalable. And buy 3-5 "good servers" is less expensive that's buy only one big super calculator :-)
We search great tool like Observium with scalability. I don't want change from Observium to another tool, because I like your tool!

The center of problem is storage. Latency for a remote rrdcached is must be negligible if the servers is in the same rack & switch?
Another guy have suggested (off list) to test CEPH (for replacing NFS). But I have never tried CEPH in the past.
IMHO, if we use NFS/CEPH storage, we can't run rddcached on each pollers. We need centralised rrdcached.

Johann

2016-11-18 5:09 GMT+01:00 Adam Armstrong <adama@memetic.org>:
The only parts of this which make sense is remote rrdcached interaction and fixing devices to poller wrapper instances.

This would probably not be rule based, as the margin for error is ridiculously huge. Fixing a device to a remote poller would have yo be a very deliberate thing. The most likely method would be to fix a device to the instance from which it was added.

I'm still not 100% convinced that this is useful for the majority of instances though, as remote SNMP is over 9000 times less retarded than remote mysql and remote rrdcached. It may be useful for behind-firewall installs, though.

Adam.

Adam.



Sent from BlueMail

On 18 Nov 2016, at 03:58, Jesper Frank Nemholt <jfn@dassic.com> wrote:
A few things on my wish list related to this :

Add another option to poller-wrapper.py where it polls only select devices either based upon on location (maybe add a custom field called data center or computer room to not mess up the geo-location feature) or device type (server, network, storage etc.) or FQDN subdomain (maybe regex based).

In addition to this, put the RRD files into a subdirectory hierachi based upon the same options (here I'd see location as the most important). Doing this would also allow splitting the RRD files into different storage volumes easier.
This could be done as simple as an optional append-path option to each device where if filled out it will be added to the RRD path and if not the normal /opt/observium/rrd applies.

With this, one could have a poller (or more) per data center and let it store locally initially and then use rsync/unison back to a master system, or alternatively as you mention use rrdcached as a central catch all with decentralized pollers.


/Jesper

On Thu, Nov 17, 2016 at 4:11 AM Markus Klock <markus@best-practice.se> wrote:
No need to modify the script, its already supported by poller-wrapper.py

usage: poller-wrapper.py [-h] [-s] [-i [INSTANCES]] [-n [NUMBER]]

you use the -i and -n arguments in poller-wrapper. if you have 5 poller VMs, you set -i 4 and -n 0 on the first VM, -i 4 and -n 1 on the second VM and so on.
VM nr 1 will then poll the first 1/5 of all your devices.

The main problem yet is writing RRD data during polling, this can be done already by using NFS and mount the same RRD-storage from all the poller VMs but this will not scale that good as NFS does not perform very well for heavy IO small writes.
Here is where we would want rrdcached. We can then have a server with a lot of RAM and a SSD that is dedicated RRD-storage who is running rrdcached.
All the poller VMs will then send thier RRD writes in a small UDP-stream to the RRD-storage box who will cache the writes in RAM and write them to SSD in big sequential writes.

/Markus

2016-11-17 12:40 GMT+01:00 Johann Mallet <johann.mallet@zayo.com>:
Interesting. How to manage the distributed poller?
You have modificate the poller script?

LibreNMS have "group" functionality for distributed polling, but it's seems doesn't exist in Observium :-/

2016-11-16 19:50 GMT+01:00 Markus Klock <markus@best-practice.se>:
I'm actually working on building a distributed Observium with separate webfront, database, RRD-storage and pollers to be able to scale polling very far.
To be able to be fully efficient some improvements has to be done to the observium rrdcached-code (to be able to do all rrd-operations via rrdcached) but this is in the works.
with these improvements I think its possible to scale Observium very very far when it comes to polling devices.

/Markus

2016-11-16 18:56 GMT+01:00 Johann Mallet <johann.mallet@zayo.com>:
Another question : Someone have already deployed Observium in "distributed" mode?
Because if the tools works with 150k interfaces, we will scale up by integrating all our networks equipments.
This represents between 300k and 500k interfaces (physical + virtuel + subinterface etc) :-)

I don't think a single server can make the deal here ;-)

Johann

2016-11-16 18:40 GMT+01:00 Johann Mallet <johann.mallet@zayo.com>:
We have not problem with RRDcached on 3 deployments :-)

Good question : > 300 equipments.
So yes, we can poller many equipments in same time.

> If the switch is slow you will have a problem.
And the latency between poller server and the equipment :-)

Based on your feedback :
I think with really good hardware (big CPU, many core, RAID SSD and great network interface), we can monitor 150k+ interfaces with Observium.
Many core for many poller in same time. Great SSD for I/O with as much RRD files.

Thank you all :-)

Johann

2016-11-12 13:33 GMT+01:00 Markus Klock <markus@best-practice.se>:

If you have a decent SSD for RRD storage you will probably not need rrdcached. I Monitor Close to 100k interfaces without rrdcached. No IOwait on the SSD at all :)
But there are a lot of if's regarding monitoring 150k interfaces. One parameter is how many devices are the 150k interfaces devided on? As a single device can only be polled by one poller, Nexus switches for example can be horribel as a single switch can have 1000+ ports. If the switch is slow you will have a problem.


Den 12 nov. 2016 12:54 em skrev "Youssef BENGELLOUN - ZAHR" <ybzahr@prodware.fr>:

That's very helpful ;-)

Why would you say that ?

Y.



Le 12 nov. 2016 à 12:27, Adam Armstrong <adama@memetic.org> a écrit :

When you decide you hate your life and want to add extra pain.

Adam.



Sent from BlueMail

 

Youssef BENGELLOUN - ZAHR - Consultant Expert
Prodware France
T : +33 979 999 000 - F : +33 988 814 001 - ybzahr@prodware.fr

Web : prodware.fr

         

 

On 12 Nov 2016, at 11:02, Youssef BENGELLOUN - ZAHR <ybzahr@prodware.fr> wrote:
Hello,

I don¹t pretend that we have the same amount of intefaces to poll but we
are keen on performance and found entries about the rrdcached capability.

In your opinion, what is the tipping point to reach before considering
using rrdcached capability ?

Best regards.



On 12/11/2016 11:56, "observium on behalf of Steffen Klemer"
<observium-bounces@observium.org on behalf of steffen.klemer@gwdg.de>
wrote:

And don't forget about rrdcached. I only recently got aware of it and
it's fabulous (if you are alright with loosing some amount of data in
case of server failure).

Am Thu, 10.11.2016 um 17:17 schrieb "Adam Armstrong"
<adama@memetic.org>:

SSD or ramdisk storage, many cores for poller. Exact numbers may
vary ;)

adam.

Sent from Mailbird

[http://www.getmailbird.com/?utm_source=Mailbird&amp;utm_medium=email&amp
;utm_campaign=sent-from-mailbird]
On 10/11/2016 17:06:41, Youssef BENGELLOUN - ZAHR
<ybzahr@prodware.fr> wrote: Dear Johann,

You should take a look at the online documentation with recommended
system sizing.

Best regards.



From: observium <observium-bounces@observium.org
[mailto:observium-bounces@observium.org]> on behalf of Johann Mallet
<johann.mallet@zayo.com [mailto:johann.mallet@zayo.com]> Reply-To:
Observium Network Observation System <observium@observium.org
[mailto:observium@observium.org]> Date: jeudi 10 novembre 2016 17:52
To: Observium Network Observation System <observium@observium.org
[mailto:observium@observium.org]> Subject: [Observium] Monitor 147k
interfaces with Observium?


Hello,


We must monitor 147,000 interfaces (logical) with Observium.

Anyone has already building an architecture with so many network
ports?

Observium can monitor 150k network ports without failed?

What is hardware recommendations? :-)


Thanks !


Johann


Youssef BENGELLOUN - ZAHR - Consultant Expert
Prodware France
T : +33 979 999 000 - F : +33 988 814 001 - ybzahr@prodware.fr
[mailto:ybzahr@prodware.fr]

Web : prodware.fr [http://www.prodware.fr]
[http://twitter.com/Prodware/] [http://www.facebook.com/Prodware/]
[https://www.linkedin.com/company/prodwarefrance]
[https://www.youtube.com/c/ProdwareFrance]
[http://www.viadeo.com/fr/company/prodware]
[http://www.prodware.fr/social-network/]




lg
/Steffen

--
Steffen Klemer E-Mail: Steffen.Klemer@gwdg.de
Tel: +49 551 201 2170



GWDG - Gesellschaft für wissenschaftliche
Datenverarbeitung mbH Göttingen
Am Faßberg 11, 37077 Göttingen

Service-Hotline:
Tel: +49 551 201-1523
E-Mail: support@gwdg.de

Kontakt:
Tel: 0551 201-1510
Fax: 0551 201-2150
E-Mail: gwdg@gwdg.de
WWW: https://www.gwdg.de


Geschäftsführer: Prof. Dr. Ramin Yahyapour
Aufsichtsratsvorsitzender: Prof. Dr. Christian Griesinger
Sitz der Gesellschaft: Göttingen
Registergericht: Göttingen, Handelsregister-Nr. B 598


Zertifiziert nach ISO 9001





BENGELLOUN - ZAHR Youssef - Consultant Expert
Prodware France
T : +33 979 999 000 F : +33 988 814 001 - ybzahr@prodware.fr
Web : prodware.fr



observium mailing list
observium@observium.org
http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
_______________________________________________
observium mailing list
observium@observium.org
http://postman.memetic.org/cgi-bin/mailman/listinfo/observium


_______________________________________________
observium mailing list
observium@observium.org
http://postman.memetic.org/cgi-bin/mailman/listinfo/observium


_______________________________________________
observium mailing list
observium@observium.org
http://postman.memetic.org/cgi-bin/mailman/listinfo/observium




_______________________________________________
observium mailing list
observium@observium.org
http://postman.memetic.org/cgi-bin/mailman/listinfo/observium



_______________________________________________
observium mailing list
observium@observium.org
http://postman.memetic.org/cgi-bin/mailman/listinfo/observium



_______________________________________________
observium mailing list
observium@observium.org
http://postman.memetic.org/cgi-bin/mailman/listinfo/observium


_______________________________________________
observium mailing list
observium@observium.org
http://postman.memetic.org/cgi-bin/mailman/listinfo/observium


observium mailing list
observium@observium.org
http://postman.memetic.org/cgi-bin/mailman/listinfo/observium

_______________________________________________
observium mailing list
observium@observium.org
http://postman.memetic.org/cgi-bin/mailman/listinfo/observium