![](https://secure.gravatar.com/avatar/c2b56233f35f1d7d39df59eec5d72fe1.jpg?s=120&d=mm&r=g)
This would probably not be rule based, as the margin for error is
ridiculously huge.
Fixing a device to a remote poller would have yo be a very deliberate
thing.
The most likely method would be to fix a device to the instance from
which it was added. Right, we can choice the nearest poller or load balanced between each poller resource. Because we don't have the same interface numbers and BGP sessions on each routers. If one poller have the only big routers and others only the small routers, too bad :-)
I'm still not 100% convinced that this is useful for the majority of
instances though, as remote SNMP is over 9000 times less retarded than remote mysql and remote rrdcached. It may be useful for behind-firewall installs, though. Exact, but it's not really infinitely scalable. And buy 3-5 "good servers" is less expensive that's buy only one big super calculator :-) We search great tool like Observium with scalability. I don't want change from Observium to another tool, because I like your tool!
The center of problem is storage. Latency for a remote rrdcached is must be negligible if the servers is in the same rack & switch? Another guy have suggested (off list) to test CEPH (for replacing NFS). But I have never tried CEPH in the past. IMHO, if we use NFS/CEPH storage, we can't run rddcached on each pollers. We need centralised rrdcached.
Johann
2016-11-18 5:09 GMT+01:00 Adam Armstrong adama@memetic.org:
The only parts of this which make sense is remote rrdcached interaction and fixing devices to poller wrapper instances.
This would probably not be rule based, as the margin for error is ridiculously huge. Fixing a device to a remote poller would have yo be a very deliberate thing. The most likely method would be to fix a device to the instance from which it was added.
I'm still not 100% convinced that this is useful for the majority of instances though, as remote SNMP is over 9000 times less retarded than remote mysql and remote rrdcached. It may be useful for behind-firewall installs, though.
Adam.
Adam.
Sent from BlueMail http://www.bluemail.me/r
On 18 Nov 2016, at 03:58, Jesper Frank Nemholt jfn@dassic.com wrote:
A few things on my wish list related to this :
Add another option to poller-wrapper.py where it polls only select devices either based upon on location (maybe add a custom field called data center or computer room to not mess up the geo-location feature) or device type (server, network, storage etc.) or FQDN subdomain (maybe regex based).
In addition to this, put the RRD files into a subdirectory hierachi based upon the same options (here I'd see location as the most important). Doing this would also allow splitting the RRD files into different storage volumes easier. This could be done as simple as an optional append-path option to each device where if filled out it will be added to the RRD path and if not the normal /opt/observium/rrd applies.
With this, one could have a poller (or more) per data center and let it store locally initially and then use rsync/unison back to a master system, or alternatively as you mention use rrdcached as a central catch all with decentralized pollers.
/Jesper
On Thu, Nov 17, 2016 at 4:11 AM Markus Klock markus@best-practice.se wrote:
No need to modify the script, its already supported by poller-wrapper.py
usage: poller-wrapper.py [-h] [-s] [-i [INSTANCES]] [-n [NUMBER]]
you use the -i and -n arguments in poller-wrapper. if you have 5 poller VMs, you set -i 4 and -n 0 on the first VM, -i 4 and -n 1 on the second VM and so on. VM nr 1 will then poll the first 1/5 of all your devices.
The main problem yet is writing RRD data during polling, this can be done already by using NFS and mount the same RRD-storage from all the poller VMs but this will not scale that good as NFS does not perform very well for heavy IO small writes. Here is where we would want rrdcached. We can then have a server with a lot of RAM and a SSD that is dedicated RRD-storage who is running rrdcached. All the poller VMs will then send thier RRD writes in a small UDP-stream to the RRD-storage box who will cache the writes in RAM and write them to SSD in big sequential writes.
/Markus
2016-11-17 12:40 GMT+01:00 Johann Mallet johann.mallet@zayo.com:
Interesting. How to manage the distributed poller? You have modificate the poller script?
LibreNMS have "group" functionality for distributed polling, but it's seems doesn't exist in Observium :-/
2016-11-16 19:50 GMT+01:00 Markus Klock markus@best-practice.se:
I'm actually working on building a distributed Observium with separate webfront, database, RRD-storage and pollers to be able to scale polling very far. To be able to be fully efficient some improvements has to be done to the observium rrdcached-code (to be able to do all rrd-operations via rrdcached) but this is in the works. with these improvements I think its possible to scale Observium very very far when it comes to polling devices.
/Markus
2016-11-16 18:56 GMT+01:00 Johann Mallet johann.mallet@zayo.com:
Another question : Someone have already deployed Observium in "distributed" mode? Because if the tools works with 150k interfaces, we will scale up by integrating all our networks equipments. This represents between 300k and 500k interfaces (physical + virtuel + subinterface etc) :-)
I don't think a single server can make the deal here ;-)
Johann
2016-11-16 18:40 GMT+01:00 Johann Mallet johann.mallet@zayo.com:
We have not problem with RRDcached on 3 deployments :-)
Good question : > 300 equipments. So yes, we can poller many equipments in same time.
If the switch is slow you will have a problem.
And the latency between poller server and the equipment :-)
Based on your feedback : I think with really good hardware (big CPU, many core, RAID SSD and great network interface), we can monitor 150k+ interfaces with Observium. Many core for many poller in same time. Great SSD for I/O with as much RRD files.
Thank you all :-)
Johann
2016-11-12 13:33 GMT+01:00 Markus Klock markus@best-practice.se:
If you have a decent SSD for RRD storage you will probably not need rrdcached. I Monitor Close to 100k interfaces without rrdcached. No IOwait on the SSD at all :) But there are a lot of if's regarding monitoring 150k interfaces. One parameter is how many devices are the 150k interfaces devided on? As a single device can only be polled by one poller, Nexus switches for example can be horribel as a single switch can have 1000+ ports. If the switch is slow you will have a problem.
Den 12 nov. 2016 12:54 em skrev "Youssef BENGELLOUN - ZAHR" < ybzahr@prodware.fr>:
That's very helpful ;-)
Why would you say that ?
Y.
Le 12 nov. 2016 à 12:27, Adam Armstrong adama@memetic.org a écrit :
When you decide you hate your life and want to add extra pain.
Adam.
Sent from BlueMail http://www.bluemail.me/r
*Youssef BENGELLOUN - ZAHR* - Consultant Expert Prodware France T : +33 979 999 000 - F : +33 988 814 001 - ybzahr@prodware.fr
Web : prodware.fr http://www.prodware.fr
http://twitter.com/Prodware/ http://www.facebook.com/Prodware/ https://www.linkedin.com/company/prodwarefrance https://www.youtube.com/c/ProdwareFrance http://www.viadeo.com/fr/company/prodware http://www.prodware.fr/social-network/
On 12 Nov 2016, at 11:02, Youssef BENGELLOUN - ZAHR ybzahr@prodware.fr wrote:
Hello,
I don¹t pretend that we have the same amount of intefaces to poll but we are keen on performance and found entries about the rrdcached capability.
In your opinion, what is the tipping point to reach before considering using rrdcached capability ?
Best regards.
On 12/11/2016 11:56, "observium on behalf of Steffen Klemer" <observium-bounces@observium.org on behalf of steffen.klemer@gwdg.de> wrote:
And don't forget about rrdcached. I only recently got aware of it and it's fabulous (if you are alright with loosing some amount of data in case of server failure).
Am Thu, 10.11.2016 um 17:17 schrieb "Adam Armstrong" adama@memetic.org:
SSD or ramdisk storage, many cores for poller. Exact numbers may vary ;)
adam.
Sent from Mailbird
[http://www.getmailbird.com/?utm_source=Mailbird&utm_medium=email& ;utm_campaign=sent-from-mailbird] On 10/11/2016 17:06:41, Youssef BENGELLOUN - ZAHR ybzahr@prodware.fr wrote: Dear Johann,
You should take a look at the online documentation with recommended system sizing.
Best regards.
From: observium <observium-bounces@observium.org [mailto:observium-bounces@observium.org observium-bounces@observium.org]> on behalf of Johann Mallet <johann.mallet@zayo.com [mailto:johann.mallet@zayo.com johann.mallet@zayo.com]> Reply-To: Observium Network Observation System <observium@observium.org [mailto:observium@observium.org observium@observium.org]> Date: jeudi 10 novembre 2016 17:52 To: Observium Network Observation System <observium@observium.org
[mailto:observium@observium.org observium@observium.org]> Subject: [Observium] Monitor 147k interfaces with Observium?
Hello,
We must monitor 147,000 interfaces (logical) with Observium.
Anyone has already building an architecture with so many network ports?
Observium can monitor 150k network ports without failed?
What is hardware recommendations? :-)
Thanks !
Johann
Youssef BENGELLOUN - ZAHR - Consultant Expert Prodware France T : +33 979 999 000 - F : +33 988 814 001 - ybzahr@prodware.fr [mailto:ybzahr@prodware.fr ybzahr@prodware.fr]
Web : prodware.fr [http://www.prodware.fr] [http://twitter.com/Prodware/] [http://www.facebook.com/Prodware/] [https://www.linkedin.com/company/prodwarefrance] [https://www.youtube.com/c/ProdwareFrance] [http://www.viadeo.com/fr/company/prodware] [http://www.prodware.fr/social-network/]
lg /Steffen
-- Steffen Klemer E-Mail: Steffen.Klemer@gwdg.de Tel: +49 551 201 2170
GWDG - Gesellschaft für wissenschaftliche Datenverarbeitung mbH Göttingen Am Faßberg 11, 37077 Göttingen
Service-Hotline: Tel: +49 551 201-1523 E-Mail: support@gwdg.de
Kontakt: Tel: 0551 201-1510 Fax: 0551 201-2150 E-Mail: gwdg@gwdg.de WWW: https://www.gwdg.de
Geschäftsführer: Prof. Dr. Ramin Yahyapour Aufsichtsratsvorsitzender: Prof. Dr. Christian Griesinger Sitz der Gesellschaft: Göttingen Registergericht: Göttingen, Handelsregister-Nr. B 598
Zertifiziert nach ISO 9001
BENGELLOUN - ZAHR Youssef - Consultant Expert Prodware France T : +33 979 999 000 F : +33 988 814 001 - ybzahr@prodware.fr Web : prodware.fr
observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium