Performance

Morten Guldager

11 Nov 2014 11 Nov '14

1:55 p.m.

'Aloha!

I'm in the process of evaluating observium for my organisation's needs. We have some instances running already, but my task is to do the evaluation in a more structured way. I have some questions which I will keep in different posts to keep the threads clean.

Performance:

We are looking at a network currently consisting of 4000 devices with close to 100'000 ports. These devices are well known to observium and a subset of them got auto discovered just fine. But how about performance. Will it require vast amounts of computing power? Also, our network grows quite rapidly, might be 50% bigger 12 months ahead.

I found an old thread from July 2013 where Joe Hoh describing a complex multi server setup to scale observium to something approx 2-3 times my current needs. Will I have to go through similar "struggles" to get it working? Or has observium changed much to make it scale different than it did 1.5 year ago?

Pointers to information regarding scaling observium are most welcome.

-- /Morten %-)

Attachments:

attachment.html (text/html — 1.2 KB)

Show replies by date

Spencer Gaw

11 Nov 11 Nov

4:22 p.m.

I'm not sure how current this information is but it may answer some of your questions: http://www.observium.org/wiki/Hardware_Scaling

Regards,

On 11/11/2014 5:55 AM, Morten Guldager wrote:

...

'Aloha!

I'm in the process of evaluating observium for my organisation's needs. We have some instances running already, but my task is to do the evaluation in a more structured way. I have some questions which I will keep in different posts to keep the threads clean.

Performance:

We are looking at a network currently consisting of 4000 devices with close to 100'000 ports. These devices are well known to observium and a subset of them got auto discovered just fine. But how about performance. Will it require vast amounts of computing power? Also, our network grows quite rapidly, might be 50% bigger 12 months ahead.

I found an old thread from July 2013 where Joe Hoh describing a complex multi server setup to scale observium to something approx 2-3 times my current needs. Will I have to go through similar "struggles" to get it working? Or has observium changed much to make it scale different than it did 1.5 year ago?

Pointers to information regarding scaling observium are most welcome.

-- /Morten %-)

observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium

Morten Guldager

9:11 p.m.

Yeah I read that page too. But I'm uncertain how linear observium scales. 10 cores will be doable, but how much RAM will it take then. Guess I will have to use rrdcached to keep the disk IOs on a manageable level. The server guys will probably complain if I suck every available IO ops out of their SAN.

/Morten

On Tue, Nov 11, 2014 at 4:22 PM, Spencer Gaw spencerg@frii.net wrote:

...

I'm not sure how current this information is but it may answer some of your questions: http://www.observium.org/wiki/Hardware_Scaling

Regards,

SG

On 11/11/2014 5:55 AM, Morten Guldager wrote:

...
'Aloha!

I'm in the process of evaluating observium for my organisation's needs. We have some instances running already, but my task is to do the evaluation in a more structured way. I have some questions which I will keep in different posts to keep the threads clean.

Performance:

We are looking at a network currently consisting of 4000 devices with close to 100'000 ports. These devices are well known to observium and a subset of them got auto discovered just fine. But how about performance. Will it require vast amounts of computing power? Also, our network grows quite rapidly, might be 50% bigger 12 months ahead.

I found an old thread from July 2013 where Joe Hoh describing a complex multi server setup to scale observium to something approx 2-3 times my current needs. Will I have to go through similar "struggles" to get it working? Or has observium changed much to make it scale different than it did 1.5 year ago?

Pointers to information regarding scaling observium are most welcome.

Lane Eckley

9:32 p.m.

You are going to have a rough time working out the requirements for Observium as there are a lot of factors that will influence your own requirements. A lot of it will come down to the devices you are monitoring, how fast they responding to the SNMP (Some devices are significantly faster than others), the volume of syslogging, if you are running other plugins (smokeping, etc.).

(Speed of SNMP response will determine how long the poller will be stuck on a single device and hence how much CPU using.)

In today’s age if you are fretting over I/O from your SAN you are doing it all wrong. With the pricing of SSD’s dropping as much as they have running a SAN on older tech such as SAS & SATA is just asking for a disaster and terrible density.

That said, grab yourself a nice dual E5-2630 V3, 64-128GB of RAM and a nice set of SSD’s [i.e. 4-8 256GB (or larger, avoid the 128GB as you will take a performance hit) Samsung 850 Pros], setup Ubuntu with software RAID (you get to use trim that way) and you will be off running like a bat out of hell for less than 5K and on a system that will handle a huge volume of servers/devices/ports.

Once you begin to saturate the above machine you bring on a 2nd one of similar specs (maybe more RAM, less drives) and move MySQL to another host to offload all that work load.

Once you outgrow the above, you will likely need to upgrade CPU on the Observium machine in which case you either buy another server or simply drop in something like an E5-2650 V3 into the existing machine(s).

The initial Dual E5-2630 V3 will take you a long way though, very long way.

Good luck!

-Lane

*From:* observium [mailto:observium-bounces@observium.org] *On Behalf Of *Morten Guldager *Sent:* November 11, 2014 3:11 PM *To:* Observium Network Observation System *Subject:* Re: [Observium] Performance

/Morten

On Tue, Nov 11, 2014 at 4:22 PM, Spencer Gaw spencerg@frii.net wrote:

I'm not sure how current this information is but it may answer some of your questions: http://www.observium.org/wiki/Hardware_Scaling

Regards,

On 11/11/2014 5:55 AM, Morten Guldager wrote:

'Aloha!

Performance:

Pointers to information regarding scaling observium are most welcome.

Adam Armstrong

10:23 p.m.

Don't use a SAN. Observium is the perfect storm of worst use-case for SANs. It has lots of tiny writes all over the disk and Observium will eat up the performance of your SAN far quicker than its sticker price might indicate. You're far better off with a few SSDs or even a RAM disks, if you can fit it in.

The ports page doesn't use as much RAM as it once did, so that requirement isn't there anymore. Mostly what you need to do is keep up I/O throughput and CPU throughput to handle enough parallel threads to poll all of your devices quickly enough.

I would aim to run without rrdcached, and only look at using it if you need to. It adds additional CPU and latency to the equation, which is not usually desired.

One of the major problems of modern servers, IMO, is that the single-core clock speeds are relatively slow. For web-ui performance, you want the fastest single core speed you can get. For poller performance, you want as many cores as you can efficiently spread your poller load over. 4,000 devices might require more than 12 cores, especially if they're only 2Ghz cores.

Don't try to run Observium on a VM. The VM I/O overhead is a pain, and you'll ruin the host system for any other application. You want a high-core, high-memory, high-io dedicated server.

Something like :

http://www.ebay.co.uk/itm/Refurbished-HP-ProLiant-DL585-G2-Web-Server-4-x-Qu...

Put a couple of SSD in that and it /should/ suffice. Though, you might want faster cores, and you might want 256GB of RAM, so you can keep the RRDs in RAM.

It's difficult to gauge performance requirements on that scale because it depends upon how the devices behave and what's monitor(able/ed) on them.

Oh, and split MySQL off onto a separate server with fewer, faster cores. It's not worth doing this with the web gui because of the latency involved in dealing with RRDs over the network, but it's definitely worth doing with MySQL.

adam.

------ Original Message ------ From: "Morten Guldager" morten.guldager@gmail.com To: "Observium Network Observation System" observium@observium.org Sent: 11/11/2014 2:11:05 PM Subject: Re: [Observium] Performance

...

Yeah I read that page too. But I'm uncertain how linear observium scales. 10 cores will be doable, but how much RAM will it take then. Guess I will have to use rrdcached to keep the disk IOs on a manageable level. The server guys will probably complain if I suck every available IO ops out of their SAN.

/Morten

On Tue, Nov 11, 2014 at 4:22 PM, Spencer Gaw spencerg@frii.net wrote:

...
I'm not sure how current this information is but it may answer some of your questions: http://www.observium.org/wiki/Hardware_Scaling

Regards,

SG

On 11/11/2014 5:55 AM, Morten Guldager wrote:

...
'Aloha!

I'm in the process of evaluating observium for my organisation's needs. We have some instances running already, but my task is to do the evaluation in a more structured way. I have some questions which I will keep in different posts to keep the threads clean.

Performance:

We are looking at a network currently consisting of 4000 devices with close to 100'000 ports. These devices are well known to observium and a subset of them got auto discovered just fine. But how about performance. Will it require vast amounts of computing power? Also, our network grows quite rapidly, might be 50% bigger 12 months ahead.

I found an old thread from July 2013 where Joe Hoh describing a complex multi server setup to scale observium to something approx 2-3 times my current needs. Will I have to go through similar "struggles" to get it working? Or has observium changed much to make it scale different than it did 1.5 year ago?

Pointers to information regarding scaling observium are most welcome.

Adriaan Smuts

11:18 p.m.

WOW, this is extremely good feedback Adam. I wish I had something like this when we started with Observium two years ago! ☺

I second the SSD’s, it makes a huge performance increase.

@ Adam/Dev’s - My suggestion: A while back you guys sent out the Usage stats - http://www.observium.org/wiki/Usage_Statistics. Maybe you should start keeping track of what hardware people use to run Observium, similar to what devices people monitor via Obsevium. I know that this is as simple and easy as said, but it would be very useful for questions like this.

Regards, Adriaan Smuts

Systems Administrator - Windows

________________________________

Direct Line:

+27 21 464 9565

Reception:

086 000 9500

www.webafrica.co.zahttp://www.webafrica.co.za/

[http://shared.webafrica.co.za/images/signature/signature_logo.png]

From: observium [mailto:observium-bounces@observium.org] On Behalf Of Adam Armstrong Sent: 11 November 2014 11:24 PM To: Observium Network Observation System Subject: Re: [Observium] Performance

I would aim to run without rrdcached, and only look at using it if you need to. It adds additional CPU and latency to the equation, which is not usually desired.

Don't try to run Observium on a VM. The VM I/O overhead is a pain, and you'll ruin the host system for any other application. You want a high-core, high-memory, high-io dedicated server.

Something like :

http://www.ebay.co.uk/itm/Refurbished-HP-ProLiant-DL585-G2-Web-Server-4-x-Qu...

Put a couple of SSD in that and it /should/ suffice. Though, you might want faster cores, and you might want 256GB of RAM, so you can keep the RRDs in RAM.

It's difficult to gauge performance requirements on that scale because it depends upon how the devices behave and what's monitor(able/ed) on them.

adam.

------ Original Message ------ From: "Morten Guldager" <morten.guldager@gmail.commailto:morten.guldager@gmail.com> To: "Observium Network Observation System" <observium@observium.orgmailto:observium@observium.org> Sent: 11/11/2014 2:11:05 PM Subject: Re: [Observium] Performance

/Morten

On Tue, Nov 11, 2014 at 4:22 PM, Spencer Gaw <spencerg@frii.netmailto:spencerg@frii.net> wrote: I'm not sure how current this information is but it may answer some of your questions: http://www.observium.org/wiki/Hardware_Scaling

Regards,

On 11/11/2014 5:55 AM, Morten Guldager wrote: 'Aloha!

Performance:

Pointers to information regarding scaling observium are most welcome.

Nikolay Shopik

12 Nov 12 Nov

12:01 a.m.

If plan to use ssd, make sure leave 25% free space on it or your can have lower performance then expeceted.

...

On 12 нояб. 2014 г., at 1:18, Adriaan Smuts adriaan.smuts@webafrica.com wrote:

I second the SSD’s, it makes a huge performance increase.

Lane Eckley

12:46 a.m.

Overprovisioning SSD's isn't a major requirement anymore especially if you use a software RAID solution where you can utilize trim (Hence why I noted Ubuntu + Software RAID as opposed to say an LSI RAID card).

Overprovisioning used to add a good amount of life to an SSD, but with all the new garbage collection solutions built into the SSD's as well as the ability to use TRIM with RAID its much less of an issue.

We have large hypervisors for VM hosting using SSD's without any overprovisioning and we see a good 2+ years out of the SSD array before degradation really begins. Those are primarily Intel 520/530's and Samsung 830/840's too.

On the SSD front, there is no real good reason to go drop $1500/2500 on an Intel S3500/3700 or like solutions unless you are have a serious concern about the SSD's lasting 4+ years. We have very heavy MySQL servers pounding standard Intel 520/530's and still get a good 12-18 months out of them before we see a real slow down.

Save the money, get cheaper drives and just plan to replace them(not that you will actually need too, but plan for it) if you have a super busy server.

-Lane -----Original Message----- From: observium [mailto:observium-bounces@observium.org] On Behalf Of Nikolay Shopik Sent: November 11, 2014 6:02 PM To: Observium Network Observation System Subject: Re: [Observium] Performance

If plan to use ssd, make sure leave 25% free space on it or your can have lower performance then expeceted.

...

On 12 нояб. 2014 г., at 1:18, Adriaan Smuts

adriaan.smuts@webafrica.com wrote:

...

I second the SSD’s, it makes a huge performance increase.

_______________________________________________ observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium

Nikolay Shopik

8:26 a.m.

I'm not talking about expanding lifetime. But lower write performance when you have less 10% space available.

On 12.11.2014 2:46, Lane Eckley wrote:

...

Overprovisioning SSD's isn't a major requirement anymore especially if you use a software RAID solution where you can utilize trim (Hence why I noted Ubuntu + Software RAID as opposed to say an LSI RAID card).

Overprovisioning used to add a good amount of life to an SSD, but with all the new garbage collection solutions built into the SSD's as well as the ability to use TRIM with RAID its much less of an issue.

Josh Hopper

11 Nov 11 Nov

11:45 p.m.

I have mine on an EMC. Does just fine.

-- Sincerely, Joshua Hopper, A+ CE Network Administrator [cid:image001.jpg@01CFFDD7.3AEFE530] 420 3rd Ave NW Hickory NC 28601 Office: 828-449-1839x2160 | Cell: 828-855-7565

From: observium [mailto:observium-bounces@observium.org] On Behalf Of Adam Armstrong Sent: Tuesday, November 11, 2014 4:24 PM To: Observium Network Observation System Subject: Re: [Observium] Performance

I would aim to run without rrdcached, and only look at using it if you need to. It adds additional CPU and latency to the equation, which is not usually desired.

Don't try to run Observium on a VM. The VM I/O overhead is a pain, and you'll ruin the host system for any other application. You want a high-core, high-memory, high-io dedicated server.

Something like :

http://www.ebay.co.uk/itm/Refurbished-HP-ProLiant-DL585-G2-Web-Server-4-x-Qu...

Put a couple of SSD in that and it /should/ suffice. Though, you might want faster cores, and you might want 256GB of RAM, so you can keep the RRDs in RAM.

It's difficult to gauge performance requirements on that scale because it depends upon how the devices behave and what's monitor(able/ed) on them.

adam.

/Morten

Regards,

On 11/11/2014 5:55 AM, Morten Guldager wrote: 'Aloha!

Performance:

Pointers to information regarding scaling observium are most welcome.

Protect Plus Air Holdings, LLC and Affiliates, Protect Plus, LLC and Affiliates, Imagine One Resources, LLC and Affiliates, Protect Plus Surfaces, LLC and Affiliates Confidentiality Notice: This e-mail is intended only for the addressee named above. It contains information that is privileged, confidential or otherwise protected from use and disclosure. If you are not the intended recipient, you are hereby notified that any review, disclosure, copying, or dissemination of this transmission, or taking of any action in reliance on its contents, or other use is strictly prohibited. If you have received this transmission in error, please reply to the sender listed above immediately and permanently delete this message from your inbox. Thank you for your cooperation.

Colin Stubbs

12 Nov 12 Nov

1:13 a.m.

My 2 cents... SAN and virtualisation should still be considered, it just depends on your budget and your requirements, along with experience/skill/knowledge and the ability to obtain what you don't have quickly.

The other 98 cents,

If you're running Observium and it's not revenue supporting in a tangible way or you're simply doing a proof of concept with full load, sure; a dedicated second hand box from eBay is probably the way to go.

Morten sounds like he's looking at running Observium in service provider land - 4000+ devices, 100,000+ ports, with a high potential for growth - it's likely going to be revenue supporting in a tangible way, either because support staff are going to be relying on it to troubleshoot things, or because customers will be logging into it directly.

Single box == single point of failure. Either due to motherboard/backplane/chassis, or due to physical location. Either way if it becomes critical to you in some way you really need to consider spending more time/money ensuring that you can keep it running as much as possible, and can recover from any failures quickly and with minimal impact.

Virtualisation with SAN based storage is an option that should be considered.

re "Don't use a SAN" - performance won't be an issue on a modern SAN with pure SSD or tiered storage options, a good storage plan/layout (if applicable), and multiple high speed 8/10Gbps+ interfaces. However, yes, it will be a waste of money if you're just going to run a dedicated Linux box anyway. It's easy enough these days to provide dedicated spindles and SSD then tier them yourself using the Linux based options available, one of which is now natively within LVM.

re "VM I/O overhead is a pain" - Correctly sizing resource allocations and segmenting VM operation to appropriate hosts can be a pain when you first deal with the concept. But VM storage I/O is negligible to the point of not existing on all modern hypervisors.

The reason I/O appears to be painful is generally because of contention for CPU resources, exacerbated by high vCPU allocations to one or more VM's on the host, which impacts everything in the VM including requests for I/O operations.

VMware has a whole swath of useful doco regarding this. Hit Google. Take a look at what the recommendations are for workloads like SQL Server, Oracle and Exchange.

If you need access to lots of physical CPU yes you will need to allocate more than just 1 or 2 vCPU's, in that case due to how scheduling and locking of pCPU's works the best practice is actually to run VM's with a high vCPU allocation on hosts without any contention e.g. run 4 x 16 vCPU VM's on 64 pCPU system with no other VM's.

That ups your cost to run each VM of course, basically to "cost of cheap physical box + virtualization software cost/licenses/support fees".... the total is likely to be more than what you could get away with by buying a bunch of pizza boxes, to start with, but you do get all the processing capability they need, along with all the benefits of an abstraction layer between the software & O/S, and the hardware it's running on, e.g. in VMware land...

1. GUI tools

2. Snapshot's and rollback, which can be integrated as part of a change control process

3. Flexibilty to move/migrate the system between storage, possibly while running but dependent upon shared storage and licensing options.

4. Flexibility to move/migrate the system between hosts, possibly while running but dependent upon shared storage and licensing options.

5. Flexibility to replicate the entire VM to another location, e.g. your own Disaster Recovery site, or cloud based storage where the VM can then be started up on someone else's cloud computing platform only when needed. Dependent upon VMware licensing options, possibly SAN features/licensing if you have to do block level replication there.

6. A whole swath of VM based backup software that can integrate with snapshots along with doing dedupe and compression. Can be part of 3. Other products required.

7. CLI tools and API's to automate a whole lot of 2,3,4,5,6.

8. Optional extra GUI tools so you can skip straight paste doing things at 7 yourself. Dependent upon licensing options and products purchased.

It might be good to start a discussion involving people who are actually running Observium within VM's? Some documented pointers on the website with links to reference documentation/architectures for large virtualised environments would not hurt.

-Colin

On 12 November 2014 07:23, Adam Armstrong adama@memetic.org wrote:

...

Don't use a SAN. Observium is the perfect storm of worst use-case for SANs. It has lots of tiny writes all over the disk and Observium will eat up the performance of your SAN far quicker than its sticker price might indicate. You're far better off with a few SSDs or even a RAM disks, if you can fit it in.

The ports page doesn't use as much RAM as it once did, so that requirement isn't there anymore. Mostly what you need to do is keep up I/O throughput and CPU throughput to handle enough parallel threads to poll all of your devices quickly enough.

I would aim to run without rrdcached, and only look at using it if you need to. It adds additional CPU and latency to the equation, which is not usually desired.

One of the major problems of modern servers, IMO, is that the single-core clock speeds are relatively slow. For web-ui performance, you want the fastest single core speed you can get. For poller performance, you want as many cores as you can efficiently spread your poller load over. 4,000 devices might require more than 12 cores, especially if they're only 2Ghz cores.

Don't try to run Observium on a VM. The VM I/O overhead is a pain, and you'll ruin the host system for any other application. You want a high-core, high-memory, high-io dedicated server.

Something like :

http://www.ebay.co.uk/itm/Refurbished-HP-ProLiant-DL585-G2-Web-Server-4-x-Qu...

Put a couple of SSD in that and it /should/ suffice. Though, you might want faster cores, and you might want 256GB of RAM, so you can keep the RRDs in RAM.

It's difficult to gauge performance requirements on that scale because it depends upon how the devices behave and what's monitor(able/ed) on them.

Oh, and split MySQL off onto a separate server with fewer, faster cores. It's not worth doing this with the web gui because of the latency involved in dealing with RRDs over the network, but it's definitely worth doing with MySQL.

adam.

------ Original Message ------ From: "Morten Guldager" morten.guldager@gmail.com To: "Observium Network Observation System" observium@observium.org Sent: 11/11/2014 2:11:05 PM Subject: Re: [Observium] Performance

Yeah I read that page too. But I'm uncertain how linear observium scales. 10 cores will be doable, but how much RAM will it take then. Guess I will have to use rrdcached to keep the disk IOs on a manageable level. The server guys will probably complain if I suck every available IO ops out of their SAN.

/Morten

On Tue, Nov 11, 2014 at 4:22 PM, Spencer Gaw spencerg@frii.net wrote:

...
I'm not sure how current this information is but it may answer some of your questions: http://www.observium.org/wiki/Hardware_Scaling

Regards,

SG

On 11/11/2014 5:55 AM, Morten Guldager wrote:

...
'Aloha!

I'm in the process of evaluating observium for my organisation's needs. We have some instances running already, but my task is to do the evaluation in a more structured way. I have some questions which I will keep in different posts to keep the threads clean.

Performance:

We are looking at a network currently consisting of 4000 devices with close to 100'000 ports. These devices are well known to observium and a subset of them got auto discovered just fine. But how about performance. Will it require vast amounts of computing power? Also, our network grows quite rapidly, might be 50% bigger 12 months ahead.

I found an old thread from July 2013 where Joe Hoh describing a complex multi server setup to scale observium to something approx 2-3 times my current needs. Will I have to go through similar "struggles" to get it working? Or has observium changed much to make it scale different than it did 1.5 year ago?

Pointers to information regarding scaling observium are most welcome.

observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium

Alex Povolotsky

11 Nov 11 Nov

9:58 p.m.

I'm working on about 20 times better poller for Observium, current one is awful. Mysql should not be a bottleneck, I guess.

On 11.11.2014 15:55, Morten Guldager wrote:

...

'Aloha!

I'm in the process of evaluating observium for my organisation's needs. We have some instances running already, but my task is to do the evaluation in a more structured way. I have some questions which I will keep in different posts to keep the threads clean.

Performance:

We are looking at a network currently consisting of 4000 devices with close to 100'000 ports. These devices are well known to observium and a subset of them got auto discovered just fine. But how about performance. Will it require vast amounts of computing power? Also, our network grows quite rapidly, might be 50% bigger 12 months ahead.

I found an old thread from July 2013 where Joe Hoh describing a complex multi server setup to scale observium to something approx 2-3 times my current needs. Will I have to go through similar "struggles" to get it working? Or has observium changed much to make it scale different than it did 1.5 year ago?

Pointers to information regarding scaling observium are most welcome.

observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium

Spencer Gaw

10:12 p.m.

Oh great, this guy again...

Regards,

On 11/11/2014 1:58 PM, Alex Povolotsky wrote:

...

I'm working on about 20 times better poller for Observium, current one is awful. Mysql should not be a bottleneck, I guess.

On 11.11.2014 15:55, Morten Guldager wrote:

...
'Aloha!

I'm in the process of evaluating observium for my organisation's needs. We have some instances running already, but my task is to do the evaluation in a more structured way. I have some questions which I will keep in different posts to keep the threads clean.

Performance:

We are looking at a network currently consisting of 4000 devices with close to 100'000 ports. These devices are well known to observium and a subset of them got auto discovered just fine. But how about performance. Will it require vast amounts of computing power? Also, our network grows quite rapidly, might be 50% bigger 12 months ahead.

I found an old thread from July 2013 where Joe Hoh describing a complex multi server setup to scale observium to something approx 2-3 times my current needs. Will I have to go through similar "struggles" to get it working? Or has observium changed much to make it scale different than it did 1.5 year ago?

Pointers to information regarding scaling observium are most welcome.

observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium

observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium

Josh Hopper

10:15 p.m.

Lol

Tip: Put Observium on a VM and when it gets slow... 1) Power Off VM 2) Open settings 3) Add more resources 4) Save 5) Turn on VM 6) Profit.

-- Sincerely, Joshua Hopper, A+ CE Network Administrator

420 3rd Ave NW Hickory NC 28601 Office: 828-449-1839x2160 | Cell: 828-855-7565

-----Original Message----- From: observium [mailto:observium-bounces@observium.org] On Behalf Of Spencer Gaw Sent: Tuesday, November 11, 2014 4:13 PM To: Observium Network Observation System Subject: Re: [Observium] Performance

Oh great, this guy again...

Regards,

On 11/11/2014 1:58 PM, Alex Povolotsky wrote:

...

I'm working on about 20 times better poller for Observium, current one is awful. Mysql should not be a bottleneck, I guess.

On 11.11.2014 15:55, Morten Guldager wrote:

...
'Aloha!

I'm in the process of evaluating observium for my organisation's needs. We have some instances running already, but my task is to do the evaluation in a more structured way. I have some questions which I will keep in different posts to keep the threads clean.

Performance:

We are looking at a network currently consisting of 4000 devices with close to 100'000 ports. These devices are well known to observium and a subset of them got auto discovered just fine. But how about performance. Will it require vast amounts of computing power? Also, our network grows quite rapidly, might be 50% bigger 12 months ahead.

I found an old thread from July 2013 where Joe Hoh describing a complex multi server setup to scale observium to something approx 2-3 times my current needs. Will I have to go through similar "struggles" to get it working? Or has observium changed much to make it scale different than it did 1.5 year ago?

Pointers to information regarding scaling observium are most welcome.

observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium

observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium

_______________________________________________ observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium Protect Plus Air Holdings, LLC and Affiliates, Protect Plus, LLC and Affiliates, Imagine One Resources, LLC and Affiliates, Protect Plus Surfaces, LLC and Affiliates Confidentiality Notice: This e-mail is intended only for the addressee named above. It contains information that is privileged, confidential or otherwise protected from use and disclosure. If you are not the intended recipient, you are hereby notified that any review, disclosure, copying, or dissemination of this transmission, or taking of any action in reliance on its contents, or other use is strictly prohibited. If you have received this transmission in error, please reply to the sender listed above immediately and permanently delete this message from your inbox. Thank you for your cooperation.

Raphael Mazelier

12 Nov 12 Nov

11:09 a.m.

Observium is great tool, have a nice gui, made plenty of things... But the poller is "naive".

For example I have a couple of stacked EX3300 I cannot poll with observium (the snmp is slow on this equipemnt ok).

Think about that : in observium the poll of this equipment take more than 5min (so it is unusable), in cacti with spine 20s.

Doing snmpbulwalk on all oid is not always the better solution....

Le 11/11/14 22:12, Spencer Gaw a écrit :

...

Oh great, this guy again...

Regards,

SG

On 11/11/2014 1:58 PM, Alex Povolotsky wrote:

...
I'm working on about 20 times better poller for Observium, current one is awful. Mysql should not be a bottleneck, I guess.

Simon Schmitz

11:40 a.m.

We have juniper ex3300 and ex4200 and the polling time is 25sec.

http://www.observium.org/wiki/Hardware_Scaling

...

Am 12.11.2014 um 11:10 schrieb Raphael Mazelier raph@futomaki.net:

Observium is great tool, have a nice gui, made plenty of things... But the poller is "naive".

For example I have a couple of stacked EX3300 I cannot poll with observium (the snmp is slow on this equipemnt ok).

Think about that : in observium the poll of this equipment take more than 5min (so it is unusable), in cacti with spine 20s.

Doing snmpbulwalk on all oid is not always the better solution....

Le 11/11/14 22:12, Spencer Gaw a écrit :

...
Oh great, this guy again...

Regards,

SG

...
On 11/11/2014 1:58 PM, Alex Povolotsky wrote: I'm working on about 20 times better poller for Observium, current one is awful. Mysql should not be a bottleneck, I guess.

observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium

Pekka.Panula＠sofor.fi

11:48 a.m.

I got cluster EXP2200 (two switches) series, pollin time 160 seconds, and EXP4550 (four switches cluster), polling time about near 500 seconds.... I even have disabled polling from that four switch cluster because Observium is so slow polling it.

"observium" observium-bounces@observium.org wrote on 12.11.2014 12:40:57:

...

From: Simon Schmitz schmitz@eseven.de To: Observium Network Observation System observium@observium.org Date: 12.11.2014 12:42 Subject: Re: [Observium] Performance Sent by: "observium" observium-bounces@observium.org

We have juniper ex3300 and ex4200 and the polling time is 25sec.

http://www.observium.org/wiki/Hardware_Scaling

...
Am 12.11.2014 um 11:10 schrieb Raphael Mazelier raph@futomaki.net:

Observium is great tool, have a nice gui, made plenty of things... But the poller is "naive".

For example I have a couple of stacked EX3300 I cannot poll with

observium (the snmp is slow on this equipemnt ok).

...
Think about that : in observium the poll of this equipment take

more than 5min (so it is unusable), in cacti with spine 20s.

...
Doing snmpbulwalk on all oid is not always the better solution....

Le 11/11/14 22:12, Spencer Gaw a écrit :

...
Oh great, this guy again...

Regards,

SG

...
On 11/11/2014 1:58 PM, Alex Povolotsky wrote: I'm working on about 20 times better poller for Observium, current

one

...

...
...
...
is awful. Mysql should not be a bottleneck, I guess.

observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium

observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium

Simon Schmitz

11:50 a.m.

Cluster ? You mean virtual chassis

...

Am 12.11.2014 um 11:47 schrieb Pekka.Panula@sofor.fi:

Hi

I got cluster EXP2200 (two switches) series, pollin time 160 seconds, and EXP4550 (four switches cluster), polling time about near 500 seconds.... I even have disabled polling from that four switch cluster because Observium is so slow polling it.

"observium" observium-bounces@observium.org wrote on 12.11.2014 12:40:57:

...
From: Simon Schmitz schmitz@eseven.de To: Observium Network Observation System observium@observium.org Date: 12.11.2014 12:42 Subject: Re: [Observium] Performance Sent by: "observium" observium-bounces@observium.org

We have juniper ex3300 and ex4200 and the polling time is 25sec.

http://www.observium.org/wiki/Hardware_Scaling

...
Am 12.11.2014 um 11:10 schrieb Raphael Mazelier raph@futomaki.net:

Observium is great tool, have a nice gui, made plenty of things... But the poller is "naive".

For example I have a couple of stacked EX3300 I cannot poll with

observium (the snmp is slow on this equipemnt ok).

...
Think about that : in observium the poll of this equipment take

more than 5min (so it is unusable), in cacti with spine 20s.

...
Doing snmpbulwalk on all oid is not always the better solution....

Le 11/11/14 22:12, Spencer Gaw a écrit :

...
Oh great, this guy again...

Regards,

SG

...
On 11/11/2014 1:58 PM, Alex Povolotsky wrote: I'm working on about 20 times better poller for Observium, current one is awful. Mysql should not be a bottleneck, I guess.

observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium

observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium

observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium

Pekka.Panula＠sofor.fi

12:46 p.m.

Yes, virtual chassis feature.

"observium" observium-bounces@observium.org wrote on 12.11.2014 12:50:42:

...

From: Simon Schmitz schmitz@eseven.de To: Observium Network Observation System observium@observium.org Date: 12.11.2014 12:52 Subject: Re: [Observium] Performance Sent by: "observium" observium-bounces@observium.org

Cluster ? You mean virtual chassis

...

Am 12.11.2014 um 11:47 schrieb Pekka.Panula@sofor.fi:

...

Hi

I got cluster EXP2200 (two switches) series, pollin time 160 seconds, and EXP4550 (four switches cluster), polling time about near 500 seconds.... I even have disabled polling from that four switch cluster because Observium is so slow polling it.

"observium" observium-bounces@observium.org wrote on 12.11.2014

12:40:57:

...

...
From: Simon Schmitz schmitz@eseven.de To: Observium Network Observation System observium@observium.org Date: 12.11.2014 12:42 Subject: Re: [Observium] Performance Sent by: "observium" observium-bounces@observium.org

We have juniper ex3300 and ex4200 and the polling time is 25sec.

http://www.observium.org/wiki/Hardware_Scaling

...
Am 12.11.2014 um 11:10 schrieb Raphael Mazelier raph@futomaki.net:

Observium is great tool, have a nice gui, made plenty of things... But the poller is "naive".

For example I have a couple of stacked EX3300 I cannot poll with

observium (the snmp is slow on this equipemnt ok).

...
Think about that : in observium the poll of this equipment take

more than 5min (so it is unusable), in cacti with spine 20s.

...
Doing snmpbulwalk on all oid is not always the better solution....

Le 11/11/14 22:12, Spencer Gaw a écrit :

...
Oh great, this guy again...

Regards,

SG

...
On 11/11/2014 1:58 PM, Alex Povolotsky wrote: I'm working on about 20 times better poller for Observium, current

one

...

...
...
...
...
is awful. Mysql should not be a bottleneck, I guess.

observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium

observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium

observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium _______________________________________________ observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium

Adam Armstrong

13 Nov 13 Nov

12:01 a.m.

Virtual chassis "features" are universal fail. A blight on the industry.

A pox upon all their houses!

Adam.

Sent with AquaMail for Android http://www.aqua-mail.com

On 12 November 2014 05:45:01 Pekka.Panula@sofor.fi wrote:

...

Yes, virtual chassis feature.

"observium" observium-bounces@observium.org wrote on 12.11.2014 12:50:42:

...
From: Simon Schmitz schmitz@eseven.de To: Observium Network Observation System observium@observium.org Date: 12.11.2014 12:52 Subject: Re: [Observium] Performance Sent by: "observium" observium-bounces@observium.org

Cluster ? You mean virtual chassis

...
Am 12.11.2014 um 11:47 schrieb Pekka.Panula@sofor.fi:

...
Hi

I got cluster EXP2200 (two switches) series, pollin time 160 seconds, and EXP4550 (four switches cluster), polling time about near 500 seconds.... I even have disabled polling from that four switch cluster because Observium is so slow polling it.

"observium" observium-bounces@observium.org wrote on 12.11.2014

12:40:57:

...
...
From: Simon Schmitz schmitz@eseven.de To: Observium Network Observation System observium@observium.org Date: 12.11.2014 12:42 Subject: Re: [Observium] Performance Sent by: "observium" observium-bounces@observium.org

We have juniper ex3300 and ex4200 and the polling time is 25sec.

http://www.observium.org/wiki/Hardware_Scaling

...
Am 12.11.2014 um 11:10 schrieb Raphael Mazelier raph@futomaki.net:

Observium is great tool, have a nice gui, made plenty of things... But the poller is "naive".

For example I have a couple of stacked EX3300 I cannot poll with

observium (the snmp is slow on this equipemnt ok).

...
Think about that : in observium the poll of this equipment take

more than 5min (so it is unusable), in cacti with spine 20s.

...
Doing snmpbulwalk on all oid is not always the better solution....

Le 11/11/14 22:12, Spencer Gaw a écrit :

...
Oh great, this guy again...

Regards,

SG

...
On 11/11/2014 1:58 PM, Alex Povolotsky wrote: I'm working on about 20 times better poller for Observium, current

one

...
...
...
...
...
is awful. Mysql should not be a bottleneck, I guess.

observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium

observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium

observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium _______________________________________________ observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium

observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium

krause＠rus.uni-stuttgart.de

12 Nov 12 Nov

12:55 p.m.

Hi Raphael,

...

But the poller is "naive".

...

sure is. I've been wondering whether (especially in a VM environment) it wouldn't be possible to use multiple VMs writing to a shared storage backend (and have a shared web server reading from there). That would at least allow a better scalability in a virtual environment w.r.t. hypervisor scheduling.

Anyone working on this (i.e. making the poller also understand "instances" of a pool like the discovery.php)?

Best, Kilian

Tom Laermans

1:52 p.m.

On 12/11/2014 12:55, krause@rus.uni-stuttgart.de wrote:

...

Hi Raphael,

...
But the poller is "naive".

...

sure is. I've been wondering whether (especially in a VM environment) it wouldn't be possible to use multiple VMs writing to a shared storage backend (and have a shared web server reading from there). That would at least allow a better scalability in a virtual environment w.r.t. hypervisor scheduling.

Anyone working on this (i.e. making the poller also understand "instances" of a pool like the discovery.php)?

Not anymore, it's supported this since before discovery did.

This is totally unrelated to the discussion though.

Tom

Adam Armstrong

13 Nov 13 Nov

6:20 p.m.

Stacking switches is pretty naive.

adam.

------ Original Message ------ From: "Raphael Mazelier" raph@futomaki.net To: observium@observium.org Sent: 11/12/2014 4:09:49 AM Subject: Re: [Observium] Performance

...

Observium is great tool, have a nice gui, made plenty of things... But the poller is "naive".

For example I have a couple of stacked EX3300 I cannot poll with observium (the snmp is slow on this equipemnt ok).

Think about that : in observium the poll of this equipment take more than 5min (so it is unusable), in cacti with spine 20s.

Doing snmpbulwalk on all oid is not always the better solution....

Le 11/11/14 22:12, Spencer Gaw a écrit :

...
Oh great, this guy again...

Regards,

SG

On 11/11/2014 1:58 PM, Alex Povolotsky wrote:

...
I'm working on about 20 times better poller for Observium, current one is awful. Mysql should not be a bottleneck, I guess.

observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium

Tom Laermans

6:27 p.m.

I have stacked switches, it works fine. But implementations vary.

Takes 21 seconds for a stack of 4 52-port powerconnects (!) including sensors, cpu, memory etc. I don't see a problem here, even though it needs counters and such from the other devices there is nothing special as far as SNMP is concerned, it just works.

Tom

On 11/13/2014 06:20 PM, Adam Armstrong wrote:

...

Stacking switches is pretty naive.

adam.

------ Original Message ------ From: "Raphael Mazelier" raph@futomaki.net To: observium@observium.org Sent: 11/12/2014 4:09:49 AM Subject: Re: [Observium] Performance

...
Observium is great tool, have a nice gui, made plenty of things... But the poller is "naive".

For example I have a couple of stacked EX3300 I cannot poll with observium (the snmp is slow on this equipemnt ok).

Think about that : in observium the poll of this equipment take more than 5min (so it is unusable), in cacti with spine 20s.

Doing snmpbulwalk on all oid is not always the better solution....

Le 11/11/14 22:12, Spencer Gaw a écrit :

...
Oh great, this guy again...

Regards,

SG

On 11/11/2014 1:58 PM, Alex Povolotsky wrote:

...
I'm working on about 20 times better poller for Observium, current one is awful. Mysql should not be a bottleneck, I guess.

observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium

observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium

Raphael Mazelier

6:34 p.m.

Le 13/11/14 18:20, Adam Armstrong a écrit :

...

Stacking switches is pretty naive.

adam.

Juniper Virtual Chassis work relatively well with few members. I have dozen of VC in production with no problems. But with VC with lot of member snmp became slow. OK. Thus this is not an excuse to make better poller...

-- Raphael Mazelier

Bryde, Ole

14 Nov 14 Nov

7:17 a.m.

Really ? How come ? Please elaborate on that.

Ole Bryde

-----Original Message----- From: observium [mailto:observium-bounces@observium.org] On Behalf Of Adam Armstrong Sent: den 13 november 2014 18:21 To: Observium Network Observation System Subject: Re: [Observium] Performance

Stacking switches is pretty naive.

adam.

------ Original Message ------ From: "Raphael Mazelier" raph@futomaki.net To: observium@observium.org Sent: 11/12/2014 4:09:49 AM Subject: Re: [Observium] Performance

...

Observium is great tool, have a nice gui, made plenty of things... But the poller is "naive".

For example I have a couple of stacked EX3300 I cannot poll with observium (the snmp is slow on this equipemnt ok).

Think about that : in observium the poll of this equipment take more than 5min (so it is unusable), in cacti with spine 20s.

Doing snmpbulwalk on all oid is not always the better solution....

Le 11/11/14 22:12, Spencer Gaw a écrit :

...
Oh great, this guy again...

Regards,

SG

On 11/11/2014 1:58 PM, Alex Povolotsky wrote:

...
I'm working on about 20 times better poller for Observium, current one is awful. Mysql should not be a bottleneck, I guess.

observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium

_______________________________________________ observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium

3931

Age (days ago)

3934

Last active (days ago)

List overview

Download

25 comments

15 participants

tags (0)

participants (15)

Adam Armstrong
Adriaan Smuts
Alex Povolotsky
Bryde, Ole
Colin Stubbs
Josh Hopper
krause＠rus.uni-stuttgart.de
Lane Eckley
Morten Guldager
Nikolay Shopik
Pekka.Panula＠sofor.fi
Raphael Mazelier
Simon Schmitz
Spencer Gaw
Tom Laermans