Poller performance

newer
Re: [Observium] can anyone tell me...

older
Re: [Observium] can anyone tell me...

Ciro Iriarte

13 Feb 2013 13 Feb '13

10:37 p.m.

Hi!, I've been running Observium on SLES11 for almost a year. This was on a 8 vCPU + 2GB RAM VMware guest with a 30GB RRD footprint.

After adding about 320 new hosts (all of them, network gear) the RRD database went to 130GB and the little machine got peaks of 3k IOPS. The I/O wait was rather hi and the machine would suddently stop responding (Hypervisor stats showed 0% CPU usage).

So, I've migrated the installation to a physical machine running Solaris 11 (I know, it's not supported) and without any tunning (mysql/zfs are candidates) I'm seeing polling of 41 devices in about an hour. Is this usual/expected?

./poller.php 0/24 February 13, 2013, 18:27 - 27 devices polled in 4048. secs ./poller.php 7/24 February 13, 2013, 18:27 - 27 devices polled in 4075. secs ./poller.php 0/24 February 13, 2013, 18:28 - 27 devices polled in 3803. secs ./poller.php 22/24 February 13, 2013, 18:28 - 27 devices polled in 4404. secs ./poller.php 3/24 February 13, 2013, 18:28 - 28 devices polled in 4417. secs --

IOPS climbed to 5k, comments?

Regards,

-- Ciro Iriarte http://cyruspy.wordpress.com --

Attachments:

attachment.html (text/html — 1.4 KB)

Show replies by date

Adam Armstrong

14 Feb 14 Feb

12:29 a.m.

ZFS is quite possibly the least optimal filesystem for running RRDs from. You want a simple filesystem without much overhead on operations, like EXT with journalling and atime turned off.

Also, it's possible that RRD is poorly optimised on solaris and uses slow system calls?

You say 41 devices in an hour, but your paste suggests 468 devices in 73 minutes. What is your I/O subsystem? disk? ssd? how many? A spindled disk will not behave well with more than 4-8 concurrent processes, an SSD might behave better.

You should also look at the new poller wrapper script, which might help you squeeze a bit more out of it, once you've sized it correctly

One must ask why, if you have a dedicated machine, are you not just running it on Ubuntu, like we recommend?

adam.

On Wed, 13 Feb 2013 18:37:21 -0300, Ciro Iriarte cyruspy@gmail.com wrote:

...

Hi!, I've been running Observium on SLES11 for almost a year. This was

on a

...

8 vCPU + 2GB RAM VMware guest with a 30GB RRD footprint.

After adding about 320 new hosts (all of them, network gear) the RRD database went to 130GB and the little machine got peaks of 3k IOPS. The

I/O

...

wait was rather hi and the machine would suddently stop responding (Hypervisor stats showed 0% CPU usage).

So, I've migrated the installation to a physical machine running Solaris

...

(I know, it's not supported) and without any tunning (mysql/zfs are candidates) I'm seeing polling of 41 devices in about an hour. Is this usual/expected?

--

./poller.php 0/24 February 13, 2013, 18:27 - 27 devices polled in 4048. secs ./poller.php 7/24 February 13, 2013, 18:27 - 27 devices polled in 4075. secs ./poller.php 0/24 February 13, 2013, 18:28 - 27 devices polled in 3803. secs ./poller.php 22/24 February 13, 2013, 18:28 - 27 devices polled in 4404. secs ./poller.php 3/24 February 13, 2013, 18:28 - 28 devices polled in 4417. secs --

IOPS climbed to 5k, comments?

Regards,

Morgan McLean

1:58 a.m.

Hey, some people are into pain @_@

But seriously, ZFS is pretty cool. (as long as it doesn't eat your data)

-m

On Wed, Feb 13, 2013 at 3:29 PM, Adam Armstrong adama@memetic.org wrote:

...

ZFS is quite possibly the least optimal filesystem for running RRDs from. You want a simple filesystem without much overhead on operations, like EXT with journalling and atime turned off.

Also, it's possible that RRD is poorly optimised on solaris and uses slow system calls?

You say 41 devices in an hour, but your paste suggests 468 devices in 73 minutes. What is your I/O subsystem? disk? ssd? how many? A spindled disk will not behave well with more than 4-8 concurrent processes, an SSD might behave better.

You should also look at the new poller wrapper script, which might help you squeeze a bit more out of it, once you've sized it correctly

One must ask why, if you have a dedicated machine, are you not just running it on Ubuntu, like we recommend?

adam.

On Wed, 13 Feb 2013 18:37:21 -0300, Ciro Iriarte cyruspy@gmail.com wrote:

...
Hi!, I've been running Observium on SLES11 for almost a year. This was

on a

...
8 vCPU + 2GB RAM VMware guest with a 30GB RRD footprint.

After adding about 320 new hosts (all of them, network gear) the RRD database went to 130GB and the little machine got peaks of 3k IOPS. The

I/O

...
wait was rather hi and the machine would suddently stop responding (Hypervisor stats showed 0% CPU usage).

So, I've migrated the installation to a physical machine running Solaris

11

...
(I know, it's not supported) and without any tunning (mysql/zfs are candidates) I'm seeing polling of 41 devices in about an hour. Is this usual/expected?

--

./poller.php 0/24 February 13, 2013, 18:27 - 27 devices polled in 4048. secs ./poller.php 7/24 February 13, 2013, 18:27 - 27 devices polled in 4075. secs ./poller.php 0/24 February 13, 2013, 18:28 - 27 devices polled in 3803. secs ./poller.php 22/24 February 13, 2013, 18:28 - 27 devices polled in 4404. secs ./poller.php 3/24 February 13, 2013, 18:28 - 28 devices polled in 4417. secs --

IOPS climbed to 5k, comments?

Regards,

observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium

-- Thanks, Morgan

christopher barry

4:58 a.m.

Why not RAM? Build a box w/256G or more, and keep them all in tmpfs during operation. (ditch Solaris. It was cool once, but...)

Copy them up into tmpfs on boot, then cron an rsync to disk between polling or even occasionally, depending on pain tolerance.

-C

On Wed, 2013-02-13 at 16:58 -0800, Morgan McLean wrote:

...

Hey, some people are into pain @_@

But seriously, ZFS is pretty cool. (as long as it doesn't eat your data)

-m

On Wed, Feb 13, 2013 at 3:29 PM, Adam Armstrong adama@memetic.org wrote:
    ZFS is quite possibly the least optimal filesystem for running
    RRDs from.
    You want a simple filesystem without much overhead on
    operations, like EXT
    with journalling and atime turned off.

Ciro Iriarte

5:19 a.m.

2013/2/14 christopher barry cbarry@rjmetrics.com

...

Why not RAM? Build a box w/256G or more, and keep them all in tmpfs during operation. (ditch Solaris. It was cool once, but...)

Copy them up into tmpfs on boot, then cron an rsync to disk between polling or even occasionally, depending on pain tolerance.

-C

Hmm, well, I could only steal a machine with 64GB of RAM this time, but ZFS + SSD cache + regular spindles sounds cleaner that rsync of 150GB every 2 hours..

Regards,

Ciro Iriarte http://cyruspy.wordpress.com --

Adam Armstrong

4:41 a.m.

On Thu, 14 Feb 2013 01:19:20 -0300, Ciro Iriarte cyruspy@gmail.com wrote:

...

2013/2/14 christopher barry cbarry@rjmetrics.com

...
Why not RAM? Build a box w/256G or more, and keep them all in tmpfs during operation. (ditch Solaris. It was cool once, but...)

Copy them up into tmpfs on boot, then cron an rsync to disk between polling or even occasionally, depending on pain tolerance.

-C

Hmm, well, I could only steal a machine with 64GB of RAM this time, but

ZFS

...

SSD cache + regular spindles sounds cleaner that rsync of 150GB every

...

hours..

Why the SSD cache? Why not just use the SSD.

Having the filesystem layer dealing with the caching is likely to add even more overhead, you're also relying on the cache to actually cache the things you are accessing. I would first ditch Solaris and see if the performance changes.

How long does it take to run /one/ instance of the poller?

adam.

Ciro Iriarte

6 a.m.

2013/2/14 Adam Armstrong adama@memetic.org

...

On Thu, 14 Feb 2013 01:19:20 -0300, Ciro Iriarte cyruspy@gmail.com wrote:

...
2013/2/14 christopher barry cbarry@rjmetrics.com

...
Why not RAM? Build a box w/256G or more, and keep them all in tmpfs during operation. (ditch Solaris. It was cool once, but...)

Copy them up into tmpfs on boot, then cron an rsync to disk between polling or even occasionally, depending on pain tolerance.

-C

Hmm, well, I could only steal a machine with 64GB of RAM this time, but

ZFS

...

SSD cache + regular spindles sounds cleaner that rsync of 150GB every

2

...
hours..

Why the SSD cache? Why not just use the SSD.

"SSD disk size" < "rrd directory size" mostly, and I/O to a mid range storage shouldn't hurt either...

...

Having the filesystem layer dealing with the caching is likely to add even more overhead, you're also relying on the cache to actually cache the things you are accessing.

I can't imagine something with more insight on access pattern than the filesystem... ZFS has a second level "read cache" (L2ARC) and a "write cache" (ZIL, that mostly helps with synchronous writes). I'm not telling that it must make a difference in this case, but I would like to save stats in case I finally add those disks...

Last time I checked, disk svctime was not hi, but system CPU time was, maybe network delay?...

...

I would first ditch Solaris and see if the performance changes

Right now I don't have spare time to do it all again from scratch, sadly.

...

How long does it take to run /one/ instance of the poller?

I'll check the logs again in the morning once I get to the office.

...

adam. _______________________________________________ observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium

Regards,

-- Ciro Iriarte http://cyruspy.wordpress.com --

Ciro Iriarte

6:17 p.m.

2013/2/14 Adam Armstrong adama@memetic.org

...

On Thu, 14 Feb 2013 01:19:20 -0300, Ciro Iriarte cyruspy@gmail.com wrote:

...
2013/2/14 christopher barry cbarry@rjmetrics.com

...
Why not RAM? Build a box w/256G or more, and keep them all in tmpfs during operation. (ditch Solaris. It was cool once, but...)

Copy them up into tmpfs on boot, then cron an rsync to disk between polling or even occasionally, depending on pain tolerance.

-C

Hmm, well, I could only steal a machine with 64GB of RAM this time, but

ZFS

...

SSD cache + regular spindles sounds cleaner that rsync of 150GB every

2

...
hours..

Why the SSD cache? Why not just use the SSD.

Having the filesystem layer dealing with the caching is likely to add even more overhead, you're also relying on the cache to actually cache the things you are accessing. I would first ditch Solaris and see if the performance changes.

How long does it take to run /one/ instance of the poller?

adam. _______________________________________________ observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium

...

From the logs, using 24 processes this are some of the executions:

-- ./poller.php 23/24 February 13, 2013, 19:20 - 27 devices polled in 2713. secs ./poller.php 4/24 February 13, 2013, 19:20 - 28 devices polled in 2718. secs ./poller.php 3/24 February 13, 2013, 19:20 - 28 devices polled in 2753. secs ./poller.php 20/24 February 13, 2013, 19:21 - 27 devices polled in 2789. secs ./poller.php 18/24 February 13, 2013, 19:22 - 27 devices polled in 3016. secs ./poller.php 6/24 February 13, 2013, 19:22 - 28 devices polled in 2861. secs ./poller.php 12/24 February 13, 2013, 19:22 - 27 devices polled in 2869. secs ./poller.php 5/24 February 13, 2013, 19:23 - 28 devices polled in 2909. secs ./poller.php 8/24 February 13, 2013, 19:23 - 27 devices polled in 2633. secs ./poller.php 16/24 February 13, 2013, 19:24 - 27 devices polled in 2963. secs ./poller.php 21/24 February 13, 2013, 19:24 - 27 devices polled in 2965. secs --

The weird thing is, not all the processes are listed in the log the same number of times...

Running Job's wrapper right now with 32 threads, so far I've seen polling times from 9 to 700 seconds. I'll report back when it finishes.

Regards,

-- Ciro Iriarte http://cyruspy.wordpress.com --

Ciro Iriarte

6:19 p.m.

2013/2/14 Ciro Iriarte cyruspy@gmail.com

...

2013/2/14 Adam Armstrong adama@memetic.org

...
On Thu, 14 Feb 2013 01:19:20 -0300, Ciro Iriarte cyruspy@gmail.com wrote:

...
2013/2/14 christopher barry cbarry@rjmetrics.com

...
Why not RAM? Build a box w/256G or more, and keep them all in tmpfs during operation. (ditch Solaris. It was cool once, but...)

Copy them up into tmpfs on boot, then cron an rsync to disk between polling or even occasionally, depending on pain tolerance.

-C

Hmm, well, I could only steal a machine with 64GB of RAM this time, but

ZFS

...

SSD cache + regular spindles sounds cleaner that rsync of 150GB every

2

...
hours..

Why the SSD cache? Why not just use the SSD.

Having the filesystem layer dealing with the caching is likely to add even more overhead, you're also relying on the cache to actually cache the things you are accessing. I would first ditch Solaris and see if the performance changes.

How long does it take to run /one/ instance of the poller?

adam. _______________________________________________ observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium

From the logs, using 24 processes this are some of the executions:

-- ./poller.php 23/24 February 13, 2013, 19:20 - 27 devices polled in 2713. secs ./poller.php 4/24 February 13, 2013, 19:20 - 28 devices polled in 2718. secs ./poller.php 3/24 February 13, 2013, 19:20 - 28 devices polled in 2753. secs ./poller.php 20/24 February 13, 2013, 19:21 - 27 devices polled in 2789. secs ./poller.php 18/24 February 13, 2013, 19:22 - 27 devices polled in 3016. secs ./poller.php 6/24 February 13, 2013, 19:22 - 28 devices polled in 2861. secs ./poller.php 12/24 February 13, 2013, 19:22 - 27 devices polled in 2869. secs ./poller.php 5/24 February 13, 2013, 19:23 - 28 devices polled in 2909. secs ./poller.php 8/24 February 13, 2013, 19:23 - 27 devices polled in 2633. secs ./poller.php 16/24 February 13, 2013, 19:24 - 27 devices polled in 2963. secs ./poller.php 21/24 February 13, 2013, 19:24 - 27 devices polled in 2965. secs --

The weird thing is, not all the processes are listed in the log the same number of times...

Running Job's wrapper right now with 32 threads, so far I've seen polling times from 9 to 700 seconds. I'll report back when it finishes.

Regards,

-- Ciro Iriarte http://cyruspy.wordpress.com --

Well, hitted the send button too soon:

-- INFO: poller-wrapper polled 696 devices in 806 seconds with 32 workers WARNING: the process took more than 5 minutes to finish, you need faster hardware or more threads INFO: in sequential style polling the elapsed time would have been: 25251 seconds WARNING: device 25 is taking too long: 501 seconds WARNING: device 427 is taking too long: 437 seconds WARNING: device 509 is taking too long: 501 seconds WARNING: device 605 is taking too long: 437 seconds WARNING: device 624 is taking too long: 390 seconds WARNING: device 625 is taking too long: 302 seconds WARNING: device 653 is taking too long: 405 seconds WARNING: device 697 is taking too long: 393 seconds ERROR: Some devices are taking more than 300 seconds, the script cannot recommend you what to do.

real 13m27,87s user 43m1,94s sys 13m46,06s --

-- Ciro Iriarte http://cyruspy.wordpress.com --

Ciro Iriarte

16 Mar 16 Mar

5:55 a.m.

2013/2/14 Ciro Iriarte cyruspy@gmail.com

...

2013/2/14 Ciro Iriarte cyruspy@gmail.com

...
2013/2/14 Adam Armstrong adama@memetic.org

...
On Thu, 14 Feb 2013 01:19:20 -0300, Ciro Iriarte cyruspy@gmail.com wrote:

...
2013/2/14 christopher barry cbarry@rjmetrics.com

...
Why not RAM? Build a box w/256G or more, and keep them all in tmpfs during operation. (ditch Solaris. It was cool once, but...)

Copy them up into tmpfs on boot, then cron an rsync to disk between polling or even occasionally, depending on pain tolerance.

-C

Hmm, well, I could only steal a machine with 64GB of RAM this time, but

ZFS

...

SSD cache + regular spindles sounds cleaner that rsync of 150GB every

2

...
hours..

Why the SSD cache? Why not just use the SSD.

Having the filesystem layer dealing with the caching is likely to add even more overhead, you're also relying on the cache to actually cache the things you are accessing. I would first ditch Solaris and see if the performance changes.

How long does it take to run /one/ instance of the poller?

adam. _______________________________________________ observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium

From the logs, using 24 processes this are some of the executions:

-- ./poller.php 23/24 February 13, 2013, 19:20 - 27 devices polled in 2713. secs ./poller.php 4/24 February 13, 2013, 19:20 - 28 devices polled in 2718. secs ./poller.php 3/24 February 13, 2013, 19:20 - 28 devices polled in 2753. secs ./poller.php 20/24 February 13, 2013, 19:21 - 27 devices polled in 2789. secs ./poller.php 18/24 February 13, 2013, 19:22 - 27 devices polled in 3016. secs ./poller.php 6/24 February 13, 2013, 19:22 - 28 devices polled in 2861. secs ./poller.php 12/24 February 13, 2013, 19:22 - 27 devices polled in 2869. secs ./poller.php 5/24 February 13, 2013, 19:23 - 28 devices polled in 2909. secs ./poller.php 8/24 February 13, 2013, 19:23 - 27 devices polled in 2633. secs ./poller.php 16/24 February 13, 2013, 19:24 - 27 devices polled in 2963. secs ./poller.php 21/24 February 13, 2013, 19:24 - 27 devices polled in 2965. secs --

The weird thing is, not all the processes are listed in the log the same number of times...

Running Job's wrapper right now with 32 threads, so far I've seen polling times from 9 to 700 seconds. I'll report back when it finishes.

Regards,

-- Ciro Iriarte http://cyruspy.wordpress.com --

Well, hitted the send button too soon:

-- INFO: poller-wrapper polled 696 devices in 806 seconds with 32 workers WARNING: the process took more than 5 minutes to finish, you need faster hardware or more threads INFO: in sequential style polling the elapsed time would have been: 25251 seconds WARNING: device 25 is taking too long: 501 seconds WARNING: device 427 is taking too long: 437 seconds WARNING: device 509 is taking too long: 501 seconds WARNING: device 605 is taking too long: 437 seconds WARNING: device 624 is taking too long: 390 seconds WARNING: device 625 is taking too long: 302 seconds WARNING: device 653 is taking too long: 405 seconds WARNING: device 697 is taking too long: 393 seconds ERROR: Some devices are taking more than 300 seconds, the script cannot recommend you what to do.

real 13m27,87s user 43m1,94s sys 13m46,06s --

-- Ciro Iriarte http://cyruspy.wordpress.com --

Well, just a quick follow up. Had some weird issues with Solaris 11, the most annoying was MySQL core dumping. Went back to SLES11 because lack of time troubleshooting this (and from 10k ports to just 1k, ISP guys left out for the time being)

Regards,

-- Ciro Iriarte http://cyruspy.wordpress.com --

Tom Laermans

2:42 p.m.

Hi Ciro,

On 16/03/2013 5:55, Ciro Iriarte wrote:

...

Well, just a quick follow up. Had some weird issues with Solaris 11, the most annoying was MySQL core dumping. Went back to SLES11 because lack of time troubleshooting this (and from 10k ports to just 1k, ISP guys left out for the time being)

Hm, MySQL being broken could contribute to the slow polling times as we fetch and update data in there many times per device; still curious how performance could be so shitty so if you have time to check Solaris out further I'd be glad to hear if you find the issue :)

Tom

Ciro Iriarte

3:11 p.m.

El 16/03/2013 10:42, "Tom Laermans" tom.laermans@powersource.cx escribió:

...

Hi Ciro,

On 16/03/2013 5:55, Ciro Iriarte wrote:

...
Well, just a quick follow up. Had some weird issues with Solaris 11, the

most annoying was MySQL core dumping. Went back to SLES11 because lack of time troubleshooting this (and from 10k ports to just 1k, ISP guys left out for the time being)

...

Hm, MySQL being broken could contribute to the slow polling times as we

fetch and update data in there many times per device; still curious how performance could be so shitty so if you have time to check Solaris out further I'd be glad to hear if you find the issue :)

...

Tom

Hi Tom, instead of going back to the original Linux VM reinstalled the new hardware with Linux and moved the original database to it (IT Infrastructure). Also created 2 more parallel installations (workstations access network, ISP distribution network) to be able to test scalability later without killing the first group (mine).

It's annoying to handle 3 databases for user creation, but also is the only way to group machines.

For the time being I won't fiddle with Solaris + Observium (too much work backlog).

Regards, CI.-

christopher barry

14 Feb 14 Feb

8:32 a.m.

Ever heard of redis? It's an in memory object store with on-disk persistence - it's essentially doing the same thing.

Kinda chuckling to myself...you've got 150 Gig of RRD data, so apparently a boatload of devices, a Solaris box with 64G of ram, we know that wasn't cheap, but you can't afford a Xeon box with 256G?

Could it be you're just in love with zfs, and this is your excuse to ask her out?

On Thu, 2013-02-14 at 01:19 -0300, Ciro Iriarte wrote:

...

... sounds cleaner that rsync of 150GB every 2 hours..

Ciro Iriarte

10:46 a.m.

2013/2/14 christopher barry cbarry@rjmetrics.com

...

Ever heard of redis? It's an in memory object store with on-disk persistence - it's essentially doing the same thing

Hmm, nope, sounds interesting but apparently it's not a general purpose solution...

Ref: http://redis.io/topics/faq

...

Kinda chuckling to myself...you've got 150 Gig of RRD data, so

...

apparently a boatload of devices,

Yep, the original devices where mostly Linux servers and a few switches (about 30GB), the ~100GB increment was given by adding more network devices... (too many ports?)

...

a Solaris box with 64G of ram, we know that wasn't cheap, but you can't afford a Xeon box with 256G?

Well, this wasn't bought for data collection specifically. Its in fact a Xeon machine, not SPARC in case you were thinking about that...

...

Could it be you're just in love with zfs, and this is your excuse to ask her out?

Hmm, "in love" is too much, but I'm curious and was a nice opportunity to ask her out.. :)

Regards, CI.-

...

On Thu, 2013-02-14 at 01:19 -0300, Ciro Iriarte wrote:

...
... sounds cleaner than rsync of 150GB every 2 hours..

observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium

-- Ciro Iriarte http://cyruspy.wordpress.com --

Mark Martinec

4:34 p.m.

Ciro Iriarte wrote:

...

8 vCPU + 2GB RAM VMware guest with a 30GB RRD footprint. After adding about 320 new hosts (all of them, network gear) the RRD database went to 130GB and the little machine got peaks of 3k IOPS. The I/O wait was rather hi and the machine would suddently stop responding

I'd get two 240 GB SSDs, configure them as a ZFS mirror pool and dedicated it to RRD. Don't forget to turn off atime, and let the pool use 4kB sectors (ashift=12), otherwise SSDs won't be happy.

Mark

Ciro Iriarte

6:08 p.m.

2013/2/14 Mark Martinec Mark.Martinec+observium@ijs.si

...

Ciro Iriarte wrote:

...
8 vCPU + 2GB RAM VMware guest with a 30GB RRD footprint. After adding about 320 new hosts (all of them, network gear) the RRD database went to 130GB and the little machine got peaks of 3k IOPS. The I/O wait was rather hi and the machine would suddently stop

responding

I'd get two 240 GB SSDs, configure them as a ZFS mirror pool and dedicated it to RRD. Don't forget to turn off atime, and let the pool use 4kB sectors (ashift=12), otherwise SSDs won't be happy.

Mark _______________________________________________ observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium

Thanks!, disabling atime in a minute, the 240GB disks will wait a little more.. The 4KB sector can be changed after zpool creation?, I asume that goes for SLOG/L2ARC also, right?

Regards,

-- Ciro Iriarte http://cyruspy.wordpress.com --

Mark Martinec

8:01 p.m.

...

Thanks!, disabling atime in a minute, the 240GB disks will wait a little more..

...

The 4KB sector can be changed after zpool creation?, I asume that goes for SLOG/L2ARC also, right?

No, the sector size (ashift) is fixed at the time when a new top-level virtual device (vdev) is created (zpool create, or zpool add), and can't be changed later. How to do that depends on the OS. With FreeBSD a gnop trick can be used if a disk lies about its physical sector size, with Illumos you edit sd.conf, don't know about Solaris. See: http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks

And make sure partitions (if using them) are 4kB aligned on AF disks (including SSD).

Mark

Ciro Iriarte

5:13 a.m.

Hi Adam,

2013/2/13 Adam Armstrong adama@memetic.org

...

ZFS is quite possibly the least optimal filesystem for running RRDs from. You want a simple filesystem without much overhead on operations, like EXT with journalling and atime turned off.

noatime is the first item on my ToDo. Was expecting to leave it working for a while before touching anything (just to have a baseline)

...

Also, it's possible that RRD is poorly optimised on solaris and uses slow system calls?

Hmm, how to tell?

...

You say 41 devices in an hour, but your paste suggests 468 devices in 73 minutes.

Well, that line didn't get to the mail. I was just taking one poller process into account. I thought that an execution time of about an hour is not nice while you intent to collect every 5 minutes... :)

...

What is your I/O subsystem? disk? ssd? how many?

Right now the BL460c machine is connected with 2 x 4GB interfaces to our SAN. The LUN provided is presented through a Hitachi USP-VM (hitting it's cache) but using a AMS2500 disk group (about 45 disks if I'm not wrong, shared with other workloads). Also I have available 2 SSD local disks that I'm considering using as ZIL and/or L2ARC (if/when they're needed)

...

A spindled disk will not behave well with more than 4-8 concurrent processes, an SSD might behave better.

You should also look at the new poller wrapper script, which might help you squeeze a bit more out of it, once you've sized it correctly

I'll search the archives about that...

...

One must ask why, if you have a dedicated machine, are you not just running it on Ubuntu, like we recommend?

Hmm, well, I'm really a happy SLES user, and as long as I can run an app on it, I'll choose it every time :D. The migration to Solaris was based on the posibility to use ZFS caching to SSD disks on the same Zpool, also, I was looking forward to try its new release.. I would just like to have references from other large installations, maybe I'm hitting a PHP limitation, maybe I need more parallel processes, maybe there's a freaking DNS problem, not sure right now, but I'm sure I shoudn't need more hardware :D

...

adam.

Regards, CI.-

...

On Wed, 13 Feb 2013 18:37:21 -0300, Ciro Iriarte cyruspy@gmail.com wrote:

...
Hi!, I've been running Observium on SLES11 for almost a year. This was

on a

...
8 vCPU + 2GB RAM VMware guest with a 30GB RRD footprint.

After adding about 320 new hosts (all of them, network gear) the RRD database went to 130GB and the little machine got peaks of 3k IOPS. The

I/O

...
wait was rather hi and the machine would suddently stop responding (Hypervisor stats showed 0% CPU usage).

So, I've migrated the installation to a physical machine running Solaris

11

...
(I know, it's not supported) and without any tunning (mysql/zfs are candidates) I'm seeing polling of 41 devices in about an hour. Is this usual/expected?

--

./poller.php 0/24 February 13, 2013, 18:27 - 27 devices polled in 4048. secs ./poller.php 7/24 February 13, 2013, 18:27 - 27 devices polled in 4075. secs ./poller.php 0/24 February 13, 2013, 18:28 - 27 devices polled in 3803. secs ./poller.php 22/24 February 13, 2013, 18:28 - 27 devices polled in 4404. secs ./poller.php 3/24 February 13, 2013, 18:28 - 28 devices polled in 4417. secs --

IOPS climbed to 5k, comments?

Regards,

observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium

-- Ciro Iriarte http://cyruspy.wordpress.com --

4522

Age (days ago)

4553

Last active (days ago)

List overview

Download

17 comments

6 participants

tags (0)

participants (6)

Adam Armstrong
christopher barry
Ciro Iriarte
Mark Martinec
Morgan McLean
Tom Laermans