2013/2/14 Ciro Iriarte <cyruspy@gmail.com>
2013/2/14 Adam Armstrong <adama@memetic.org>

On Thu, 14 Feb 2013 01:19:20 -0300, Ciro Iriarte <cyruspy@gmail.com>
wrote:
> 2013/2/14 christopher barry <cbarry@rjmetrics.com>
>
>>
>> Why not RAM? Build a box w/256G or more, and keep them all in tmpfs
>> during operation. (ditch Solaris. It was cool once, but...)
>>
>> Copy them up into tmpfs on boot, then cron an rsync to disk between
>> polling or even occasionally, depending on pain tolerance.
>>
>> -C
>>
>>
> Hmm, well, I could only steal a machine with 64GB of RAM this time, but
ZFS
> + SSD cache + regular spindles sounds cleaner that rsync of 150GB every
2
> hours..

Why the SSD cache? Why not just use the SSD.

Having the filesystem layer dealing with the caching is likely to add even
more overhead, you're also relying on the cache to actually cache the
things you are accessing.
I would first ditch Solaris and see if the performance changes.

How long does it take to run /one/ instance of the poller?

adam.
_______________________________________________
observium mailing list
observium@observium.org
http://postman.memetic.org/cgi-bin/mailman/listinfo/observium

From the logs, using 24 processes this are some of the executions:

--
./poller.php 23/24 February 13, 2013, 19:20 - 27 devices polled in 2713. secs
./poller.php 4/24 February 13, 2013, 19:20 - 28 devices polled in 2718. secs
./poller.php 3/24 February 13, 2013, 19:20 - 28 devices polled in 2753. secs
./poller.php 20/24 February 13, 2013, 19:21 - 27 devices polled in 2789. secs
./poller.php 18/24 February 13, 2013, 19:22 - 27 devices polled in 3016. secs
./poller.php 6/24 February 13, 2013, 19:22 - 28 devices polled in 2861. secs
./poller.php 12/24 February 13, 2013, 19:22 - 27 devices polled in 2869. secs
./poller.php 5/24 February 13, 2013, 19:23 - 28 devices polled in 2909. secs
./poller.php 8/24 February 13, 2013, 19:23 - 27 devices polled in 2633. secs
./poller.php 16/24 February 13, 2013, 19:24 - 27 devices polled in 2963. secs
./poller.php 21/24 February 13, 2013, 19:24 - 27 devices polled in 2965. secs
--

The weird thing is, not all the processes are listed in the log the same number of times...

Running Job's wrapper right now with 32 threads, so far I've seen polling times from 9 to 700 seconds. I'll report back when it finishes.

Regards,

--

Well, hitted the send button too soon:

--
INFO: poller-wrapper polled 696 devices in 806 seconds with 32 workers
WARNING: the process took more than 5 minutes to finish, you need faster hardware or more threads
INFO: in sequential style polling the elapsed time would have been: 25251 seconds
WARNING: device 25 is taking too long: 501 seconds
WARNING: device 427 is taking too long: 437 seconds
WARNING: device 509 is taking too long: 501 seconds
WARNING: device 605 is taking too long: 437 seconds
WARNING: device 624 is taking too long: 390 seconds
WARNING: device 625 is taking too long: 302 seconds
WARNING: device 653 is taking too long: 405 seconds
WARNING: device 697 is taking too long: 393 seconds
ERROR: Some devices are taking more than 300 seconds, the script cannot recommend you what to do.

real    13m27,87s
user    43m1,94s
sys     13m46,06s
--


--
Ciro Iriarte
http://cyruspy.wordpress.com
--