2013/2/14 Ciro Iriarte cyruspy@gmail.com
2013/2/14 Adam Armstrong adama@memetic.org
On Thu, 14 Feb 2013 01:19:20 -0300, Ciro Iriarte cyruspy@gmail.com wrote:
2013/2/14 christopher barry cbarry@rjmetrics.com
Why not RAM? Build a box w/256G or more, and keep them all in tmpfs during operation. (ditch Solaris. It was cool once, but...)
Copy them up into tmpfs on boot, then cron an rsync to disk between polling or even occasionally, depending on pain tolerance.
-C
Hmm, well, I could only steal a machine with 64GB of RAM this time, but
ZFS
- SSD cache + regular spindles sounds cleaner that rsync of 150GB every
2
hours..
Why the SSD cache? Why not just use the SSD.
Having the filesystem layer dealing with the caching is likely to add even more overhead, you're also relying on the cache to actually cache the things you are accessing. I would first ditch Solaris and see if the performance changes.
How long does it take to run /one/ instance of the poller?
adam. _______________________________________________ observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
From the logs, using 24 processes this are some of the executions:
-- ./poller.php 23/24 February 13, 2013, 19:20 - 27 devices polled in 2713. secs ./poller.php 4/24 February 13, 2013, 19:20 - 28 devices polled in 2718. secs ./poller.php 3/24 February 13, 2013, 19:20 - 28 devices polled in 2753. secs ./poller.php 20/24 February 13, 2013, 19:21 - 27 devices polled in 2789. secs ./poller.php 18/24 February 13, 2013, 19:22 - 27 devices polled in 3016. secs ./poller.php 6/24 February 13, 2013, 19:22 - 28 devices polled in 2861. secs ./poller.php 12/24 February 13, 2013, 19:22 - 27 devices polled in 2869. secs ./poller.php 5/24 February 13, 2013, 19:23 - 28 devices polled in 2909. secs ./poller.php 8/24 February 13, 2013, 19:23 - 27 devices polled in 2633. secs ./poller.php 16/24 February 13, 2013, 19:24 - 27 devices polled in 2963. secs ./poller.php 21/24 February 13, 2013, 19:24 - 27 devices polled in 2965. secs --
The weird thing is, not all the processes are listed in the log the same number of times...
Running Job's wrapper right now with 32 threads, so far I've seen polling times from 9 to 700 seconds. I'll report back when it finishes.
Regards,
-- Ciro Iriarte http://cyruspy.wordpress.com --
Well, hitted the send button too soon:
-- INFO: poller-wrapper polled 696 devices in 806 seconds with 32 workers WARNING: the process took more than 5 minutes to finish, you need faster hardware or more threads INFO: in sequential style polling the elapsed time would have been: 25251 seconds WARNING: device 25 is taking too long: 501 seconds WARNING: device 427 is taking too long: 437 seconds WARNING: device 509 is taking too long: 501 seconds WARNING: device 605 is taking too long: 437 seconds WARNING: device 624 is taking too long: 390 seconds WARNING: device 625 is taking too long: 302 seconds WARNING: device 653 is taking too long: 405 seconds WARNING: device 697 is taking too long: 393 seconds ERROR: Some devices are taking more than 300 seconds, the script cannot recommend you what to do.
real 13m27,87s user 43m1,94s sys 13m46,06s --