poller.php burns all available CPU in the last community edition
![](https://secure.gravatar.com/avatar/796e9bafee5d6f39b5b32b68ca494440.jpg?s=120&d=mm&r=g)
After upgrading a somewhat outdated version Observium CE 0.13.10.4585 to CE 0.15.6.6430 (under PHP 5.5.28), the poller.php started burning excessive CPU: eight running pollers on an 8-CPU machine started burning 100% of available CPU time instead of a few percents as before. They somehow still managed to get their job done, but just barely so in five minutes, and often missing the time slot.
While watching the behaviour for a day and learning PHP profiling along the way, I now have an explanation and a solution. Apology if this has been reported before, couldn't find any such report.
The problem is in a function external_exec() in includes/common(.inc).php, which has been rewritten some time between these two versions, now switching to a common loop waiting with a stream_select() on two pipes (stdout and stderr) from a spawned snmp*walk process.
The documentation on stream_select helps understanding the issue:
http://php.net/manual/en/function.stream-select.php
read The streams listed in the read array will be watched to see if characters become available for reading (more precisely, to see if a read will not block - in particular, a stream resource is also ready on end-of-file, in which case an fread() will return a zero length string).
So what happens is that when a spawned snmp* command closes its stderr but not yet its stdout, the external_exec() goes in a spin:
feof($pipes[1]) is false but stream is not yet ready for reading, however the feof($pipes[2]) is true and hence the stream_select() returns right away and reports $pipes[2] as ready. Then the fread($pipes[2]) is called, returns an empty string (because stderr is at eof), and the whole loop busily goes around, until the time the snmp command happens to terminate.
Here is a fix:
--- includes/common.inc.php.ori 2015-09-04 20:29:39.561244285 +0200 +++ includes/common.inc.php 2015-09-04 20:30:07.617242602 +0200 @@ -618,9 +618,13 @@ { $start = microtime(TRUE); - //while ($status['running'] !== FALSE) - while (feof($pipes[1]) === FALSE || feof($pipes[2]) === FALSE) + while (TRUE) { + $read = array(); + if (!feof($pipes[1])) { $read[] = $pipes[1]; } + if (!feof($pipes[2])) { $read[] = $pipes[2]; } + if (empty($read)) { break; } + stream_select( - $read = array($pipes[1], $pipes[2]), + $read, $write = NULL, $except = NULL,
Regards Mark
![](https://secure.gravatar.com/avatar/0fa97865a0e1ab36152b6b2299eedb49.jpg?s=120&d=mm&r=g)
Nice catch, and nice research.
Mike will probably want to test and commit this himself!
Adam.
Sent with AquaMail for Android http://www.aqua-mail.com
On 5 September 2015 00:20:14 Mark Martinec Mark.Martinec+observium@ijs.si wrote:
After upgrading a somewhat outdated version Observium CE 0.13.10.4585 to CE 0.15.6.6430 (under PHP 5.5.28), the poller.php started burning excessive CPU: eight running pollers on an 8-CPU machine started burning 100% of available CPU time instead of a few percents as before. They somehow still managed to get their job done, but just barely so in five minutes, and often missing the time slot.
While watching the behaviour for a day and learning PHP profiling along the way, I now have an explanation and a solution. Apology if this has been reported before, couldn't find any such report.
The problem is in a function external_exec() in includes/common(.inc).php, which has been rewritten some time between these two versions, now switching to a common loop waiting with a stream_select() on two pipes (stdout and stderr) from a spawned snmp*walk process.
The documentation on stream_select helps understanding the issue:
http://php.net/manual/en/function.stream-select.php
read The streams listed in the read array will be watched to see if characters become available for reading (more precisely, to see if a read will not block - in particular, a stream resource is also ready on end-of-file, in which case an fread() will return a zero length string).
So what happens is that when a spawned snmp* command closes its stderr but not yet its stdout, the external_exec() goes in a spin:
feof($pipes[1]) is false but stream is not yet ready for reading, however the feof($pipes[2]) is true and hence the stream_select() returns right away and reports $pipes[2] as ready. Then the fread($pipes[2]) is called, returns an empty string (because stderr is at eof), and the whole loop busily goes around, until the time the snmp command happens to terminate.
Here is a fix:
--- includes/common.inc.php.ori 2015-09-04 20:29:39.561244285 +0200 +++ includes/common.inc.php 2015-09-04 20:30:07.617242602 +0200 @@ -618,9 +618,13 @@ { $start = microtime(TRUE);
- //while ($status['running'] !== FALSE)
- while (feof($pipes[1]) === FALSE || feof($pipes[2]) === FALSE)
- while (TRUE) {
$read = array();
if (!feof($pipes[1])) { $read[] = $pipes[1]; }
if (!feof($pipes[2])) { $read[] = $pipes[2]; }
if (empty($read)) { break; }
stream_select(
$read = array($pipes[1], $pipes[2]),
$read, $write = NULL, $except = NULL,
Regards Mark _______________________________________________ observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
![](https://secure.gravatar.com/avatar/990281e0cef3d194deb34c14541e3936.jpg?s=120&d=mm&r=g)
I noticed the high CPU usage a few weeks ago, but I haven't had time to track it down. Thanks for the hard work! I've incorporated your patch and will let the poller run a while to see how it goes. -- Cameron Moore
-----Original Message----- From: observium [mailto:observium-bounces@observium.org] On Behalf Of Mark Martinec Sent: Friday, September 04, 2015 5:20 PM To: observium@observium.org Subject: [Observium] poller.php burns all available CPU in the last community edition
After upgrading a somewhat outdated version Observium CE 0.13.10.4585 to CE 0.15.6.6430 (under PHP 5.5.28), the poller.php started burning excessive CPU: eight running pollers on an 8-CPU machine started burning 100% of available CPU time instead of a few percents as before. They somehow still managed to get their job done, but just barely so in five minutes, and often missing the time slot.
While watching the behaviour for a day and learning PHP profiling along the way, I now have an explanation and a solution. Apology if this has been reported before, couldn't find any such report.
The problem is in a function external_exec() in includes/common(.inc).php, which has been rewritten some time between these two versions, now switching to a common loop waiting with a stream_select() on two pipes (stdout and stderr) from a spawned snmp*walk process.
The documentation on stream_select helps understanding the issue:
http://php.net/manual/en/function.stream-select.php
read The streams listed in the read array will be watched to see if characters become available for reading (more precisely, to see if a read will not block - in particular, a stream resource is also ready on end-of-file, in which case an fread() will return a zero length string).
So what happens is that when a spawned snmp* command closes its stderr but not yet its stdout, the external_exec() goes in a spin:
feof($pipes[1]) is false but stream is not yet ready for reading, however the feof($pipes[2]) is true and hence the stream_select() returns right away and reports $pipes[2] as ready. Then the fread($pipes[2]) is called, returns an empty string (because stderr is at eof), and the whole loop busily goes around, until the time the snmp command happens to terminate.
Here is a fix:
--- includes/common.inc.php.ori 2015-09-04 20:29:39.561244285 +0200 +++ includes/common.inc.php 2015-09-04 20:30:07.617242602 +0200 @@ -618,9 +618,13 @@ { $start = microtime(TRUE); - //while ($status['running'] !== FALSE) - while (feof($pipes[1]) === FALSE || feof($pipes[2]) === FALSE) + while (TRUE) { + $read = array(); + if (!feof($pipes[1])) { $read[] = $pipes[1]; } + if (!feof($pipes[2])) { $read[] = $pipes[2]; } + if (empty($read)) { break; } + stream_select( - $read = array($pipes[1], $pipes[2]), + $read, $write = NULL, $except = NULL,
Regards Mark _______________________________________________ observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
![](https://secure.gravatar.com/avatar/0fa97865a0e1ab36152b6b2299eedb49.jpg?s=120&d=mm&r=g)
I see no appreciable decrease in poller usage after applying this modification, so perhaps it was fixing something that had already been fixed since the last CE release.
adam.
Sent from Mailbird [http://www.getmailbird.com/?utm_source=Mailbird&utm_medium=email&utm...] On 05/09/2015 00:47:51, Moore, Cameron cmoore@hsutx.edu wrote: I noticed the high CPU usage a few weeks ago, but I haven't had time to track it down. Thanks for the hard work! I've incorporated your patch and will let the poller run a while to see how it goes. -- Cameron Moore
-----Original Message----- From: observium [mailto:observium-bounces@observium.org] On Behalf Of Mark Martinec Sent: Friday, September 04, 2015 5:20 PM To: observium@observium.org Subject: [Observium] poller.php burns all available CPU in the last community edition
After upgrading a somewhat outdated version Observium CE 0.13.10.4585 to CE 0.15.6.6430 (under PHP 5.5.28), the poller.php started burning excessive CPU: eight running pollers on an 8-CPU machine started burning 100% of available CPU time instead of a few percents as before. They somehow still managed to get their job done, but just barely so in five minutes, and often missing the time slot.
While watching the behaviour for a day and learning PHP profiling along the way, I now have an explanation and a solution. Apology if this has been reported before, couldn't find any such report.
The problem is in a function external_exec() in includes/common(.inc).php, which has been rewritten some time between these two versions, now switching to a common loop waiting with a stream_select() on two pipes (stdout and stderr) from a spawned snmp*walk process.
The documentation on stream_select helps understanding the issue:
http://php.net/manual/en/function.stream-select.php
read The streams listed in the read array will be watched to see if characters become available for reading (more precisely, to see if a read will not block - in particular, a stream resource is also ready on end-of-file, in which case an fread() will return a zero length string).
So what happens is that when a spawned snmp* command closes its stderr but not yet its stdout, the external_exec() goes in a spin:
feof($pipes[1]) is false but stream is not yet ready for reading, however the feof($pipes[2]) is true and hence the stream_select() returns right away and reports $pipes[2] as ready. Then the fread($pipes[2]) is called, returns an empty string (because stderr is at eof), and the whole loop busily goes around, until the time the snmp command happens to terminate.
Here is a fix:
--- includes/common.inc.php.ori 2015-09-04 20:29:39.561244285 +0200 +++ includes/common.inc.php 2015-09-04 20:30:07.617242602 +0200 @@ -618,9 +618,13 @@ { $start = microtime(TRUE); - //while ($status['running'] !== FALSE) - while (feof($pipes[1]) === FALSE || feof($pipes[2]) === FALSE) + while (TRUE) { + $read = array(); + if (!feof($pipes[1])) { $read[] = $pipes[1]; } + if (!feof($pipes[2])) { $read[] = $pipes[2]; } + if (empty($read)) { break; } + stream_select( - $read = array($pipes[1], $pipes[2]), + $read, $write = NULL, $except = NULL,
Regards Mark _______________________________________________ observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium _______________________________________________ observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
![](https://secure.gravatar.com/avatar/21caf0a08d095be7196a1648d20942be.jpg?s=120&d=mm&r=g)
I seem to remember we fairly quickly released a new CE due to some rrd process issue? Mike! Where are you! :)
On 05/09/2015 12:32, Adam Armstrong wrote:
I see no appreciable decrease in poller usage after applying this modification, so perhaps it was fixing something that had already been fixed since the last CE release.
adam.
Sent from Mailbird http://www.getmailbird.com/?utm_source=Mailbird&utm_medium=email&utm_campaign=sent-from-mailbird
On 05/09/2015 00:47:51, Moore, Cameron cmoore@hsutx.edu wrote:
I noticed the high CPU usage a few weeks ago, but I haven't had time to track it down. Thanks for the hard work! I've incorporated your patch and will let the poller run a while to see how it goes. -- Cameron Moore
-----Original Message----- From: observium [mailto:observium-bounces@observium.org] On Behalf Of Mark Martinec Sent: Friday, September 04, 2015 5:20 PM To: observium@observium.org Subject: [Observium] poller.php burns all available CPU in the last community edition
After upgrading a somewhat outdated version Observium CE 0.13.10.4585 to CE 0.15.6.6430 (under PHP 5.5.28), the poller.php started burning excessive CPU: eight running pollers on an 8-CPU machine started burning 100% of available CPU time instead of a few percents as before. They somehow still managed to get their job done, but just barely so in five minutes, and often missing the time slot.
While watching the behaviour for a day and learning PHP profiling along the way, I now have an explanation and a solution. Apology if this has been reported before, couldn't find any such report.
The problem is in a function external_exec() in includes/common(.inc).php, which has been rewritten some time between these two versions, now switching to a common loop waiting with a stream_select() on two pipes (stdout and stderr) from a spawned snmp*walk process.
The documentation on stream_select helps understanding the issue:
http://php.net/manual/en/function.stream-select.php
read The streams listed in the read array will be watched to see if characters become available for reading (more precisely, to see if a read will not block - in particular, a stream resource is also ready on end-of-file, in which case an fread() will return a zero length string).
So what happens is that when a spawned snmp* command closes its stderr but not yet its stdout, the external_exec() goes in a spin:
feof($pipes[1]) is false but stream is not yet ready for reading, however the feof($pipes[2]) is true and hence the stream_select() returns right away and reports $pipes[2] as ready. Then the fread($pipes[2]) is called, returns an empty string (because stderr is at eof), and the whole loop busily goes around, until the time the snmp command happens to terminate.
Here is a fix:
--- includes/common.inc.php.ori 2015-09-04 20:29:39.561244285 +0200 +++ includes/common.inc.php 2015-09-04 20:30:07.617242602 +0200 @@ -618,9 +618,13 @@ { $start = microtime(TRUE);
- //while ($status['running'] !== FALSE)
- while (feof($pipes[1]) === FALSE || feof($pipes[2]) === FALSE)
- while (TRUE)
{
- $read = array();
- if (!feof($pipes[1])) { $read[] = $pipes[1]; }
- if (!feof($pipes[2])) { $read[] = $pipes[2]; }
- if (empty($read)) { break; }
stream_select(
- $read = array($pipes[1], $pipes[2]),
- $read,
$write = NULL, $except = NULL,
Regards Mark _______________________________________________ observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium _______________________________________________ observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
![](https://secure.gravatar.com/avatar/990281e0cef3d194deb34c14541e3936.jpg?s=120&d=mm&r=g)
That patch did not work for me. It looked promising at first with the poller CPU usage cut by 30%. However, I let it run overnight and came in to a server with load avg of 300 with tons of poller processes still running (cron spawns a new poller every 5 mins). -- Cameron Moore Manager of Systems & Networks Hardin-Simmons University, Technology Services Ph: (325) 670-1506 Fx: (325) 670-1570
From: observium [mailto:observium-bounces@observium.org] On Behalf Of Adam Armstrong Sent: Saturday, September 05, 2015 5:32 AM To: observium@observium.org Subject: Re: [Observium] poller.php burns all available CPU in the last community edition
I see no appreciable decrease in poller usage after applying this modification, so perhaps it was fixing something that had already been fixed since the last CE release.
adam.
Sent from Mailbirdhttp://www.getmailbird.com/?utm_source=Mailbird&utm_medium=email&utm_campaign=sent-from-mailbird
On 05/09/2015 00:47:51, Moore, Cameron <cmoore@hsutx.edumailto:cmoore@hsutx.edu> wrote: I noticed the high CPU usage a few weeks ago, but I haven't had time to track it down. Thanks for the hard work! I've incorporated your patch and will let the poller run a while to see how it goes. -- Cameron Moore
-----Original Message----- From: observium [mailto:observium-bounces@observium.org] On Behalf Of Mark Martinec Sent: Friday, September 04, 2015 5:20 PM To: observium@observium.orgmailto:observium@observium.org Subject: [Observium] poller.php burns all available CPU in the last community edition
After upgrading a somewhat outdated version Observium CE 0.13.10.4585 to CE 0.15.6.6430 (under PHP 5.5.28), the poller.php started burning excessive CPU: eight running pollers on an 8-CPU machine started burning 100% of available CPU time instead of a few percents as before. They somehow still managed to get their job done, but just barely so in five minutes, and often missing the time slot.
While watching the behaviour for a day and learning PHP profiling along the way, I now have an explanation and a solution. Apology if this has been reported before, couldn't find any such report.
The problem is in a function external_exec() in includes/common(.inc).php, which has been rewritten some time between these two versions, now switching to a common loop waiting with a stream_select() on two pipes (stdout and stderr) from a spawned snmp*walk process.
The documentation on stream_select helps understanding the issue:
http://php.net/manual/en/function.stream-select.php
read The streams listed in the read array will be watched to see if characters become available for reading (more precisely, to see if a read will not block - in particular, a stream resource is also ready on end-of-file, in which case an fread() will return a zero length string).
So what happens is that when a spawned snmp* command closes its stderr but not yet its stdout, the external_exec() goes in a spin:
feof($pipes[1]) is false but stream is not yet ready for reading, however the feof($pipes[2]) is true and hence the stream_select() returns right away and reports $pipes[2] as ready. Then the fread($pipes[2]) is called, returns an empty string (because stderr is at eof), and the whole loop busily goes around, until the time the snmp command happens to terminate.
Here is a fix:
--- includes/common.inc.php.ori 2015-09-04 20:29:39.561244285 +0200 +++ includes/common.inc.php 2015-09-04 20:30:07.617242602 +0200 @@ -618,9 +618,13 @@ { $start = microtime(TRUE); - //while ($status['running'] !== FALSE) - while (feof($pipes[1]) === FALSE || feof($pipes[2]) === FALSE) + while (TRUE) { + $read = array(); + if (!feof($pipes[1])) { $read[] = $pipes[1]; } + if (!feof($pipes[2])) { $read[] = $pipes[2]; } + if (empty($read)) { break; } + stream_select( - $read = array($pipes[1], $pipes[2]), + $read, $write = NULL, $except = NULL,
Regards Mark _______________________________________________ observium mailing list observium@observium.orgmailto:observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium _______________________________________________ observium mailing list observium@observium.orgmailto:observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
![](https://secure.gravatar.com/avatar/990281e0cef3d194deb34c14541e3936.jpg?s=120&d=mm&r=g)
I’ve submitted my solution to the poller’s high CPU usage: http://jira.observium.org/browse/OBSERVIUM-1439 -- Cameron Moore Manager of Systems & Networks Hardin-Simmons University, Technology Services Ph: (325) 670-1506 Fx: (325) 670-1570
From: observium [mailto:observium-bounces@observium.org] On Behalf Of Moore, Cameron Sent: Saturday, September 05, 2015 12:15 PM To: Observium Network Observation System observium@observium.org Subject: Re: [Observium] poller.php burns all available CPU in the last community edition
That patch did not work for me. It looked promising at first with the poller CPU usage cut by 30%. However, I let it run overnight and came in to a server with load avg of 300 with tons of poller processes still running (cron spawns a new poller every 5 mins). -- Cameron Moore Manager of Systems & Networks Hardin-Simmons University, Technology Services Ph: (325) 670-1506 Fx: (325) 670-1570
From: observium [mailto:observium-bounces@observium.org] On Behalf Of Adam Armstrong Sent: Saturday, September 05, 2015 5:32 AM To: observium@observium.orgmailto:observium@observium.org Subject: Re: [Observium] poller.php burns all available CPU in the last community edition
I see no appreciable decrease in poller usage after applying this modification, so perhaps it was fixing something that had already been fixed since the last CE release.
adam.
Sent from Mailbirdhttp://www.getmailbird.com/?utm_source=Mailbird&utm_medium=email&utm_campaign=sent-from-mailbird
On 05/09/2015 00:47:51, Moore, Cameron <cmoore@hsutx.edumailto:cmoore@hsutx.edu> wrote: I noticed the high CPU usage a few weeks ago, but I haven't had time to track it down. Thanks for the hard work! I've incorporated your patch and will let the poller run a while to see how it goes. -- Cameron Moore
-----Original Message----- From: observium [mailto:observium-bounces@observium.org] On Behalf Of Mark Martinec Sent: Friday, September 04, 2015 5:20 PM To: observium@observium.orgmailto:observium@observium.org Subject: [Observium] poller.php burns all available CPU in the last community edition
After upgrading a somewhat outdated version Observium CE 0.13.10.4585 to CE 0.15.6.6430 (under PHP 5.5.28), the poller.php started burning excessive CPU: eight running pollers on an 8-CPU machine started burning 100% of available CPU time instead of a few percents as before. They somehow still managed to get their job done, but just barely so in five minutes, and often missing the time slot.
While watching the behaviour for a day and learning PHP profiling along the way, I now have an explanation and a solution. Apology if this has been reported before, couldn't find any such report.
The problem is in a function external_exec() in includes/common(.inc).php, which has been rewritten some time between these two versions, now switching to a common loop waiting with a stream_select() on two pipes (stdout and stderr) from a spawned snmp*walk process.
The documentation on stream_select helps understanding the issue:
http://php.net/manual/en/function.stream-select.php
read The streams listed in the read array will be watched to see if characters become available for reading (more precisely, to see if a read will not block - in particular, a stream resource is also ready on end-of-file, in which case an fread() will return a zero length string).
So what happens is that when a spawned snmp* command closes its stderr but not yet its stdout, the external_exec() goes in a spin:
feof($pipes[1]) is false but stream is not yet ready for reading, however the feof($pipes[2]) is true and hence the stream_select() returns right away and reports $pipes[2] as ready. Then the fread($pipes[2]) is called, returns an empty string (because stderr is at eof), and the whole loop busily goes around, until the time the snmp command happens to terminate.
Here is a fix:
--- includes/common.inc.php.ori 2015-09-04 20:29:39.561244285 +0200 +++ includes/common.inc.php 2015-09-04 20:30:07.617242602 +0200 @@ -618,9 +618,13 @@ { $start = microtime(TRUE); - //while ($status['running'] !== FALSE) - while (feof($pipes[1]) === FALSE || feof($pipes[2]) === FALSE) + while (TRUE) { + $read = array(); + if (!feof($pipes[1])) { $read[] = $pipes[1]; } + if (!feof($pipes[2])) { $read[] = $pipes[2]; } + if (empty($read)) { break; } + stream_select( - $read = array($pipes[1], $pipes[2]), + $read, $write = NULL, $except = NULL,
Regards Mark _______________________________________________ observium mailing list observium@observium.orgmailto:observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium _______________________________________________ observium mailing list observium@observium.orgmailto:observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
![](https://secure.gravatar.com/avatar/990281e0cef3d194deb34c14541e3936.jpg?s=120&d=mm&r=g)
Committed as r6969 for those on the Professional Edition. Adam says it doesn’t improve much for him, but it was significantly improved for me. Anyone else see reduced CPU usage of the poller after this commit? -- Cameron Moore Manager of Systems & Networks Hardin-Simmons University, Technology Services Ph: (325) 670-1506 Fx: (325) 670-1570
From: observium [mailto:observium-bounces@observium.org] On Behalf Of Moore, Cameron Sent: Monday, September 07, 2015 3:39 PM To: Observium Network Observation System observium@observium.org Subject: Re: [Observium] poller.php burns all available CPU in the last community edition
I’ve submitted my solution to the poller’s high CPU usage: http://jira.observium.org/browse/OBSERVIUM-1439 -- Cameron Moore Manager of Systems & Networks Hardin-Simmons University, Technology Services Ph: (325) 670-1506 Fx: (325) 670-1570
From: observium [mailto:observium-bounces@observium.org] On Behalf Of Moore, Cameron Sent: Saturday, September 05, 2015 12:15 PM To: Observium Network Observation System <observium@observium.orgmailto:observium@observium.org> Subject: Re: [Observium] poller.php burns all available CPU in the last community edition
That patch did not work for me. It looked promising at first with the poller CPU usage cut by 30%. However, I let it run overnight and came in to a server with load avg of 300 with tons of poller processes still running (cron spawns a new poller every 5 mins). -- Cameron Moore Manager of Systems & Networks Hardin-Simmons University, Technology Services Ph: (325) 670-1506 Fx: (325) 670-1570
From: observium [mailto:observium-bounces@observium.org] On Behalf Of Adam Armstrong Sent: Saturday, September 05, 2015 5:32 AM To: observium@observium.orgmailto:observium@observium.org Subject: Re: [Observium] poller.php burns all available CPU in the last community edition
I see no appreciable decrease in poller usage after applying this modification, so perhaps it was fixing something that had already been fixed since the last CE release.
adam.
Sent from Mailbirdhttp://www.getmailbird.com/?utm_source=Mailbird&utm_medium=email&utm_campaign=sent-from-mailbird
On 05/09/2015 00:47:51, Moore, Cameron <cmoore@hsutx.edumailto:cmoore@hsutx.edu> wrote: I noticed the high CPU usage a few weeks ago, but I haven't had time to track it down. Thanks for the hard work! I've incorporated your patch and will let the poller run a while to see how it goes. -- Cameron Moore
-----Original Message----- From: observium [mailto:observium-bounces@observium.org] On Behalf Of Mark Martinec Sent: Friday, September 04, 2015 5:20 PM To: observium@observium.orgmailto:observium@observium.org Subject: [Observium] poller.php burns all available CPU in the last community edition
After upgrading a somewhat outdated version Observium CE 0.13.10.4585 to CE 0.15.6.6430 (under PHP 5.5.28), the poller.php started burning excessive CPU: eight running pollers on an 8-CPU machine started burning 100% of available CPU time instead of a few percents as before. They somehow still managed to get their job done, but just barely so in five minutes, and often missing the time slot.
While watching the behaviour for a day and learning PHP profiling along the way, I now have an explanation and a solution. Apology if this has been reported before, couldn't find any such report.
The problem is in a function external_exec() in includes/common(.inc).php, which has been rewritten some time between these two versions, now switching to a common loop waiting with a stream_select() on two pipes (stdout and stderr) from a spawned snmp*walk process.
The documentation on stream_select helps understanding the issue:
http://php.net/manual/en/function.stream-select.php
read The streams listed in the read array will be watched to see if characters become available for reading (more precisely, to see if a read will not block - in particular, a stream resource is also ready on end-of-file, in which case an fread() will return a zero length string).
So what happens is that when a spawned snmp* command closes its stderr but not yet its stdout, the external_exec() goes in a spin:
feof($pipes[1]) is false but stream is not yet ready for reading, however the feof($pipes[2]) is true and hence the stream_select() returns right away and reports $pipes[2] as ready. Then the fread($pipes[2]) is called, returns an empty string (because stderr is at eof), and the whole loop busily goes around, until the time the snmp command happens to terminate.
Here is a fix:
--- includes/common.inc.php.ori 2015-09-04 20:29:39.561244285 +0200 +++ includes/common.inc.php 2015-09-04 20:30:07.617242602 +0200 @@ -618,9 +618,13 @@ { $start = microtime(TRUE); - //while ($status['running'] !== FALSE) - while (feof($pipes[1]) === FALSE || feof($pipes[2]) === FALSE) + while (TRUE) { + $read = array(); + if (!feof($pipes[1])) { $read[] = $pipes[1]; } + if (!feof($pipes[2])) { $read[] = $pipes[2]; } + if (empty($read)) { break; } + stream_select( - $read = array($pipes[1], $pipes[2]), + $read, $write = NULL, $except = NULL,
Regards Mark _______________________________________________ observium mailing list observium@observium.orgmailto:observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium _______________________________________________ observium mailing list observium@observium.orgmailto:observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
![](https://secure.gravatar.com/avatar/ac1245688999dd3e955365d6492e9961.jpg?s=120&d=mm&r=g)
Hello,
Just in case, I'm running Observium CE 0.15.6.6430 (upgraded from 0.13.10.4585 right after the update was available) and I do not have such kind of issues with CPU usage. It's the same as it was before the upgrade (the graph attached).
Here is some statistics in case it might help:
Observium CE 0.15.6.6430 (CPU: Intel Core2 Duo E8400) OS Linux 3.2.0-4-amd64 [amd64] (Debian 7.8) Apache 2.2.22 (Debian) PHP 5.4.41-0+deb7u1 Python Python 2.7.3 MySQL 5.5.41-0+wheezy1 SNMP NET-SNMP 5.4.3 RRDtool 1.4.7
Statistics DB size 1.33GB RRD size 3.39GB Devices 48 Ports 949 IPv4 Addresses 189 IPv4 Networks 86 IPv6 Addresses 52 IPv6 Networks 39 Processors 643 Memory 149 Storage 339 Disk I/O 164 HR-MIB 1813 Entity-MIB 3 Syslog Entries 0 Eventlog Entries 28107 Sensors 162
Thanks, Eugene
On Tue, Sep 8, 2015 at 9:06 AM, Moore, Cameron cmoore@hsutx.edu wrote:
Committed as r6969 for those on the Professional Edition. Adam says it doesn’t improve much for him, but it was significantly improved for me. Anyone else see reduced CPU usage of the poller after this commit?
--
Cameron Moore
Manager of Systems & Networks
Hardin-Simmons University, Technology Services
Ph: (325) 670-1506 Fx: (325) 670-1570
*From:* observium [mailto:observium-bounces@observium.org] *On Behalf Of *Moore, Cameron *Sent:* Monday, September 07, 2015 3:39 PM *To:* Observium Network Observation System observium@observium.org *Subject:* Re: [Observium] poller.php burns all available CPU in the last community edition
I’ve submitted my solution to the poller’s high CPU usage: http://jira.observium.org/browse/OBSERVIUM-1439
--
Cameron Moore
Manager of Systems & Networks
Hardin-Simmons University, Technology Services
Ph: (325) 670-1506 Fx: (325) 670-1570
*From:* observium [mailto:observium-bounces@observium.org observium-bounces@observium.org] *On Behalf Of *Moore, Cameron *Sent:* Saturday, September 05, 2015 12:15 PM *To:* Observium Network Observation System observium@observium.org *Subject:* Re: [Observium] poller.php burns all available CPU in the last community edition
That patch did not work for me. It looked promising at first with the poller CPU usage cut by 30%. However, I let it run overnight and came in to a server with load avg of 300 with tons of poller processes still running (cron spawns a new poller every 5 mins).
--
Cameron Moore
Manager of Systems & Networks
Hardin-Simmons University, Technology Services
Ph: (325) 670-1506 Fx: (325) 670-1570
*From:* observium [mailto:observium-bounces@observium.org observium-bounces@observium.org] *On Behalf Of *Adam Armstrong *Sent:* Saturday, September 05, 2015 5:32 AM *To:* observium@observium.org *Subject:* Re: [Observium] poller.php burns all available CPU in the last community edition
I see no appreciable decrease in poller usage after applying this modification, so perhaps it was fixing something that had already been fixed since the last CE release.
adam.
Sent from Mailbird http://www.getmailbird.com/?utm_source=Mailbird&utm_medium=email&utm_campaign=sent-from-mailbird
On 05/09/2015 00:47:51, Moore, Cameron cmoore@hsutx.edu wrote:
I noticed the high CPU usage a few weeks ago, but I haven't had time to track it down. Thanks for the hard work! I've incorporated your patch and will let the poller run a while to see how it goes. -- Cameron Moore
-----Original Message----- From: observium [mailto:observium-bounces@observium.org observium-bounces@observium.org] On Behalf Of Mark Martinec Sent: Friday, September 04, 2015 5:20 PM To: observium@observium.org Subject: [Observium] poller.php burns all available CPU in the last community edition
After upgrading a somewhat outdated version Observium CE 0.13.10.4585 to CE 0.15.6.6430 (under PHP 5.5.28), the poller.php started burning excessive CPU: eight running pollers on an 8-CPU machine started burning 100% of available CPU time instead of a few percents as before. They somehow still managed to get their job done, but just barely so in five minutes, and often missing the time slot.
While watching the behaviour for a day and learning PHP profiling along the way, I now have an explanation and a solution. Apology if this has been reported before, couldn't find any such report.
The problem is in a function external_exec() in includes/common(.inc).php, which has been rewritten some time between these two versions, now switching to a common loop waiting with a stream_select() on two pipes (stdout and stderr) from a spawned snmp*walk process.
The documentation on stream_select helps understanding the issue:
http://php.net/manual/en/function.stream-select.php
read The streams listed in the read array will be watched to see if characters become available for reading (more precisely, to see if a read will not block - in particular, a stream resource is also ready on end-of-file, in which case an fread() will return a zero length string).
So what happens is that when a spawned snmp* command closes its stderr but not yet its stdout, the external_exec() goes in a spin:
feof($pipes[1]) is false but stream is not yet ready for reading, however the feof($pipes[2]) is true and hence the stream_select() returns right away and reports $pipes[2] as ready. Then the fread($pipes[2]) is called, returns an empty string (because stderr is at eof), and the whole loop busily goes around, until the time the snmp command happens to terminate.
Here is a fix:
--- includes/common.inc.php.ori 2015-09-04 20:29:39.561244285 +0200 +++ includes/common.inc.php 2015-09-04 20:30:07.617242602 +0200 @@ -618,9 +618,13 @@ { $start = microtime(TRUE);
- //while ($status['running'] !== FALSE)
- while (feof($pipes[1]) === FALSE || feof($pipes[2]) === FALSE)
- while (TRUE)
{
- $read = array();
- if (!feof($pipes[1])) { $read[] = $pipes[1]; }
- if (!feof($pipes[2])) { $read[] = $pipes[2]; }
- if (empty($read)) { break; }
stream_select(
- $read = array($pipes[1], $pipes[2]),
- $read,
$write = NULL, $except = NULL,
Regards Mark _______________________________________________ observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium _______________________________________________ observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
![](https://secure.gravatar.com/avatar/796e9bafee5d6f39b5b32b68ca494440.jpg?s=120&d=mm&r=g)
2015-09-08 Eugene Nechai wrote:
Just in case, I'm running Observium CE 0.15.6.6430 (upgraded from 0.13.10.4585 right after the update was available) and I do not have such kind of issues with CPU usage. It's the same as it was before the upgrade (the graph attached).
I haven't seen the logging overhead that Cameron has been fixing, probably because the poller in CE 0.15.6.6430 does not contain colorization of text in its printout. Neither does the xdebug with KCacheGrind show any such remaining hotspots in the poller.php run.
Note that the select-on-eof problem that I have reported at the head of this topic is unrelated to what Cameron has been chasing, and it is quite possible that it does not affect all installations, or that EOF on both pipes is often (but not always) seen close to one another so the spinning may be short. Or perhaps that it has already been fixed in the professional edition.
If anyone wants to try it, here is a hackish patch against CE 0.15.6.6430, which prints 'EXCESSIVE IDLE CYCLES' in the output of the poller.php run, if there are any:
--- includes/common.inc.php.ori 2015-09-04 20:29:39.561244285 +0200 +++ includes/common.inc.php 2015-09-08 17:59:13.600049587 +0200 @@ -618,7 +618,9 @@ { $start = microtime(TRUE); + $idle_cycles = 0; //while ($status['running'] !== FALSE) while (feof($pipes[1]) === FALSE || feof($pipes[2]) === FALSE) { + $any_progress = 0; stream_select( $read = array($pipes[1], $pipes[2]), @@ -634,9 +636,13 @@ if ($pipe === $pipes[1]) { - $stdout .= fread($pipe, 8192); + $str = fread($pipe, 8192); + if (strlen($str) > 0) { $any_progress = 1; } + $stdout .= $str; } else if ($pipe === $pipes[2]) { - $stderr .= fread($pipe, 8192); + $str = fread($pipe, 8192); + if (strlen($str) > 0) { $any_progress = 1; } + $stderr .= $str; } } @@ -670,4 +676,8 @@ } } + if (!$any_progress) { $idle_cycles++; } + } + if ($idle_cycles > 2) { + printf("EXCESSIVE IDLE CYCLES: %s\n", $idle_cycles); } if ($status['running'])
In my case there are numerous. The more both Eof conditions on pipes are apart in time, the worse the CPU idle spinning gets:
$ ./poller.php -i 8 -n 1 | fgrep 'EXCESSIVE IDLE CYCLES' EXCESSIVE IDLE CYCLES: 456 EXCESSIVE IDLE CYCLES: 2075 EXCESSIVE IDLE CYCLES: 2800 EXCESSIVE IDLE CYCLES: 1661 EXCESSIVE IDLE CYCLES: 1301 EXCESSIVE IDLE CYCLES: 295 EXCESSIVE IDLE CYCLES: 2457 EXCESSIVE IDLE CYCLES: 4217 EXCESSIVE IDLE CYCLES: 14354 EXCESSIVE IDLE CYCLES: 732 EXCESSIVE IDLE CYCLES: 11438 EXCESSIVE IDLE CYCLES: 4784 EXCESSIVE IDLE CYCLES: 10939 EXCESSIVE IDLE CYCLES: 589 ... EXCESSIVE IDLE CYCLES: 4782 EXCESSIVE IDLE CYCLES: 84 EXCESSIVE IDLE CYCLES: 518 EXCESSIVE IDLE CYCLES: 2796 EXCESSIVE IDLE CYCLES: 2260 EXCESSIVE IDLE CYCLES: 3591 EXCESSIVE IDLE CYCLES: 7266 EXCESSIVE IDLE CYCLES: 13707 EXCESSIVE IDLE CYCLES: 1509 EXCESSIVE IDLE CYCLES: 4420 EXCESSIVE IDLE CYCLES: 2103 EXCESSIVE IDLE CYCLES: 15664 ... EXCESSIVE IDLE CYCLES: 1495 EXCESSIVE IDLE CYCLES: 3592 EXCESSIVE IDLE CYCLES: 68420 EXCESSIVE IDLE CYCLES: 5430 EXCESSIVE IDLE CYCLES: 5003 EXCESSIVE IDLE CYCLES: 191203 EXCESSIVE IDLE CYCLES: 148976 EXCESSIVE IDLE CYCLES: 20664 EXCESSIVE IDLE CYCLES: 46755 EXCESSIVE IDLE CYCLES: 6071 EXCESSIVE IDLE CYCLES: 109397 ...etc
A php select on a pipe which is already at Eof is futile (always fires as ready), so that needs to be avoided even if a particular installation does not encounter problems.
Mark
![](https://secure.gravatar.com/avatar/990281e0cef3d194deb34c14541e3936.jpg?s=120&d=mm&r=g)
Yes, we're chasing two separate issues. However, I did apply the "excessive cycles" patch you provided, and I see not excessive cycles. It looks like that issue has been fixed in the latest Professional Edition. -- Cameron Moore Manager of Systems & Networks Hardin-Simmons University, Technology Services Ph: (325) 670-1506 Fx: (325) 670-1570
-----Original Message----- From: observium [mailto:observium-bounces@observium.org] On Behalf Of Mark Martinec Sent: Tuesday, September 08, 2015 11:28 AM To: observium@observium.org Subject: Re: [Observium] poller.php burns all available CPU in the last community edition
2015-09-08 Eugene Nechai wrote:
Just in case, I'm running Observium CE 0.15.6.6430 (upgraded from 0.13.10.4585 right after the update was available) and I do not have such kind of issues with CPU usage. It's the same as it was before the upgrade (the graph attached).
I haven't seen the logging overhead that Cameron has been fixing, probably because the poller in CE 0.15.6.6430 does not contain colorization of text in its printout. Neither does the xdebug with KCacheGrind show any such remaining hotspots in the poller.php run.
Note that the select-on-eof problem that I have reported at the head of this topic is unrelated to what Cameron has been chasing, and it is quite possible that it does not affect all installations, or that EOF on both pipes is often (but not always) seen close to one another so the spinning may be short. Or perhaps that it has already been fixed in the professional edition.
If anyone wants to try it, here is a hackish patch against CE 0.15.6.6430, which prints 'EXCESSIVE IDLE CYCLES' in the output of the poller.php run, if there are any:
--- includes/common.inc.php.ori 2015-09-04 20:29:39.561244285 +0200 +++ includes/common.inc.php 2015-09-08 17:59:13.600049587 +0200 @@ -618,7 +618,9 @@ { $start = microtime(TRUE); + $idle_cycles = 0; //while ($status['running'] !== FALSE) while (feof($pipes[1]) === FALSE || feof($pipes[2]) === FALSE) { + $any_progress = 0; stream_select( $read = array($pipes[1], $pipes[2]), @@ -634,9 +636,13 @@ if ($pipe === $pipes[1]) { - $stdout .= fread($pipe, 8192); + $str = fread($pipe, 8192); + if (strlen($str) > 0) { $any_progress = 1; } + $stdout .= $str; } else if ($pipe === $pipes[2]) { - $stderr .= fread($pipe, 8192); + $str = fread($pipe, 8192); + if (strlen($str) > 0) { $any_progress = 1; } + $stderr .= $str; } } @@ -670,4 +676,8 @@ } } + if (!$any_progress) { $idle_cycles++; } + } + if ($idle_cycles > 2) { + printf("EXCESSIVE IDLE CYCLES: %s\n", $idle_cycles); } if ($status['running'])
In my case there are numerous. The more both Eof conditions on pipes are apart in time, the worse the CPU idle spinning gets:
$ ./poller.php -i 8 -n 1 | fgrep 'EXCESSIVE IDLE CYCLES' EXCESSIVE IDLE CYCLES: 456 EXCESSIVE IDLE CYCLES: 2075 EXCESSIVE IDLE CYCLES: 2800 EXCESSIVE IDLE CYCLES: 1661 EXCESSIVE IDLE CYCLES: 1301 EXCESSIVE IDLE CYCLES: 295 EXCESSIVE IDLE CYCLES: 2457 EXCESSIVE IDLE CYCLES: 4217 EXCESSIVE IDLE CYCLES: 14354 EXCESSIVE IDLE CYCLES: 732 EXCESSIVE IDLE CYCLES: 11438 EXCESSIVE IDLE CYCLES: 4784 EXCESSIVE IDLE CYCLES: 10939 EXCESSIVE IDLE CYCLES: 589 ... EXCESSIVE IDLE CYCLES: 4782 EXCESSIVE IDLE CYCLES: 84 EXCESSIVE IDLE CYCLES: 518 EXCESSIVE IDLE CYCLES: 2796 EXCESSIVE IDLE CYCLES: 2260 EXCESSIVE IDLE CYCLES: 3591 EXCESSIVE IDLE CYCLES: 7266 EXCESSIVE IDLE CYCLES: 13707 EXCESSIVE IDLE CYCLES: 1509 EXCESSIVE IDLE CYCLES: 4420 EXCESSIVE IDLE CYCLES: 2103 EXCESSIVE IDLE CYCLES: 15664 ... EXCESSIVE IDLE CYCLES: 1495 EXCESSIVE IDLE CYCLES: 3592 EXCESSIVE IDLE CYCLES: 68420 EXCESSIVE IDLE CYCLES: 5430 EXCESSIVE IDLE CYCLES: 5003 EXCESSIVE IDLE CYCLES: 191203 EXCESSIVE IDLE CYCLES: 148976 EXCESSIVE IDLE CYCLES: 20664 EXCESSIVE IDLE CYCLES: 46755 EXCESSIVE IDLE CYCLES: 6071 EXCESSIVE IDLE CYCLES: 109397 ...etc
A php select on a pipe which is already at Eof is futile (always fires as ready), so that needs to be avoided even if a particular installation does not encounter problems.
Mark
_______________________________________________ observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
![](https://secure.gravatar.com/avatar/11b54b3dd25b712395dab9818c67596f.jpg?s=120&d=mm&r=g)
Yes, I think that was fixed some time ago by Mike.
New CE is not too far out!
Adam.
Sent with AquaMail for Android http://www.aqua-mail.com
On 8 September 2015 19:38:31 "Moore, Cameron" cmoore@hsutx.edu wrote:
Yes, we're chasing two separate issues. However, I did apply the "excessive cycles" patch you provided, and I see not excessive cycles. It looks like that issue has been fixed in the latest Professional Edition. -- Cameron Moore Manager of Systems & Networks Hardin-Simmons University, Technology Services Ph: (325) 670-1506 Fx: (325) 670-1570
-----Original Message----- From: observium [mailto:observium-bounces@observium.org] On Behalf Of Mark Martinec Sent: Tuesday, September 08, 2015 11:28 AM To: observium@observium.org Subject: Re: [Observium] poller.php burns all available CPU in the last community edition
2015-09-08 Eugene Nechai wrote:
Just in case, I'm running Observium CE 0.15.6.6430 (upgraded from 0.13.10.4585 right after the update was available) and I do not have such kind of issues with CPU usage. It's the same as it was before the upgrade (the graph attached).
I haven't seen the logging overhead that Cameron has been fixing, probably because the poller in CE 0.15.6.6430 does not contain colorization of text in its printout. Neither does the xdebug with KCacheGrind show any such remaining hotspots in the poller.php run.
Note that the select-on-eof problem that I have reported at the head of this topic is unrelated to what Cameron has been chasing, and it is quite possible that it does not affect all installations, or that EOF on both pipes is often (but not always) seen close to one another so the spinning may be short. Or perhaps that it has already been fixed in the professional edition.
If anyone wants to try it, here is a hackish patch against CE 0.15.6.6430, which prints 'EXCESSIVE IDLE CYCLES' in the output of the poller.php run, if there are any:
--- includes/common.inc.php.ori 2015-09-04 20:29:39.561244285 +0200 +++ includes/common.inc.php 2015-09-08 17:59:13.600049587 +0200 @@ -618,7 +618,9 @@ { $start = microtime(TRUE);
- $idle_cycles = 0; //while ($status['running'] !== FALSE) while (feof($pipes[1]) === FALSE || feof($pipes[2]) === FALSE) {
$any_progress = 0; stream_select( $read = array($pipes[1], $pipes[2]),
@@ -634,9 +636,13 @@ if ($pipe === $pipes[1]) {
$stdout .= fread($pipe, 8192);
$str = fread($pipe, 8192);
if (strlen($str) > 0) { $any_progress = 1; }
$stdout .= $str; } else if ($pipe === $pipes[2]) {
$stderr .= fread($pipe, 8192);
$str = fread($pipe, 8192);
if (strlen($str) > 0) { $any_progress = 1; }
$stderr .= $str; } }
@@ -670,4 +676,8 @@ } }
if (!$any_progress) { $idle_cycles++; }
- }
- if ($idle_cycles > 2) {
printf("EXCESSIVE IDLE CYCLES: %s\n", $idle_cycles); } if ($status['running'])
In my case there are numerous. The more both Eof conditions on pipes are apart in time, the worse the CPU idle spinning gets:
$ ./poller.php -i 8 -n 1 | fgrep 'EXCESSIVE IDLE CYCLES' EXCESSIVE IDLE CYCLES: 456 EXCESSIVE IDLE CYCLES: 2075 EXCESSIVE IDLE CYCLES: 2800 EXCESSIVE IDLE CYCLES: 1661 EXCESSIVE IDLE CYCLES: 1301 EXCESSIVE IDLE CYCLES: 295 EXCESSIVE IDLE CYCLES: 2457 EXCESSIVE IDLE CYCLES: 4217 EXCESSIVE IDLE CYCLES: 14354 EXCESSIVE IDLE CYCLES: 732 EXCESSIVE IDLE CYCLES: 11438 EXCESSIVE IDLE CYCLES: 4784 EXCESSIVE IDLE CYCLES: 10939 EXCESSIVE IDLE CYCLES: 589 ... EXCESSIVE IDLE CYCLES: 4782 EXCESSIVE IDLE CYCLES: 84 EXCESSIVE IDLE CYCLES: 518 EXCESSIVE IDLE CYCLES: 2796 EXCESSIVE IDLE CYCLES: 2260 EXCESSIVE IDLE CYCLES: 3591 EXCESSIVE IDLE CYCLES: 7266 EXCESSIVE IDLE CYCLES: 13707 EXCESSIVE IDLE CYCLES: 1509 EXCESSIVE IDLE CYCLES: 4420 EXCESSIVE IDLE CYCLES: 2103 EXCESSIVE IDLE CYCLES: 15664 ... EXCESSIVE IDLE CYCLES: 1495 EXCESSIVE IDLE CYCLES: 3592 EXCESSIVE IDLE CYCLES: 68420 EXCESSIVE IDLE CYCLES: 5430 EXCESSIVE IDLE CYCLES: 5003 EXCESSIVE IDLE CYCLES: 191203 EXCESSIVE IDLE CYCLES: 148976 EXCESSIVE IDLE CYCLES: 20664 EXCESSIVE IDLE CYCLES: 46755 EXCESSIVE IDLE CYCLES: 6071 EXCESSIVE IDLE CYCLES: 109397 ...etc
A php select on a pipe which is already at Eof is futile (always fires as ready), so that needs to be avoided even if a particular installation does not encounter problems.
Mark
observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium _______________________________________________ observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
![](https://secure.gravatar.com/avatar/9113800bbd271c46f4585a9549d85c15.jpg?s=120&d=mm&r=g)
Regarding if anyone else see any improvement: I see a notable decrease in cpu usage monday morning after I did a svn up :) http://best-practice.se/dump/cpu_usage.PNG
/Markus
2015-09-08 21:04 GMT+02:00 Adam Armstrong adama@observium.org:
Yes, I think that was fixed some time ago by Mike.
New CE is not too far out!
Adam.
Sent with AquaMail for Android http://www.aqua-mail.com
On 8 September 2015 19:38:31 "Moore, Cameron" cmoore@hsutx.edu wrote:
Yes, we're chasing two separate issues. However, I did apply the
"excessive cycles" patch you provided, and I see not excessive cycles. It looks like that issue has been fixed in the latest Professional Edition. -- Cameron Moore Manager of Systems & Networks Hardin-Simmons University, Technology Services Ph: (325) 670-1506 Fx: (325) 670-1570
-----Original Message----- From: observium [mailto:observium-bounces@observium.org] On Behalf Of Mark Martinec Sent: Tuesday, September 08, 2015 11:28 AM To: observium@observium.org Subject: Re: [Observium] poller.php burns all available CPU in the last community edition
2015-09-08 Eugene Nechai wrote:
Just in case, I'm running Observium CE 0.15.6.6430 (upgraded from 0.13.10.4585 right after the update was available) and I do not have such kind of issues with CPU usage. It's the same as it was before the upgrade (the graph attached).
I haven't seen the logging overhead that Cameron has been fixing, probably because the poller in CE 0.15.6.6430 does not contain colorization of text in its printout. Neither does the xdebug with KCacheGrind show any such remaining hotspots in the poller.php run.
Note that the select-on-eof problem that I have reported at the head of this topic is unrelated to what Cameron has been chasing, and it is quite possible that it does not affect all installations, or that EOF on both pipes is often (but not always) seen close to one another so the spinning may be short. Or perhaps that it has already been fixed in the professional edition.
If anyone wants to try it, here is a hackish patch against CE 0.15.6.6430, which prints 'EXCESSIVE IDLE CYCLES' in the output of the poller.php run, if there are any:
--- includes/common.inc.php.ori 2015-09-04 20:29:39.561244285 +0200 +++ includes/common.inc.php 2015-09-08 17:59:13.600049587 +0200 @@ -618,7 +618,9 @@ { $start = microtime(TRUE);
- $idle_cycles = 0; //while ($status['running'] !== FALSE) while (feof($pipes[1]) === FALSE || feof($pipes[2]) === FALSE) {
$any_progress = 0; stream_select( $read = array($pipes[1], $pipes[2]),
@@ -634,9 +636,13 @@ if ($pipe === $pipes[1]) {
$stdout .= fread($pipe, 8192);
$str = fread($pipe, 8192);
if (strlen($str) > 0) { $any_progress = 1; }
$stdout .= $str; } else if ($pipe === $pipes[2]) {
$stderr .= fread($pipe, 8192);
$str = fread($pipe, 8192);
if (strlen($str) > 0) { $any_progress = 1; }
$stderr .= $str; } }
@@ -670,4 +676,8 @@ } }
if (!$any_progress) { $idle_cycles++; }
- }
- if ($idle_cycles > 2) {
printf("EXCESSIVE IDLE CYCLES: %s\n", $idle_cycles); } if ($status['running'])
In my case there are numerous. The more both Eof conditions on pipes are apart in time, the worse the CPU idle spinning gets:
$ ./poller.php -i 8 -n 1 | fgrep 'EXCESSIVE IDLE CYCLES' EXCESSIVE IDLE CYCLES: 456 EXCESSIVE IDLE CYCLES: 2075 EXCESSIVE IDLE CYCLES: 2800 EXCESSIVE IDLE CYCLES: 1661 EXCESSIVE IDLE CYCLES: 1301 EXCESSIVE IDLE CYCLES: 295 EXCESSIVE IDLE CYCLES: 2457 EXCESSIVE IDLE CYCLES: 4217 EXCESSIVE IDLE CYCLES: 14354 EXCESSIVE IDLE CYCLES: 732 EXCESSIVE IDLE CYCLES: 11438 EXCESSIVE IDLE CYCLES: 4784 EXCESSIVE IDLE CYCLES: 10939 EXCESSIVE IDLE CYCLES: 589 ... EXCESSIVE IDLE CYCLES: 4782 EXCESSIVE IDLE CYCLES: 84 EXCESSIVE IDLE CYCLES: 518 EXCESSIVE IDLE CYCLES: 2796 EXCESSIVE IDLE CYCLES: 2260 EXCESSIVE IDLE CYCLES: 3591 EXCESSIVE IDLE CYCLES: 7266 EXCESSIVE IDLE CYCLES: 13707 EXCESSIVE IDLE CYCLES: 1509 EXCESSIVE IDLE CYCLES: 4420 EXCESSIVE IDLE CYCLES: 2103 EXCESSIVE IDLE CYCLES: 15664 ... EXCESSIVE IDLE CYCLES: 1495 EXCESSIVE IDLE CYCLES: 3592 EXCESSIVE IDLE CYCLES: 68420 EXCESSIVE IDLE CYCLES: 5430 EXCESSIVE IDLE CYCLES: 5003 EXCESSIVE IDLE CYCLES: 191203 EXCESSIVE IDLE CYCLES: 148976 EXCESSIVE IDLE CYCLES: 20664 EXCESSIVE IDLE CYCLES: 46755 EXCESSIVE IDLE CYCLES: 6071 EXCESSIVE IDLE CYCLES: 109397 ...etc
A php select on a pipe which is already at Eof is futile (always fires as ready), so that needs to be avoided even if a particular installation does not encounter problems.
Mark
observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium _______________________________________________ observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
observium mailing list observium@observium.org http://postman.memetic.org/cgi-bin/mailman/listinfo/observium
![](https://secure.gravatar.com/avatar/06aa7c7b7c1aceeb9f755fe9380d301a.jpg?s=120&d=mm&r=g)
Yes, excellent work. thanks for your time.
thanks
Peter Hine Senior Technical Support Engineer (Servers) FCoA ITS peter.hine@familycourt.gov.au
********************************************************************** The information contained in this e-mail (including any attachments) is for the exclusive use of the addressee. If you are not the intended recipient please notify the sender immediately and delete this e-mail. It is noted that legal privilege is not waived because you have read this e-mail. **********************************************************************
participants (8)
-
Adam Armstrong
-
Adam Armstrong
-
Eugene Nechai
-
Mark Martinec
-
Markus Klock
-
Moore, Cameron
-
Peter.Hine@familycourt.gov.au
-
Tom Laermans