Hi everyone,
Yesterday I committed a housekeeping script to clean up after Observium.
If you have a nice number of devices combined with a long running time of Observium, you're sure to have a gigantic database of event log, syslog and performance timing information. Some of you have resorted to periodically truncating tables - fear no more, housekeeping.php is here!
You have a choice of specifying either -a for all modules or s, e, r, p and/or t for specific things. Apart from that, there are options in the configuration to set before most of these things actually do anything.
What it does: -s Cleans up syslog data, based on the age of the entries.
Configured by $config['housekeeping']['syslog']['age'] = 0; // Maximum age of syslog entries in seconds; 0 to disable i.e. 7*86400 for 7 days.
-e Cleans up eventlog data, based on the age of the entries.
Configured by $config['housekeeping']['eventlog']['age'] = 0; // Maximum age of eventlog entries in seconds; 0 to disable i.e. 30*86400 for 30 days.
-r Cleans up RRD files, there are 2 options here:
Cleanup of RRD files with a modification time of a certain age. Ports that are gone, sensors that are gone, erroneaously created RRD files due to bugs or hardware changes, etc.
Configured by $config['housekeeping']['rrd']['age'] = 0; // Maximum age of unused rrd files in seconds before automatically purging; 0 to disable i.e. 90*86400 for 90 days.
Cleanup of invalid RRD files. Requires $config['file'] to be set correctly to the path of the "file" magic utility. Runs over all .rrd files to see if they are identified as RRDTool Databases. They can sometimes get corrupted (0 bytes, half a file) when your disk/ramdrive is full. When they exist, they are not recreated even if they're invalid. Consequently RRDtool refuses to add data into them as they are not valid RRD database files. This scripts deletes any .rrd file not identified as RRDTool.
Configured by $config['housekeeping']['rrd']['invalid'] = TRUE; // Delete .rrd files that are not valid RRD files (eg created with a full disk)
Note: this defaults to true currently. Handle with care.
-p Cleans up deleted ports. You can do this manually by using the purge button in the web interface, but one could probably safely assume if a port has not returned after 30 days it really is gone. Shorter could be useful in your environment, mine is set to 7 days.
Configured by $config['housekeeping']['deleted_ports']['age'] = 0; // Maximum age of deleted ports in seconds before automatically purging; 0 to disable i.e. 30*86400 for 30 days.
-t: Cleans up perf time information. Observium keeps performance times per module, per device and per run. This quickly accumulates into a giant number of rows in the database.
Configured by $config['housekeeping']['timing']['age'] = 0; // Maximum age of timing (discovery and poll time) entries in seconds; 0 to disable i.e. 7*86400 for 7 days.
Housekeeping logs into housekeeping.log in the log_dir. (*)
I recommend putting this into cron, daily or weekly (yes, even though the age is set in seconds, I don't recommend running it every minute ;), but only after you've run it manually once. It'll take quite a while the first time to clean up large databases. The "check for invalid RRD files" is probably quite I/O heavy.
i'll be documenting this on the wiki soonish.
(*) log_dir was also added as a new configuration directive, it defaults to ${installdir}/logs which will be created automatically for you if possible.
Tom