Hi everyone,
Yesterday I committed a housekeeping script to clean up after Observium.
If you have a nice number of devices combined with a long running time
of Observium, you're sure to have a gigantic database of event log,
syslog and performance timing information. Some of you have resorted to
periodically truncating tables - fear no more, housekeeping.php is here!
You have a choice of specifying either -a for all modules or s, e, r, p
and/or t for specific things.
Apart from that, there are options in the configuration to set before
most of these things actually do anything.
What it does:
-s Cleans up syslog data, based on the age of the entries.
Configured by $config['housekeeping']['syslog']['age'] = 0; // Maximum
age of syslog entries in seconds; 0 to disable
i.e. 7*86400 for 7 days.
-e Cleans up eventlog data, based on the age of the entries.
Configured by $config['housekeeping']['eventlog']['age'] = 0; // Maximum
age of eventlog entries in seconds; 0 to disable
i.e. 30*86400 for 30 days.
-r Cleans up RRD files, there are 2 options here:
Cleanup of RRD files with a modification time of a certain age. Ports
that are gone, sensors that are gone, erroneaously created RRD files due
to bugs or hardware changes, etc.
Configured by $config['housekeeping']['rrd']['age'] = 0; // Maximum age
of unused rrd files in seconds before automatically purging; 0 to disable
i.e. 90*86400 for 90 days.
Cleanup of invalid RRD files. Requires $config['file'] to be set
correctly to the path of the "file" magic utility.
Runs over all .rrd files to see if they are identified as RRDTool
Databases. They can sometimes get corrupted (0 bytes, half a file) when
your disk/ramdrive is full. When they exist, they are not recreated even
if they're invalid. Consequently RRDtool refuses to add data into them
as they are not valid RRD database files. This scripts deletes any .rrd
file not identified as RRDTool.
Configured by $config['housekeeping']['rrd']['invalid'] = TRUE; //
Delete .rrd files that are not valid RRD files (eg created with a full
disk)
Note: this defaults to true currently. Handle with care.
-p Cleans up deleted ports. You can do this manually by using the purge
button in the web interface, but one could probably safely assume if a
port has not returned after 30 days it really is gone. Shorter could be
useful in your environment, mine is set to 7 days.
Configured by $config['housekeeping']['deleted_ports']['age'] = 0; //
Maximum age of deleted ports in seconds before automatically purging; 0
to disable
i.e. 30*86400 for 30 days.
-t: Cleans up perf time information. Observium keeps performance times
per module, per device and per run. This quickly accumulates into a
giant number of rows in the database.
Configured by $config['housekeeping']['timing']['age'] = 0; // Maximum
age of timing (discovery and poll time) entries in seconds; 0 to disable
i.e. 7*86400 for 7 days.
Housekeeping logs into housekeeping.log in the log_dir. (*)
I recommend putting this into cron, daily or weekly (yes, even though
the age is set in seconds, I don't recommend running it every minute ;),
but only after you've run it manually once. It'll take quite a while the
first time to clean up large databases.
The "check for invalid RRD files" is probably quite I/O heavy.
i'll be documenting this on the wiki soonish.
(*) log_dir was also added as a new configuration directive, it defaults
to ${installdir}/logs which will be created automatically for you if
possible.
Tom