Hi,
 
Just noticed that the CBQoS stuff has gone active recently (which is very cool) - currently we get all this via (very) painful integration with Cacti.
 
On looking closer I’ve found what I think is an issue, but stand ready to be corrected. What I think is happening is that Observium is considering packets the Cisco counted as “Exceeded” to have been “Dropped” where in reality they were just re-marked and transmitted normally.
 
As an example, here is the graph for gi3/1 (class-default) on a 6500 chassis:
 
 
Here is the output of the show policy-map int gi3/1:
 
GigabitEthernet3/1
 
  Service-policy input: <name>
 
    class-map: <Class A> (match-any)
      Match: protocol arp
      Match: access-group name <ACL A1>
      Match: access-group name <ACL A2>
      police :
        32000 bps 102400 limit 102400 extended limit
      Earl in slot 3 :
        1377162657 bytes
        5 minute offered rate 2280 bps
        aggregate-forwarded 1377162657 bytes action: transmit
        exceeded 0 bytes action: drop                                       <- Zero drops
        aggregate-forward 2520 bps exceed 0 bps
 
    class-map: class-default (match-any)
      Match: any
      police :
        40000000 bps 3200000 limit 3200000 extended limit 1000000000 pir-bps
      Earl in slot 3 :
        10562039041987 bytes
        5 minute offered rate 11748848 bps
        aggregate-forwarded 10562039041987 bytes action: set-dscp-transmit
        exceeded 297846881624 bytes action: policed-dscp-transmit
        violated 0 bytes action: drop                                             <- Zero drops
        aggregate-forward 13842064 bps exceed 0 bps violate 0 bps
 
 
So in summary, there have never been any actual drops on the port, only re-marking of the DSCP (incrementing the exceeded counter) and then a transmit.
 
Since the re-marking occurs in the “exceeded” counter, I think Observium is considering them as dropped, instead of just exceeded. This also messes up the pre-policy and post-policy counters, because it implies that traffic didn’t make it through when in reality it all did. We use re-marking extensively and so all of the CBQoS graphs show incorrect drop and post-policy information.
 
Either that or we’ve got a bug/misconfiguration (or I just don’t get it), so maybe someone else can confirm? Let me know what debug info is needed if it helps :)
 
Cheers!
 
 
Robert Williams
Custodian Data Centre
Email: Robert@CustodianDC.com
http://www.CustodianDC.com