Problem Description:
Refinery’s EMAThroughputSampler is not adhering to the specified GoalThroughputPerSec limit. The problem persists even after reducing the GoalThroughputPerSec value. In addition, the behaviour is observed with both single and multiple Refinery instances.
Cause:
In cases like the above, the cause is usually that the cardinality of the field list is too high, preventing Refinery from reaching the desired sample rate. Refinery manages throughput with the EMAThroughputSampler by calculating the sample rate needed to achieve the desired throughput — but it will always make sure to put at least one sample from every key into the list. If the cardinality of the key field is too high, then any EMA sampler will be unable to achieve the desired rate.
One way to think about this is to divide the keyspace size by the AdjustmentInterval (the default is 15s); this is the smallest number of traces per second that could possibly be sent — but the throughput sampler is trying to measure events per second, so you need to multiply that by the average size of a trace. If the average trace size is 20 spans, and the cardinality of the keyspace is 900, then 900/15*20 will generate at least 1200 events per second at a minimum — but if the keys aren't evenly distributed it will be more.
Investigate with Honeycomb:
To see if that's what's happening, you can look into the Refinery metrics and VISUALIZE the metric AVG(emathroughput_keyspace_size). This will show the cardinality of the keyspace — usually something in the range of 50-300 is most effective. It's possible to go higher but only if the input volume is correspondingly high.
Other Factors:
In particular, UseTraceLength can massively increase keyspace cardinality unless the traces are extremely consistent so this should also be a concern if goal throughput is high.