Problem statement: My trigger didn’t fire when I expected it to; looking at my events, it appears it should have alerted me.
Cause: We see this quite a bit in support cases, and a very usual cause is a gap between when the trigger query ran and the event latency involved. Event latency is the difference between an event's timestamp and when it actually reached Honeycomb.
How to investigate with Honeycomb:
I was curious to see if there was a way for us to see the latency on a specific event for troubleshooting in the future and it turns out there is! Essentially you can use a derived column to calculate the latency of a given event in seconds. In your derived column, you can use the INGEST_TIMESTAMP() function like so:
SUB( INGEST_TIMESTAMP(), SUM( EVENT_TIMESTAMP(), DIV( $duration_ms, 1000 ) ) )
An example of an event_latency derived column can be seen below, looking at a specific event, we can see that although the timestamp on the event says 07:50:00, there was a delay in sending it to Honeycomb of almost 6 minutes:
Taking event latency into account when creating a trigger:
On the create trigger screen, you will see an Event Latency History graph next to the duration field. This graph describes the maximum and average amount of delay between the timestamp on the event and when it reached Honeycomb. You can use this data to help choose a duration that captures all your events, even if they are delayed.
For example, if the average event latency is 2 minutes, and you want to run your trigger every 5 min, choose a 7-minute duration to ensure that delayed events are captured by the trigger. Please note that if your traces span a long time frame, you may see high latency in this chart, even though the traces are arriving as soon as they are complete.
Related links: