Description:
On September 15, 2023, our customer experience team identified an issue with Insights not displaying data after the dashboard had loaded.
Upon investigation, our engineering team identified that the root cause of the problem was that several ETL functions were taking longer than allowed.
Type of Event:
Insights dashboards loading but not displaying data.
Services/Modules Impacted:
Insights (including Direct Connect and Editor).
Remediation:
The ETL function was migrated from the SQL server to Insight’s Snowflake server, improving ETL speed by 30-50%.
Timeline (AEST):
12th September
15th September
19th September
23rd September
Total Duration of Event:
~ 7 days and 16 hours.
Root Cause Analysis:
The scalar functions used for Insights’ ETL (hourly for live data clients, nightly for all other clients) were taking longer due to being run on the SQL server rather than in Insights' Snowflake server. Once moved to the Snowflake server, ETL completed before timing out.
Preventative Action:
We will optimize the data extraction method that Insights uses and improve proactive ETL monitoring to prevent timeout issues in the future.