Read our Technical White Paper
Building a Large Scale Machine Learning-Based Anomaly Detection System
Part 1: Design Principles
Credit Karma’s goal is to save Americans time and money. Through analysis of more than 50 million members’ finances, Credit Karma researches and recommends credit cards, loans and insurance based on each individual’s specific credit profile, drastically simplifying some of the most confusing and tedious yet important tasks in personal finance.
The company started by providing free credit scores to recreate the financial industry around people instead of banks. It continues to expand its completely free offerings including its Credit Score Simulator, credit monitoring and friendly, personalized information to help each person understand and make the most of their individual situation.
As self-described “data nerds,” the Credit Karma team spends a lot of time crunching the data from its website and mobile apps, to ensure that its 50 million members are maximizing the use of its solution. All told, Credit Karma has hundreds of thousands of business and technical metrics to monitor to keep business running smoothly, and the multiple teams within Credit Karma have used various self-developed tools to accomplish this.
Yet the company had a delay of at least 24 hours – and sometimes much longer – before they could identify important business incidents. In one case described by Credit Karma’s Senior Product Manager Pedro Silva, the revenue for a specific page had decreased by 50% over the course of three days before the team was able to identify the issue, and then it took additional time to determine the root cause of the change, which was challenging to find (it turned out to be due to a technical update to the front end of the system, from several days earlier).
“If the entire service were down, we would know pretty quickly, of course,” Silva explained, “but for errors impacting a specific page, offer, browser, platform or feature – all of which affect revenue and customer satisfaction – we would not know for at least a day or two using other analytics tools, and Anodot finds them right away.”
Silva and the Credit Karma team sought a solution that would identify quickly when a metric increases or decreases, and then provide enough information to help identify the root cause, so that it can be resolved before business suffers and revenue is lost.
After testing multiple solutions and considering in-house development, Credit Karma selected Anodot for its business incident detection.
“To test Anodot, we streamed a subset of six months of historical data to see if Anodot would find the same anomalies we had found manually, and it did,” Silva said. “It was clear very quickly that Anodot provides a ton of value to both business and technical teams.”
Credit Karma had considered further developing their in-house solution to eliminate business incident detection latency, however they determined that it made more sense to use Anodot since it provided exactly the solution they needed in a user friendly interface, freeing their in-house data scientists to bring more value to the company.
Silva explained that the other analytics solutions they tested were not feasible for them: “The other solutions on the market require manual setting of thresholds for business incident detection, which is not scalable for a company like us, with the large number and complexity of metrics we need to track. Anodot sets itself apart with automatic anomaly detection, rather than manually setting thresholds. We also found the Anodot user interface to be more intuitive and easier to use than the other solutions we tried.”
In the selection process, Silva stressed that the proof-of-concept phase was straightforward and easy to implement with Anodot. “Anodot has been so easy to work with,” he remarked. “One of our engineers submitted a bug report and Anodot resolved it overnight. We have had a really pleasant experience setting the system up and getting everyone on board.”
Today, using Anodot, Credit Karma identifies several relevant and actionable incidents for Credit Karma each day, that is, issues that have a business or technical impact. Examples of such incidents that Anodot identified rapidly:
The benefit of using Anodot is that these types of incidents and many others are detected in real time, and can be dealt with quickly and appropriately.
Three Credit Karma teams are currently relying on Anodot’s business incident detection:
The eventual goal is to roll out access to Anodot to all the teams in the company.
“Anodot sets itself apart with automatic anomaly detection, rather than manually setting thresholds. We also found the Anodot user interface to be more intuitive and easier to use than the other solutions we tried.”
Pedro Silva, Credit Karma Senior Product Manager