Enterprise Software

Anomaly Detection – A Novel Approach

Posted by | Enterprise Software, Optimization, Technology | No Comments

One of the harder things to do in monitoring system health or even brand health is to detect anomalies or “events” that are happening that may be out of the ordinary. It gets harder to detect such events when you’re data fluctuates frequently or when you’re trying to build a model that can be applied towards dramatically different datasets. This topic has been talked about many different times and contested with different theories, mathematics, and approaches such that we don’t create “alert fatigue”. Of course, I had to try and do it differently! Let’s talk about the approach I’ve been testing out.

TL;DR – I’m testing out a model that looks at the velocity vector moving average and derivative moving average. By looking at 3 time series data points of the derivatives in the past and extrapolate into the future, paired with the velocity vector, we get a good idea on when an anomaly may be happening.

I’ve explored many different ways including sophisticated machine learning methods. However, one afternoon I had a thought about looking at the problem a different way. This approach includes methods from day trading, physics, and calculus. The approach is simple enough: look at the change in slope agains the moving average. The reality is that there is a lot more to get it to work. And now, for the deconstruction…

Acceleration Moving Average

The first portion of this theory is to take a look at the acceleration moving average. This is found often in day trading as an indicator to a dramatic shift in direction that outpaces the prior accelerations. In this portion of the formula, we look at the acceleration formula as follows:

a = ∆v/∆t

For each time series, we store what the a is calculated out at. From there, we compare that against the moving average. Internally, we have tested out looking at a 14 day moving average on a 10 minute time series. So for each 10 minute increment, we look at the current and compare it against the moving average. However, as you can imagine, this can fluctuate quite dramatically and cause alerts to be sent that shouldn’t be. The risk of looking specifically at this is that you set a static threshold – ie. if current acceleration is greater than acceleration moving average by 20%, send alert. Where this really breaks down is when you get multiple spikes over the course of a day with each subsequent spike being less in volume (but still notable). Since the moving average will increase to account for the most recent spike, you lose out on the sub sequent spikes. Example below.

anomalies

If you look at the large red line right at around 12/1/15, you’ll notice that if we were to use a moving average that our moving average line would be pulled up dramatically. This causes the subsequent events happening at around 12/15/15 and 12/18/15 to be missed. While the acceleration moving average is a novel approach, we’ve actually found that it isn’t necessarily as useful as we’d like. It has often been led astray with wild fluctuations of volume and has a high propensity to trigger alerts that are not actually needed – such as the above. This led to look at a different approach.

Velocity Vector

Vectors allow us to quantify an object’s direction and magnitude. When looking at an anomaly, we want to understand it’s direction of movement on an x,y axis then pair that with the magnitude of volume. We could arguably get rid of the acceleration moving average at this point as they effectively become the same thing once we look at the moving average. Now, the velocity vector gives us a bit of understanding in real time what is happening to our volume. See example below.

velvec2

When analyzing twitter volume, volume can be sporadic. Even when reviewing the velocity vector moving average against the current, we still find that alarms are triggered more frequently than we’d like. This is primarily due to the data not being smoothed out. Meaning, we get snapshots of volume at different time frames as whole numbers, such as 10, 50, 34, etc. This makes it hard to discern the significance of a change in the vector portion of velocity vector. This brings us to the third portion of the formula.

Fourier Smoothing

Since Twitter volume data comes in as chunks of whole numbers, this causes our vectors to change dramatically which renders the prior useless. Vector velocities appear to really only be useful when the data is smoothed out between the actual time series counts. For example, if we have the two data points of 1 and 5, we’d actually want to fill in the difference with 1.1, 1.2, 1.3, 1.4, etc. In an interesting way, Twitter volume data can sometimes look like audio signal data, the sense that it can be incredibly choppy. In order to smooth it out, we can actually use Fourier Smoothing to create a nice looking data set as the Twitter volume count comes in. Below is an example of Fourier Smooth, where we look at discrete values of temperature by day and smooth out the data using this technique.

fourier-smoothing

Now when we look the velocity vector moving average, the value becomes more stoic and doesn’t change nearly as much as it did when no smoothing was applied. If we look at the velocity vector on 10 minute increments as a 14 day moving average, we get some nice insight as to the different fluctuations happening. However, we’re still looking at the current state and still don’t have a good way of letting the machine tell us not only when to trigger something, but letting us know when something might happen. In order to solve the predictive portion of that problem, we looked to derivatives.

Derivatives

Since our Fourier Smoothing of the dataset provided nice hyperbolas, we can easily calculate the derivative of any data point at any given time. In our environment, we have tested out looking at the derivative at each 10 minute increment. Since calculating the derivative gives us a line that theoretically extends both into the past and future, we actually look up to 3 time series increments into the future and past. From there, we calculate the change in the y axis of the derivatives. See example below.

Tangent-calculus.svg

By doing this, we can predict what the change in the derivative is up to 30 minutes before we get to that point. This is key because we’re looking specifically at the slope of an extrapolated derivative. But how do we know when an anomaly may happen? We look at the moving average of the past 14 days of the change in derivative slope. If the current change in slope exceeds the moving average, we’re likely to have an anomaly on our hands. However, we have found this to also be a bit too sensitive by itself which led to creating a combination of both the velocity vector moving average and the derivative slope moving average.

By combining both, we force a decision to be made. If the velocity vector is within the moving average but the derivative slope isn’t, it is most likely not an anomaly. Conversely, the same also applies. What I did find out though was that if both the derivative slope and the velocity vector exceed the moving average, it’s a strong indication that an anomaly is or will happen. I’ve also tried pairing this with 1 standard deviation away from the moving average as a dynamic threshold. Adding this in creates a system that only pulls out the very extreme cases of anomalies. In further tests, I’ll probably be testing out using units of standard deviation as a way to create a “more/less” sensitive alerting system. Almost like a user-drive knob or refinement method.


Is this a finished approach? Absolutely not. There’s a lot of challenges in getting this to work 100% properly to the point that it meets some sort of statistical rigor. I’ve been encouraged by the early results looking at real examples of events happening with our customers in the Twitter sphere. So far, I’ve seen a decent amount of success with predicting when an anomaly may be happening. There are other methods that we could look at to help refine the model, such as adding an F-Score for precision and recall for better accuracy on the prediction front.

New vs. Upgrade Features: How to manage the process

Posted by | Enterprise Software | No Comments

Being a product manager has it’s perks, such as being able to create products from scratch. There really is no better feeling than seeing something go live for the first time and see the excitement from your users. When you hear customers love your product you get that euphoric feeling of “Yes! We did it!”. However, that is not all of product management. In fact, it’s probably a much smaller subset than most people think. The less glamorous portion of product management is upgrading the old bits within your product.

If you’re a software company, there is a good chance that you’ve accrued tech debt and made decisions to forgo development on portions of your product. These are the “nice to have/RFP checkbox” features that you create to get into deals but neglect to keep up to date. Often times, this comes to bite you in the ass later but it’s a known tradeoff. A common challenge that many product managers face (myself included) is delivering new features while updating old ones. For example, if you’re creating a new product from scratch on a new front end architecture, you’re going to want to port that over to the rest of the product to reduce the tech debt. However, doing that is a challenge because a large portion of the old platform is built with legacy code. Without rewriting the entire platform at the same time, how do you go about making incremental updates while delivering new features to stay competitive?

There’s no silver bullet answer to this question. But, since this is a blog post, I do have a potential answer! The method I’ve used in the past is incrementally changing the product over time, then making the full switch once most of the platform is complete. Obvious answer, right? It’s much more challenging in practice to actually do this.

For example: let’s assume that we have 10 pages that each have a unique visualization on them. The visualizations are using a myriad of different visual libraries to render the data that we’re serving up. Additionally, the framework we’re on is a legacy one and we want to decouple the front end from the back end using APIs. The wrong way to go about making the upgrade is by trying to build this all at once. By committing all of your resources to the switch, you leave no room for new feature development that may be critical to closing deals.

There are 3 key items I look for when tackling an upgrade:

  • What area of the product has the highest customer pain point
  • What are of the product has the most customer usage
  • What area of the product has the highest reusability

We want to look at each of those questions individually because they each hold their own weight when prioritizing what gets rebuilt and when. The item that has the highest score for these 3 questions is the one that should be built first. By creating a list of these items, it becomes much easier to build an upgrade path that allows for new development on the new framework to be sprinkled into the roadmap. Upgrading a platform is a process and rarely should be a rough switch. Some may disagree here but the reasoning is that not all items in a platform hold equal weight in value to the customer – especially in enterprise. This means that we want to be extremely targeted about where we spend our time in order to create the biggest positive impact.

In my experience, the best way to prove out the upgrade path is to scope the development down to a thin, vertical slice of the workflow. Meaning, when you come out of your prototype or MVP, your customers should be able to use the product end to end for the specific item that you upgraded. The key here is to rapidly build out this thin, vertical solution as fast as possible in order to get it in front of customers faster. This shows customers that you are actively upgrading the areas of the product that need help but also validates your direction on the upgrades. From there, you can start to either move horizontal or vertical in upgrading. Horizontal means taking a component that you built and reusing it in different areas in the platform to keep consistency. Vertical means taking the same type of thin workflow solution and replicating it across different parts of your platform.

Since the work is broken down into short bursts and is compacted into bite sized developments, it allows you to continue delivering new features that adhere to the new design or engineering efforts you’re doing. This means that for every feature you introduce, you should be using the new upgraded code or design. By doing this, you’re doubling down the speed of delivering the new experience, getting your customers into a happier spot.

In my opinion, there’s never a “right” way to do platform wide updates, primarily because it will always be painful (whether for you or your customers). There will be hurdles, there will be confused customers, and there will be short-term larger code base support needs. However, this above method helps reduce the risk and pain significantly by compartmentalizing the breadth of impact into smaller chunks while simultaneously providing more rapid feedback on your upgrades.

The Commoditization of Data: Where real value is moving to

Posted by | Enterprise Software, Technology | No Comments

I’ve been doing a lot of consulting lately for large groups around the personalization and analytics space. There seems to be a common trend amongst many of the questions I’m getting asked: How do all of these providers differentiate? Where is the value for enterprises looking to deploy these new softwares coming onto the space?

In all fairness, it’s not an easy question to answer. There’s a lot of moving parts and competitive pressures forcing enterprises (and software vendors) to think different about how true value is delivered to the end user. On one hand, you have a highly competitive marketing landscape at the very top who are all competing for the same business. These are analytics providers that typically collect torrents of data about your users. Often times these software providers are channel focused. On the other hand, you have IaaS providers, primarily Amazon AWS, who are completely commoditizing this space. Amazon allows users in house to easily create and deploy custom analytics applications. To further it, they’re starting to offer Business Intelligence tools and Machine Learning capabilities that are simple to deploy. This is putting big pressure on the software vendors to differentiate themselves.

With the marketing software market to increase between 17% by 2019 to over $56B in total market cap, the need to provide real value will come much more rapidly than many vendors expect. Data collection, management, and querying is basically table stakes for enterprises evaluating these vendors. The real shift has moved towards actionable insight, predictive analytics, and utilizing machine learning to understand users at an implicit behavior level.

To get a better perspective on where value is shift, let’s break down the following image.

0RidI
At the very bottom, we have “Data Collection & Management“. What data encompasses is data collection, management, transformation, etc. I’m using it as a catch all term for anything around database systems. While there are many criteria for having a solid data practice, we can safely say that this space is very commoditized at this point. Many analytical vendors are all collecting the same or very similar data and enterprises are no longer wanting to wait until after the fact to make decisions on what to do next (especially with digital marketing strategies). With Amazon making this layer of the stack so easy, this is no longer a competitive advantage or a value proposition for these software providers. Up the stack we go.

Pure Analytics providers are getting eaten away by Amazon or startup disruptors. Vendors like Mixpanel or Swrve are pushing heavily against incumbents like Omniture to price more competitively and offer better differentiated value. Since Google Analytics is a robust offering for free, there’s pressure to provide a differentiated offering. Additionally, current software technology makes it easy to do graph visualizations (think d3.js), translate data, or pull data in from different sources thus making this part of the stack and the former commoditized. This isn’t to say that you can get by without the Analytics or Data part of the stack. You absolutely have to have both in order to move into the areas where real value is built. Up the chain we go.

Segmentation and User Targeting is getting closer to the point of commoditization but not there yet. We’re still seeing new forms of targeting driving value and decision making when enterprises are evaluating vendors. Often times, these new vendors are giving enterprises the true ability to track and target customers across channel versus a single channel. This has been a big push by many of the leading research firms, such as Forrest and Gartner, that they’ve termed “Unified Customer View”. In my opinion, this is most of the market today and many of the vendors in this space are, in some form or fashion, able to accomplish this. There are additional features that help add value proposition such as CMS integrations, triggering events, and automated segmentation. However, the challenge is that the brunt of the work is still put onto the marketer. The reality is that marketers are having to do more with less and time is not on their side. This portion of the tech stack is where the current focus of commoditization is at with the movement of machine learning into being able understand and react to user behavior.

This brings us up to current day and the future which is the meat of where I believe the value is moving to.


The future is always unclear but here is my prediction. Software that automatically surfaces up general insights, unique insights, predictive insights, and autonomous reactions is going to be king. To break it down, let’s look at each one of these items.

General Insights

In my eyes, general insights are insights into the audience that users of a software vendor would perform on a daily basis. This might be something like audience health (DAU:MAU or audience return frequency) which helps keep a pulse on audiences. In the future, I anticipate that vendors will automatically surface up a suite of insights, based on vertical, that users get as more of a “state of the union” dashboard with the obvious ability to add their own automated metrics via a nice query builder of sorts.

Unique Insights

Users don’t always know where to look for insights. Nor do they always know the right questions to be asking. Since marketers can’t constantly look at all of the crazy data points they’re collecting and correlations between these data points, future software vendors will have to move here in order to meet these demands. These types of insights would be the machine crunching very sophisticated machine learning models each night in order to surface up insights that the model believes the user should know about. You can think of this as “Show me what I don’t know”. An example of this might something like “Your broadcast marketing campaign to all users had 20% higher engagement with French Male users in the Active segment of your audience”. The software will surface up the things the marketers doesn’t even know to think about.

Predictive Insights

As the name suggests, these are insights that are based off predictive modeling. Since these vendors are required to collect troves of data on user activity and behavior, there’s a world where machine learning models can predict the performance of campaigns, when to send them, when a user may be churning, etc. These are insights that help the marketer be proactive about their approach when engaging with users or handling changes within their digital ecosystem. There are lots of vendors playing in this space right now as point solutions but very few (if any) have a buttoned up product that is making a significant difference yet. The reason for this is that it is hard to build a generalized machine learning model that can find covariance between the different data points collected in a way that works for different verticals. This isn’t to say that it’s impossible but it will take more time to get to a really good spot here. This is where the marketer gets exceptionally high value because it moves them from the reactive analytics world to the proactive “autonomous” world.

Autonomous Reactions

Building on the previous item, autonomous reactions are predominately around automating many of the mundane tasks the marketer has to do today. For example, setting up an automated trigger to email a user when they perform a specific event on a channel. In the future, high value world software vendors will build machine learning models that are similar to developments in artificial intelligence. The system will know when to reach out to users with what type of messaging based on variable inputs at each point of the user lifecycle. The user can pinpoint where positive or negative behaviors may be at and assign value weightings to these data points, however it is the machine learning model that is optimizing the user journey. This is along the lines of Factorial Experimentation with machine learning models that build off of notions such as Markov Chain Monte Carlo (MCMC) simulations. This helps the marketer focus heavily on things that are much harder to automate, such as their acquisition and churn reduction/retention strategies.


With the above points in play, the only way that this is possible is the ability for these sophisticated machine learning models to understand explicit and implicit user behavior. Today, we have a lot of explicit behavior since this is easily tracked (session count, time on page, etc.). The real value is in understanding explicit user behavior. Implicit behavior is extremely valuable because you can tie in personas which help dictate how you market this personas. For example, I’m an unknown user on a travel site looking for a vacation that has sandy beaches, is sunny, and warm. I start looking at different beach destinations that are in tropical regions, search for different locations, input different fields about what temperature range I want, etc. I interact with the site on a more intimate basis. In the background, a model is crunching an analysis on who I am which can inform the broader system, specifically the CMS, what types of content I may like.

The reason why this is so fundamentally important is that we can understand who our users are without known explicitly who they are on an authenticated basis. Additionally, we can serve up very specific content based on a granular understanding of what the user is interested in. This is exceptionally powerful because you’re able to achieve a deep personalized connection with your user. A great example of this is Spotify’s “Discover Weekly” engine. People have described this weekly curated list of songs based on who you are as “creepy awesome”. What is effectively happening is that Spotify is crunching 100’s of intimate data points on your listening habits and building a playlist each week customized for you, and you only. An example of some of these data points are:

  • Genres of songs frequently played
  • Sub-Genres of songs frequently played
  • Did user skip within first 30 seconds?
  • What type of track is it? (high energy, relaxing, etc.)

It’s a beautiful example of personalization that is damn near perfect. Much like the Discover Weekly engine, software vendors in the marketing space are going to need to get to this level where they can curate and deliver content at an extremely granular and personal level. As the stack moves upwards in value due to commoditization, the next battle will be fought in the insights, personalization, and proactive engagement world. It’s an exciting time to be watching and building products for this line of work. I welcome the day when brands know me well enough to know when to engage with me, how to engage, in what form, in what frequency, in what context, with what content and so much more.


Agree? Disagree? Let us hear your thoughts on the next generation of marketing software!