During my time at Localytics there was a drastic movement towards getting deeper insight into the customer lifecycle (acquisition to engagement). It’s the holy grail for any marketer to be able to understand where the user was at in their journey, where they might go next, and when the may potentially fall off. This made me construct a theory around how we look at user data so that we can understand better what forces push a user in one direction or pull a user in another. This theory breaks down into 2 major components: Track user behavior in 3 dimensions instead of two, and utilizing deep learning networks to provide insight into where a users’ behavior is moving towards.
In many (if not all) of the current platforms on the market that offer some level of personalization or optimization, they are viewing a single marketing channel in a two dimensional space. What I mean by this is that, for example, we would look at something like conversion rate vs. time on a website. We could even get a little bit more technical and say conversion rate by a unique user over time to get more granular. There are a very select few amount of platforms providing a holistic (barf buzzword) user centric view of how individual users are interacting with each of the individual channels.
As enterprises shift dramatically towards what everyone is calling “digital transformation”, there’s a specific trend that is being surfaced: brands need to be where ever their users are at in all distribution mediums. What this means is that you’re not mobile first, web first, or social first, but rather taking the stance that your users will engage with your brand on one of many different channels, and on those channels in many different forms. For example, a user engaging with the “social” channel could mean Twitter, Pinterest, Facebook, or some random forum with their action being a like, a comment, or a share.
If we go back to viewing marketing channels as two dimensional spaces, we can start get some idea of what this looks like and the X or Y axis. For example:
- X-Axis = Time
- Y-Axis = Conversion Rate
- Graph = Individual User Level
Today, you would do this for many different channels then try to discern what is actually happening. I’ve seen this manifest on Excel spreadsheets where the rows look something like:
- Table = Campaign
- Time Series = Week over Week on 6 month basis
- Mobile App = 3.5% conversion rate on “X” event/trigger
- Website = 3.2 minute avg. session length
- Twitter = 2 #’s or @’s referencing “X” brand
- Email = .03% open rate
|Campaign ID||Mobile App CR%||Website||Push|
|This Week||1995803||3.5%||3.2 min||2 Interactions||.03% OR||10% CR|
|Last Week||1994068||3.2%||3 min||1 Interactions||1.5% OR||9% OR|
Listings of metrics like this raise questions such as was this campaign successful? Did our messaging turn off our users or increase retention? Did any users churn week/week? But what if some users underneath a campaign show signs of churning but, in reality, they’re really just not engaging in your emails? Were there outside factors influencing these campaigns, such as weather or holidays? Were the results that I’m seeing statistically significant enough that I can trust them?
There’s a ton of open ended questions that marketers get stuck with and have no clue what any of it means. It’s complete data overload and what they end up doing is surfacing the deltas between the weeks as their KPIs, cherry picking the best, etc. It’s not a good situation overall.
The theory goes like this: When doing aggregate metrics or trying to understand multi-channel marketing efforts, viewing users on 3 dimensions instead of 2 surfaces much deeper insights. This manifests itself into 4 quadrants that generalize the overall user behavior that can occur on any channel:
- Highly Engaged (top right)
- Engaged but Not Responding (bottom right)
- Responding but Not Engaged (top left)
- Not Engaged and Not Responding (bottom left)
We want to keep them as generalized quadrants because there needs to be ample room for interpretation based on channel, vertical, or business interpretation. In my experiments, it’s easiest to contain the graph on a scale of 0-100. We’ll come back to this later.
Here’s the hypothesis that we’ll work off of, viewing the above theory from an app perspective :
If we think of audiences/users in 3 dimensions with more aggregate-type metrics, we can provide a more prescriptive insight into audiences/users based on many summary metrics. Additionally, we can surface audience/user movements automatically without the need for customer input.
We have our hypothesis, but now we need to add the general titles for the X and Y axis. This looks something like:
On our X-Axis we have App Engagement which could be an aggregated metric based on many different individual values. On the Y-Axis we have Marketing Engagement which could be attributed to all the different mediums a user might interact with when on their mobile device. We also have our 4 quadrants which help us positions users. For the above graph, I swapped out “Not Engaged and Not Responding” with “Risk of Churn” which can be viewed as synonymous.
In this world, new users would be placed onto the chart in the very center and, based on their actions, start to form a vector for their behavior. So once you start to have users on the chart, it may look something like:
With many analytics or marketing platforms, there are APIs that allow you to pull data on a scheduled basis. In our scenario here, we would want to pull data from many different data sources within our ecosystem into a nightly or weekly snapshot. This could be a super flattened JSON file that stores performance data for campaigns, segments, audiences or individual users. We don’t want to pull very specific data such as when a user engaged with a campaign, but rather aggregate or composite metrics (ie. 7 day retention).
This nightly or weekly snapshot of composite metrics allows us to identify anomalies, user behavior shifts over time, and assign a “vector” of movement. From there, the vectors of movement can be charted to look something like this:
Now we’re getting somewhere interesting. On each of the axises you can see different types of metrics that may contribute to the aggregate or composite metrics. Since we now have day over day or week over week deltas of user behavior, we can plot vectors. Different vector deltas may have different positive, neutral, or negative indicators associated with it. In the above example, we have 2 green users who are in the “Risk of Churn” section but they may have both elicited a significant change in user behavior based on the aggregated metrics. These changes may be signifying that the user is being recovered and coming back towards a more healthy state with regards to App Engagement and Marketing Engagement.
By doing this, it can help us identify anomalies, thresholds where we may want to intervene with an outbound marketing reach, predict where an audience/user may be moving based on their vector, and get a better overall view into where users may be at in their lifecycle journey. Up until now however, we’ve viewed this on 2 dimensions. If we add a 3rd dimension to our chart, it becomes much more interesting. Let’s assign an App Goal as the Z-Axis which could be something like “Increase App Engagement”, where the metric looks at any increase in engagement as a positive influence. This starts to look like the following:
Now that we’re viewing this in 3d, it gives us the opportunity to see where a specific user may be on each of the axises that we care about. For example, we may notice that the blue user in the top right has about a 7.5 on Marketing Engagement, 7 on App Engagement, and 7 on App Goal completion. This user would be considered a safe user and is stable in what we’d considered the “Highly Engaged” quadrant. However, on the bottom with our turquoise user, we’re seeing trouble. The user is low on App Goals, very low on Marketing Engagement, but has decent App Engagement. If we had a vector assigned to this, we could see the direction in which the user is heading to determine if this is a user who is moving into the “Potential to Churn” quadrant.
How A Deep Learning Neural Network Fits
Here’s where we will go down the rabbit hole. In the above image, our users were defined as “spheres” visually. What if we actually viewed them mathematically as spheres? Follow me on this one.
A user has a sphere around them that defines the behavior quadrants which, in any direction, have a global maximum value of 1. The user initially starts out with a neutral value of 0 (center of the sphere). When the user performs an action such as having 3 sessions on an app in 1 week, we see that as a positive behavior and assign that as “Highly Engaged”. On our graph above, the user started at the center of the graph (5, 5, 5) with the user behavior is at plotted at 0, 0, 0. Now, with the new positive session count towards “Highly Engaged”, we weight this direction in order to provide its vector movement.
Metaphorically, this is similar to having a sheet that has our above quadrants on it that is pulled tight. You drop a weighted ball in the “Highly Engaged” quadrant and the sheet is pulled in that direction. If you wrap that sheet into a sphere, you are doing the same thing except that each drop (or throw) of the ball is pushing the entire sphere in the direction of the quadrant (in a 3 dimensional space).
Credit to Michael Nielsen and his blog on Neural Networks and Deep Learning.
We do this through the use of a deep learning neural network. This is supremely described in detail by Michael Nielsen on his blog which I highly recommend reading. In essence, the neural network utilizes many different data sources in order to product a mechanism called gradient decent within a specific “perceptron” or “neuron”. These neurons work on a sigmoid function with their output being a value between 0 and 1. As the output values come through to our “user behavior sphere”, the values weight the sphere in the direction of quadrants that the user behavior is attributing towards. This moves the “user behavior sphere” into that quadrant.
For example, positive App Engagement and Marketing Engagement pushes the user sphere towards the “Highly Engaged” quadrant while day over day or week over week reduction in interaction on either channel conversely weights the user sphere down towards the likely to churn.
With all of the different channels a user may be interacting with a brand on, a neural network can ingest this data, decipher it, and provide the proper vector for the user with respect to our 3d visualization.
Bring it back together
If we climb out of the rabbit hole and out of an app specific example, we can start to view that a 3 dimensional view of users can be beneficial for synthesizing and expressing user behavior across many channels as one unified view. It allows us to do interesting things with the concepts of explicit vs. implicit user behavior and be able to plot that behavior in a way that makes sense to marketers as well as machines. To be clear, I don’t think it’s beneficial to surface the above graph visualizations to marketers but rather view the data from this perspective. What I do think is that, behind the scenes, we view the user behavior in this format mathematically but surface up at a dashboard level only the interesting movements, such as “X number of users are potentially moving from one quadrant to another”.
This theory could be useful for complex ecosystems that have disparate metrics and data sources. However, it is important to note that I believe that this could only be useful when the metrics are in an aggregated format. I believe that if you did this on an individual metric basis that it will dilute value since we’re not taking in the full picture of the user or audience on all channels.
I still have many questions personally around how this can be applied to different business models given that metrics like engagement mean completely different things to different verticals (media vs. travel). Additionally, this may only be useful at an audience level instead of user level due to the graphs nature to simplify and abstract away the metrics.
We currently do similar types of thinking around users when it comes to radar graphs. We assign a metric and value ranges to each of the points on a radar graph and track user behavior as they progress towards “who they are”. This is especially useful for when we’re trying to understand what persona a user may fit into. However, the drawback is that this is still 2-dimensional thinking and has flaws such as not being able to effectively measure goal completion or aggregate metrics effectively (7 engagement, weekly conversion rate, etc.). There’s a world where we could add a “depth” factor to a radar graph but that is a theory for another day.
Radar Chart example from Chris Zhou
I have more thoughts around how this can be morphed into interesting things like open graphs for further automation among the entire marketing ecosystem. For example, if we have a user who is starting to become low on certain marketing engagement efforts, could the system automatically create a retargeted marketing campaign on Facebook as a method of exploitation test to further optimize the system?
A very open area that I need to consider more here is specifically around downward/negative weighting. In reality, users don’t necessary do events that show that they are likely to churn, except for some exception like opting out of push notifications. Users often elect to “churn” simply by stopping to use a channel. This is problematic for this model as we don’t have a proper way of giving the graph empirical data to weight it down. My hypothesis here is that we have a time decay model that looks at the deltas of values day over day or week over week to identify shifts in engagement specifically. This decay model would take into account metrics like % difference of engagement, session length decrease, # of days/weeks of downward interaction. The output would be a weighted value pulling the user sphere towards a non-favorable quadrant.
Another area of thought exploration is around doing further clustering within the quadrants so that marketers can target certain sections of the quadrants. For example, a marketer may see that one of their users in the “Highly Engaged” category is starting to move towards “Engaged but Not Responding Quadrant”. If we view the graph on a scale of 0-100, the marketer may want to intervene when the user has moved into the region of:
- X-Axis = 60-65
- Y-Axis = 55-65
- Z-Axis = 50-60
In an ideal world, either the system or the user would set these quadrant sub-parameters and, when they log into their dashboard each day, get a notification with how many users have a “drift vector” showing that they may move quadrants. This would allow the marketer to get ahead quicker on the changes.
Yet another area of contention is the quadrant naming. I think there’s a good argument that would need to happen on whether or not the quadrant naming, nomenclature, and conventions make sense.
It’s a crazy theory but, based on the work that I’ve done in my past with different teams, there is some validity with regards to how marketers view users and how we view data. More thoughts to come
Love the idea? Think I’m crazy and full of shit? Leave a comment and tell me what you think!