Visualizing High Performing Enterprises

In my previous posts, I talked through some of the interesting findings from our research at Workfront around analyzing 600M hours work getting performed across tons of different verticals. We found some interesting stats like the avg. number of people on a project is 5.4 or that 76% of work created and completed comes from a process.

We took it another step further to better understand what it looks like to be a high performing enterprise. Our mission at Workfront is to help companies operate better and to make it so that people know their work matters. To do this, we need to see what it looks like to be a high performer. We know what the numbers look like but that only took into account a couple of variables.

Our work primarily focused on how we can accurately predict whether a project would complete on time. We extracted a handful of features that we had high confidence on, created a training data set, then ran the numbers through an exploratory machine learning pipeline. This included models such as neural nets and SVMs which helped us figure out which model could consistently deliver us predictable completion dates. It’s worth noting that we had to do a lot of data normalization and transformations in order to make the numbers work out. For example, it’s hard to visualize a process that takes only 5 days to complete versus a process that takes 500 days.

What we found was like any data exploration effort: the data tells a compelling and actionable story. We found that a best-fit line that was equal and the “x” and “y” axis through our visualization represents a fairly optimal work process. When the line skewed higher on the y-axis and less on the x-axis, companies we’re dramatically underestimating the effort required to get work done in their processes. Conversely, if the line skewed lower on the “y” and higher on the “x”, it means that the organization was dramatically overestimating the complexity of their work and that they could take on more work should the learn to optimize their process.

1 – This means that users had high accuracy of their plan vs. actual minutes and our model was able to accurately predict project duration. This is the ideal state.

2 – This means that users underestimate how long projects will take. They take on too much work, rebase their delivery dates, or have poor scoping practices. This is a non-optimization process.

3 – This means that users over estimate work efforts and have not gone back to optimize their processes. The org “looks” busy but can actually take on a lot more work. This is a non-optimizated process.

Below are a couple of high performing and predictable organizations.

When we visualize organization that we can readily predict their delivery dates, we get an interesting pattern. The tighter the scatter plot, the more consistent and predictable the organization. Curiously but not necessarily surprising, we also found that these organizations completed an order of magnitude more work than organizations with a wider scatter plot.

Don’t you love it when you find little gems like that in your data? 😉

Below are a couple of low performing and unpredictable organizations.

You can easily see that the data points have a much wider distribution which means that we weren’t able to get accurate project completion date predictions. It’s also easy to see that there is a significant less amount of work items being completed.

Now, one flaw with saying that the amount of work an organization can complete due to a process is that not all work is equal in complexity and size. That said, I would still say that this is a good leading indicator that there is a decent amount of truth that shows these work items are close in size. We found that the top 10% of processes that generated 80% of the work had an average of 12 tasks per process. Qualitatively, we also know that many organizations use their processes for fairly similar use cases (Eg. marketing campaign, product prototype, etc.).

We’re doing a lot of fun data explorations such as this in order to better understand user behavior and how to contour our software to augment users to be more efficient. It’s a journey through the data that requires a lot of time and a love of the labor, but we continue to find fascinating aspects that drive our product roadmap into new areas.