Improving Revenue Modeling: Cohort Analysis at E-Comm and SaaS Companies

Improving Revenue Modeling: Cohort Analysis at E-Comm and SaaS Companies

Large companies have done an outstanding job building the infrastructure to collect data, but they are still learning how to effectively integrate and use it.

Paper by George Heiler, [email protected], (602) 620 3429
Anaplan Model Builder and Head of Delivery, Anavate Partners

Not an E-commerce or Software-as-a-Service professional? Read on if you would like to learn about some of the Financial Planning they do and let me know how it relates or contrasts to your organization. 


At Anavate Partners, we deliver exceptional services to midsize and large enterprises implementing Anaplan, a transformational planning tool. For the purpose of injecting new Connected Planning experts into the ever-expanding ecosystem and for our own talent pipeline, we uniquely cultivate new consultants through our captive educational platform called Skill Factory. Along the way, we have developed a few highly effective and efficient Anaplan use cases for Revenue or Topline modeling in the Software-as-a-Service (SaaS) and E-Commerce (E-comm) segments. This write-up is more technical than it is big picture and should provide value to detail-oriented professionals in any space who are evaluating Anaplan, rethinking their revenue forecast methodology or reviewing Enterprise Performance Management (EPM) and planning technologies. If you would like to browse some of the many proven Anaplan use-cases, head to

For the sake of simplicity and my sanity, let us dub SaaS buyers “customers” and E-commerce buyers “users.” The reason for my semantic distinction is that customer growth at a SaaS is tracked and driven by dedicated account segmentation, sales funnels, and pipelines, while user growth at an E-comm is generally tracked and driven at a more summary level by marketing, ad spend, and campaigns yielding new user volumes. Other than enterprise relationships, you generally wouldn’t see a direct-to-consumer (DTC) E-comm involve Account Executives and the like… a new user is typically a number among a summary of new platform users. Of course, “freemium” lightweight SaaS products can blur these lines, but this is not the topic for today.

What is a Cohort Analysis?

A cohort analysis is a handy way to slice and dice data on your customers and users, and to then leverage that data to predict the future. Predicting the future is not possible, but some get closer than others using cohort analyses and this has made cohort modeling an attractive exercise for Finance teams at SaaS and E-Comm enterprises. It is also common in educational institutions tracking and predicting enrollment and medical or research institutions. A “cohort” is simply a group of users or customers defined by a commonality and then tracked and analyzed as a group; usually the defining factor is the month in which they all joined the platform or signed the contract. By tracking cohorts you unlock insights into various trends based on time, seasonality and more discussed in this paper. See below two basic grids showing the way a cohort analysis often looks, including the way the data is shown across dimensions, axes and time:

Exhibit A:

Exhibit B:

Exhibit A and Exhibit B show the exact same data points for Cohorts and Monthly time units, with Exhibit A showing Time on the Y-axis and Cohorts on the X, and Exhibit B the reverse. Both show the same forecasted data, highlighted in green. Before delving into the details, I would note the simple ability to pivot the same information multiple different ways allows analysts to deduce new insights and track new trends. The bottom line in Exhibit A shows a sum for a single cohort across all months, the bottom line in Exhibit B shows the inverse: a sum for a single month across all cohorts. This is one reason pivot tables commonly indicate that a model has outgrown Excel and longs to be rebuilt in Anaplan. In a couple of paragraphs, under the Dimensionality header, I will discuss the (X,Y) intersection from the Exhibits above and how it can represent a whole host of dimensions and data points. 

The Data

Great data integration is the gatekeeper to successful Anaplan implementations. Often a large chunk of the quick-win return on investment (ROI) is cutting down on hours and weeks spent copying and pasting data between many different systems. This doesn’t affect upper-management directly, but it is a massive toll on analysts and forces hasty work and an infinite opportunity for error and miscalculation. Many of the planning models we replace have raw data organized such that a single typo in the name of a row would break the entire formula or such that simply resorting a table in the wrong direction produces incorrect numbers in a financial model’s output. 

By the final phase of an implementation, we import and refresh data regularly and automatically into Anaplan where it is then transformed and dimensionalized with no room for error. There are many smooth ways to integrate data across dozens and even hundreds of enterprise apps and systems. The source systems for data in Revenue planning vary widely, and will be familiar to most business users. They include systems like Salesforce, SAP, Oracle, Sage Intacct, Netsuite, Amazon Redshift, Looker, ADP, Informatica, MuleSoft, Microsoft Dynamics and many more. Automatic exports to reporting tools (“reporting” being used lightly) for executive leadership and boards often use Anaplan’s native visualization User Experience tools or they include live, bi-directional connections to Tableau, Excel and PowerPoint. The bulk of the upside still remains if you let Anaplan do the heavy lifting, modeling, scenario analysis, systematizing and more and prefer to let lighter-weight tools turn Anaplan’s output into pretty charts and graphs. 


Dimensionality is likely the most important part of modern planning and forecasting. Companies and analysis are complex and become more complex all the time. For example, consider the FP&A Manager or Head of Finance at a retail E-comm company suddenly asking to see the massive financial model and revenue forecast broken into 2 new categories of users: Men and Women. He or she wants to see every single Profit and Loss (P&L) line broken into Men, Women and Total, and to run this new dimension through the same methodology as before. Do you immediately picture analysts scrambling and working overtime to rewrite formulas and Excel workbooks, then triple-checking their work, sending it back a week later and then realizing they forgot to adjust a few formulas? Do you perhaps picture an IT ticket to build a new Finance reporting module or dashboard in your bloated ERP or calling expensive consultants to spend months rebuilding an implementation? I picture a junior Business Analyst simply logging into Anaplan and spending a few minutes adding these new dimensions to the model, mapping the raw data and watching everything, including formulas, time, methodology and reporting dashboards adjust to reflect the new change. In most instances, an analyst could make this change “real time” in the same meeting in which the manager asks for it.

You can imagine the compounding implications of the two worlds described above. To expand on Dimensionality, below are some of the types of dimensions we commonly see in Cohort Analysis models: 

  • Geography
  • Organizational Units
  • User Metadata (E-comm): Gender, Ad/Marketing Channel Source, etc.
  • Customer Profiles (SaaS): Deal Source, Land vs Expand, Account Executive, etc.
  • Shopping Information (E-comm): Product Family or Type, Price Buckets or Tiers, etc.
  • Deal Metadata (SaaS): Segment, Industry, Stage, Account, Opportunity Type, etc.
  • Forecast Versions or Scenarios: Base Case, Aggressive Spend, Conservative, etc.
  • Shipping Classification or Type (E-comm)

Remember, each of these dimensions is usually multi-layered. A geography can easily subdivide into Regions, Countries and then Territories. An organization may track their P&L across only 3 or 4 parent departments but their General Ledger (GL) may have accounts rolling into dozens of departments that each roll into summary departments; for example Development may roll into Engineering which rolls up to Research and Development which rolls up to IT. You can imagine how a module showing a few data points with a formula can turn into millions, or billions, of data points as you add new layers of dimensionality to the same analysis. Anaplan’s proprietary in-memory engine makes this swift and easy from a web browser or even a cell phone.

I should note: Most large and midsize enterprises already do a great job collecting all of this data across many dimensions. Corporate IT has been busy building a beefy data infrastructure. It is the business teams who are still learning how to really use and connect all this data across many dimensions, and we are here to help.

What are We Actually Analyzing

Regarding a Cohort Analysis, there are a few common calculations we model across the dimensions listed above, we call these “line items.” Line items can be thought of as the data points and cells, whether formulaic or hardcoded, that exist at the intersection of dimensions. For example, if you track Users on Geography and Gender, you would have blank cells for modeling at the intersections of Germany and Women, All Countries and All Genders, and United States and Men, and so forth. We write formulas to pull in actual past data and forecast future data for items like: 

  • Number of Active Users or Customers
  • Gross Merchandising Volume (GMV) or Gross Revenue or Gross Sales
  • GMV or GR divided by Customer or User Count – Per Capita
  • Seasonality, Retention, Growth Rates and other Factors for all of the above
  • Number of New Deals
  • Annual Contract Values
  • Net and Gross Additions

These models can become very complex and large, and Anaplan provides the perfect environment to create, host, deploy, manage and report them. As you can imagine, simply adding a new dimension, say “Scenarios,” and then adding 3 scenarios, would triple the number of calculations and data points in the model. In this manner, Anaplan scales up instantly while Excel crashes and SAP and others costs millions and take too long to adjust.

Parameters and Assumptions

Parameters and assumptions make up the human or machine generated secret sauce to using modeling to make predictions and forecasts. I will abstain from diving into detailed forecast methodology, as this would bore readers even further. Instead, similar to other sections of this paper, I will give a list of examples of what I call “parameters” or “assumptions.” I define these as the inputs that drive forecasts; they can either be input by a relevant finance, sales or operations leader, or they themselves can be estimated and input by models and calculations with the option to have a human-input override. A few examples of parameters that are common in the E-comm and SaaS spaces are the following:

  • Max/Min Retention % (E-comm)
  • Expansion Rate (SaaS)
  • Renewal Rate (SaaS)
  • Win Rate (SaaS)
  • User Deterioration (E-comm)
  • Spend per User Factors (E-comm)
  • Average Deal Size (SaaS)
  • Regional and Segment Splits (SaaS)
  • Marketing Spend
  • Smoothing Factors
  • Growth Factors
  • Impairments
  • Manual Overrides and Adjustments

Just like line Items, parameters are usually input across multiple dimensions. You may have a different retention rate assumption for Male and Female users, United States and Europe or Base Case and Bull Case scenarios. You could have different Win Rate assumptions across industries, segments or Account Executives. You could have relevant analysts and managers input these assumptions or you could build formulaic models to generate them based on past data. Anaplan is designed from the ground up to accommodate this type of dynamic and multidimensional analysis and decision making.

What’s Wrong with Both: The Process, Not the Logic

Most of the E-comm and SaaS Anaplan implementations we complete are replacing models built by brilliant analysts, usually with great talent in Excel. There’s usually a certain sort of wistful admiration for the old models and their many quirks when we are done rebuilding them in Anaplan. The forecast logic itself, if anything, is often overdone and not wanting for more complexity, and we don’t hurry into modifying it at the outset unless we agree with the client on a full transformation – not just a process revamp – from the start. The key is to pick up massive efficiencies and to free up some time for analysts to actually analyze instead of running around chasing data, spreadsheets, formulas and workbooks. 

Rather than the forecasting and modeling methodology and logic, the biggest problem E-comm and SaaS analysts and leadership have when forecasting and planning their Revenue is a matter of tools. Here is a litany of only some current problems caused by the most common culprit, Excel:

  • The final model is a combination of 3 or 4 other models, such as OpEx, Pipeline forecasting, revenue “actuals” (or past) data, headcount forecasting and others
  • Heaps of data tables, structured references, pivot tables, named ranges, INDEX(MATCH()) formulas and SUMIF() formulas indicate Excel’s lack of dimensionality has been stretched too thin and it is time to upgrade
  • Formulas that break when someone makes a typo or someone simply resorts a table
  • Data copy-and-paste from multiple different systems
  • Upon month-end, rolling actuals data and formulas forward and reformatting sometimes thousands of rows across dozens of sheets
  • Virtually un-auditable Visual Basic (VBA) “macros” that break and don’t work across different computers – and can’t be fixed until the analyst who knows them helps
  • Dozens and dozens of versions of the model – who has the latest? Which one got #REF’d into oblivion?
  • Scenario and what-if analysis: If it’s even possible, it consumes your analysts’ time and energy like crazy – and the rush to finish often returns inaccurate information
  • On nearly every model I’ve rebuilt, I find fat-finger errors such as an extra comma in an AVERAGE() formula, that throw off the final forecasted line items by 5%, sometimes 10% or 15%
  • Dispersed spreadsheets due to role-based privacy and security, especially in workforce planning, that lead to intermediary models that inflame all the other issues mentioned
  • Frustrated analysts and frustrated decision makers

Many enterprises rely on structured systems like ERP modules or addons and finance point-solutions to do their planning and forecasting, and those usually come with fewer accuracy problems at the expense of being mostly worthless. The sheer time and money to institute a new methodology into the ERP or edge system usually leaves Finance teams with two options: Complacent acceptance of underleveraged financial planning processes or exports to Excel for more advanced logic, modeling and formulation. And how long, in today’s world, should we really expect a planning model to remain useful with 0 changes to its structure? My opinion is 2 to 4 years, depending on industry. Point solutions may relieve some of the pressure of an overcooked process or data flow in the short run, but any leadership thinking on a 10 or even 5 year horizon should know that adding another software to the mosaic of enterprise softwares, with no consolidation across business units in sight, is not a great bet. The odds are that the future will arrive and force a rebuild using a Connected Planning tool, and then everyone can hope the failed or deserted implementations from a few years before don’t get brought up in meetings. 


This paper is just a brief glimpse into a single use case. The Anaplan platform is adaptable and scalable to, literally, hundreds of use cases across Finance, Sales, Supply Chain and even HR and IT. It is my sincere belief that any Finance, Supply Chain or Ops group in a company at a state of technological and financial maturity which includes large IT systems like ERPs, CRMs, HCMs, relational data lakes and warehouses, ETLs and other enterprise tools should at least reach out and view a few demos. Anaplan customers do many implementations and, one by one, fill in the frustrating gaps and manual processes that are birthed from the large technology implementations across different organizational units – and they do it relatively cheaply and quickly. Even if you’re on the IT team, you will begin to appreciate consolidating analysis and planning tools as the use cases begin to mount, and you can appreciate the fact that an Anaplan deployment is maintained by business end users and not the IT team.

Feel free to reach out at [email protected] to schedule a demo, or shoot me an email or give me a call just to talk about Anaplan and this exciting, transformational ecosystem.

Scroll to Top
Scroll to Top