Defining Engagement Score for Media / Publishing Sector
In media industry, "Engagement" word has been systematically diluted in its meaning by overusing it across several contexts without defining concretely what it means exactly. To someone it's
Average Page Views / Visit and to some it is defined as
Total Engaged Minutes (whatever that means). When I started my role as Product Data Analyst, success metrics of various A/B or Multivariate tests would involve vague things like - "increase engagement by 5%". Frustrated by absence of any robust definition, have embarked upon a project to answer this simple-looking question - "How do you define engagement for media or publishing sector ?"
Given a set of independent (can turn out to be correlated as we keep exploring) variables or predictors (ex - Pageviews/time, Frequency: sessions/time, Recency: time since last visit etc.), we may partition the problem at two levels - individual user level and aggregate-level (Please refer to the The Brief, esp. the Normalization section).
Focusing our attention to user-level scoring, we need to identify a subset (or all) of these variables with sufficient regression power to the target variables (ARPU, conversions). Given appropriate data (scaled, de-duped, user-level data; this is probably the hardest part of the project), a series of exploratory scatter plots of various pairs of (i-th predictor, target variable), the correlation matrix of predictors and a simple multivariate linear regression (MVR) should be enough to propel us to the desired outcome. For example, let’s take the much-discussed engagement metric of Financial Times -
This can be expressed as a simple multivariate regression problem by taking logarithms on both sides
With right kind of data, this can easily be modeled with MVR. Now the “trick” of taking logarithm and subsequent modeling via MVR would be obvious when one explores the relationship between Volume and target (dependent) variables through the process of exploratory analysis.
Outline of Steps
Gather user-level data. (let’s worry about aggregate-level scoring later). This would involve merging multiple datasets, de-duping and fair bit of data-wrangling and data-enrichment.
Study exploratory scatter plots of various predictors and target variables. R (statistical software) has a special command for this - plotmatrix. Also study correlation matrix of various predictors.
Perform suitable data transformations (ex. log), scaling or normalization (z-scoring etc.) etc.
Loosely regress target variable(s) into predictors (fancy of saying: run MVR). Various candidate predictors being considered are -
- Page views/time
- Average active time on article pages
- Article views/time
- Frequency: sessions/time
- Recency: time since last visit
- Referral channel (SEO, direct etc)
- Interactions (commenting, sharing, flagship product feature bespoke measures e.g. adding articles to shortlist)
- User type/status: subscriber, registered or visitor
- Value (ARPU, subs revenue, advertising revenue)
- Conversions (visitor>member, member>subs, visitor>subs)
Using a combination of t-statistic(s) and F-statistic (and some qualitative judgement) select a subset of “strong” predictors leaving out weaker ones. Fix any issues with scaling if we haven’t done it appropriately earlier.
Regress again with the reduced subset of predictors. Capture the coefficients. In the FT example above, this would approximately be (1,0.5,-1).
Testing: Engagement scoring model needs ongoing management / improvement as our understanding about our audience improves. But some basic testing needs to be carried out -
Predictive Accuracy: Using Cross-validation ensure predictive accuracy of the scoring model. A standard quantitative measure like MSE would be good.
Descriptive Accuracy: For example, assess questions like - What is the differential improvement of the engagement score with a small increase in - let’s say - commenting ?
Rationale: Why this would be a preferred approach ?
We can fit various exotic statistical models to the data. However simple MVR is preferred for couple of reasons -
Simplicity: MVR is simple and various nonlinear processes can be modeled by linear regression with sufficient accuracy.
Descriptive Power: Engagement scoring is not a predictive modelling task. We are not just concerned about predicting value given the predictors (in that case, Support Vector Machine would be a good starting point). Instead, we are deeply concerned about how each of the predictors influence (through direct and interaction effects) the overall score. Additionally, the score has to make sense qualitatively. The plot below explains it beautifully -
FYI, MVR falls under Least Squares in the plot. The key attribute we are looking here is - Interpretability. As the model complexity (flexibility) increases, its interpretability reduces commensurate with an increase in predictive power.