Estimating Incremental Acquisition of Content Launches in A Subscription Service
Estimating Incremental Acquisition of Content Launches in A Subscription Service
Abstract
Subscription services face a difficult problem when estimating the causal impact of content
launches on acquisition. Customers buy subscriptions, not individual pieces of content, and once
subscribed they may consume many pieces of content in addition to the one(s) that drew them
to the service. In this paper, we propose a scalable methodology to estimate the incremental
acquisition impact of content launches in a subscription business model when randomized ex-
perimentation is not feasible. Our approach uses simple assumptions to transform the problem
into an equivalent question: what is the expected consumption rate for new subscribers who
did not join due to the content launch? We estimate this counterfactual rate using the con-
sumption rate of new subscribers who joined just prior to launch, while making adjustments for
variation related to subscriber attributes, the in-product experience, and seasonality. We then
compare our counterfactual consumption to the actual rate in order to back out an acquisition
estimate. Our methodology provides top-line impact estimates at the content / day / region
grain. Additionally, to enable subscriber-level attribution, we present an algorithm that assigns
specific individual accounts to add up to the top-line estimate. Subscriber-level attribution is
derived by solving an optimization problem to minimize the number of subscribers attributed
to more than one piece of content, while maximizing the average propensity to be incremental
for subscribers attributed to each piece of content. Finally, in the absence of definitive ground
truth, we present several validation methods which can be used to assess the plausibility of
impact estimates generated by these methods.
Keywords: Observational Causal Inference, Machine Learning, Incremental At-
tribution
1 Introduction
In a subscription business model, a customer pays a recurring fee, typically monthly or yearly, for
unlimited access to content offered by the service. Examples include subscription video on demand
companies such as Netflix and Disney+, music streaming services like Apple Music or Spotify, e-
books or magazine subscription services such as Kindle Unlimited, and subscription-based fitness
businesses such as Peloton. In the context of these examples, a piece of content can be considered
as a new film, TV show, song, album, e-book, audiobook, or class.
Flow of new content on the service is an important growth lever for subscription services.
For example, video-on-demand companies release new movies and shows every week. Content
brings value to a service in two fundamental ways: they may cause current members to retain
on the service more often, or they may cause new members to join the service. In this paper,
we present modeling innovation designed to improve estimation of the latter impact – the causal
1
effect of content launches on new member acquisition. Such estimation can help inform content
prioritization in programming and marketing, particularly in lower-penetrated markets where non-
member content preferences may not be well understood.
Estimating the acquisition impact of content launches in subscription services is challenging,
mainly due to three factors. First, most existing attribution models are focused on the problem
of marketing attribution. A customer purchases a product and the firm wishes to attribute that
purchase to some combination of previous events, such as ad impressions. The firm uses cookies or
other forms of tracking to observe the timing and nature of previous events to which the customer
was exposed. The firm also observes previous events for potential customers who did not purchase.
Then the attribution is done via training a model on data with binary “convert” / “don’t convert”
target variables. However, in the context of a service where subscribers must join in order to
consume content, all observed outcomes are ”convert.” For example, marketers can observe ad
exposure for both converters and non-converters, but subscription services cannot easily observe
the behavior of non-subscribers, i.e. what specific content they would have consumed if they had
decided to subscribe. Hence, launch attribution faces a severe data censoring problem which makes
the marketing attribution literature challenging to adapt to our context.
Second, randomized experiments tend to be impractical for determining the incremental impact
of content launches, since subscription services generally launch each piece of content only once,
and after being launched, it can be accessed and consumed by all subscribers. Inability to use
randomized experiments makes causal identification and validation challenging.
Third, subscription services face the additional hurdle that, once a member has subscribed, all
content becomes costless to consume. This means that subscribers may quickly consume a large
amount of content, including many pieces of content that were unrelated to their decision to join,
making it difficult to pick out which particular content, if any, was pivotal to that decision.
This paper introduces the acquisition impact model (AIM), a model that uses observational
data to estimate the number of sign ups caused by a piece of content at its launch. AIM combines
the intuition behind first touch attribution – that members quickly consume the content that has
caused them to sign up – with a new methodology that compares cohorts of new members based
on their sign up date relative to the content launch date. Specifically, we use the pre-launch sign
up cohort as a covariate-adjusted control group for the post-launch cohort, and compare their
consumption behavior in order to tease out the number of incremental sign ups that exist due
to the new content launch. Under the assumptions that a) all incremental sign ups quickly start
consuming the content they signed up for, and b) the underlying preferences of new subscribers
who are non-incremental for the newly launched content are smooth around the launch date, our
method delivers plausibly-causal estimates.
The contributions of the paper are: (1) present a scalable model to estimate the number of sign
ups caused by a new content at its launch (2) propose a framework to isolate specific subscribers
that signed up for the new content and (3) propose several ways to validate the methodology using
external sources of truth.
The paper is organized as follows. Section 2 shows how, under plausible assumptions, esti-
mating incremental subscribers is equivalent to estimating the non-incremental consumption rate,
then shows how we might adjust the stream rate to account for differences in covariates between
populations. Section 3 describes a methodology that can be used to identify subscribers who signed
up for the new content from those who streamed the content due to seasonlity factors, higher user
activity or product promotion. Section 4 proposes several methodologies we could use to validate
the estimates. Section 5 reviews the attribution literature, and Section 6 concludes.
2
2 Estimation Methodology
This section sets up AIM and shows that if we assume incremental subscribers always consume the
content that causes them to subscribe upon signing up, we can solve for the number of incremental
new subscribers using the non-incremental consumption rate. We then discuss covariates that may
influence the probability of consumption among non-incremental subscribers, and how we can alter
covariate distributions to get adjusted consumption rates.
2.1 Setup
Suppose we observe a population Nt of new subscribers on a particular day t:
X
+
Nt = NCatalog + Njt . (2.1)
j∈A
Nt is the sum of all subscribers who joined the subscription service specifically because of content
j, or because of the subscription service as a whole, which we indicate by catalog-driven sign ups.
A subscriber who joined due to content j is “incremental for content j ” while a subscriber who
joined for other reasons is “non-incremental for content j ”. The sum of all incremental subscribers
+ −
for j on day t is Njt , while the sum of all non-incremental subscribers is Njt . We wish to estimate
+
Njt but observe only Nt .
+
Information on the consumption behavior of subscribers can help to isolate Njt . Suppose the
probability that a given subscriber i signing up on day t consumes content j can be defined by the
f (.) function:
pijt = f (Xijt )
where pijt is the probability of a binary consumption outcome calculated over an appropriate time
window, and Xijt is a set of covariates capturing the underlying preference of subscriber i toward
content j, how content j is shown to subscriber i in the product, the age of subscription i at the
time content j becomes available to i, and subscriber activity over the appropriate time window
defined for the consumption of content j. The set of subscribers who consume content j is denoted
Sjt , the set of incremental and non-incremental subscribers for content j are denoted by S+ jt and
−
Sjt , respectively.
Assumption 1. If subscriber i is incremental to content j, then pijt = 1.
The reasoning is that for content to be pivotal to the decision to join, the subscriber must value it
and be aware of it before joining. Salient, valued content is likely to be consumed. This assumption
+
helps identify the data elements needed to estimate Njt . Define the number of consumers of content
j who sign up on day t by Sjt , then because pijt = 1 for incremental subscribers, we have:
X
Sjt = pijt
i∈Sjt
X X
Sjt = pijt + pijt
i∈S+
jt i∈S−
jt
X
+
Sjt = Njt + pijt
−
i∈Njt
3
+ −
Sjt = Njt + Njt × p̄ijt
+ +
Sjt = Njt + Nt − Njt × p̄ijt
+ Sjt − Nt × p̄ijt
Njt = . (2.2)
1 − p̄ijt
+
In other words, given Assumption 1 we can recover Njt from observable data Nt and Sjt by esti-
−
mating the average consumption probability for non-incremental subscribers Njt , i.e. p̄ijt . Because
we don’t know which subscribers are non-incremental we can’t directly observe p̄ijt , which means
+
we cannot directly solve for Njt . However, we can make inferences about p̄ijt using information
from other populations of non-incremental subscribers.
Assumption 2. For subscribers joining in the T0 -day period prior to the launch of content j,
−
B+
j = 0 and Bj = Bj
We can use Bj in several ways. First if Bj and S− jt have the same distribution of covariates, i.e.
Xijt , then we can simply use the average stream rate for subscribers in Bj to estimate p̄ijt :
P
i∈Bj pijt
p̄ijt =
|Bj |
+
and then use (2.2) to estimate Njt .
However, the assumption that Bj and S−jt have the same distribution Xijt of covariates is unlikely
to hold in practice. When pre-launch and post-launch cohorts differ in important ways other than
the presence of incremental subscribers, we can make adjustments to account for these differences.
For instance, pre-launch subscriptions are necessarily older than post-launch subscriptions. If
4
subscription age affects the probability of consumption, we can learn this relationship and use it to
adjust for age differences between the cohorts. Similarly, if there are differences between cohorts
in how content is presented in the product, or in subscriber activity upon sign up, we can adjust
for those differences as well.
As before, there is a relationship between pijt and various covariates such as subscriber age,
subscriber activity, content promotion, and underlying preference given by
pijt = f (Xijt ).
Unlike preference toward content j – which is not directly observable but which for non-
incremental subscribers is assumed to be continuous across the content launch boundary – age,
subscriber activity, and promotion are observable to us. We can adjust for them together by
estimating a model that explains the relationship between these factors and pijt :
We can train such a model using the pre-launch cohort on the pool of all available content launches.
Model (2.3) provides an adjusted consumption probability for Bj that simulates the probability of
consumption if Bj had the same joint distribution of age, subscriber activity, and promotion as S−
jt .
+
Finally, we substitute this adjusted consumption rate into (2.2) to solve for Njt .
Theoretically, we could extend the post-launch window indefinitely in order to capture more
long-tail incremental subscribers. The limiting factors are the comparability of new subscribers
to an increasingly-distant pre-launch cohort and the fact that measurement error in the tail may
overwhelm the trickle of true incremental subscribers.
2.3 Example
To be concrete, imagine a streaming music service that releases a new album A. We wish to estimate
how many new subscribers joined the service specifically because of the launch of A. We define a
subscriber “consuming” A as streaming more than a certain percentage of the album within a
certain window of the album becoming available to that subscriber. For simplicity imagine that
promotion of A is the same for all subscribers, and that the only reason different cohorts of new
subscribers may differ in their consumption rate of A is the presence or absence of incremental
subscribers who specifically joined the service in order to stream A.
For new subscribers who joined prior to the launch of A, the first opportunity to stream A was
the day of A’s launch. For new subscribers who joined post-launch, the first opportunity to stream
A was the first day of their new subscription. Figure 1 charts the consumption rate for subscribers
who joined on different dates. We observe the consumption rate spike for subscribers joining post-
launch. In our toy example, because covariates are assumed to be balanced for cohorts across the
date of launch, this increase in consumption must derive from the presence of subscribers who
joined because of A and stream A with probability 1, rather than baseline probability pijt ≈ 0.2.
Figure 2 plots the number of new subscribers each day. On launch day 1000 new subscribers
joined the service, of whom 500 streamed the new album. Using the pre-launch cohort, we estimate
that the streaming rate among non-incremental new subscribers is 20%.
+
We can then use Equation (2.2) to solve for Njt :
+ Sjt − Nt × p̄ijt
Njt =
1 − p̄ijt
5
Figure 1: Simulated non-incremental streaming rate as a function of join date in a toy example
Figure 2: Simulated new subscriber count as a function of join date in a toy example
6
Figure 3: Estimated incremental acquisition for Album A
3 Subscriber-Level Attribution
The previous section explained how the total number of sign ups (“acquisition impact”) caused by
a content launch can be estimated by forming a synthetic control. However, this process only yields
an aggregate acquisition count – it does not tell us which particular subscribers, among those who
consumed the content, were most likely to have signed up because of the launch. Subscriber-level
attribution is necessary to answer any questions about the attributes, preferences, and subsequent
activity of those who are drawn in by the content. For instance, such attribution is necessary to
study users who signed up for specific content but later churned due to deficiency in similar content
that might have retained them.
In this section, we describe a framework for subscriber-level attribution. We start by discussing
an important property of the incremental subscribers.
Lemma 3. Let the non-incremental consumption probability of subscriber i and content j estimated
by (2.3) be p̂ijt , then we have:
∂P (subscriber i being incremental to content j)
≤ 0.
∂ p̂ijt
7
Figure 4: (a) Each subscriber consumes a different number of contents upon signing up. Numbers
reported in the graph are 1 − p̂ijt for a hypothetical example. (b) Ranked order incremental
likelihood attribution example for Content 1.
The proof is in Appendix 8.1. Lemma 3 states that for subscribers who consume content j, the
probability of being incremental to content j is decreasing in the baseline probability of consumption
p̂ijt . This helps us to find a rank order to better identify the possible incremental subscribers for
each piece of content, e.g. we can use 1 − p̂ijt to rank subscribers according to their probability of
being incremental for j.
Note that we explicitly incorporate subscriber level features when estimating p̂ijt , such as sub-
scriber activity. For example, if p̂ijt is increasing with respect to subscriber activity, then it is more
plausible to attribute low-activity subscribers as incremental subscribers of content j, i.e. new sign
ups with higher activity (who consume many pieces of content) will have a lower likelihood of being
incremental for any given piece of content that they consume.
After estimating the 1 − p̂ijt for subscriber-content pairs (Figure 4-a), we can limit the number
+
of incremental subscribers being attributed to each piece of content by Njt . There are various ways
which we can rank different subscribers for each piece of content. For example, if the aggregate
model estimates two incremental acquisitions for content 1 on a particular day, and there are three
subscribers consuming content 1 upon signing up, we can choose the two subscribers with the
highest affinity as the incremental subscribers for content 1 (Figure 4-b). Another possibility is
to decay 1 − p̂ijt proportionally to the order that each piece of content was consumed by each
subscriber or rank order of 1 − p̂ijt for each subscriber. More specifically, if a subscriber consumes
more than one newly launched content upon signing up, we can reward content that was consumed
sooner after the sign up event, or we can penalize contents with lower 1 − p̂ijt probability.
The main advantage of the above approaches is simplicity. However, these approaches attribute
subscribers independently across pieces of content, and don’t take into account the fact that a
subscriber might be already attributed to one piece of content while attributing it to another one.
For this reason, subscribers may be subject to multiple attribution.
8
X X
min yi − λ (1 − p̂ijt )xij (3.4)
x,y
i i,j
subject to: X
+
xij = Njt , ∀j (3.5)
i∈Sjt
X
xij ≤ 1 + M yi , ∀i (3.6)
j∈Ti
X
xij ≥ 2 − M (1 − yi ), ∀i (3.7)
j∈Ti
Figure 5 (b) plots the trade-off between two conflicting objectives defined in (3.4). We simulate
consumption behavior of 10K new subscribers and assume that each of them has access to more
than 1K pieces of contents on a subscription service. We randomly choose a set of contens that will
be consumed by each subscriber on the sign up date and generate pijt for each subscriber / content
pair based on a uniform distribution. In this example, 60% of subscribers consume more than one
piece of content after the sign up event (Figure 5 (a)). One can observe a diminishing return in the
reduction of 1:1 allocation. In particular, the allocation solution suggests a marginal improvement
over 1:1 allocation for solutions with less than 70.5% of average 1 − p̂ijt .
4 Validation Methods
The gold standard for validating model output is to compare it against casual estimates generated
by a randomized experiment. However, even when randomized experiments are infeasible we can
still make progress on validation. In this section, we describe several non-experimental methods
that can be used to tune the model and build confidence in its output. We recommend using an
ensemble approach to validation, as over-reliance on any one approach may lead to overfitting.
9
(a) (b)
Figure 5: (a) Frequncy of number of contents being consumed by a subscriber upon signing up in
the simulated example. (b) Trade-off between the reduction in percentage of 1:1 allocation and
average baseline probability of subscribers attributed across contents.
10
4.3 Other Methods
For the biggest content launches, there may be a clearly-visible spike in the aggregate number of
sign ups. We can use such situations to evaluate how much of the sign up spike is captured by
the model, and whether daily estimates are consistent with the aggregate spike. Close alignment
between the model and the spike for the biggest pieces of content can build confidence in estimates
for smaller pieces of content.
We can also evaluate how the model behaves when external events impact the aggregate sign up
pattern. For example, when COVID-19 hit many streaming platforms saw a surge in acquisition.
A well-tuned model will not attribute the rise in signups to whichever content happened to launch
at the time, but instead will pass the signup spike into the residual.
Another source of truth that may be available in certain circumstances is content-based acquisi-
tion marketing experiments. Suppose a subscription service runs a campaign for a specific piece of
content targeted at non-members, and exposure to the campaign is randomized. The randomized
campaign will provide us with a causal lift estimate. For a sufficiently powered experiment, we can
use AIM to estimate the acquisition impact separately for members of the treatment and control
groups of the experiment. This will yield distinct impact estimates for each group. The difference
between these group estimates, if the model has performed well, will equal the experimental lift
estimate multiplied by the size of the treatment group.
5 Literature Review
The majority of prior attribution research focuses on the problem of marketing attribution. A
customer purchases a product and the firm wishes to attribute that purchase to some combination
of previous events, such as ad impressions. The firm uses cookies or other forms of tracking to
observe the timing and nature of previous events to which the customer was exposed. These
observations may be incomplete, and may not include the totality of previous events that may
have influenced the action. The firm also observes previous events for potential customers who did
not purchase. The simplest form of attribution model is rules-based. “Last touch” attribution,
a form of rules-based attribution in which the entire purchase is attributed to the most recent
touchpoint, gained early popularity not because it was correct, but because it was easy to track
on the basis of the referring URL. Berman (2018) shows that last touch attribution incentivizes
inefficient oversupply of ad impressions, due to competition among advertisers [3].
An alternative approach is to observe a corpus of data on touchpoints and conversions, then
train a model to determine how much weight to assign each touchpoint. Shao and Li (2011) [12]
develop a multi-touch attribution model of this type. They employ a bagged logistic regression
approach in which they train sub-models on subsets of the data, validate misclassification rates
against holdout data, then aggregate up into a final model. The bagged approach gives their final
model more stable coefficients, which they value for the sake of interpretability by advertisers.
Abhishek et al. (2015) [1] take a different approach, training a Hidden Markov Model (HMM)
to simulate the customer’s journey through the conversion funnel, moving from low initial state
(“unawareness”) to high final state (“purchase”). The HMM allows them to model how the type
and timing of ad exposure alters the transition probabilities between states, which they then use to
attribute credit to ads. The authors attempt to use an IV method to account for endogeneity, but
ultimately abandon it. Li and Kannan (2014) [8] take a related approach by building a hierarchical
Bayesian decision model with discrete steps. After estimating the model on hospitality industry
data, the authors validate the estimates using a one-week period during which, unlike the training
11
data, paid search was completely turned off. They find that the actual conversion drop (-6.6%) is
smaller than the predicted conversion drop (-7.8%), suggesting their model did not fully account
for substitution across channels. Another data-driven approach is Media Mix Modeling (MMM).
MMM tends to use historical data on spend levels and outcomes that is aggregated by time and
geography, and runs estimations on these aggregates to tease out relationships. Chan and Perry
(2017) [5] write about the challenges of MMM, particularly its inability to account for selection
bias and endogenous variation in spend. Chen et al. (2018) [6] examine one method to correct for
selection bias due to ad targeting in MMM by employing detailed search query data, and show this
method leads to improvement in estimates as benchmarked by external experimental data.
Data-driven models tend to be an improvement over rules-based models, in that their attribu-
tion is rooted in empirical evidence. However, lack of exogenous variation means they fall short of
capturing the true causal effect of their touchpoints. Ultimately, attribution is a question of causa-
tion. The correct value to attribute to an ad is the lift in conversion probability caused by showing
the ad, versus the counterfactual of not showing it. For this reason, Randomized Controlled Trials
(RCTs) have become a cornerstone of marketing science at Netflix and elsewhere. Sapp et al.
[11] proposed a methodology to simulate a complex causal relationship, to test methods for causal
marketing attribution in the absence of RCTs. The Digital Advertising System Simulation (DASS)
is a framework for generating simulated customer browsing and ad-viewing histories. DASS is a
non-stationary Markov model with three parts: 1) a user path model, 2) an ad serving model, and
3) an ad impact model. Singh et al. [13] applied DASS simulation to evaluate five observational es-
timators: first touch, last touch, linear, matched-pairs data-driven attribution (MP-DDA), and the
“upstream data-driven attribution” (MUDDA) method developed in [10] and extended in [7]. They
find that across a variety of scenarios, MUDDA comes closest to ground truth. MUDDA is an esti-
mator that allows for ad impressions to affect not only conversion probabilities, but also subsequent
customer behavior such as search intensity or website visits. Matching on subsequent behavior in a
context where such behavior is influenced by the treatment, as is done by the MP-DDA estimator,
constitutes a form of post-treatment selection bias which is avoided in MUDDA.
6 Conclusion
Understanding acquistion drivers is an extremly important problem for subscription service, as
it can inform content buying decisions, prioritization process in awareness markets and improve
demand creating marketing and adaptive decisioning. One driving force that can significantly drive
acquistion is the launching of new contents on the service. Yet, the majority of existing literature
is limited to marketing attribution research which cannot be directly applied to the subscription
domain. In this work, we propose a methodology to better assess the acquisition value of content in
a subscription services business. Our model estimates the number of incremental sign ups caused
after a content’s launch. Our methodology combines the intuition behind first-touch attribution
(incremental subscribers consume a content quickly), with adjustments for product promotion,
subscriber activity and seasonality. Specifically, our methodology performs cohort analysis to isolate
incremental sign ups, using pre-launch new member cohorts as the control group.
In theory, our methodology can be extended to physical platforms such as gyms / clubs that
also offer membership oriented access for a fee. However we think that the scale and frequency of
content launch make our proposed methodology more approapriate to digital platforms.
We introduce a complimentary mixed linear integer programming model to identify specific
subscribers who signed up for a content. Our model minimizes the number of subscribers attributed
12
to more than one content, while maximizing the average incremental likelihood of subscribers being
attributed across different contents. This attribution enables the subscription services to better
understand characteristics of subscribers who join for specific contents, as well as evaluating sign-
up dynamics at the content and country grain, and deeper insights about content preferences for
new members, including their streaming journey on the subscription service.
Finally we propose several methods that could be utilized to validate the estimates and cal-
ibrate the model. We cannot directly validate our methodology as we never observe how many
people join for a specific content. However we can use external sources of truth to validate our
proposed methodology to estimate the incremental subscribers. To validate the estimates, we can
measure the correlation between the estimates and external sources of truth such as estimates of
the subscriber impact of very large contents, and estimates based on acquisition marketing exper-
iments. If estimates obtained from our methodology are more strongly correlated than currently
existing proxies (e.g. first-touch attribution), this indicates our method is an improvement over
existing methodologies. We can also use external sources of truth to calibrate the estimates, either
by tweaking the definition of control cohort or other inputs to the model.
7 Acknowledgments
The authors wish to thank the many colleagues from Netflix whose contributions enhanced this
work, with special thanks to Manping Wang, Steve McBride and Minwoo Choi. We would also like
to thank Meghana Bhatt, Natali Ruchansky, Yves Raimond, Ashish Rastogi, and Phil Hebda for
their thoughtful review and guidance.
13
References
[1] Vibhanshu Abhishek, Peter Fader, and Kartik Hosanagar. Media exposure through the funnel:
A model of multi-stage attribution. Available at SSRN 2158421, 2012.
[2] Egon Balas. Disjunctive programming and a hierarchy of relaxations for discrete optimization
problems. SIAM Journal on Algebraic Discrete Methods, 6(3):466–486, 1985.
[3] Ron Berman. Beyond the last touch: Attribution in online advertising. Marketing Science,
37(5):771–792, 2018.
[4] Kay H. Brodersen, Fabian Gallusser, Jim Koehler, Nicolas Remy, and Steven L. Scott. Inferring
causal impact using bayesian structural time-series models. Annals of Applied Statistics, 9:247–
274, 2015.
[5] David Chan and Mike Perry. Challenges and opportunities in media mix modeling. 2017.
[6] Aiyou Chen, David Chan, Mike Perry, Yuxue Jin, Yunting Sun, Yueqing Wang, and
Jim Koehler. Bias correction for paid search in media mix modeling. arXiv preprint
arXiv:1807.03292, 2018.
[7] Joseph Kelly, Jon Vaver, and Jim Koehler. A causal framework for digital attribution. Tech-
nical report, Google LLC, 2018.
[9] Lyle Ramshaw and Robert E Tarjan. On minimum-cost assignments in unbalanced bipartite
graphs. HP Labs, Palo Alto, CA, USA, Tech. Rep. HPL-2012-40R1, 2012.
[10] Stephanie Sapp and Jon Vaver. Toward improving digital attribution model accuracy. Tech-
nical report, Google Inc., 2016.
[11] Stephanie Sapp, Jon Vaver, Minghui Shi, and Neil Bathia. Dass: Digital advertising system
simulation. Technical report, Google Inc., 2016.
[12] Xuhui Shao and Lexin Li. Data-driven multi-touch attribution models. In Proceedings of the
17th ACM SIGKDD international conference on Knowledge discovery and data mining, pages
258–264, 2011.
[13] Kyra Singh, Jon Vaver, Richard E. Little, and Rachel Fan. Attribution model evaluation.
Technical report, Google LLC, 2018.
14
8 Appendices
8.1 Proof of Lemma 3
Proof. Let’s denote Aij as the event of subscriber i being incremental to content j, Bij as the event
of subscriber i streaming content j upon signing up, and Acij and Bcij as the complement of Aij and
Bij , respectively. Based on Assumption 2, we can write:
P (Bij |Aij ) = 1,
therefore:
P (Aij ) = P (Aij ∩ Bij ).
Using the pre-launch cohort, we can estimate P (Bij |Acij ) using (2.3) as p̂ijt . We can write P (Bij |Acij )
as
P (Bij ∩ Acij ) P (Bij ) − P (Bij ∩ Aij ) P (Bij ) − P (Aij )
P (Bij |Acij ) = c = = .
P (Aij ) 1 − P (Aij ) 1 − P (Aij )
Solving for P (Aij ) and replacing P (Bij |Acij ) with p̂ijt , we get
P (Bij ) − p̂ijt
P (Aij ) = ,
1 − p̂ijt
which implies
∂P (Aij )
≤ 0.
∂ p̂ijt
15