SSRN Id4269703
SSRN Id4269703
April 2023
Abstract
1 Data
Our main analysis uses data on patent content, citations, and attributes. Our main
sample covers all utility patents granted by the U.S. Patent Office (USPTO) from 1976
through 2021. This allows for at least a 20-year follow-up history, extending through the
patent’s expiration.
Our sample of patents comes from the USPTO’s Patent Examination Research Database
(PatEx) dataset. In our main analyses, We study the quality of the patents through the
lens of patent abstracts, as they provide a clear and concise text-only summary of the
core contribution of the patent. Importantly, this is the key text input into the C-BERT
model. Using the abstract of patents presents several key advantages over using the full
body. First, use of abstracts alleviates concerns about differences in the quality of the
figures contained within the patents that could substitute for the quality of the writing.
Second, abstracts are a good proxy for the contents of a patent as well as what inventors
and examiners review. Third, from a practical standpoint, using the full text of the
patents is computationally prohibitive. Even with access to a high-powered computing
cluster, using abstracts in our setting takes several days to complete.
In robustness tests, however, we reproduce our analysis using the full text of the
patents, and substituting the LONGFORMER embedding for BERT, as BERT does not
scale for long texts. All our results remain qualitatively similar (we intend to include
these results in more detail in a future draft).
Importantly, there is a key difference between patent citation counts and actual patent
quality. While historically forward citations have been used as a proxy for patent quality,
the key point of our analysis is to determine whether this measure is systematically bi-
ased downwards for female-authored patents. We therefore distinguish between patent
citations (the easily observed outcome for a patent) and quality, which is mediated for
by using the text of the patent. Patent forward citation counts are obtained through use
of data from the USPTO.
Of course, patent forward citations are highly skewed in their distribution, with only
Our main treatment variable is the gender of the lead inventor (first author).5 Inventors,
however, do not disclose their gender when applying for a patent. Because of this, we
must infer gender from third-party sources, (Graham, Marco, and Miller, 2018). To dis-
ambiguate the gender of the inventor, we implement a name disambiguation algorithm
similar to that of Desai (2019). We use the first name of the lead inventor to identify the
gender of the inventor (Tzioumis, 2018).
Starting with the PatentView data, we obtain the first names of each inventor of each
patent. For patents with multiple inventors, we rely on the name of the first inventor due
to that person’s prominence. Next we classify the gender of patent inventors using state-
level data on the frequency of names obtained from the Social Security Administration
(SSA) (Comenetz, 2016). We assign a gender when the percentage of names in the state
belonging to that gender is above 70%.6 If the first name does not match the SSA dataset,
our second step uses a similar process but utilizing a cross-country dataset from the
World Intellectual Property Organization (WIPO) (Martinez, Raffo, Saito, et al., 2016).
We drop patents when there is no distinct gender determination for the lead inventor.
One challenge is that our sample shows that women are underrepresented as inven-
tors on patents (Hunt et al., 2013). As a result, we need to balance our sample across
patents with lead inventors from each gender. To do this, we use all patents with a fe-
male lead inventor and extract a random subsample of patents with male lead inventors
of the same size. We then estimate a propensity model using a one layer logit-linear
5 In further robustness, we consider single-author patents and the gender of the entire team.
6 We take a conservative approach and apply a high confidence interval to reduce Type I errors when
identifying males and females.
10
Typically, patent applications include a list of related patents and supporting material.
Citations to patents may be added in two ways. First, inventors cite precedent patents in
their applications. Second, examiners will identify additional citations that are missing
from the patent and request that these be included (Farre-Mensa, Liu, and Nickerson
(2022)). Starting in 2001, and more clearly since 2003, the USPTO discloses whether the
citation originated from the examiner or the inventor. For the purposes of the analysis
studying the source of a citation, we create additional citation counts that only record
citations that were explicitly added by examiners and inventors.
When an inventor files a patent application with the USPTO, the application is assigned
a USPC class and subclass based on its field of technology. The application is then
assigned to an “art unit” comprised of several examiners who specialize in that particular
technology class and subclass. We use the art unit to which the patent is assigned as our
7 Allour results remain qualitatively similar in nature and stronger in magnitude if we do not exclude
patent whose author gender can be clearly identified from the text content alone.
11
2 Empirical Strategy
Our analysis presents both methodological and computational challenges. First, we must
represent complex and often subtle differences in the text of the patents in a parsimo-
nious and computationally useful form. Second, we need to relate that text to forward
8 Note, the NBER patent categories are truncated at the end of our sample.
12
There are a variety of possible approaches to transform text into numerical form. Here
we use a Bidirectional Encoder Representations from Transformers (BERT) approach to
transform the text of each patent into a high-dimensional numerical vector. Developed
by Google (Devlin, Chang, Lee, and Toutanova, 2018), BERT has become the leading ap-
proach in many commercial applications, including Google’s search platform. BERT con-
structs embedding vectors that are numerical representations of the text, which preserve
both the meaning of individual words and the underlying context of each word.9 The
BERT encoder module (Vaswani, Shazeer, Parmar, Uszkoreit, Jones, Gomez, Kaiser, and
Polosukhin (2017)) produces a high-dimensional representation with 768-dimensional
embeddings that each represent the text of a patent’s abstract. We describe the encoder
architecture in detail in Appendix A.
13
14
15
g( Zi ) = P( Ti = 1| Zi ) = ( g ◦ f ) (Wi )
where Yi (0) and Yi (1) denote the potential outcomes of the ith patent. In our case,
these potential outcomes are the number of forward citations. Given these mappings
represented by neural networks, we can then estimate the average treatment effect (ATE)
and the average treatment effect on the treated (ATT) using the following equations for
a set of N patents.
N N
ATE = ∑ [E(Yi (1)|Zi ) − E(Yi (0)|Zi )] = ∑ [Qi (1, Zi ) − Qi (0, Zi )]
i =1 i =1
N N
1 1
ATT =
∑iN=1 Ti
∑ Ti [E(Yi (1)|Zi ) − E(Yi (0)|Zi )] = ∑ N Ti
∑ Ti [Qi (1, Zi ) − Qi (0, Zi )]
i =1 i =1 i =1
The resulting output of our C-BERT model is the actual outcome and a counterfactual
outcome. In our application, this is the number of citations and the estimated number
of citations the opposite gender would have received.
16
There are three assumptions that the econometrician must consider when applying C-
BERT.
The first necessary condition is that the text of the documents must render the effect
identifiable. Said differently, the effect that the econometrician is measuring must be
measurable directly from the text. Similar to an exclusion restriction within other iden-
tification strategies, this cannot be formally tested. Instead, this condition must be in-
spected and potentially falsified by considering other channels. In the context of this
paper, the quality of the patent should be measurable by the content (text) of the patent
itself. Patent examiners read the text of the proposals to evaluate the novelty of patents
prior to granting a patent. As a result, this necessary condition is likely satisfied in our
context.
The second necessary condition is that the embedding method extracts semantically
meaningful text information relevant to the prediction of both treatment, T, and out-
come, Y. In our setting, this means that embedding, a lower-dimensional representation
of the text, is sufficient to capture the gender and quality of citations.
To assess the quality of our embedding representations, we consider synthetic tests to
measure the accuracy of our model. To do this, we first compute the synthetic outcomes
of all of the patents across the full dataset. In doing this, we used a random linear trans-
formation that takes a uniformly random 768 vector with values from 0 to 1. Then we
take the dot product of this random vector with each patent’s 768 dimensional embed-
ding. Finally, the resulting values are the synthetic outcome for females, and, for males,
we add a known scalar to the function. In this approach, we know the true treatment
17
Our third and final necessary condition is that the conditional outcome and propensity
score models be consistent. That is, the treatment and control groups should have com-
mon support. To address this, as discussed above, we follow the procedure of Veitch,
Sridhar, and Blei (2020) and drop the patents with either below 3% treatment propensity
or above 97% treatment propensity. In our study, the treatment is the female gender in-
dicator of the lead inventor. Therefore a treatment propensity of at most 3% implies that
this patent, as defined by the embedding of the text, almost certainly has a male lead
inventor. On the other hand, a treatment propensity of at least 97% implies this patent
almost certainly has a female lead inventor. This procedure preserves over 80% of our
data after dropping the propensity score outliers. Importantly, our results remain robust,
suggesting that the conditional outcome and propensity score models are consistent.
Do forward citation counts for patents differ across the gender of the lead inventor? We
begin by examining the differences in forward citation counts by gender using simple
regression analysis. We then apply the C-BERT model to calculate counterfactuals, and
assess the causal effect of gender on citation counts.
18
where patent and year are represented by i and t, respectively. Yi is our outcome of inter-
est. Our specification includes fixed effects for customer-examiner pair (δcustomer×examiner ),
art unit (δArtUnit ), and year of grant (δGrantYear ). All standard errors in this paper, unless
otherwise noted, are double-clustered by patent issue year and customer. β 1 is our co-
efficient of interest, where a positive value would indicate that women receive more
citations than males, and vice versa.
The estimates are presented in Table 2. Panel A presents the results for the full
sample of patents (extensive margin), while Panel B presents the estimates for those
patents which receive at least one forward citation (intensive margin).11 Both panels
suggest that female lead investors receive between 0.8 to 4 fewer citations than males,
depending on specifications and controls included.Given that the expected selection ef-
fect from prior literature might predict that we would see higher quality patents–and
thus, higher citation counts–for female authored patents, the patterns from this sim-
ple analysis raise questions. Either the female authored patents being approved are of
10 To address skewness and show more clearly, we the natural logarithm of citations and present a
histogram in Panel A of Figure IA1.
11 Most patents do not receive any forward citations; in general, female lead-authored patents appear
to be less likely to receive any citations than those with a male lead author.
19
To explore this, we turn to our C-BERT model. As a reminder, C-BERT first trains
two mappings, one using only patents from male inventors and a second for female
inventors. Armed with our two mappings, we pass the male patents through the female
mapping, and vice versa. From this, we can estimate the counterfactual number of
citations a patent would have received had its lead author been of the opposite gender,
[
ForwardCitation i . We plot the histogram of predicted citations from C-BERT by gender
[
Deltai = ForwardCitationi − ForwardCitation i, (2)
inventor had been of the opposite gender. A positive delta implies that a patent has
received more citations than the quality-adjusted number suggested by the opposite-
gender model.
We plot the difference between actual and model implied citations in Figure 4 for
patents with at least one citation. We observe that Deltai appears to be negative on
average for female lead inventors (plotted in red), with a mean and median of -2.69 and
12 To address skewness and to show the distribution more clearly, we the natural logarithm of citations
20
21
Next, we explore whether these patterns of undercitation are uniform across a variety of
dimensions of heterogeneity in patent characteristics.
First, a reasonable question is whether the underciting of female lead inventor patents
uncovered in our main models holds across all technology categories or whether there is
variation across fields. We next explore this heterogeneity. Specifically, we estimate the
following model.
22
To ease interpretation, Figure 5 presents the sum of the female lead indicator and
the interaction coefficients (Female Lead Inventor × Subcategory) graphically. The raw
estimates are presented in the Internet Appendix in Table IA4.Column 1 of Table IA4
shows the estimates employing the raw citation counts as the dependent variable, while
column (2) uses Delta as a dependent variable. The finer category classification exhibits
23
An interesting question is whether the patterns we see across technology fields relate in
some way to whether women are patenting in an established field versus in an emerging
field of technology. It is possible that newer fields may not present as many barriers to
entry or pre-existing biases for female inventors and researchers, given the lack of an
established history of research and researchers, and that we may expect undercitation
patterns to be larger or concentrated in more established fields. On the other hand, the
underlying forces that lead to undercitation for patents with female first authors may
be unrelated to the nature of the field, and relate to gender norms or perceptions more
generally, in which case we would not expect to see a difference.
To explore these issues further, we denote a category as an “emerging field” if the
art unit first appeared within five years of the patent being granted. We then re-run our
models, adding an indicator for an emerging field as well as an interaction between that
indicator and the indicator for a female lead inventor. Our coefficient of interest is the
interaction between the indicator for female inventors and emerging fields.
The estimates are presented in Table 6. While our main result is still apparent, with
patents with female first authors exhibiting an estimated 3.3 to 3.7 fewer citations than
would be predicted if the first author had been male, depending on specification), across
all specifications, we cannot reject the null that there is no additional citation difference
24
A natural question is whether the undercitation we observe above is present from the
outset or whether it primarily materializes or diminishes later in the life of the patent.
On the one hand, undercitation may be present from the outset but diminish over time
as inventors and examiners become more familiar with the patent and its quality. Alter-
natively, the bias may increase and become more pronounced over time, potentially in-
dicating a self-reinforcing effect that could be harder to overcome. Examining the timing
of the bias in citations can provide valuable insights into the nature of the undercitation
of female inventors and inform potential interventions to address this issue.
To investigate the timing of undercitation, we create separate samples of forward
citations based on the number of years that have passed since a given patent was granted.
Specifically, we divide the post grant period into four sub periods: [0-1) years post
grant, [1-5) years, [5-10) years, and [10-20] years. For each of these subperiods, for
each patent, we collect the forward citations the patent receives during this sub period
post grant. For each subperiods, we re-run our C-BERT methodology to estimate the
delta in citations after mediating for patent quality. This allowed us to examine when
undercitation occurs, relative to the time of patent grant.
The estimates are presented in Table 7. Column (1) presents estimates from forward
citations to patents received in the first year after patent grant, column (2) presents
estimates for forwward citations received in years 2 to 5 after patent grant, column (3)
for citations received in years 6 to 10, and column (4) years 11-20. In each column, the
dependent variable is the Delta estimated from C-BERT using only forward citations
received during that subperiod (by necessity, the number of observations is smaller in
later subperiods as fewer of the patents in our sample will yet have histories of that
length). As can be seen from the estimates in the table, the undercitation for patents
25
The estimates we present in the prior analyses suggest that across fields, patents with
female first authors are consistently undercited relative to what would be expected for
the same patent had its first author been male. A natural question is whether these
patterns vary over time within the sample, as gender norms and female participation
in the workforce more generally and in science and engineering more specifically have
been changing over time.
Of course, older patents tend to naturally receive more citations. Without adjust-
ments to our initial methodology, our findings may incorrectly suggest a decrease in
bias over time, when in reality, it is simply a reflection that newer patents receive fewer
citations on average. To accurately study the evolution of bias, we restrict our measure
26
Yi = β 1 I ( FemaleInventori )
2011
+ ∑ β j I ( FemaleInventori ) × I ( GrantYear = j) (5)
j=1977
+δGrantYear + δArtUnit + δCustomer×Examiner + ε i ,
where β 1 estimates the average undercitation of females across the entire sample, and
the set of coefficients β j estimate the marginal bias in each patent grant year, with 1976
as the year of comparison. Standard errors are clustered in this specification by the grant
year. The omitted group is 1976.
Figure 6 plots the interaction coefficient of for each year from equation Equation 7.
The time-invariant estimate for coefficient on the female lead inventor variable is -0.34;
the interaction coefficients presented in the figure are additive to that number. From
the figure, we observe clearly that the average undercitation of patents with female lead
authors has become more pronounced over time. In comparison to patents from 1976,
those from the late 1970s and early 1980s seem to have been only modestly additionally
undercited. However, starting in the 1990s, the additional undercitation of female-led
patents rises to around 2 citations per patent. Thus, despite a decrease in disparities and
representation of women in the workplace and in science and engineering professions,
27
So far, we have presented causal evidence that patents with female lead inventors receive
fewer citations than the equivalent patents with male lead inventors. Next we explore
the source of the under-citation: whether it is driven by inventors or examiners, and the
role of their gender.
To set the stage for this analysis, we first discuss how a citation is added to a patent.
When applying for a patent, applicants cite supporting patents whose inventions the cur-
rent patent is building on top of. If, however, the patent examiner deems that there are
additional relevant citations that have not been included by the inventor, the examiner
will also add these to the patent application. As a result, the documented undercita-
tion of patents with female lead inventors may stem from the original inventor-added
citations, additional examiner-added citations, or a combination of both.
To explore the source of the under-citation, we first need to know which citations
in a patent are attributable to the inventor versus the examiner. Starting in 2001, and
more comprehensively starting in 2003, asterisks were added to the USPTO citation
data to identify examiner-added patents in the data. Using this detail, we construct a
new subsample starting from 2003 aggregating forward-citations into four categories: (i)
forward citations added (in a future patent) by male lead inventors, (ii) forward citations
added by female lead inventors, (iii) forward citations added by male-lead examiners,
and (iv) forward citations added by female-lead examiners. Using these groups, we can
then decompose the sources of under-citation of female lead-inventor patents.
We begin our analysis by studying examiner-added citations. For a given patent, we
take all forward citations that occur due to being added to a future patent application
by an examiner. We then break these into forward citations added by female examiners
28
29
6 Robustness Tests
One potential explanation for our findings is that the way in which we classify patents
by gender may have spurious influence the results. In our baseline method, we used the
name of the first inventor to assign author gender to patents with multiple inventors.
The first name on the patent is likely the most salient, as it is the first name observed
when reading the patent. Of course, it is possible that examiners and inventors may
consider all inventors and not just the first author when attributing gender.
To address the possibility that our definition of author gender spuriously produces
the patterns we observe, we show robustness of the estimates to a number of alterna-
tive approaches to attributing author gender to a patent. First, we limit our sample to
patents with only one author, and re-running our C-BERT model, comparing female
sole-authored patents to male sole-authored patents. This shuts down concerns that the
30
The results presented up to this point utilize patent abstracts and the BERT embedding.
A natural concern is that the patent abstracts do not have enough content to fully pick up
patent quality for mediation purposes. For robustness, we repeat our analyses replacing
the patent abstract texts with the full patent texts, and utilizing the LONGFORMER
embedding instead of BERT. We use the LONGFORMER embedding instead because
BERT has difficulties handling longer text lengths.
Table IA6 reports the main results using full patent texts, for the earlier subperiod of
our sample, 1986-1994. We observe qualitatively similar, statistically significant, under-
citation results (roughly 2 citations). In future versions of the paper, we intend to extend
this analysis to the full sample period and for all of our estimations.
13 The neural network that measures the propensity for a given patent to have been written by a male
versus a female assures that what we are picking up is quality as indicated by text content as opposed to
writing style.
14 Note, this only grows our sample by roughly 2% because teams of all female inventors that can be
31
A standard concern with these types of models is overfitting to the training data. In
our setting, we train two different models by completing multiple passes of our training
dataset through our algorithm, an epoch. While numerous passes of the data help im-
prove the predictive probability of the neural networks, we could have overfit our model
to the data. If so, this would result in relatively poor out-of-sample performance. In the
context of our paper, this would result in incorrect or biased out-of-sample predictions
of the number of citations.
We address this concern by studying the loss function, as presented in Figure IA5, to
ensure a reasonable number of training iterations. Plotting the mean square error (MSE)
per batch against the number of passes of the training dataset, we find two key pieces
of evidence that suggest we have not overfit the model. First, as we increase the number
of epochs, the MSE tends to decrease. Second, we find diminishing improvements to
the error rate as we approach 20 epochs. Taken together, these findings suggest that our
model is unlikely to be overfitted and, as a result, that the model is appropriate and that
reasonable counterfactual citations are predicted from our neural networks.
We next consider the relationship between the economic importance of a patent, as evalu-
ated by public markets at the time of issue, and forward citations. As we have previously
discussed, undercitation of female-authored patents tends to persist over time and be-
come more pronounced as the years go on. In contrast, the economic value of a patent,
as assessed by public markets, is forward-looking and can be determined at the time of
issuance. An interesting question is whether these forward looking market estimates of
a patent’s economic value relate more closely to actual forward citations, or to the pre-
dicted number of forward citations we obtain out of C-BERT, which adjusts for author
32
[
Yi = β 1 I ( FemaleInventori ) + β 2 ForwardCitationi + β 3 ForwardCitation i (6)
[
The estimates are presented in Table 10. Only the coefficient on ForwardCitation i
loads significantly, suggesting that the market does not appear to undervalue female-
authored patents relative to what it would had that same patent been authored by a
male lead inventor. The estimates provides further support to the notion that actual
measures of forward citations are biased by gender of the inventor.
We provide causal evidence that patents with female lead inventors are undercited rel-
ative to what they would have received if their patent had a male lead inventor. Our
approach uses new tools in machine learning to disentangle quality from forward cita-
tions, allowing us to show that the most commonly used measure for patent quality in
33
34
35
Gavrilova, Evelina, and Steffen Juranek, 2021, Female inventors: The drivers of the gen-
der patenting gap, Working paper.
Gentzkow, Matthew, Bryan Kelly, and Matt Taddy, 2019a, Text as data, Journal of Economic
Literature 57, 535–74.
Gentzkow, Matthew, Jesse M Shapiro, and Matt Taddy, 2019b, Measuring group differ-
ences in high-dimensional choices: method and application to congressional speech,
Econometrica 87, 1307–1340.
Goldstein, Itay, Chester S Spatt, and Mao Ye, 2021, Big data in finance, Review of Financial
Studies 34, 3213–3225.
Graham, Stuart JH, Alan C Marco, and Richard Miller, 2018, The uspto patent examina-
tion research dataset: A window on patent processing, Journal of Economics & Manage-
ment Strategy 27, 554–578.
Hall, Bronwyn H, Adam Jaffe, and Manuel Trajtenberg, 2005, Market value and patent
citations, RAND Journal of Economics 16–38.
Hanley, Kathleen Weiss, and Gerard Hoberg, 2019, Dynamic interpretation of emerging
risks in the financial sector, Review of Financial Studies 32, 4543–4603.
Hansen, Stephen, Michael McMahon, and Andrea Prat, 2018, Transparency and delib-
eration within the fomc: a computational linguistics approach, Quarterly Journal of
Economics 133, 801–870.
Hengel, Erin, and Euyoung Moon, 2020, Gender and equality at top economics journals,
Working paper.
Hirschey, Mark, and Vernon J Richardson, 2004, Are scientific indicators of patent quality
useful to investors?, Journal of Empirical Finance 11, 91–107.
Hirshleifer, David, Po-Hsuan Hsu, and Dongmei Li, 2013, Innovative efficiency and stock
returns, Journal of Financial Economics 107, 632–654.
Hunt, Jennifer, Jean-Philippe Garant, Hannah Herman, and David J Munroe, 2013, Why
are women underrepresented amongst patentees?, Research Policy 42, 831–843.
Jensen, Kyle, Balázs Kovács, and Olav Sorenson, 2018, Gender differences in obtaining
and maintaining patent rights, Nature Biotechnology 36, 307–309.
Jha, Manish, Hongyi Liu, and Asaf Manela, 2022, Does finance benefit society? a lan-
guage embedding approach, Working paper.
36
Koffi, Marlène, 2021, Innovative ideas and gender inequality, Working paper.
Koffi, Marlène, and Matt Marx, 2021, Cassatts in the attic, Working paper.
Kogan, Leonid, Dimitris Papanikolaou, Amit Seru, and Noah Stoffman, 2017, Techno-
logical innovation, resource allocation, and growth, Quarterly Journal of Economics 132,
665–712.
Li, Kai, Feng Mai, Rui Shen, and Xinyan Yan, 2021, Measuring corporate culture using
machine learning, Review of Financial Studies 34, 3265–3315.
Loughran, Tim, and Bill McDonald, 2016, Textual analysis in accounting and finance: A
survey, Journal of Accounting Research 54, 1187–1230.
Martinez, Gema Lax, Julio Raffo, Kaori Saito, et al., 2016, Identifying the gender of PCT
inventors, volume 33 (WIPO).
Moser, Petra, 2005, How do patent laws influence innovation? evidence from nineteenth-
century world’s fairs, American Economic Review 95, 1214–1236.
Moser, Petra, 2013, Patents and innovation: evidence from economic history, Journal of
Economic Perspectives 27, 23–44.
Oster, Emily, 2019, Unobservable selection and coefficient stability: Theory and evidence,
Journal of Business & Economic Statistics 37, 187–204.
Reshef, Oren, Abhay Aneja, and Gauri Subramani, 2021, Persistence and the gender
innovation gap: evidence from the us patent and trademark office, in Academy of Man-
agement Proceedings, volume 2021, 11626, Academy of Management Briarcliff Manor,
NY 10510.
Rouen, Ethan, Kunal Sachdeva, and Aaron Yoon, 2022, The evolution of esg reports and
the role of voluntary standards, Technical report.
Routledge, Bryan R, Stefano Sacchetto, and Noah A Smith, 2017, Predicting merger
targets and acquirers from text, Working paper.
Sarsons, Heather, 2017, Recognition for group work: Gender differences in academia,
American Economic Review 107, 141–45.
Sarsons, Heather, Klarita Gërxhani, Ernesto Reuben, and Arthur Schram, 2021, Gender
differences in recognition for group work, Journal of Political Economy 129, 101–147.
Shao, Yifan, Haoru Li, Jinghang Gu, Longhua Qian, and Guodong Zhou, 2021, Extraction
of causal relations based on sbel and bert model, Database 2021.
37
Tzioumis, Konstantinos, 2018, Demographic aspects of first names, Scientific Data 5, 1–9.
Vaswani, Ashish, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N
Gomez, Łukasz Kaiser, and Illia Polosukhin, 2017, Attention is all you need, Advances
in Neural Information Processing Systems 30.
Veitch, Victor, Dhanya Sridhar, and David Blei, 2020, Adapting text embeddings for
causal inference, in Conference on Uncertainty in Artificial Intelligence, 919–928, PMLR.
38
Citation is the outcome of interest, Gender is the treatment, and Text are the sequence of words. Panel A
depicts the average treatment effect, with the assumption that Text carries sufficient information to adjust
for confounding (common cause) between outcome and treatment. Panel B depicts the natural direct effect
(NDE), where the text is a mediator of the treatment on outcome.
39
BERT
Embedding
True False
Treatment
Male Citation Female Citation propensity
Network Network
Counterfactual
number of citations
Output
The figure illustrates the estimation procedure of C-BERT once the neural networks are trained. The light
blue block at the very top describes the input used for estimation. The green blocks are the four neural
networks trained using the patent data. The blue block describes the decision rule used for counterfactual
estimation. Finally, the red block is the output that combines the outputs of the citation estimation net-
works and the propensity score estimation network.
40
15.0%
Density
10.0%
5.0%
0.0%
0 25 50 75 100
Forward Citations
Female Male
20.0%
15.0%
Density
10.0%
5.0%
0.0%
0 25 50 75 100
Expected Forward Citations
Female Male
This figure illustrates the distribution of forward citations. Panel A uses forward citations observed in the
data, while Panel B uses the expected number of forward citations as implied by the model. The horizontal
axis counts the number of citations while the vertical axis measures the percent of the distribution. Red
bars correspond to females, blue bars correspond to males, and purple bars correspond to the overlapping
region. The distribution is truncated at 100 for ease of interpretation. The natural logarithm transformation
of these distributions is presented in Figure IA1. 41
2.0%
Density
1.0%
42
0.0%
−20 0 20
Bias in Forward Citations
Female Male
This figure illustrates the difference between forward citations and expected forward citations, as defined by Equation 2. The horizontal axis counts
the additional number of citations that a patent should have received after adjusting. Red bars correspond to females, blue bars correspond to
males, and purple bars correspond to the overlapping region. The distribution is truncated between -30 to 30.
Motors & Engines + Parts
Transportation
Metal Working
Mat. Proc & Handling
Miscellaneous (Mech)
Optics
Electrical Lighting
Power Systems
Nuclear & X−rays
Measuring & Testing
Miscellaneous (Elec)
Electrical Devices
Semiconductor Devices
Sub Categories (NBER)
Drugs
Genetics
Miscellaneous (Drgs&Med)
Surgery & Med Inst.
Information Storage
Computer Peripherials
Computer Hardware & Software
Communications
Electronic business methods and software
Organic Compounds
Miscellaneous (Chemical)
Gas
Resins
Coating
Heating
Furniture,House Fixtures
Pipes & Joints
Apparel & Textile
Miscellaneous (Others)
Agriculture,Husbandry,Food
Receptacles
Earth Working & Wells
Amusement Devices
−20 −15 −10 −5 0 5
Bias In Citation
This figure illustrates the coefficients of Equation 5. For ease of interpretation, each point corresponds to
the linear combination of the baseline result for females and the interaction terms, presented in Table IA4.
Whiskers correspond to a 95% confidence internal. Coefficients are sorted by by patent category and then
by the magnitude of the estimate. Colors corresponds to the patent category as defined by the NBER,
where pink observations correspond the mechanical (Mech), purple corresponds to electrical (Elec), blue
observations correspond to drugs and medical (Drgs&Med), light green observations correspond to com-
puters and communication (Cmp&Comm), dark green observations correspond to chemical (Chemical),
yellow observations correspond to other (Other) categories. The red dotted line is plotted at the zero
intercept, representing a no effect.
43
−1
−2
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
FIGURE 6: EVOLUTION OF DELTA OVER TIME
This figure illustrates the evolution of citations over time. The horizontal axis corresponds to the year a
patent was granted. The vertical axis corresponds to the delta in forward citations, with negative numbers
corresponding to undercitation. Forward citations are computed within the first ten years the patent was
granted. Each point represents an estimate from a separate estimate, with error bars corresponding to a
95% confidence interval. All estimates include customer, examiner, and examiner art unit fixed effects.
44
−0.5
−1.0
er
er
or
o
in
nt
in
nt
am
am
ve
ve
In
In
Ex
Ex
e
e
al
al
e
e
al
al
m
M
m
M
Fe
Fe
Female Male
This figure illustrates the delta by examiner and inventor added citations and by gender. The red columns
correspond to female-added citations, and the blue columns correspond to male-added citations. Error
bars correspond to the 95% confidence interval. Estimates correspond to column (4) of Table 8 and Table 9.
45
This table provides summary statistics on patents and citations. The sample covers patents issued from
1976-01-01 through 2021-12-31. Panel A presents a two-way table of forward citations by gender. Panel
B presents a two-way table of patents in the top decile by gender. Panel C presents a two-way table of
patents by their cooperative patent classification (CPC). ***, **, * denote significance at the 1%, 5%, and
10% level, respectively. Data Source: USPTO.
46
This table reports estimates of Equation 1 and studies the number of forward citations by the gender of
the lead inventor. The sample includes both patents that received a citation and those that did not receive
a citation. Panel A uses the sample of all patents while Panel B uses patents with a positive number of
forward citations. The sample covers patents issued from 1976-01-01 through 2021-12-31. Standard errors
are clustered at the patent customer and patent issue year level. ***, **, * denote significance at the 1%,
5%, and 10% level, respectively. Data source: USPTO.
Intercept 14.953∗∗∗
(4.497)
Customer FE No No Yes No
Examiner FE No No Yes No
Examiner x Customer FE No No No Yes
Art Unit FE No Yes Yes Yes
Patent Issue Year FE No Yes Yes Yes
Observations 907,996 907,996 907,996 907,996
Adjusted R2 0.002 0.122 0.199 0.375
Intercept 18.091∗∗∗
(2.952)
Customer FE No No Yes No
Examiner FE No No Yes No
Examiner x Customer FE No No No Yes
Art Unit FE No Yes Yes Yes
Patent Issue Year FE No Yes Yes Yes
Observations 527,348 527,348 527,348 527,348
Adjusted R2 0.001 0.095 0.182 0.354
47
This table reports estimates of Equation 2 and studies the number of forward citations by the gender of
the lead inventor. The sample includes both patents that received a citation and those that did not receive
a citation. Panel A uses the sample of all patents while Panel B uses patents with a positive number of
forward citations. The sample covers patents issued from 1976-01-01 through 2021-12-31. Standard errors
are clustered at the patent customer and patent issue year level. ***, **, * denote significance at the 1%,
5%, and 10% level, respectively. Data source: USPTO.
Intercept 0.235∗∗∗
(0.085)
Customer FE No No Yes No
Examiner FE No No Yes No
Examiner x Customer FE No No No Yes
Art Unit FE No Yes Yes Yes
Patent Issue Year FE No Yes Yes Yes
Observations 907,996 907,996 907,996 907,996
Adjusted R2 0.006 0.017 0.051 0.298
Intercept 1.097∗∗∗
(0.191)
Customer FE No No Yes No
Examiner FE No No Yes No
Examiner x Customer FE No No No Yes
Art Unit FE No Yes Yes Yes
Patent Issue Year FE No Yes Yes Yes
Observations 527,348 527,348 527,348 527,348
Adjusted R2 0.006 0.011 0.062 0.328
48
This table studies patents that receive forward citations in the top decile. Panel A documents the rela-
tionship between a patent’s lead inventor’s gender and the propensity to receive citations placing them
in the top decile. Panel B documents the relationship between a patent’s lead inventor’s gender and the
model’s prediction a patent would be in the top decile of citations. The sample covers patents issued from
1976-01-01 through 2021-12-31. Standard errors are clustered at the patent customer and patent issue year
level. ***, **, * denote significance at the 1%, 5%, and 10% level, respectively. Data source: USPTO.
Intercept 0.106∗∗∗
(0.022)
Customer FE No No Yes No
Examiner FE No No Yes No
Examiner x Customer FE No No No Yes
Art Unit FE No Yes Yes Yes
Patent Issue Year FE No Yes Yes Yes
Observations 527,348 527,348 527,348 527,348
Adjusted R2 0.001 0.107 0.146 0.185
Panel B: Counterfactual
Flipped to Top Decile
(1) (2) (3) (4)
Lead Inventor Female 0.017∗∗∗ 0.018∗∗∗ 0.018∗∗∗ 0.017∗∗∗
(0.004) (0.004) (0.004) (0.006)
Intercept 0.010∗∗∗
(0.002)
Customer FE No No Yes No
Examiner FE No No Yes No
Examiner x Customer FE No No No Yes
Art Unit FE No Yes Yes Yes
Patent Issue Year FE No Yes Yes Yes
Observations 527,348 527,348 527,348 527,348
Adjusted R2 0.004 0.011 0.016 0.066
49
This table estimates the difference in citations by NBER Cateogry. Column (1) uses the number of forward
citations as its dependent variable, while Column (2) uses the difference in forward citations, as defined
by Equation 2. Estimates include interactions for the patent category based on NBER Categories. All
specifications include NBER Cateogry, Examiner × Customer, and Patent Issue Year fixed effects. The sample
covers patents issued from 1976-01-01 through 2021-12-31. Standard errors are clustered at the patent
customer and patent issue year level. ***, **, * denote significance at the 1%, 5%, and 10% level, respectively.
Data source: USPTO.
Dependent variable:
Forward Citations Delta in Forward Citations
(1) (2)
Female Lead Inventor −0.077 −3.154∗∗∗
(0.395) (0.337)
50
This table studies the citations to new fields of innovation. New Field takes the value of one if the art unit
first appeared within five years of the patent being granted. The dependent variable is the difference in
the observed number and the expected number of citations for a patent, as defined by Equation 2. The
sample covers patents issued from 1976-01-01 through 2021-12-31. Standard errors are clustered at the
patent customer and patent issue year level. ***, **, * denote significance at the 1%, 5%, and 10% level,
respectively. Data source: USPTO.
Intercept 18.119∗∗∗
(3.486)
Customer FE No No Yes No
Examiner FE No No Yes No
Examiner x Customer FE No No No Yes
Art Unit FE No Yes Yes Yes
Patent Issue Year FE No Yes Yes Yes
Observations 527,348 527,348 527,348 527,348
Adjusted R2 0.001 0.095 0.182 0.354
Intercept 1.102∗∗∗
(0.183)
Customer FE No No Yes No
Examiner FE No No Yes No
Examiner x Customer FE No No No Yes
Art Unit FE No Yes Yes Yes
Patent Issue Year FE No Yes Yes Yes
Observations 527,348 527,348 527,348 527,348
Adjusted R2 0.006 0.011 0.062 0.328
51
This table estimates Equation 1 and studies the difference in forward citations by the number of years after
the patent was granted. Column (1) – (4), study the difference in forward citations 0-1, 2-5, 6-10, and 11-20
years after they are granted, respectively. All specifications use Art Unit, Examiner × Customer, and Patent
Grant Year fixed effects. The sample covers patents issued from 1976-01-01 through 2021-12-31. Standard
errors are clustered at the patent customer and patent issue year level. ***, **, * denote significance at the
1%, 5%, and 10% level, respectively. Data source: USPTO.
52
This table studies the source of examiner-added citations for male inventors. The dependent variable is
the difference in forward citations. Panel A uses the difference in forward citations that were added by
female lead examiners as its dependent variable. Panel B uses the difference in forward citations that
were added by male lead examiners as its dependent variable. The sample covers patents issued from
1976-01-01 through 2021-12-31. Note, the source of citations is only available following the start of 2001.
The sample covers patents issued from 1976-01-01 through 2021-12-31. Standard errors are clustered at
the patent customer and patent issue year level. ***, **, * denote significance at the 1%, 5%, and 10% level,
respectively. Data source: USPTO.
Intercept 0.131∗∗∗
(0.038)
Customer FE No No Yes No
Examiner FE No No Yes No
Examiner x Customer FE No No No Yes
Art Unit FE No Yes Yes Yes
Patent Issue Year FE No Yes Yes Yes
Observations 66,757 66,757 66,757 66,757
Adjusted R2 0.0001 0.017 0.062 0.101
Intercept 0.060
(0.065)
Customer FE No No Yes No
Examiner FE No No Yes No
Examiner x Customer FE No No No Yes
Art Unit FE No Yes Yes Yes
Patent Issue Year FE No Yes Yes Yes
Observations 180,397 180,397 180,397 180,397
Adjusted R2 −0.00000 0.010 0.046 0.219
53
This table studies the source of inventor-added citations. The dependent variable is the difference in for-
ward citations. Panel A uses the difference in forward citations that were added by female lead inventors
as its dependent variable. Panel B uses the difference in forward citations that were added by male lead
inventors as its dependent variable. The sample covers patents issued from 1976-01-01 through 2021-12-31.
Note, the source of citations is only available following the start of 2001. Standard errors are clustered at
the patent customer and patent issue year level. ***, **, * denote significance at the 1%, 5%, and 10% level,
respectively. Data source: USPTO.
Intercept 0.609∗∗∗
(0.063)
Customer FE No No Yes No
Examiner FE No No Yes No
Examiner x Customer FE No No No Yes
Art Unit FE No Yes Yes Yes
Patent Issue Year FE No Yes Yes Yes
Observations 29,984 29,984 29,984 29,984
Adjusted R2 0.004 0.019 0.075 0.223
Intercept 1.083∗∗∗
(0.171)
Customer FE No No Yes No
Examiner FE No No Yes No
Examiner x Customer FE No No No Yes
Art Unit FE No Yes Yes Yes
Patent Issue Year FE No Yes Yes Yes
Observations 282,370 282,370 282,370 282,370
Adjusted R2 0.001 0.010 0.091 0.191
54
This table studies the relationship between the measures of citations and the market-implied value of
patents. The dependent variable for both panels use the log value of innovation, deflated to 1982 (million)
dollars using the CPI, as calculated in Kogan et al. (2017). The sample covers patents issued from 1976-01-
01 through 2021-12-31. Standard errors are clustered at the patent customer and patent issue year level.
***, **, * denote significance at the 1%, 5%, and 10% level, respectively. Data source: USPTO.
log(dollar )
(1) (2) (3) (4)
Female Lead Inventor 0.083∗∗∗ −0.014 −0.020 −0.050
(0.029) (0.033) (0.029) (0.041)
[
Forward Citation 0.004∗∗∗ 0.004∗∗∗ 0.002∗∗∗ 0.003∗∗∗
(0.001) (0.0003) (0.001) (0.001)
Intercept 0.716∗∗∗
(0.138)
Customer FE No No Yes No
Examiner FE No No Yes No
Examiner x Customer FE No No No Yes
Art Unit FE No Yes Yes Yes
Patent Issue Year FE No Yes Yes Yes
Observations 202,865 202,865 202,865 202,865
Adjusted R2 0.008 0.127 0.438 0.321
55
56
57
segmentation embedding EW S , which labels tokens with the sentence they belong to; and
a positional embedding EW P , which represents the relative distances between each pair of
tokens (a “token” is a word or a part of a word if the word is long). A linear combination
of these three embeddings then goes into the encoder.
The first step of the encoder is a multi-headed attention layer. Its mechanism can
described as follows. Let EW denote the input embedding of the encoder. For a given
token Wi in sentence W, the embedding is denoted EiW . The attention layer calculates
the projection of EiW onto all token embeddings, including itself, using a dot product.
The final output of the single-headed attention layer for each token embedding is a
weighted average of all token embeddings, where the weights are the cosine projection
coefficient of the current token embedding on to each token embedding. A multi-headed
attention layer is analogous to a forest of single-headed attention layers. To construct a k-
headed attention layer using a pk dimensional token embedding, we randomly split the
pk dimensional embedding of each tokens into k groups of p dimensional embeddings.
We then build a single-headed attention layer with one subset of the token embeddings.
Finally, we take a weighted average of all of the output of the k heads.
The output of this multi-headed attention layer is then passed through a normal-
ization layer with residual connection. Residual connection is achieved by passing the
input of the multi-headed attention layer directly to the normalization layer along with
the output of the multi-headed attention layer. This residual connection allows gradients
to directly flow from the input of the multi-headed attention layer to the next layer while
not going through the multi-headed attention layer. After the normalization layer, the
output is passed through a feed forward layer, which converts the output of the normal-
ization layer to the same format as the input of the encoder module. This allows us to
stack multiple encoder modules together, where the previous encoder’s output can be
used as the input for the next encoder. The reason we stack encoders is that the first
encoder learns the contextual relationship between pairs of tokens, the second encoder
learns the relationship between pairs of pairs of tokens, and so forth. For the following
discussions in this paper, we use the word "embedding" to mean the output embedding
of the encoder at the text level.
The pre-trained BERT model uses the encoder architecture to train for two tasks:
58
59
15.0%
Density
10.0%
5.0%
0.0%
0.0 2.5 5.0 7.5
log(Forward Citation)
Female Male
20.0%
15.0%
Density
10.0%
5.0%
0.0%
0 2 4 6 8
log(Expected Forward Citation)
Female Male
This figure illustrates the transformation from forward citations to expected forward citations. Panel
A uses the natural logarithm of forward citations while Panel B uses the natural logarithm of forward
citations expected from our model. The vertical axis in both panels measures the percent of the distribu-
tion. Red bars correspond to females, blue bars correspond to males, and purple bars correspond to the
overlapping region.
60
Loss
Parameter Optimization
The figure illustrates the training procedure of C-BERT once the neural networks are trained. The light
blue block at the top describes the input used for estimation. The green blocks are the four neural networks
that are trained using the patent data. The purple block denotes the loss function of the model which is a
weighted average of the loss of all four networks. Finally, the red block denotes the optimization algorithm
that allows the model to get a step toward fitting the training data.
61
Normalization Layer
Residual Multi-headed
connection Attention 1 Attention 2 Attention 3
attention
This figure illustrates the structure of the encoder module. The light blue block at the bottom describes
the input. The yellow blocks are the layers within the encoder, and the red block is the output.
62
1.50%
Density
1.00%
0.50%
0.00%
The figure evaluates the quality of fit of our neural network. The black dashed line is centered at the mean
difference in the data (-0.008). We are unable to reject the null that the true difference in means is equal to
zero (p=0.2693).
63
1
10
15
20
5
Epochs
This figure illustrates the loss function of the C-BERT model. The horizontal axis corresponds to the
number of complete passes of the training dataset through the algorithm or epoch. The vertical axis
corresponds to the loss function and is the mean square error per batch.
64
This table reports the difference in the writing style between males and females. The sample covers patents
issued from 1976-01-01 through 2021-12-31 and has at least 120 words. ***, **, * denote significance at the
1%, 5%, and 10% level, respectively. Data source: USPTO, Google Patents.
65
This table uses all patent observations without applying C-BERT. The sample covers patents issued from
1976-01-01 through 2021-12-31. Standard errors are clustered at the patent customer and patent issue year
level. ***, **, * denote significance at the 1%, 5%, and 10% level, respectively. Data source: USPTO.
Forward Citations
(1) (2) (3) (4)
Lead Female Inventor −3.621∗∗∗ −1.023∗∗∗ −0.784∗∗∗ −0.591∗∗
(0.276) (0.310) (0.240) (0.256)
Intercept 15.185∗∗∗
(4.520)
Customer FE No No Yes No
Examiner FE No No Yes No
Examiner x Customer FE No No No Yes
Art Unit FE No Yes Yes Yes
Patent Issue Year FE No Yes Yes Yes
Observations 6,312,796 6,312,796 6,312,796 6,312,796
Adjusted R2 0.0004 0.126 0.206 0.318
66
This table estimates the difference in citations by CPC Section. Column (1) uses the number of forward
citations as its dependent variable, while Column (2) uses the difference in forward citations, as defined by
Equation 2. Estimates include interactions for the patent category based on CPC Section. All specifications
include CPC Section, Examiner × Customer, and Patent Issue Year fixed effects. The sample covers patents
issued from 1976-01-01 through 2021-12-31. Standard errors are clustered at the patent customer and
patent issue year level. ***, **, * denote significance at the 1%, 5%, and 10% level, respectively. Data
source: USPTO.
Dependent variable:
Forward Citations Delta in Forward Citations
(1) (2)
Lead Female Inventor −1.192∗ −3.537∗∗∗
(0.685) (0.821)
67
This table estimates the difference in citations by NBER subcategories. Column (1) uses the actual num-
ber of citations as its dependent variable, while Column (2) uses the Delta in citations, as defined by
Equation 2. Estimates include interactions for the patent subcategory based on NBER classifications. All
specifications include Patent Subcategory, Examiner × Customer, and Patent Issue Year fixed effects. Standard
errors are clustered at the patent customer and patent issue year level. ***, **, * denote significance at the
1%, 5%, and 10% level, respectively. Data source: USPTO.
Dependent variable:
Forward Citations Delta in Forward Citations
(1) (2)
Female Lead Inventor 0.499 −1.915∗∗∗
(0.470) (0.440)
Electronic business methods and software× Female Lead Inventor −8.866∗∗ −8.702∗
(3.623) (4.926)
Dependent variable:
Forward Citations Delta in Forward Citations
(1) (2)
Mat. Proc & Handling× Female Lead Inventor 0.259 −0.341
(0.665) (0.427)
This table establishes robustness of our baseline specification of Panel B of Table 3. Panel A uses a single-
author patent while Panel B uses both single-author patents and patents where all inventors share the
same gender. The sample covers patents issued from 1976-01-01 through 2021-12-31. Standard errors are
clustered at the patent customer and patent issue year level. ***, **, * denote significance at the 1%, 5%,
and 10% level, respectively. Data source: USPTO.
Intercept 1.456∗∗∗
(0.261)
Customer FE No No Yes No
Examiner FE No No Yes No
Examiner x Customer FE No No No Yes
Art Unit FE No Yes Yes Yes
Patent Issue Year FE No Yes Yes Yes
Observations 124,280 124,280 124,280 124,280
Adjusted R2 0.002 0.015 0.138 0.421
Intercept 1.725∗∗∗
(0.239)
Customer FE No No Yes No
Examiner FE No No Yes No
Examiner x Customer FE No No No Yes
Art Unit FE No Yes Yes Yes
Patent Issue Year FE No Yes Yes Yes
Observations 129,298 129,298 129,298 129,298
Adjusted R2 0.003 0.017 0.211 0.466
70
This table replaces the BERT model using Longformer to study the full text of patents. The sample covers
patents ranging from 4800000 through 5299999. These range roughly from 1986 until 1994. Standard
errors are clustered at the patent customer and patent issue year level. ***, **, * denote significance at the
1%, 5%, and 10% level, respectively. Data source: USPTO.
Intercept 4.767∗∗∗
(0.471)
Customer FE No No Yes No
Examiner FE No No Yes No
Examiner x Customer FE No No No Yes
Art Unit FE No Yes Yes Yes
Patent Issue Year FE No Yes Yes Yes
Observations 37,020 37,020 37,020 37,020
Adjusted R2 0.001 0.020 0.255 0.292
71
This table presents a placebo test by randomizing the gender of patents and re-running our C-BERT
approach to establish the effects are not an artifact of C-BERT. Panel A uses the number of forward
citations as the dependent variable, while Panel B uses bias computed using C-BERT. The sample covers
patents issued from 1976-01-01 through 2021-12-31. Standard errors are clustered at the patent customer
and patent issue year level. ***, **, * denote significance at the 1%, 5%, and 10% level, respectively. Data
source: USPTO.
Intercept 19.918∗∗∗
(3.146)
Customer FE No No Yes No
Examiner FE No No Yes No
Examiner x Customer FE No No No Yes
Art Unit FE No Yes Yes Yes
Patent Issue Year FE No Yes Yes Yes
Observations 471,461 471,461 471,461 471,461
Adjusted R2 −0.00000 0.109 0.202 0.304
Intercept −3.626∗∗∗
(0.537)
Customer FE No No Yes No
Examiner FE No No Yes No
Examiner x Customer FE No No No Yes
Art Unit FE No Yes Yes Yes
Patent Issue Year FE No Yes Yes Yes
Observations 471,461 471,461 471,461 471,461
Adjusted R2 −0.00000 0.025 0.051 0.221
72
This table replaces the BERT model with SciBERT. The sample covers patents issued from 1976-01-01
through 2021-12-31. Standard errors are clustered at the patent customer and patent issue year level. ***,
**, * denote significance at the 1%, 5%, and 10% level, respectively. Data source: USPTO.
Intercept 0.680∗∗∗
(0.141)
Customer FE No No Yes No
Examiner FE No No Yes No
Examiner x Customer FE No No No Yes
Art Unit FE No Yes Yes Yes
Patent Issue Year FE No Yes Yes Yes
Observations 602,974 602,974 602,974 602,974
Adjusted R2 0.012 0.033 0.064 0.195
73