Prediction of patent grant and interpreting the key determinants: an application of interpretable machine learning approach

Yao, Li; Ni, He

doi:10.1007/s11192-023-04736-z

Prediction of patent grant and interpreting the key determinants: an application of interpretable machine learning approach

Published: 22 June 2023

Volume 128, pages 4933–4969, (2023)
Cite this article

Scientometrics Aims and scope Submit manuscript

1161 Accesses
4 Citations
Explore all metrics

Abstract

Patents are valuable intellectual property only when granted by the governments, and failing to receive an official grant means disclosing valuable technologies and information, which otherwise would be kept as commercial secrets. Yet, a typical patent application process takes years to complete and the outcome is uncertain. This study implements machine learning models to predict patent examination outcomes based on early information disclosed at patent publication and interpret the mechanism of how these models make predictions, highlighting the key determinants to patent grant and delineating the relationships between the patent features and the examination outcome. The predictive models that integrate patent-level variables with textual information accomplish the best prediction performances with a 0.854 ROC-AUC score and 77% accuracy rate. A number of interpretable machine learning methods are applied. The permutation-based feature importance metric identifies key determinants such as applicants’ prior experience, page length, backward citation, claim counts, number of patent family, etc. SHAP (SHapley Additive exPlanations), a local interpretability method, describes the marginal contributions to the model prediction of key predictors using two actual patent examples. Our study provides several valuable findings with important theoretical insights and practical applications. Specifically, we show that patent-level information can serve as a predictor of examination outcomes and the relationships between the predictors and outcome variables are complex. Knowledge accumulation and technology complexity positively affect the likelihood of patent grants, albeit with a curvilinear relationship. At lower levels, both factors significantly increase the chance of a grant, but beyond a certain threshold, the marginal effect becomes less pronounced. Additionally, prior experience, patent family size, and engagement with the patent agency have a monotonic and positive relationship with the grant likelihood, whereas the impact of patent scope on patent grants remains uncertain. While a narrower and more specific patent claim is associated with a higher grant rate, the number of claims increases it. Moreover, technology range, inventor team size, and examination duration have little effect on the patent grant results. From a practical standpoint, the accurate prediction of patent grants has significant potential applications. For instance, it could help firms better prioritize resources on the patent applications of high grant potentials to secure the final grant, as failure means a waste of R &D effort and disclosure of technology without IP protection. Additionally, patent examiners could utilize our predictive results as prior knowledge to enhance their judgment and accelerate the examination process.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Successful patenting—not only how, but with whom: the importance of patent attorneys

Article Open access 13 August 2022

Patent retrieval: a literature review

Article 14 January 2019

Discovering the realistic paths towards the realization of patent valuation from technical perspectives: defense, implementation or transfer

Article 15 May 2020

Notes

We have combed out and summarized these influence factors to patent grant in Table 5 in Appendix.
To better present the two streams of related literature, we tabulate these related studies in Table 6 in the appendix.
A paid membership on incopat.com allows quick downloading of a large amount of patent data with numerous patent indicators. CNIPA offers a website port for individual patent queries but not for downloading a large number of patent data.
Only invention patent follows the three major stages, namely, application, publication, and final decision. In comparison, utility and design patents are published once and only once when it is officially published and granted, which means the rejected ones are not available to the public. That’s why we use invention application patents instead of the other types.
Independent claims are ones that are contingent of other claims. The first patent claim is always independent by law.
In this section, we only detail the tuning and fitting process for the XGBoost model as an illustration and exemplification, since most machine learning models are tuned and trained in similar ways.

References

Ahn, J. M., Mortara, L., & Minshall, T. (2018). Dynamic capabilities and economic crises: Has openness enhanced a firm’s performance in an economic downturn? Industrial and Corporate Change, 27(1), 49–63.
Article Google Scholar
Asadi, M., Ebrahimi, N., Kharazmi, O., & Soofi, E. S. (2018). Mixture models, Bayes Fisher information, and divergence measures. IEEE Transactions on Information Theory, 65(4), 2316–2321.
Article MathSciNet MATH Google Scholar
Bekkers, R., Martinelli, A., & Tamagni, F. (2020). The impact of including standards-related documentation in patent prior art: Evidence from an EPO policy change. Research Policy, 49(7), 104007.
Article Google Scholar
Breiman, L. (2001). Random forests, machine learning 45. Journal of Clinical Microbiology, 2(30), 199–228.
Google Scholar
Carmona, P., Climent, F., & Momparler, A. (2019). Predicting failure in the U.S. banking sector: An extreme gradient boosting approach. International Review of Economics & Finance, 61(MAY), 304–323.
Article Google Scholar
Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp. 785–794.
Cho, J. H., Lee, J., & Sohn, S. Y. (2021). Predicting future technological convergence patterns based on machine learning using link prediction. Scientometrics, 126(7), 5413–5429.
Article Google Scholar
Choi, Y., Park, S., & Lee, S. (2021). Identifying emerging technologies to envision a future innovation ecosystem: A machine learning approach to patent data. Scientometrics, 126(7), 5431–5476.
Article Google Scholar
Choudhury, P., & Haas, M. R. (2018). Scope versus speed: Team diversity, leader experience, and patenting outcomes for firms. Strategic Management Journal, 39(4), 977–1002.
Article Google Scholar
Chung, P., & Sohn, S. Y. (2020). Early detection of valuable patents using a deep learning model: Case of semiconductor industry. Technological Forecasting and Social Change, 158, 120146.
Article Google Scholar
Climent, F., Momparler, A., & Carmona, P. (2019). Anticipating bank distress in the Eurozone: An extreme gradient boosting approach. Journal of Business Research, 101(AUG.), 885–896.
Article Google Scholar
de Rassenfosse, G., Palangkaraya, A., & Webster, E. (2016). Why do patents facilitate trade in technology? Testing the disclosure and appropriation effects. Research Policy, 45(7), 1326–1336.
Article Google Scholar
de Rassenfosse, G., Palangkaraya, A., & Raiteri, E. (2020). Technology protectionism and the patent system: Strategic technologies in China. The Journal of Industrial Economics.
de Rassenfosse, G., Palangkaraya, A., & Hosseini, R. (2020). Discrimination against foreigners in the US patent system. Journal of International Business Policy, 3(4), 349–366.
Article Google Scholar
Denisko, D., & Hoffman, M. M. (2018). Classification and interaction in random forests. Proceedings of the National Academy of Sciences, 115(8), 1690–1692.
Article Google Scholar
Drivas, K., & Kaplanis, I. (2020). The role of international collaborations in securing the patent grant. Journal of Informetrics, 14(4), 101093.
Article Google Scholar
Faems, D., Van Looy, B., & Debackere, K. (2005). Interorganizational collaboration and innovation: Toward a portfolio approach. Journal of Product Innovation Management, 22(3), 238–250.
Article Google Scholar
Fisher, A., Rudin, C., & Dominici, F. (2019). All models are wrong, but many are useful: Learning a variable’s importance by studying an entire class of prediction models simultaneously. Journal of Machine Learning Research, 20(177), 1–81.
MathSciNet MATH Google Scholar
Frakes, M. D., & Wasserman, M. F. (2017). Is the time allocated to review patent applications inducing examiners to grant invalid patents? Evidence from microlevel application data. Review of Economics and Statistics, 99(3), 550–563.
Article Google Scholar
Frakes, M. D., & Wasserman, M. F. (2021). Knowledge spillovers, peer effects, and telecommuting: Evidence from the U.S. patent office. Journal of Public Economics, 198, 104425.
Article Google Scholar
Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29(5), 1189–1232.
Article MathSciNet MATH Google Scholar
Friedman, J. H. (2017). The elements of statistical learning: Data mining, inference, and prediction. London: Springer.
Google Scholar
Gans, J. S., Hsu, D. H., & Stern, S. (2008). The impact of uncertain intellectual property rights on the market for ideas: Evidence from patent grant delays. Management Science, 54(5), 982–997.
Article Google Scholar
Ghoddusi, H., Creamer, G. G., & Rafizadeh, N. (2019). Machine learning in energy economics and finance: A review. Energy Economics, 81, 709–727.
Article Google Scholar
Guellec, D., & van Pottelsberghe, B. (2000). Applications, grants and the value of patent. Economics Letters, 69(1), 109–114.
Article MATH Google Scholar
Hall, B. H., & Harhoff, D. (2012). Recent research on the economics of patents. Annual Review of Economics, 4(1), 541–565.
Article Google Scholar
Hall, B. H., & Trajtenberg, J. M. (2005). Market value and patent citations on. Rand Journal of Economics, 36(1), 16–38.
Google Scholar
Han, S., Huang, H., Huang, X., Li, Y., Ruihua, Y., & Zhang, J. (2022). Core patent forecasting based on graph neural networks with an application in stock markets. Technology Analysis & Strategic Management, 34, 1–15.
Article Google Scholar
Harhoff, D., & Wagner, S. (2009). The duration of patent examination at the European patent office. Management Science, 55(12), 1969–1984.
Article Google Scholar
Harhoff, D., Scherer, F. M., & Vopel, K. (2003). Citations, family size, opposition and the value of patent rights. Research Policy, 32(8), 1343–1363.
Article Google Scholar
Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning. London: Springer.
Book MATH Google Scholar
Higham, K., De Rassenfosse, G., & Jaffe, A. B. (2021). Patent quality: Towards a systematic framework for analysis and measurement. Research Policy, 50(4), 104215. Publisher: Elsevier.
Article Google Scholar
Hur, W., & Junbyoung, O. (2021). A man is known by the company he keeps?: A structural relationship between backward citation and forward citation of patents. Research Policy, 50(1), 104117. Publisher: Elsevier.
Article Google Scholar
Jaffe, A. B., Trajtenberg, M., & Henderson, R. (1993). Geographic localization of knowledge spillovers as evidenced by patent citations. The Quarterly journal of Economics, 108(3), 577–598.
Article Google Scholar
Juranek, S., & Otneim, H. (2021). Using machine learning to predict patent lawsuits. SSRN Electronic Journal. https://doi.org/10.2139/ssrn.3871701
Article Google Scholar
Katchanov, Y. L., Markova, Y. V., & Shmatko, N. A. (2019). The distinction machine: Physics journals from the perspective of the Kolmogorov-Smirnov statistic. Journal of Informetrics, 13(4), 100982.
Article Google Scholar
Kim, D., Seo, D., Cho, S., & Kang, P. (2019). Multi-co-training for document classification using various document representations: TF-IDF, LDA, and Doc2Vec. Information Sciences, 477, 15–29.
Article Google Scholar
Kim, J., Lee, G., Lee, S., & Lee, C. (2022). Towards expert-machine collaborations for technology valuation: An interpretable machine learning approach. Technological Forecasting and Social Change, 183, 121940.
Article Google Scholar
Kim, Y. K., & Oh, J. B. (2017). Examination workloads, grant decision bias and examination quality of patent office. Research Policy, 46(5), 1005–1019.
Article Google Scholar
Klincewicz, K., & Szumiał, S. (2022). Successful patenting–not only how, but with whom: The importance of patent attorneys. Scientometrics, 127(9), 5111–5137.
Article Google Scholar
Kong, D., Yang, J., & Li, L. (2020). Early identification of technological convergence in numerical control machine tool: A deep learning approach. Scientometrics, 125(3), 1983–2009.
Article Google Scholar
Kuhn, J. M., & Thompson, N. C. (2019). How to measure and draw causal inferences with patent scope. International Journal of the Economics of Business, 26(1), 5–38.
Article Google Scholar
Kuhn, M., & Johnson, K. (2013). Applied predictive modeling. New York: Springer.
Book MATH Google Scholar
Kwon, U., & Geum, Y. (2020). Identification of promising inventions considering the quality of knowledge accumulation: A machine learning approach. Scientometrics, 125(3), 1877–1897.
Article Google Scholar
Kyebambe, M. N., Cheng, G., Huang, Y., He, C., & Zhang, Z. (2017). Forecasting emerging technologies: A supervised learning approach through patent analysis. Technological Forecasting and Social Change, 125, 236–244.
Article Google Scholar
Lee, C., Kwon, O., Kim, M., & Kwon, D. (2018). Early identification of emerging technologies: A machine learning approach using multiple patent indicators. Technological Forecasting and Social Change, 127, 291–303.
Article Google Scholar
Lemley, M. A., & Sampat, B. (2012). Examiner characteristics and patent office outcomes. Review of Economics and Statistics, 94(3), 817–827. Publisher: The MIT Press.
Article Google Scholar
Lerner, J. (1994). The importance of patent scope: An empirical analysis. The RAND Journal of Economics, 25(2), 319.
Article Google Scholar
Li, K., Cursio, J. D., Sun, Y., & Zhu, Z. (2019). Determinants of price fluctuations in the electricity market: A study with PCA and NARDL models. Economic research-Ekonomska istraživanja, 32(1), 2404–2421.
Article Google Scholar
Li, P., Mao, K., Yuecong, X., Li, Q., & Zhang, J. (2020). Bag-of-concepts representation for document classification based on automatic knowledge acquisition from probabilistic knowledge base. Knowledge-Based Systems, 193, 105436.
Article Google Scholar
Liegsalz, J., & Wagner, S. (2013). Patent examination at the State Intellectual Property Office in China. Research Policy, 42(2), 552–563.
Article Google Scholar
Loyola-Gonzalez, O. (2019). Black-box vs. white-box: Understanding their advantages and weaknesses from a practical point of view. IEEE Access, 7, 154096–154113.
Article Google Scholar
Lundberg, S. M., Lee, S-I. (2017). A unified approach to interpreting model predictions. In: Proceedings of the 31st international conference on neural information processing systems (pp. 4768–4777).
Lundberg, S. M., Erion, G., Chen, H., DeGrave, A., Prutkin, J. M., Nair, B., Katz, R., Himmelfarb, J., Bansal, N., & Lee, S.-I. (2020). From local explanations to global understanding with explainable AI for trees. Nature Machine Intelligence, 2(1), 56–67. Publisher: Nature Publishing Group.
Article Google Scholar
Mann, R. J., & Sager, T. W. (2007). Patents, venture capital, and software start-ups. Research Policy, 36(2), 193–208.
Article Google Scholar
Marco, A. C., Sarnoff, J. D., & deGrazia, C. A. W. (2019). Patent claims and patent scope. Research Policy, 48(9), 103790.
Article Google Scholar
Molnar, C. (2020). Interpretable machine learning, Lulu. com.
Moser, P., Ohmstedt, J., Rhode, P. W. (2017). Patent citations—An analysis of quality differences and citing practices in hybrid corn. Management Science mnsc.2016.2688
Mullainathan, S., & Spiess, J. (2017). Machine learning: An applied econometric approach. Journal of Economic Perspectives, 31(2), 87–106.
Article Google Scholar
Novelli, E. (2015). An examination of the antecedents and implications of patent scope. Research Policy, 44(2), 493–507.
Article Google Scholar
Ponta, L., Puliga, G., Oneto, L., & Manzini, R. (2020). Identifying the determinants of innovation capability with machine learning and patents. IEEE Transactions on Engineering Management. https://doi.org/10.1109/TEM.2020.3004237
Article Google Scholar
Sampat, B., & Williams, H. L. (2019). How do patents affect follow-on innovation? Evidence from the human genome. American Economic Review, 109(1), 203–236.
Article Google Scholar
Schoenmakers, W., & Duysters, G. (2010). The technological origins of radical inventions. Research Policy, 39(8), 1051–1059.
Article Google Scholar
Schuster, W. M., Evan Davis, R., Schley, K., & Ravenscraft, J. (2020). An empirical study of patent grant rates as a function of race and gender. American Business Law Journal, 57, 39.
Article Google Scholar
Sun, Z., & Wright, B. D. (2022). Citations backward and forward: Insights into the patent examiner’s role. Research Policy, 51(7), 104517.
Article Google Scholar
Tong, T. W., Zhang, K., He, Z.-L., & Zhang, Y. (2018). What determines the duration of patent examination in China? An outcome-specific duration analysis of invention patent applications at SIPO. Research Policy, 47(3), 583–591.
Article Google Scholar
Useche, D. (2014). Are patents signals for the IPO market? An EU-US comparison for the software industry. Research Policy, 43(8), 1299–1311.
Article Google Scholar
van Zeebroeck, N., de la Potterie, B. P., & Guellec, D. (2009). Claiming more: The Increased voluminosity of patent applications and its determinants. Research Policy, 38(6), 1006–1020.
Article Google Scholar
Wang, X., Yang, X., Jian, D., Wang, X., Li, J., & Tang, X. (2021). A deep learning approach for identifying biomedical breakthrough discoveries using context analysis. Scientometrics, 126(7), 5531–5549.
Article Google Scholar
Webster, E., Jensen, P. H., & Palangkaraya, A. (2014). Patent examination outcomes and the national treatment principle. The RAND Journal of Economics, 45(2), 449–469.
Article Google Scholar
Winter, E. (2002). The shapley value. Handbook of game theory with economic applications (Vol. 3, pp. 2025–2054). London: North-Holland.
Google Scholar
Xie, Y., & Giles, D. E. (2011). A survival analysis of the approval of US patent applications. Applied Economics, 43(11), 1375–1384.
Article Google Scholar
Yang, D. (2008). Pendency and grant ratios of invention patents: A comparative study of the US and China. Research Policy, 37(6–7), 1035–1046.
Article Google Scholar
Zhang, Guiyang, & Tang, Chaoying. (2017). How could firm’s internal R &D collaboration bring more innovation? Technological Forecasting and Social Change. https://doi.org/10.1016/j.techfore.2017.07.007
Article Google Scholar
Zhang, Y., Ma, F., & Wang, Y. (2019). Forecasting crude oil prices with a large set of predictors: Can LASSO select powerful predictors? Journal of Empirical Finance, 54, 97–117.
Article Google Scholar
Zhao, L. (2022). On the grant rate of Patent Cooperation Treaty applications: Theory and evidence. Economic Modelling, 117, 106051.
Article Google Scholar
Zhao, Q., & Hastie, T. (2021). Causal interpretations of black-box models. Journal of Business & Economic Statistics, 39(1), 272–281.
Article MathSciNet Google Scholar
Zhou, Y., Dong, F., Liu, Y., Li, Z., JunFei, D., & Zhang, L. (2020). Forecasting emerging technologies using data augmentation and deep learning. Scientometrics, 123(1), 1–29.
Article Google Scholar
Zhu, K., Malhotra, S., & Li, Y. (2022). Technological diversity of patent applications and decision pendency. Research Policy, 51(1), 104364.
Article Google Scholar

Download references

Funding

The authors are grateful for financial support from Philosophy and Social Science Foundation of Zhejiang Province (Grant No. 22NDQN246YB), National Natural Science Foundation of China (Grant No. 72204222), the Characteristic and Preponderant Discipline of Key Construction Universities in Zhejiang Province (Zhejiang Gongshang University-Statistics), Zhejiang Gongshang University “Digital+” Disciplinary Construction Management Project (Grant No. SZJ2022B001), and the Tailong Finance School.

Author information

Authors and Affiliations

School of Business Administration, Zhejiang Gongshang University, Hangzhou, China
Li Yao
Tailong Finance School, Zhejiang Gongshang University, Hangzhou, China
He Ni

Authors

Li Yao
View author publications
You can also search for this author in PubMed Google Scholar
He Ni
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to He Ni.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Appendix

Table of literature review

Table 5 Summary of influencing factors to patent grant

Full size table

Table 6 Literature review summary

Full size table

Feature selection

While machine learning models are good at handling large dimensions of inputs variables, there is a consequence of using non-informative predictors in that they could add uncertainty to the predictions and reduce the overall effectiveness of the model (Kuhn & Johnson, 2013). Hence, the logic behind feature selection is that removing non-informative features could improve model prediction performance. We measure the informativeness using the difference between variable distributions of the granted and rejected groups. The intuition is that, the more different distributions of the focal variable between the granted and rejected are, the more informative that variable is. Thus, we calculate the differences between the pair for each feature and we believe the difference is proportional to the usefulness of features.

The distribution distance are illustrated in Fig. 6. The blue density distribution indicates the granted sample while the yellow distribution indicates the rejected sample. The extent to which the two distributions differ indicates is the evidence of informativeness of that focal predictor.

We apply two well-established approaches to measure the distributional difference.

Kolmogorov–Smirnov test (KS test)

KS test is a form of minimum distance estimation used to compare two datasets. The KS statistic quantifies a distance between the empirical distribution functions of two groups of samples. The two-sample KS test is one of the most useful and general nonparametric methods for comparing two samples, as it is sensitive to differences in both location and shape of the empirical cumulative distribution functions of the two samples. The KS test is based on the maximum distance between these two curves produced by two empirical cumulative distribution functions. The KS test statistic is defined as:

$$\begin{aligned} D_n = \max _{1 \le x \le N} (F_a(Y_x) = F_b(Y_x)) \end{aligned}$$

(3)

where F is a theoretical cumulative distribution of the distribution being tested which must be a continuous distribution. The statistic evaluates the greatest separation between the cumulative distribution $F_a$ and $F_b$. When $F_a \ne F_b$, larger values of D are expected.

Jensen–Shannon divergence

JS divergence is based on Kullback–Leibler (KL) Divergence divergence, a method of measuring statistical distance sometimes referred to as relative entropy. KL calculates a score that measures the divergence of one probability distribution from another. It can be used as a distance metric that quantifies the difference between two probability distributions. If $F_a$ and $F_b$ represent the probability distribution of a discrete random variable, the KL divergence is calculated as a summation:

$$\begin{aligned} KL(F_a || F_b) = \sum F_a(x)log \Big (\frac{F_a(x)}{F_b(x)} \Big ) \end{aligned}$$

(4)

The intuition for the KL divergence score is that when the probability for an event is large, but the probability for the same event is small, there is a large divergence. When the probability from is small and the probability from is large, there is also a large divergence, but not as large as the first case.

JS divergence extends KL divergence to calculate a symmetrical score and distance measure of one probability distribution from another. The JS divergence can be calculated as follows:

$$\begin{aligned} JS(F_a || F_b) = \frac{KL(F_a ||F_M)}{2} + \frac{KL(F_b||F_M)}{2} \end{aligned}$$

(5)

where $F_M$ is calculated as:

$$\begin{aligned} F_M = \frac{F_a + F_b}{2} \end{aligned}$$

(6)

Based on both information from the KS and JS statistic measurement of the distributional difference between the granted and rejected, we select the relatively more informative features as predictors of our machine learning models.

The permutation feature importance

Permutation feature importance measures the increase in the prediction error of the model after the features’ values are permuted. The idea is intuitive—if shuffling a feature value increases the model error more, then the model relied on this feature for the prediction to a greater degree. A feature is hence “unimportant” if shuffling its values leaves the model error unchanged because, as we can interpret, the model ignored the feature for the prediction. This method was initially introduced Breiman (2001) for random forest, and extended to a model-agnostic version by Fisher et al. (2019). The permutation feature importance algorithm we use was based on Fisher et al. (2019):

Input: trained model ${\hat{f}}$, feature matrix X, target vector y, error measure $L(y, {\hat{f}})$.

1.
Estimate the original model error $e_{orig} = L(y, {\hat{f}}(X))$(e.g. R-square or mean squared error)
2.
For each feature $j \in \{1,\ldots , p\}$ do:
- Generate feature matrix $X_{perm}$ by permuting feature j in the data X. This breaks the association between feature j and true outcome y.
- Estimate error $e_{perm} = L(Y, {\hat{f}}(X_{perm}))$ based on the predictions of the permuted data.
- Calculate permutation feature importance as quotient $FI_j = \frac{e_{perm}}{e_{orig}}$ or difference $FI_j = e_{perm} - e_{orig}$
3.
Sort features by descending FI

In practice, we use the $vi\_permute$ command in the R package vip to generate feature importance for XGBoost, random forest, SVM, LASSO, and Logit.

The partial dependence plots

The SHAP dependence plot shows a fitted curve for a scatter plot of the SHAP values and the actual feature values, illustrating the relationship between the two, whether it is a linear, monotonic, or more complex relationship.

The more traditional way to present the interrelation as a visualization tool is the Partial Dependence Plot (PDP) proposed by Friedman (2001), which is able to illustrate how each variable affects the chance of patent grant. The difference between the SHAP dependence plot and PDP is that first, the y-axis of the SHAP dependence plot is the SHAP value (or marginal contribution) while the y-axis of PDP is the average prediction. Thus, the y-axis in the SHAP dependence plot is centered on zero, while it is centered on 0.46 (the average grant ratio) for PDP. Second, SHAP dependence is able to present more information, such as the distribution of individual samples. Yet, PDP is a more well-accept and applied tool for model interpretation. We apply PDP for the robustness of our results.

Given the output function f(x) of machine learning algorithm, the partial dependence of f on variable $X_s$ is defined as (let c be the complement set of s )

$$\begin{aligned} f^s(x_s) = E_{X_c}[f(x_s, X_c)] = \int f(x_s, x_c)dP(x_c) \end{aligned}$$

(7)

where the PDP $f^s$ is the expectation of f over the marginal distribution of all variables except for $x_s$. In practice, PDP is estimated by taking average over the training data with fixed $X_s$

$$\begin{aligned} {\bar{f}}^s(x_s) = \frac{1}{n} \sum ^n _i = 1 f(x_s, X^i_c)) \end{aligned}$$

(8)

PDP plots indicate direct causal effect of a variable s on the outcome variable if the back-door condition is satisfied (Zhao and Hastie, 2021). We admit that some of the variables are likely endogenous or descendants of other variables in the terminology of the directed acyclic graph (DAG). For instance, the variable duration, the time duration between application and publication, can be affected by other confounding variables. Though may not be explained in a causal manner, nonetheless many of the high correlation nature are worth pointing out.

Figure 7 illustrates 12 PDP plots for the major predictors. To keep the plots comparable, we limit the y-axis ranging from 0.4 to 0.6 and the x-axis according to the distribution of each predictor. Note that from Panel (a) to (f), the PDP generates similar relational patterns between the predictors and output variable as ones from Fig. 5. Backward citation, patent length, number of claims, length of the first claim, grant rate of the previous year, and patent family all show up a similar pattern of positive correlation with the outcome variable. For backward citation, patent length, and the number of claims, there exists a threshold effect. Once beyond 4 for citing, 20 for pages, and 10 for $right\_no$, the marginal benefits of these variables stay positive and constant. While $nchar\_right$, $app\_rate2010$, and nfamily all show up a monotonic increasing relationship with the outcome variable. Hence, PDP confirms the relationships of the predictors and outcome variables as suggested and discussed in our main paper.

Patent length or pages increase the likelihood of a patent grant. There are notable data outliers with one or two pages, and these are patents with an above-average grant rate. It is also present in Panel (b) of Fig. 5 that there are four blues dots on the upper left corner. There are possible data anomalies but it does not interfere the model interpretation.

Besides the six major variables, we also explore other less important yet noteworthy predictors. The agent success rate in the previous year, $agent\_rate2010$, increases the likelihood of patent grant, suggesting relying on a high-quality patent agent could help the examination outcome. Agent application load, $agent_freq$, initially increases but eventually decreases the patent grant rate. Application frequency in the same year, $app\_freq$, could lower the outcome variable because too many applications can be distracting. Panel (i) shows that the duration between application and publication has a complex relationship with the outcome variable. Panel (k) shows that Inventor numbers have little effect on the likelihood of patent grant. There is a slight increase for a higher number of inventors. Panel (l) shows the length of the abstract, $nchar\_ab$, which increases the patent grant because it may indicate the technology complexity similar to patent length.

It is also interesting to note, Panel (m) and (o) show forward citation and the number of IPC assigned have little predicting power of whether a patent gets granted or not.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Yao, L., Ni, H. Prediction of patent grant and interpreting the key determinants: an application of interpretable machine learning approach. Scientometrics 128, 4933–4969 (2023). https://doi.org/10.1007/s11192-023-04736-z

Download citation

Received: 26 October 2021
Accepted: 05 May 2023
Published: 22 June 2023
Issue Date: September 2023
DOI: https://doi.org/10.1007/s11192-023-04736-z

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Prediction of patent grant and interpreting the key determinants: an application of interpretable machine learning approach

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others