Skip to main content

Advertisement

Log in

Prediction of patent grant and interpreting the key determinants: an application of interpretable machine learning approach

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

Patents are valuable intellectual property only when granted by the governments, and failing to receive an official grant means disclosing valuable technologies and information, which otherwise would be kept as commercial secrets. Yet, a typical patent application process takes years to complete and the outcome is uncertain. This study implements machine learning models to predict patent examination outcomes based on early information disclosed at patent publication and interpret the mechanism of how these models make predictions, highlighting the key determinants to patent grant and delineating the relationships between the patent features and the examination outcome. The predictive models that integrate patent-level variables with textual information accomplish the best prediction performances with a 0.854 ROC-AUC score and 77% accuracy rate. A number of interpretable machine learning methods are applied. The permutation-based feature importance metric identifies key determinants such as applicants’ prior experience, page length, backward citation, claim counts, number of patent family, etc. SHAP (SHapley Additive exPlanations), a local interpretability method, describes the marginal contributions to the model prediction of key predictors using two actual patent examples. Our study provides several valuable findings with important theoretical insights and practical applications. Specifically, we show that patent-level information can serve as a predictor of examination outcomes and the relationships between the predictors and outcome variables are complex. Knowledge accumulation and technology complexity positively affect the likelihood of patent grants, albeit with a curvilinear relationship. At lower levels, both factors significantly increase the chance of a grant, but beyond a certain threshold, the marginal effect becomes less pronounced. Additionally, prior experience, patent family size, and engagement with the patent agency have a monotonic and positive relationship with the grant likelihood, whereas the impact of patent scope on patent grants remains uncertain. While a narrower and more specific patent claim is associated with a higher grant rate, the number of claims increases it. Moreover, technology range, inventor team size, and examination duration have little effect on the patent grant results. From a practical standpoint, the accurate prediction of patent grants has significant potential applications. For instance, it could help firms better prioritize resources on the patent applications of high grant potentials to secure the final grant, as failure means a waste of R &D effort and disclosure of technology without IP protection. Additionally, patent examiners could utilize our predictive results as prior knowledge to enhance their judgment and accelerate the examination process.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. We have combed out and summarized these influence factors to patent grant in Table 5 in Appendix.

  2. To better present the two streams of related literature, we tabulate these related studies in Table 6 in the appendix.

  3. A paid membership on incopat.com allows quick downloading of a large amount of patent data with numerous patent indicators. CNIPA offers a website port for individual patent queries but not for downloading a large number of patent data.

  4. Only invention patent follows the three major stages, namely, application, publication, and final decision. In comparison, utility and design patents are published once and only once when it is officially published and granted, which means the rejected ones are not available to the public. That’s why we use invention application patents instead of the other types.

  5. Independent claims are ones that are contingent of other claims. The first patent claim is always independent by law.

  6. In this section, we only detail the tuning and fitting process for the XGBoost model as an illustration and exemplification, since most machine learning models are tuned and trained in similar ways.

References

  • Ahn, J. M., Mortara, L., & Minshall, T. (2018). Dynamic capabilities and economic crises: Has openness enhanced a firm’s performance in an economic downturn? Industrial and Corporate Change, 27(1), 49–63.

    Article  Google Scholar 

  • Asadi, M., Ebrahimi, N., Kharazmi, O., & Soofi, E. S. (2018). Mixture models, Bayes Fisher information, and divergence measures. IEEE Transactions on Information Theory, 65(4), 2316–2321.

    Article  MathSciNet  MATH  Google Scholar 

  • Bekkers, R., Martinelli, A., & Tamagni, F. (2020). The impact of including standards-related documentation in patent prior art: Evidence from an EPO policy change. Research Policy, 49(7), 104007.

    Article  Google Scholar 

  • Breiman, L. (2001). Random forests, machine learning 45. Journal of Clinical Microbiology, 2(30), 199–228.

    Google Scholar 

  • Carmona, P., Climent, F., & Momparler, A. (2019). Predicting failure in the U.S. banking sector: An extreme gradient boosting approach. International Review of Economics & Finance, 61(MAY), 304–323.

    Article  Google Scholar 

  • Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp. 785–794.

  • Cho, J. H., Lee, J., & Sohn, S. Y. (2021). Predicting future technological convergence patterns based on machine learning using link prediction. Scientometrics, 126(7), 5413–5429.

    Article  Google Scholar 

  • Choi, Y., Park, S., & Lee, S. (2021). Identifying emerging technologies to envision a future innovation ecosystem: A machine learning approach to patent data. Scientometrics, 126(7), 5431–5476.

    Article  Google Scholar 

  • Choudhury, P., & Haas, M. R. (2018). Scope versus speed: Team diversity, leader experience, and patenting outcomes for firms. Strategic Management Journal, 39(4), 977–1002.

    Article  Google Scholar 

  • Chung, P., & Sohn, S. Y. (2020). Early detection of valuable patents using a deep learning model: Case of semiconductor industry. Technological Forecasting and Social Change, 158, 120146.

    Article  Google Scholar 

  • Climent, F., Momparler, A., & Carmona, P. (2019). Anticipating bank distress in the Eurozone: An extreme gradient boosting approach. Journal of Business Research, 101(AUG.), 885–896.

    Article  Google Scholar 

  • de Rassenfosse, G., Palangkaraya, A., & Webster, E. (2016). Why do patents facilitate trade in technology? Testing the disclosure and appropriation effects. Research Policy, 45(7), 1326–1336.

    Article  Google Scholar 

  • de Rassenfosse, G., Palangkaraya, A., & Raiteri, E. (2020). Technology protectionism and the patent system: Strategic technologies in China. The Journal of Industrial Economics

  • de Rassenfosse, G., Palangkaraya, A., & Hosseini, R. (2020). Discrimination against foreigners in the US patent system. Journal of International Business Policy, 3(4), 349–366.

    Article  Google Scholar 

  • Denisko, D., & Hoffman, M. M. (2018). Classification and interaction in random forests. Proceedings of the National Academy of Sciences, 115(8), 1690–1692.

    Article  Google Scholar 

  • Drivas, K., & Kaplanis, I. (2020). The role of international collaborations in securing the patent grant. Journal of Informetrics, 14(4), 101093.

    Article  Google Scholar 

  • Faems, D., Van Looy, B., & Debackere, K. (2005). Interorganizational collaboration and innovation: Toward a portfolio approach. Journal of Product Innovation Management, 22(3), 238–250.

    Article  Google Scholar 

  • Fisher, A., Rudin, C., & Dominici, F. (2019). All models are wrong, but many are useful: Learning a variable’s importance by studying an entire class of prediction models simultaneously. Journal of Machine Learning Research, 20(177), 1–81.

    MathSciNet  MATH  Google Scholar 

  • Frakes, M. D., & Wasserman, M. F. (2017). Is the time allocated to review patent applications inducing examiners to grant invalid patents? Evidence from microlevel application data. Review of Economics and Statistics, 99(3), 550–563.

    Article  Google Scholar 

  • Frakes, M. D., & Wasserman, M. F. (2021). Knowledge spillovers, peer effects, and telecommuting: Evidence from the U.S. patent office. Journal of Public Economics, 198, 104425.

    Article  Google Scholar 

  • Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29(5), 1189–1232.

    Article  MathSciNet  MATH  Google Scholar 

  • Friedman, J. H. (2017). The elements of statistical learning: Data mining, inference, and prediction. London: Springer.

    Google Scholar 

  • Gans, J. S., Hsu, D. H., & Stern, S. (2008). The impact of uncertain intellectual property rights on the market for ideas: Evidence from patent grant delays. Management Science, 54(5), 982–997.

    Article  Google Scholar 

  • Ghoddusi, H., Creamer, G. G., & Rafizadeh, N. (2019). Machine learning in energy economics and finance: A review. Energy Economics, 81, 709–727.

    Article  Google Scholar 

  • Guellec, D., & van Pottelsberghe, B. (2000). Applications, grants and the value of patent. Economics Letters, 69(1), 109–114.

    Article  MATH  Google Scholar 

  • Hall, B. H., & Harhoff, D. (2012). Recent research on the economics of patents. Annual Review of Economics, 4(1), 541–565.

    Article  Google Scholar 

  • Hall, B. H., & Trajtenberg, J. M. (2005). Market value and patent citations on. Rand Journal of Economics, 36(1), 16–38.

    Google Scholar 

  • Han, S., Huang, H., Huang, X., Li, Y., Ruihua, Y., & Zhang, J. (2022). Core patent forecasting based on graph neural networks with an application in stock markets. Technology Analysis & Strategic Management, 34, 1–15.

    Article  Google Scholar 

  • Harhoff, D., & Wagner, S. (2009). The duration of patent examination at the European patent office. Management Science, 55(12), 1969–1984.

    Article  Google Scholar 

  • Harhoff, D., Scherer, F. M., & Vopel, K. (2003). Citations, family size, opposition and the value of patent rights. Research Policy, 32(8), 1343–1363.

    Article  Google Scholar 

  • Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning. London: Springer.

    Book  MATH  Google Scholar 

  • Higham, K., De Rassenfosse, G., & Jaffe, A. B. (2021). Patent quality: Towards a systematic framework for analysis and measurement. Research Policy, 50(4), 104215. Publisher: Elsevier.

    Article  Google Scholar 

  • Hur, W., & Junbyoung, O. (2021). A man is known by the company he keeps?: A structural relationship between backward citation and forward citation of patents. Research Policy, 50(1), 104117. Publisher: Elsevier.

    Article  Google Scholar 

  • Jaffe, A. B., Trajtenberg, M., & Henderson, R. (1993). Geographic localization of knowledge spillovers as evidenced by patent citations. The Quarterly journal of Economics, 108(3), 577–598.

    Article  Google Scholar 

  • Juranek, S., & Otneim, H. (2021). Using machine learning to predict patent lawsuits. SSRN Electronic Journal. https://doi.org/10.2139/ssrn.3871701

    Article  Google Scholar 

  • Katchanov, Y. L., Markova, Y. V., & Shmatko, N. A. (2019). The distinction machine: Physics journals from the perspective of the Kolmogorov-Smirnov statistic. Journal of Informetrics, 13(4), 100982.

    Article  Google Scholar 

  • Kim, D., Seo, D., Cho, S., & Kang, P. (2019). Multi-co-training for document classification using various document representations: TF-IDF, LDA, and Doc2Vec. Information Sciences, 477, 15–29.

    Article  Google Scholar 

  • Kim, J., Lee, G., Lee, S., & Lee, C. (2022). Towards expert-machine collaborations for technology valuation: An interpretable machine learning approach. Technological Forecasting and Social Change, 183, 121940.

    Article  Google Scholar 

  • Kim, Y. K., & Oh, J. B. (2017). Examination workloads, grant decision bias and examination quality of patent office. Research Policy, 46(5), 1005–1019.

    Article  Google Scholar 

  • Klincewicz, K., & Szumiał, S. (2022). Successful patenting–not only how, but with whom: The importance of patent attorneys. Scientometrics, 127(9), 5111–5137.

    Article  Google Scholar 

  • Kong, D., Yang, J., & Li, L. (2020). Early identification of technological convergence in numerical control machine tool: A deep learning approach. Scientometrics, 125(3), 1983–2009.

    Article  Google Scholar 

  • Kuhn, J. M., & Thompson, N. C. (2019). How to measure and draw causal inferences with patent scope. International Journal of the Economics of Business, 26(1), 5–38.

    Article  Google Scholar 

  • Kuhn, M., & Johnson, K. (2013). Applied predictive modeling. New York: Springer.

    Book  MATH  Google Scholar 

  • Kwon, U., & Geum, Y. (2020). Identification of promising inventions considering the quality of knowledge accumulation: A machine learning approach. Scientometrics, 125(3), 1877–1897.

    Article  Google Scholar 

  • Kyebambe, M. N., Cheng, G., Huang, Y., He, C., & Zhang, Z. (2017). Forecasting emerging technologies: A supervised learning approach through patent analysis. Technological Forecasting and Social Change, 125, 236–244.

    Article  Google Scholar 

  • Lee, C., Kwon, O., Kim, M., & Kwon, D. (2018). Early identification of emerging technologies: A machine learning approach using multiple patent indicators. Technological Forecasting and Social Change, 127, 291–303.

    Article  Google Scholar 

  • Lemley, M. A., & Sampat, B. (2012). Examiner characteristics and patent office outcomes. Review of Economics and Statistics, 94(3), 817–827. Publisher: The MIT Press.

    Article  Google Scholar 

  • Lerner, J. (1994). The importance of patent scope: An empirical analysis. The RAND Journal of Economics, 25(2), 319.

    Article  Google Scholar 

  • Li, K., Cursio, J. D., Sun, Y., & Zhu, Z. (2019). Determinants of price fluctuations in the electricity market: A study with PCA and NARDL models. Economic research-Ekonomska istraživanja, 32(1), 2404–2421.

    Article  Google Scholar 

  • Li, P., Mao, K., Yuecong, X., Li, Q., & Zhang, J. (2020). Bag-of-concepts representation for document classification based on automatic knowledge acquisition from probabilistic knowledge base. Knowledge-Based Systems, 193, 105436.

    Article  Google Scholar 

  • Liegsalz, J., & Wagner, S. (2013). Patent examination at the State Intellectual Property Office in China. Research Policy, 42(2), 552–563.

    Article  Google Scholar 

  • Loyola-Gonzalez, O. (2019). Black-box vs. white-box: Understanding their advantages and weaknesses from a practical point of view. IEEE Access, 7, 154096–154113.

    Article  Google Scholar 

  • Lundberg, S. M., Lee, S-I. (2017). A unified approach to interpreting model predictions. In: Proceedings of the 31st international conference on neural information processing systems (pp. 4768–4777).

  • Lundberg, S. M., Erion, G., Chen, H., DeGrave, A., Prutkin, J. M., Nair, B., Katz, R., Himmelfarb, J., Bansal, N., & Lee, S.-I. (2020). From local explanations to global understanding with explainable AI for trees. Nature Machine Intelligence, 2(1), 56–67. Publisher: Nature Publishing Group.

    Article  Google Scholar 

  • Mann, R. J., & Sager, T. W. (2007). Patents, venture capital, and software start-ups. Research Policy, 36(2), 193–208.

    Article  Google Scholar 

  • Marco, A. C., Sarnoff, J. D., & deGrazia, C. A. W. (2019). Patent claims and patent scope. Research Policy, 48(9), 103790.

    Article  Google Scholar 

  • Molnar, C. (2020). Interpretable machine learning, Lulu. com.

  • Moser, P., Ohmstedt, J., Rhode, P. W. (2017). Patent citations—An analysis of quality differences and citing practices in hybrid corn. Management Science mnsc.2016.2688

  • Mullainathan, S., & Spiess, J. (2017). Machine learning: An applied econometric approach. Journal of Economic Perspectives, 31(2), 87–106.

    Article  Google Scholar 

  • Novelli, E. (2015). An examination of the antecedents and implications of patent scope. Research Policy, 44(2), 493–507.

    Article  Google Scholar 

  • Ponta, L., Puliga, G., Oneto, L., & Manzini, R. (2020). Identifying the determinants of innovation capability with machine learning and patents. IEEE Transactions on Engineering Management. https://doi.org/10.1109/TEM.2020.3004237

    Article  Google Scholar 

  • Sampat, B., & Williams, H. L. (2019). How do patents affect follow-on innovation? Evidence from the human genome. American Economic Review, 109(1), 203–236.

    Article  Google Scholar 

  • Schoenmakers, W., & Duysters, G. (2010). The technological origins of radical inventions. Research Policy, 39(8), 1051–1059.

    Article  Google Scholar 

  • Schuster, W. M., Evan Davis, R., Schley, K., & Ravenscraft, J. (2020). An empirical study of patent grant rates as a function of race and gender. American Business Law Journal, 57, 39.

    Article  Google Scholar 

  • Sun, Z., & Wright, B. D. (2022). Citations backward and forward: Insights into the patent examiner’s role. Research Policy, 51(7), 104517.

    Article  Google Scholar 

  • Tong, T. W., Zhang, K., He, Z.-L., & Zhang, Y. (2018). What determines the duration of patent examination in China? An outcome-specific duration analysis of invention patent applications at SIPO. Research Policy, 47(3), 583–591.

    Article  Google Scholar 

  • Useche, D. (2014). Are patents signals for the IPO market? An EU-US comparison for the software industry. Research Policy, 43(8), 1299–1311.

    Article  Google Scholar 

  • van Zeebroeck, N., de la Potterie, B. P., & Guellec, D. (2009). Claiming more: The Increased voluminosity of patent applications and its determinants. Research Policy, 38(6), 1006–1020.

    Article  Google Scholar 

  • Wang, X., Yang, X., Jian, D., Wang, X., Li, J., & Tang, X. (2021). A deep learning approach for identifying biomedical breakthrough discoveries using context analysis. Scientometrics, 126(7), 5531–5549.

    Article  Google Scholar 

  • Webster, E., Jensen, P. H., & Palangkaraya, A. (2014). Patent examination outcomes and the national treatment principle. The RAND Journal of Economics, 45(2), 449–469.

    Article  Google Scholar 

  • Winter, E. (2002). The shapley value. Handbook of game theory with economic applications (Vol. 3, pp. 2025–2054). London: North-Holland.

    Google Scholar 

  • Xie, Y., & Giles, D. E. (2011). A survival analysis of the approval of US patent applications. Applied Economics, 43(11), 1375–1384.

    Article  Google Scholar 

  • Yang, D. (2008). Pendency and grant ratios of invention patents: A comparative study of the US and China. Research Policy, 37(6–7), 1035–1046.

    Article  Google Scholar 

  • Zhang, Guiyang, & Tang, Chaoying. (2017). How could firm’s internal R &D collaboration bring more innovation? Technological Forecasting and Social Change. https://doi.org/10.1016/j.techfore.2017.07.007

    Article  Google Scholar 

  • Zhang, Y., Ma, F., & Wang, Y. (2019). Forecasting crude oil prices with a large set of predictors: Can LASSO select powerful predictors? Journal of Empirical Finance, 54, 97–117.

    Article  Google Scholar 

  • Zhao, L. (2022). On the grant rate of Patent Cooperation Treaty applications: Theory and evidence. Economic Modelling, 117, 106051.

    Article  Google Scholar 

  • Zhao, Q., & Hastie, T. (2021). Causal interpretations of black-box models. Journal of Business & Economic Statistics, 39(1), 272–281.

    Article  MathSciNet  Google Scholar 

  • Zhou, Y., Dong, F., Liu, Y., Li, Z., JunFei, D., & Zhang, L. (2020). Forecasting emerging technologies using data augmentation and deep learning. Scientometrics, 123(1), 1–29.

    Article  Google Scholar 

  • Zhu, K., Malhotra, S., & Li, Y. (2022). Technological diversity of patent applications and decision pendency. Research Policy, 51(1), 104364.

    Article  Google Scholar 

Download references

Funding

The authors are grateful for financial support from Philosophy and Social Science Foundation of Zhejiang Province (Grant No. 22NDQN246YB), National Natural Science Foundation of China (Grant No. 72204222), the Characteristic and Preponderant Discipline of Key Construction Universities in Zhejiang Province (Zhejiang Gongshang University-Statistics), Zhejiang Gongshang University “Digital+” Disciplinary Construction Management Project (Grant No. SZJ2022B001), and the Tailong Finance School.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to He Ni.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Appendix

Appendix

Table of literature review

Table 5 Summary of influencing factors to patent grant
Table 6 Literature review summary

Feature selection

While machine learning models are good at handling large dimensions of inputs variables, there is a consequence of using non-informative predictors in that they could add uncertainty to the predictions and reduce the overall effectiveness of the model (Kuhn & Johnson, 2013). Hence, the logic behind feature selection is that removing non-informative features could improve model prediction performance. We measure the informativeness using the difference between variable distributions of the granted and rejected groups. The intuition is that, the more different distributions of the focal variable between the granted and rejected are, the more informative that variable is. Thus, we calculate the differences between the pair for each feature and we believe the difference is proportional to the usefulness of features.

Fig. 6
figure 6

The different between distribution of the granted and th rejected

The distribution distance are illustrated in Fig. 6. The blue density distribution indicates the granted sample while the yellow distribution indicates the rejected sample. The extent to which the two distributions differ indicates is the evidence of informativeness of that focal predictor.

We apply two well-established approaches to measure the distributional difference.

Kolmogorov–Smirnov test (KS test)

KS test is a form of minimum distance estimation used to compare two datasets. The KS statistic quantifies a distance between the empirical distribution functions of two groups of samples. The two-sample KS test is one of the most useful and general nonparametric methods for comparing two samples, as it is sensitive to differences in both location and shape of the empirical cumulative distribution functions of the two samples. The KS test is based on the maximum distance between these two curves produced by two empirical cumulative distribution functions. The KS test statistic is defined as:

$$\begin{aligned} D_n = \max _{1 \le x \le N} (F_a(Y_x) = F_b(Y_x)) \end{aligned}$$
(3)

where F is a theoretical cumulative distribution of the distribution being tested which must be a continuous distribution. The statistic evaluates the greatest separation between the cumulative distribution \(F_a\) and \(F_b\). When \(F_a \ne F_b\), larger values of D are expected.

Jensen–Shannon divergence

JS divergence is based on Kullback–Leibler (KL) Divergence divergence, a method of measuring statistical distance sometimes referred to as relative entropy. KL calculates a score that measures the divergence of one probability distribution from another. It can be used as a distance metric that quantifies the difference between two probability distributions. If \(F_a\) and \(F_b\) represent the probability distribution of a discrete random variable, the KL divergence is calculated as a summation:

$$\begin{aligned} KL(F_a || F_b) = \sum F_a(x)log \Big (\frac{F_a(x)}{F_b(x)} \Big ) \end{aligned}$$
(4)

The intuition for the KL divergence score is that when the probability for an event is large, but the probability for the same event is small, there is a large divergence. When the probability from is small and the probability from is large, there is also a large divergence, but not as large as the first case.

JS divergence extends KL divergence to calculate a symmetrical score and distance measure of one probability distribution from another. The JS divergence can be calculated as follows:

$$\begin{aligned} JS(F_a || F_b) = \frac{KL(F_a ||F_M)}{2} + \frac{KL(F_b||F_M)}{2} \end{aligned}$$
(5)

where \(F_M\) is calculated as:

$$\begin{aligned} F_M = \frac{F_a + F_b}{2} \end{aligned}$$
(6)

Based on both information from the KS and JS statistic measurement of the distributional difference between the granted and rejected, we select the relatively more informative features as predictors of our machine learning models.

The permutation feature importance

Permutation feature importance measures the increase in the prediction error of the model after the features’ values are permuted. The idea is intuitive—if shuffling a feature value increases the model error more, then the model relied on this feature for the prediction to a greater degree. A feature is hence “unimportant” if shuffling its values leaves the model error unchanged because, as we can interpret, the model ignored the feature for the prediction. This method was initially introduced Breiman (2001) for random forest, and extended to a model-agnostic version by Fisher et al. (2019). The permutation feature importance algorithm we use was based on Fisher et al. (2019):

Input: trained model \({\hat{f}}\), feature matrix X, target vector y, error measure \(L(y, {\hat{f}})\).

  1. 1.

    Estimate the original model error \(e_{orig} = L(y, {\hat{f}}(X))\)(e.g. R-square or mean squared error)

  2. 2.

    For each feature \(j \in \{1,\ldots , p\}\) do:

    • Generate feature matrix \(X_{perm}\) by permuting feature j in the data X. This breaks the association between feature j and true outcome y.

    • Estimate error \(e_{perm} = L(Y, {\hat{f}}(X_{perm}))\) based on the predictions of the permuted data.

    • Calculate permutation feature importance as quotient \(FI_j = \frac{e_{perm}}{e_{orig}}\) or difference \(FI_j = e_{perm} - e_{orig}\)

  3. 3.

    Sort features by descending FI

In practice, we use the \(vi\_permute\) command in the R package vip to generate feature importance for XGBoost, random forest, SVM, LASSO, and Logit.

The partial dependence plots

The SHAP dependence plot shows a fitted curve for a scatter plot of the SHAP values and the actual feature values, illustrating the relationship between the two, whether it is a linear, monotonic, or more complex relationship.

Fig. 7
figure 7

The partial dependence plots of major variables

The more traditional way to present the interrelation as a visualization tool is the Partial Dependence Plot (PDP) proposed by Friedman (2001), which is able to illustrate how each variable affects the chance of patent grant. The difference between the SHAP dependence plot and PDP is that first, the y-axis of the SHAP dependence plot is the SHAP value (or marginal contribution) while the y-axis of PDP is the average prediction. Thus, the y-axis in the SHAP dependence plot is centered on zero, while it is centered on 0.46 (the average grant ratio) for PDP. Second, SHAP dependence is able to present more information, such as the distribution of individual samples. Yet, PDP is a more well-accept and applied tool for model interpretation. We apply PDP for the robustness of our results.

Given the output function f(x) of machine learning algorithm, the partial dependence of f on variable \(X_s\) is defined as (let c be the complement set of s )

$$\begin{aligned} f^s(x_s) = E_{X_c}[f(x_s, X_c)] = \int f(x_s, x_c)dP(x_c) \end{aligned}$$
(7)

where the PDP \(f^s\) is the expectation of f over the marginal distribution of all variables except for \(x_s\). In practice, PDP is estimated by taking average over the training data with fixed \(X_s\)

$$\begin{aligned} {\bar{f}}^s(x_s) = \frac{1}{n} \sum ^n _i = 1 f(x_s, X^i_c)) \end{aligned}$$
(8)

PDP plots indicate direct causal effect of a variable s on the outcome variable if the back-door condition is satisfied (Zhao and Hastie, 2021). We admit that some of the variables are likely endogenous or descendants of other variables in the terminology of the directed acyclic graph (DAG). For instance, the variable duration, the time duration between application and publication, can be affected by other confounding variables. Though may not be explained in a causal manner, nonetheless many of the high correlation nature are worth pointing out.

Figure 7 illustrates 12 PDP plots for the major predictors. To keep the plots comparable, we limit the y-axis ranging from 0.4 to 0.6 and the x-axis according to the distribution of each predictor. Note that from Panel (a) to (f), the PDP generates similar relational patterns between the predictors and output variable as ones from Fig. 5. Backward citation, patent length, number of claims, length of the first claim, grant rate of the previous year, and patent family all show up a similar pattern of positive correlation with the outcome variable. For backward citation, patent length, and the number of claims, there exists a threshold effect. Once beyond 4 for citing, 20 for pages, and 10 for \(right\_no\), the marginal benefits of these variables stay positive and constant. While \(nchar\_right\), \(app\_rate2010\), and nfamily all show up a monotonic increasing relationship with the outcome variable. Hence, PDP confirms the relationships of the predictors and outcome variables as suggested and discussed in our main paper.

Patent length or pages increase the likelihood of a patent grant. There are notable data outliers with one or two pages, and these are patents with an above-average grant rate. It is also present in Panel (b) of Fig. 5 that there are four blues dots on the upper left corner. There are possible data anomalies but it does not interfere the model interpretation.

Besides the six major variables, we also explore other less important yet noteworthy predictors. The agent success rate in the previous year, \(agent\_rate2010\), increases the likelihood of patent grant, suggesting relying on a high-quality patent agent could help the examination outcome. Agent application load, \(agent_freq\), initially increases but eventually decreases the patent grant rate. Application frequency in the same year, \(app\_freq\), could lower the outcome variable because too many applications can be distracting. Panel (i) shows that the duration between application and publication has a complex relationship with the outcome variable. Panel (k) shows that Inventor numbers have little effect on the likelihood of patent grant. There is a slight increase for a higher number of inventors. Panel (l) shows the length of the abstract, \(nchar\_ab\), which increases the patent grant because it may indicate the technology complexity similar to patent length.

It is also interesting to note, Panel (m) and (o) show forward citation and the number of IPC assigned have little predicting power of whether a patent gets granted or not.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yao, L., Ni, H. Prediction of patent grant and interpreting the key determinants: an application of interpretable machine learning approach. Scientometrics 128, 4933–4969 (2023). https://doi.org/10.1007/s11192-023-04736-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-023-04736-z

Keywords

Navigation

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy