Abstract
Log anomaly detection serves as an effective approach for identifying threats. Autoencoder-based detection methods address positive and negative sample imbalance issues and have been extensively adopted in practical applications. However, most existing methods necessitate a sliding window to adapt to the autoencoder’s base network, leading to information confusion and diminished resilience. Furthermore, detection results may be worthless when a single log comprises numerous unbalanced log records. In response, we propose TAElog, a novel framework employing a transformer-based autoencoder designed to extract precise information from logs without the need for sliding windows. TAElog also incorporates a new loss calculation that computes both high-dimensional metrics and divergence information, enhancing detection performance in intricate situations with diverse and unbalanced log records. Moreover, our framework covers preprocessing to increase the compatibility between text and numeric logs. To verify the effectiveness of TAElog, we evaluate its performance against other methods on both textual and numerical logs. Additionally, we assess various preprocessing and loss computation approaches to determine the optimal configuration within our method. Experimental results demonstrate that TAElog not only achieves superior accuracy rates but also boasts increased processing speed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Astekin, M., Özcan, S., Sözer, H.: Incremental analysis of large-scale system logs for anomaly detection. In: 2019 IEEE International Conference on Big Data (Big Data), pp. 2119–2127 (2019). https://doi.org/10.1109/BigData47090.2019.9006593
Berlin, K., Slater, D., Saxe, J.: Malicious behavior detection using windows audit logs. In: Proceedings of the 8th ACM Workshop on Artificial Intelligence and Security, pp. 35–44. Association for Computing Machinery (2015). https://doi.org/10.1145/2808769.2808773
Bian, H., Bai, T., Salahuddin, M.A., Limam, N., Daya, A.A., Boutaba, R.: Uncovering lateral movement using authentication logs. IEEE Trans. Netw. Serv. Manage. 18(1), 1049–1063 (2021). https://doi.org/10.1109/TNSM.2021.3054356
Borghesi, A., Bartolini, A., Lombardi, M., Milano, M., Benini, L.: Anomaly detection using autoencoders in high performance computing systems. In: Proceedings of the AAAI Conference on artificial intelligence, vol. 33, pp. 9428–9433 (2019). https://doi.org/10.1609/aaai.v33i01.33019428
Catillo, M., Pecchia, A., Villano, U.: AutoLog: anomaly detection by deep autoencoding of system logs. Expert Syst. Appl. 191, 116263 (2022). https://doi.org/10.1016/j.eswa.2021.116263
Dongre, P.B., Malik, L.G.: A review on real time data stream classification and adapting to various concept drift scenarios. In: 2014 IEEE International Advance Computing Conference (IACC), pp. 533–537 (2014). https://doi.org/10.1109/IAdCC.2014.6779381
Du, M., Li, F., Zheng, G., Srikumar, V.: DeepLog: anomaly detection and diagnosis from system logs through deep learning. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, CCS 2017, pp. 1285–1298. Association for Computing Machinery (2017). https://doi.org/10.1145/3133956.3134015
Dunia, R., Qin, S.J.: Multi-dimensional fault diagnosis using a subspace approach. In: American Control Conference, vol. 5. Citeseer (1997)
Frey, B.J., Dueck, D.: Clustering by passing messages between data points. Science 315(5814), 972–976 (2007)
Gao, Y., Ma, Y., Li, D.: Anomaly detection of malicious users’ behaviors for web applications based on web logs. In: 2017 IEEE 17th International Conference on Communication Technology (ICCT), pp. 1352–1355 (2017). https://doi.org/10.1109/ICCT.2017.8359854
Ho, G., et al.: Hopper: modeling and detecting lateral movement. In: USENIX Security Symposium, pp. 3093–3110 (2021)
Kullback, S., Leibler, R.A.: On information and sufficiency. Ann. Math. Stat. 22(1), 79–86 (1951)
Kwon, D., Natarajan, K., Suh, S.C., Kim, H., Kim, J.: An empirical study on network anomaly detection using convolutional neural networks. In: 2018 IEEE 38th International Conference on Distributed Computing Systems (ICDCS), pp. 1595–1598 (2018). https://doi.org/10.1109/ICDCS.2018.00178
Li, T., Ma, J., Pei, Q., Shen, Y., Lin, C., Ma, S., Obaidat, M.S.: AClog: attack chain construction based on log correlation. In: 2019 IEEE Global Communications Conference (GLOBECOM), pp. 1–6 (2019)
Lu, S., Wei, X., Li, Y., Wang, L.: Detecting anomaly in big data system logs using convolutional neural network. In: 2018 IEEE 16th International Conference on Dependable, Autonomic and Secure Computing, 16th International Conference on Pervasive Intelligence and Computing, 4th International Conference on Big Data Intelligence and Computing and Cyber Science and Technology Congress (DASC/PiCom/DataCom/CyberSciTech), pp. 151–158 (2018)
Meng, W., et al.: Device-agnostic log anomaly classification with partial labels. In: 2018 IEEE/ACM 26th International Symposium on Quality of Service (IWQoS), pp. 1–6 (2018). https://doi.org/10.1109/IWQoS.2018.8624141
Moustafa, N., Slay, J.: UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). In: 2015 Military Communications and Information Systems Conference (MilCIS), pp. 1–6 (2015). https://doi.org/10.1109/MilCIS.2015.7348942
Wang, Z., Tian, J., Fang, H., Chen, L., Qin, J.: LightLog: a lightweight temporal convolutional network for log anomaly detection on the edge. Comput. Netw. 203, 108616 (2022)
wuyifan18: Deeplog (2019). https://github.com/wuyifan18/DeepLog
Xu, W., Huang, L., Fox, A., Patterson, D., Jordan, M.: Online system problem detection by mining patterns of console logs. In: 2009 Ninth IEEE International Conference on Data Mining, pp. 588–597 (2009). https://doi.org/10.1109/ICDM.2009.19
Xu, W., Huang, L., Fox, A., Patterson, D., Jordan, M.I.: Detecting large-scale system problems by mining console logs. In: Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles, SOSP 2009, pp. 117–132. Association for Computing Machinery, New York (2009). https://doi.org/10.1145/1629575.1629587
Yen, S., Moh, M., Moh, T.S.: CausalConvLSTM: semi-supervised log anomaly detection through sequence modeling. In: 2019 18th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 1334–1341 (2019). https://doi.org/10.1109/ICMLA.2019.00217
Yoo, Y.H., Kim, U.H., Kim, J.H.: Recurrent reconstructive network for sequential anomaly detection. IEEE Trans. Cybern. 51(3), 1704–1715 (2021). https://doi.org/10.1109/TCYB.2019.2933548
Yuan, L.P., Liu, P., Zhu, S.: Recompose event sequences vs. predict next events: a novel anomaly detection approach for discrete event logs. In: Proceedings of the 2021 ACM Asia Conference on Computer and Communications Security, pp. 336–348. Association for Computing Machinery (2021). https://doi.org/10.1145/3433210.3453098
Zhang, X., et al.: Robust log-based anomaly detection on unstable log data. In: Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 807–817. Association for Computing Machinery (2019). https://doi.org/10.1145/3338906.3338931
Acknowledgements
We, the authors, thank the anonymous reviewers for their helpful suggestions. This work is supported by the National Key Research and Development Program of China (Grant No. 2020YFB1806504), the National Key Research and Development Program of China (Grant No. 2021YFF0307203), the Strategic Priority Research Program of Chinese Academy of Sciences (Grant No. XDC02040100). This work is also supported by the Program of Key Laboratory of Network Assessment Technology, the Chinese Academy of Sciences, Program of Beijing Key Laboratory of Network Security and Protection Technology.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Zhao, C. et al. (2024). TAElog: A Novel Transformer AutoEncoder-Based Log Anomaly Detection Method. In: Ge, C., Yung, M. (eds) Information Security and Cryptology. Inscrypt 2023. Lecture Notes in Computer Science, vol 14527. Springer, Singapore. https://doi.org/10.1007/978-981-97-0945-8_3
Download citation
DOI: https://doi.org/10.1007/978-981-97-0945-8_3
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-0944-1
Online ISBN: 978-981-97-0945-8
eBook Packages: Computer ScienceComputer Science (R0)