Abstract
Value reuse improves a processor’s performance by dynamically caching the results of previous instructions and reusing those results to bypass the execution of future instructions that have the same opcode and input operands. However, continually replacing the least recently used entries could eventually fill the value reuse table with instructions that are not frequently executed. Furthermore, the complex hardware that replaces entries and updates the table may necessitate an increase in the clock period. We propose instruction precomputation to address these issues by profiling programs to determine the opcodes and input operands that have the highest frequencies of execution. These instructions then are loaded into the precomputation table before the program executes. During program execution, the precomputation table is used in the same way as the value reuse table is, with the exception that the precomputation table does not dynamically replace any entries. For a 2K-entry precomputation table implemented on a 4-way issue machine, this approach produced an average speedup of 11.0%. By comparison, a 2K-entry value reuse table produced an average speedup of 6.7%. Instruction precomputation outperforms value reuse, especially for smaller tables, with the same number of table entries while using less area and having a lower access time.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
D. Burger and T. Austin; “The Simplescalar Tool Set, Version 2.0”; University of Wisconsin Computer Sciences Department Technical Report 1342.
A. KleinOsowski, J. Flynn, N. Meares, and D. Lilja; “Adapting the SPEC 2000 Bench-mark Suite for Simulation-Based Computer Architecture Research”; Workload Characterization of Emerging Computer Applications, L. Kurian John and A. M. Grizzaffi Maynard (eds.),Kluwer Academic Publishers, (2001) 83–100
C. Molina, A. Gonzalez, and J. Tubella; “Dynamic Removal of Redundant Computa-tions”; International Conference on Supercomputing, (1999)
A. Sodani and G. Sohi; “Dynamic Instruction Reuse”; International Symposium on Com-puter Architecture, (1997)
J. Yi, R. Sendag, and D. Lilja; “Increasing Instruction-Level Parallelism with Instruction Precomputation”; University of Minnesota Technical Report: ARCTiC 02-01
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yi, J.J., Sendag, R., Lilja, D.J. (2002). Increasing Instruction-Level Parallelism with Instruction Precomputation. In: Monien, B., Feldmann, R. (eds) Euro-Par 2002 Parallel Processing. Euro-Par 2002. Lecture Notes in Computer Science, vol 2400. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45706-2_65
Download citation
DOI: https://doi.org/10.1007/3-540-45706-2_65
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44049-9
Online ISBN: 978-3-540-45706-0
eBook Packages: Springer Book Archive