N. Malathi, E. Swetha, Design a Processor Based on VLIW Architecture for Executing Multi-Scalar/Vector Instructions, ASIO Journal of Engineering & Technological Perspective Research (ASIO-JETPR), 2016, 2(1): 01-08.
dids/doi No.: 01.2016-19818151
dids link: http://dids.info/didslink/05.2016-51384112/
One of the most important methods for achieving high performance is taking advantage of parallelism. This paper proposes new processor architecture for data-parallel applications based on the combination of VLIW and vector processing paradigms. It uses VLIW architecture for processing multiple independent scalar instructions concurrently on parallel execution units. Data parallelism is expressed by vector ISA and processed on the same parallel execution units of the VLIW architecture. The proposed processor, which is called VecLIW, has register file of 64x32-bit registers in the decode stage for storing scalar/vector data. VecLIW can issue up to four scalar/vector operations in each cycle for parallel processing a set of operands and producing up to four results. This loads/stores 128- bit scalar/vector data from/to data cache. Four 32-bit results can be written back into VecLIW register file. The complete design of our proposed VecLIW processor is implemented using Verilog targeting the Xilinx FPGA Virtex-5, XC5VLX110T-3FF1136 device.
Keywords: VLIW architecture; vector processing; data-level parallelism; FPGA/Verilog implementation
- J. Hennessey and D. Patterson, Computer Architecture A Quantitative Approach, 5th ed, Morgan-Kaufmann, September 2011.
- J. Mike, Superscalar Microprocessor Design, Prentice Hall (Prentice Hall Series in Innovative Technology), 1991.
- J. Smith and G. Sohi, “The micro architecture of superscalar processors,” Proceedings of the IEEE, vol. 83, no. 12, pp. 1609-1624, December 1995.
- J. Fisher, “VLIW architectures and the ELI-512,” Proc. 10th International Symposium on Computer Architecture, Stockholm, Sweden, pp. 140-150, June 1983.
- J. Fisher, P. Faraboschi, and C. Young, Embedded Computing: A VLIW Approach to Architecture, Compilers and Tools, Morgan Kaufmann, 2004.
- Philips, Inc., An Introduction to Very-Long Instruction Word (VLIW) Computer Architecture, Philips Semiconductors, 1997.
- R. Espasa, M. Valero, and J. Smith, “Vector architectures: past, present and future,” Proc. 2nd International Conference on Supercomputing, Melbourne, Australia, pp. 425-432, July 1998.
- F. Quintana, R. Espasa, and M. Valero, “An ISA comparison between superscalar and vector processors,” in VECPAR, vol. 1573, Springer-Verlag London, pp. 548-560, 1998.
- J. Smith, “The best way to achieve vector-like performance?,” Keynote Speech, in 21st International Symposium on Computer Architecture, Chicago, IL, April 1994.
- C. Kozyrakis and D. Patterson, “Vector vs. superscalar and vliw architectures for embedded multimedia benchmarks,” Proc. 35th International Symposium on Micro architecture, Istanbul, Turkey, pp. 283-293, November 2002.
- K. Asanovic, Vector Microprocessors, Ph.D. Thesis, Computer Science Division, University of California at Berkeley, 1998.
- S. Kaxiras, “Distributed vector architectures,” Journal of Systems Architecture, Elsevier Science B.V., vol. 46, no. 11, pp. 973-990, 2000.
- C. Kozyrakis, Scalable Vector Media-processors for Embedded Systems, Ph.D. Thesis, Computer Science Division, University of California at Berkeley, 2002.
- R. Krashinsky, Vector-Thread Architecture and Implementation, Ph.D. Thesis, Massachusetts Institute of Technology, 2007.
- J. Gebis, Low-complexity Vector Microprocessor Extensions, Ph.D. thesis, University of California at Berkeley, 2008.
- C. Batten, Simplified Vector-Thread Architectures for Flexible and Efficient Data-Parallel Accelerators, Ph.D. Thesis, Massachusetts Institute of Technology, 2010.
- P. Yiannacouras, J. Steffan, and J. Rose, “Portable, flexible, and scalable soft vector processors,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 20, no. 8, pp. 1429-1442, August 2012.
- M. Gschwind, H. Hofstee, B. Flachs, M. Hopkins, Y. Watanabe, and T. Yamazaki, “Synergistic processing in Cell's multicore architecture,” Journal IEEE Micro, vol. 26, no. 2, pp. 10-24, March 2006.
- T. Zeiser, G. Hager, and G. Wellein, “Performance results from the NEC SX-9,” Proc. IEEE International Symposium on Parallel & Distributed Processing, IPDPS-2009, pp. 1-8, 2009.
- E. Salami and M. Valero “A vector-μSIMD-VLIW architecture for multimedia applications,” Proc. IEEE International Conference on Parallel Processing, ICPP-2005, pp. 69-77, 2005.
- T. Wada, S. Ishiwata, K. Kimura, K. Nakanishi, M. Sumiyoshi, T. Miyamori, and M. Nakagawa, “A VLIW vector media coprocessor with cascaded SIMD ALUs,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 17, no. 9, pp. 1285-1296, 2009.
- G. Amdahl, “Validity of the single-processor approach to achieving large scale computing capabilities,” Proc. AFIPS 1967 Spring Joint Computer Conference, Atlantic City, New Jersey, AFIPS Press, vol. 30, pp. 483-485, April 1967.
- G. Kane, MIPS RISC Architecture (R2000/R3000), Prentice Hall, 1989.
- MIPS32 Architecture For Programmers Volume II: The MIPS32 Instruction Set, MIPS, Revision 0.95, 2011. Available at: http://www.cs.cornell.edu /courses/cs3410/2011sp/MIPS Vol2.pdf.
- D. Patterson and J. Hennessy, Computer Organization and Design: The Hardware/Software Interface, 4th ed, Morgan Kaufman, December 2011.
- A. Cosoroaba and F. Rivoallon, “Achieving higher system performance with the Virtex-5 family of FPGAs,” White Paper: Virtex-5 Family of FPGAs, Xilinx WP245 (v1.1.1), July 2006.
- A. Percey, “Advantages of the Virtex-5 FPGA 6-Input LUT architecture,” White Paper: Virtex-5 FPGAs, Xilinx WP284 (v1.0), December 2007.
- Virtex-5 FPGA User Guide, UG190 (v5.4), March 2012. http://www.xilinx.com/support/documen-tati on /user_guides/ug190.pdf.