High-performance and low-power VLIW cores for numerical computations. (23rd January 2006)
- Record Type:
- Journal Article
- Title:
- High-performance and low-power VLIW cores for numerical computations. (23rd January 2006)
- Main Title:
- High-performance and low-power VLIW cores for numerical computations
- Authors:
- Pericas, Miquel
Ayguade, Eduard
Zalamea, Javier
Llosa, Josep
Valero, Mateo - Abstract:
- Issue logic is among the worst scaling structures in a modern microprocessor. Increasing the issue width increments the processor area in an exponential way. Bigger processors will have inherently larger wire delays. In this scenario, technology scaling will yield smaller performance improvements as the wire delays do not decrease. Instead, they start to dominate the clock cycle. In order to offer higher performance the wire problem needs to be tackled. This paper discusses two methods which attempt to move the wire problem out of the critical path. The first method is the clustering technique, which directly approaches the wire problem by combining several smaller execution cores in the processor backend to perform the computations. Each core has a smaller issue width and a much smaller area. The second technique we study is the widening technique. This technique consists in reducing the issue width of the processor, but giving the instructions SIMD capabilities. The parallelism here is small (normally two to four) and does not resemble multimedia or vector extensions. Wide processors use wide functional units that compute the same operation on multiple words. The rationale behind this idea is that by reducing the issue width (but not the computational bandwidth), we are also reducing the issue logic circuitry and the complexity of structures such as the register file and the cache memory. When compared with a centralised core with 128 registers, 8 FPUs and 4 memory ports,Issue logic is among the worst scaling structures in a modern microprocessor. Increasing the issue width increments the processor area in an exponential way. Bigger processors will have inherently larger wire delays. In this scenario, technology scaling will yield smaller performance improvements as the wire delays do not decrease. Instead, they start to dominate the clock cycle. In order to offer higher performance the wire problem needs to be tackled. This paper discusses two methods which attempt to move the wire problem out of the critical path. The first method is the clustering technique, which directly approaches the wire problem by combining several smaller execution cores in the processor backend to perform the computations. Each core has a smaller issue width and a much smaller area. The second technique we study is the widening technique. This technique consists in reducing the issue width of the processor, but giving the instructions SIMD capabilities. The parallelism here is small (normally two to four) and does not resemble multimedia or vector extensions. Wide processors use wide functional units that compute the same operation on multiple words. The rationale behind this idea is that by reducing the issue width (but not the computational bandwidth), we are also reducing the issue logic circuitry and the complexity of structures such as the register file and the cache memory. When compared with a centralised core with 128 registers, 8 FPUs and 4 memory ports, our approach, using an equivalent amount of hardware units, is able to achieve speedups up to 1.7. … (more)
- Is Part Of:
- International journal of high performance computing and networking. Volume 1:Number 4(2004)
- Journal:
- International journal of high performance computing and networking
- Issue:
- Volume 1:Number 4(2004)
- Issue Display:
- Volume 1, Issue 4 (2004)
- Year:
- 2004
- Volume:
- 1
- Issue:
- 4
- Issue Sort Value:
- 2004-0001-0004-0000
- Page Start:
- 171
- Page End:
- 179
- Publication Date:
- 2006-01-23
- Subjects:
- ILP -- VLIW cores -- clustering -- FPU widening -- floating point units -- modulo scheduling -- energy-delay -- numerical computations -- high performance computing -- issue logic -- issue width reduction
High performance computing -- Periodicals
Computer networks -- Periodicals
High performance computing
Periodicals
004.05 - Journal URLs:
- http://www.inderscience.com/jhome.php?jcode=ijhpcn ↗
http://www.metapress.com/openurl.asp?genre=journal&issn=1740-0562 ↗
http://www.inderscience.com/ ↗ - Languages:
- English
- ISSNs:
- 1740-0562
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 8690.xml