What is software loop unrolling and why does it help to speedup execution?

What will be an ideal response?


Loops are normally used in iterative techniques involving vectors, tables and arrays; for example, adding 1 to all
the elements of an array. Consider the following.


Loop LDR r0,[r1,#4]! ;Repeat get element
ADD r0,r0,#1 ;Increment element
STR r0,[r1] ;Save element
SUBS r2,r2,#1 ;Decrement loop counter (preset in r2)
BNE Loop ;Repeat until all done


In this case, a trip round the loop involves a read, increment, store, and loop check. Consider the following
where the loop is unrolled by performing two iterations.


Loop LDR r0,[r1,#4]! ;Repeat get element 1
LDR r10,[r11,#4]! ;Repeat get element 2
ADD r0,r0,#1 ;Increment element 1
ADD r10,r10,#1 ;Increment element 2
STR r0,[r1] ;Save element 1
STR r10,[r11] ;Save element 2
SUBS r2,r2,#1 ;Decrement loop counter (preset in r2)
BNE Loop ;Repeat until all done


Note that the first form performs one iteration in 5 cycles and the second performs two iterations in 8 cycles;
that is a single iteration takes 4 cycles, demonstrating a speed up. Now, if the processors were three?way
superscalar, we couldn’t perform the first three operations in parallel because of the data dependency. The
best we could do is to put the loop count decrement and branch in parallel with earlier instructions; that is,

Loop LDR r0,[r1,#4]!
ADD r0,r0,#1 SUBS r2,r2,#1
STR r0,[r1] BNE Loop

This gives us two cycles per iteration. If we do the same to the unrolled loop, we get

Loop LDR r0,[r1,#4]! LDR r10,[r11,#4]!
ADD r0,r0,#1 ADD r10,r10,#1 SUBS r2,r2,#1
STR r0,[r1] STR r10,[r11] BNE Loop


Now we have two iterations in three cycles or 1.5 cycles per iteration.
So, loop unrolling permits greater use of superscalar processing resources by avoiding data dependence by
running multiple iteration trips together.

Computer Science & Information Technology

You might also like to view...

Answer the following statements true (T) or false (F)

1. Two different objects of the same class have a different set of member functions. 2. You may not have more than one input and one output stream open at any one time. 3. The formatting options that were discussed for cout do not work for output file streams. 4. You must use a width statement before each variable that you want output in a certain width 5. If you use the width command, it stays in effect for all values that are send to the stream

Computer Science & Information Technology

Which of the following save options should you use so that someone who has an earlier version of Excel can view your Excel document?

A. Excel 2010 Workbook B. Excel 97-2003 Workbook C. Excel 2003-2007 Workbook D. Excel 98 and Later Workbook

Computer Science & Information Technology

An illustration of a one-to-many relationship would be:

What will be an ideal response?

Computer Science & Information Technology

When Susan requests a SOC 2 report, she receives a SAS 70 report. What issue should Susan raise?

A. SAS 70 does not include Type 2 reports, so control evaluation is only point in time. B. SAS 70 has been replaced. C. SAS 70 is a financial reporting standard and does not cover data centers. D. SAS 70 only uses a 3-month period for testing

Computer Science & Information Technology