Results 1 to 5 of 5

Thread: logic confusion, not sure how i got 12, but its not a choice .:Resolve:.

  1. #1

    Thread Starter
    Hyperactive Member voidflux's Avatar
    Join Date
    Jun 2003
    Location
    Brockway, PA
    Posts
    290

    logic confusion, not sure how i got 12, but its not a choice .:Resolve:.

    Hello everyone i'm not sure how they got their answer here, it seems simple but I ended up getting 12 for $t3. Here is the question:
    IF we are to run the below code on a pipleined 5 staged data path, how many cycles are needed? Assume there is no data forwarding hardware in the pipline.
    After running, what is the data stored in $t3?
    $t1 = 2
    $t0 = 4 at the beginning.

    Code:
    add $t2, $t1, $t0 
    xor $t3, $t2, $t1 
    sub $t4, $t0, $t1 
    j label 
    add $t3,$t3,$t3 
    ... 
    label: or $t3, $t1, $t0
    A. $t3 = 8 , 11 cycles
    B. $t3 = 4 , 10 cycles
    C. $t3 = 6 , 10 cycles
    D. $t3 = 6 , 11 cycles

    I got 10 cycles and the naswer to be $t3 = 12.

    Here is my reasoning.

    first line:
    add $t2, $t1, $t0 #$t2 = 6
    xor $t3, $t2, $t1 # $t3 = 6 xor 2 would be 0110 xor 0010 = 0100 = 4
    sub $t4, $t0, $t1 # $t4 = 2
    j label #jump to label
    add $t3,$t3,$t3
    ....
    label: or $t3, $t1, $t0 # t3 = 2 or 4 = 0010 or 0100 = 0110 = 6

    but now you must add $t3 = $t3 + $t3 = 6 + 6 = 12

    I don't see where i'm misunderstanding...


    Thanks!
    Last edited by voidflux; Dec 21st, 2006 at 02:29 AM.
    C¤ry Sanchez
    Computer Science/Engineering
    @ Penn State
    IBM.zSeries Intern
    Mandriva 2007

  2. #2
    Member
    Join Date
    Jan 2004
    Posts
    37

    Re: logic confusion, not sure how i got 12, but its not a choice, xor and or'ing

    Why do you do the t3+t3? It's skipped in the code... So don't do it on paper either.

  3. #3

    Thread Starter
    Hyperactive Member voidflux's Avatar
    Join Date
    Jun 2003
    Location
    Brockway, PA
    Posts
    290

    Re: logic confusion, not sure how i got 12, but its not a choice, xor and or'ing

    oo my bad, i was thinking j was a jal, thank u!
    C¤ry Sanchez
    Computer Science/Engineering
    @ Penn State
    IBM.zSeries Intern
    Mandriva 2007

  4. #4
    Kitten CornedBee's Avatar
    Join Date
    Aug 2001
    Location
    In a microchip!
    Posts
    11,594

    Re: logic confusion, not sure how i got 12, but its not a choice, xor and or'ing

    It's there so that the instruction cache doesn't happen to start decoding the actual jump target.

    This is not answerable without knowing a lot more about your particular architecture than I do. Do you take into account jump prediction? (Unconditional jumps to a fixed target are easy to predict.) How many cycles does an instruction need until the result is available? Does the CPU do instruction reordering?
    And most importantly, what does each stage do? The operand fetching must wait for the results to be available.
    Hmm, let's see. 5 stages would be something like this:
    1) Fetch instruction.
    2) Decode instruction.
    3) Fetch operands. -> Must wait for dependencies to write.
    4) Execute.
    5) Write result.

    Is an optimizing assembler going over this? Because if it is, the program comes down to
    $t4 = $t0 - $t1
    $t3 = $t1 | $t0

    Let's assume not. There aren't many optimizing assemblers. Let's also assume that there is no instruction reordering.

    1) add $t3, $t1, $t0

    As we start, the pipeline is empty. This instruction therefore counts for 5 cycles, timesteps 1 through 5.

    2) xor $t3, $t2, $t1

    The instruction gets fetched directly after 1, on timestep 2, and decoded on timestep 3. On timestep 4, it wants to fetch operands, but $t2 is not computed yet. The pipeline stalls until step 6, where the operands can finally be fetched. The instruction finishes on step 8.

    3) sub $t4, $t0, $t1

    This operation is independent of any previously computed values. It finishes as early as possible, given the pipeline stall, namely cycle 9. Note that if this instruction had been before the second, it would have finished on cycle 6 and the second would still have finished on 8. That's why instruction reordering is important.

    4) j label

    OK, here it gets difficult. The instruction itself is independent, but there are several consideration. For one, this instruction doesn't write anything. It could execute in just 4 cycles.
    On the other hand, it's a jump. Does jump prediction work? If so, it will be effectively be a no-op taking two cycles (prediction doesn't kick in until decoding). Otherwise it will flush the pipeline. Let's consider both scenarios.
    Scenario 1: the jump is predicted correctly. This means the instruction itself finishes its fourth part, the actual jump, on cycle 9, the same moment as instruction 3 has written its value back.

    5a) or $t3, $t1, $t0

    The pipeline then stalls for a moment: instruction 4 was fetched on cycle 6, cycle 7 fetched the add, cycle 8 something different while the jump prediction kicked in. Cycle 9 starts fetching this instruction, and because there are no dependencies, it finishes on cycle 13, ending the execution of the snippet.

    5a) or $t3, $t1, $t0

    If the jump prediction didn't work, the instruction isn't fetched until the jump is completed on cycle 9, meaning it is fetched in 10 and finishes on 14.

    One notable thing here is that jump prediction isn't worth much on short pipelines. The P4 pipeline had over 20 stages; now that was a bad hit if it got flushed.

    Another notable thing is that I don't see any of the offered solutions as correct. But then, I don't know the architecture well enough. It may be that it can directly transfer results - that would allow the second instruction to execute 2 cycles earlier, speeding the whole thing up that much and reaching the offered goal of 11 cycles. It could also be that the jump offset is coded directly into the jump, meaning it doesn't need to fetch data and can execute in three cycles. Note, however, that due to the stalling problem, the latter wouldn't actually gain you any speed in the predicted case; it would only make the non-predicted case as fast as the predicted one, removing the need for jump-prediction.

    As for the final value of $t3, as I said, it boils down to just one line, or $t3, $t1, $t0, which is 6.
    All the buzzt
    CornedBee

    "Writing specifications is like writing a novel. Writing code is like writing poetry."
    - Anonymous, published by Raymond Chen

    Don't PM me with your problems, I scan most of the forums daily. If you do PM me, I will not answer your question.

  5. #5

    Thread Starter
    Hyperactive Member voidflux's Avatar
    Join Date
    Jun 2003
    Location
    Brockway, PA
    Posts
    290

    Re: logic confusion, not sure how i got 12, but its not a choice, [RESOLVED

    You know your stuff. Are you part computer science major and part computer engineering?>

    From my understanding, the correct answer is 12 cycles with $t3 = 6, which none of the listed answers are correct.

    Here is my reasoning, if there is no data fowarding, then that means you must do a stall. You have a data hazard when you hit
    add $t2, $t1, $t0
    xor $t3, $t2, $t1

    because of the register t2 is stil in use by the first pipeline.
    So you gota do 2 stalls, this will allow the register to be free again.
    Now you can continue and do the xor line, once your done with that you can do the subtract.
    So the add with take 5 cycles
    xor will take 8
    sub will take 9
    now jump will take 10
    you must do a stall once you encounter the jump on this arch (MiPS 32)
    once you do the stall you can go to the jump target which is or.
    or will take: 12 cycles.

    So a total of 12 cycles
    Last edited by voidflux; Dec 21st, 2006 at 02:28 AM.
    C¤ry Sanchez
    Computer Science/Engineering
    @ Penn State
    IBM.zSeries Intern
    Mandriva 2007

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width