According to Peter Gerwinski peter@agnes.dida.physik.uni-essen.de:
According to PredatorZeta:
Just little clarifying.....:)) No. Surely not on 80x86 machines. Look this (is for Pentium):
1 push (mem) take 2 cycles NOT pairable
2 a) mov (mem), reg take 1 cycle pairable AGI Stall (1 cycle+cut off
pairing
system) b) push reg take 1 cycle pairable
TOTAL: 3 cycles
Well, is 2 (not pairable) versus 2 1/2 (not pairable) cycles..... However, with a manual re-ordering it could take only 1 cycle...8-))
Hope this clarify......
Partially. As far as I understand the above, it's
push (mem) 2 cycles not pairable mov (mem), reg 1 cycle pariable push reg 1 cycle pairable
so both versions have the same speed, but the second one is pairable (-;whatever that means ...
Only theoretically..::)) This is because between the two instructions there is an AGI stall, a Pentium hardware-generated LOCK instruction.(why? because there is a read after write!!) 8-< This block ALL pairing rules and costs 1 cycle. Then: 1 cycle for the MOV pairable (BUT THAT DON'T PAIR!!) 1 cycle for AGI (block all pipeline!!) 1 cycle for the PUSH pairable = 2 cycles(not pairable) + 1 pairable= 2 1/2 cycles
I guess it means that it is executed in parallel with a floating point operation, right?).
Yez, but also PUSH (MEM) can do this...
Cya PrZ