Jump to content


Pipelined max calculation


  • You cannot reply to this topic
3 replies to this topic

#1 Diego

    Member

  • Members
  • PipPip
  • 11 posts

Posted 01 December 2009 - 06:17 AM

Hi,

I'm trying to calculate a maximum in a pipelined loop in this way:

while (co_stream_read(datain, &input, sizeof(co_uint256)) == co_err_none) {
#pragma CO PIPELINE
//Some operations to calculate m5
if (m5 > max)
max = m5;
//....
}

the problem is that I get a pipeline rate of 2 because the comparation in the if must wait in case max is updated.
Is there any trick to get a rate of 1?

Thank you.






#2 etrexel

    Advanced Member

  • Impulse Staff
  • PipPipPip
  • 260 posts

Posted 01 December 2009 - 07:17 PM

QUOTE (Diego @ Dec 1 2009, 07:17 AM) <{POST_SNAPBACK}>
Hi,

I'm trying to calculate a maximum in a pipelined loop in this way:

while (co_stream_read(datain, &input, sizeof(co_uint256)) == co_err_none) {
#pragma CO PIPELINE
//Some operations to calculate m5
if (m5 > max)
max = m5;
//....
}

the problem is that I get a pipeline rate of 2 because the comparation in the if must wait in case max is updated.
Is there any trick to get a rate of 1?

Thank you.

Hi,
What version of CoDeveloper are you running? With the code above as-is I am seeing the pipeline come out with latency=2 and rate=1 (v3.60.a8). Perhaps the code before the comparison is causing the lower rate due to conditional logic in which case balancing if-statements (always having an else) or structuring the code to avoid conditional assignments may help.

Best Regards,
Ed
Ed Trexel
Impulse Accelerated Technologies, Inc.

#3 Diego

    Member

  • Members
  • PipPip
  • 11 posts

Posted 02 December 2009 - 02:31 AM

QUOTE (etrexel @ Dec 2 2009, 04:17 AM) <{POST_SNAPBACK}>
Hi,
What version of CoDeveloper are you running? With the code above as-is I am seeing the pipeline come out with latency=2 and rate=1 (v3.60.a8). Perhaps the code before the comparison is causing the lower rate due to conditional logic in which case balancing if-statements (always having an else) or structuring the code to avoid conditional assignments may help.

Best Regards,
Ed


Hi, thank you for your quick answer.
I'm using v3.60.a8 and the problem seems to be floating point operations.

My actual code is
CODE
        while ( co_stream_read(input, &nSample, sizeof(co_int64)) == co_err_none ) {
        #pragma CO PIPELINE
            IF_SIM(samplesread++;)
            r = nSample;
            u = nSample >> 32;
            m5 = r * u;
            if (m5 > max)
                max = m5;
            salida = max;
    
            co_stream_write(output, &salida, sizeof(co_int64));
            IF_SIM(sampleswritten++;)
        }


beeing r,u,m5,max float. I get
Block #1 pipeline:
| Latency: 26
| Rate: 2
| Max. Unit Delay: 32
| Effective Rate: 64

However, if declared as co_uint32, the result is
| Latency: 2
| Rate: 1| Max. Unit Delay: 65
| Effective Rate: 65

Thank you.

#4 etrexel

    Advanced Member

  • Impulse Staff
  • PipPipPip
  • 260 posts

Posted 02 December 2009 - 04:21 PM

QUOTE (Diego @ Dec 2 2009, 03:31 AM) <{POST_SNAPBACK}>
Hi, thank you for your quick answer.
I'm using v3.60.a8 and the problem seems to be floating point operations.

My actual code is
CODE
        while ( co_stream_read(input, &nSample, sizeof(co_int64)) == co_err_none ) {
         #pragma CO PIPELINE
             IF_SIM(samplesread++;)
             r = nSample;
             u = nSample >> 32;
             m5 = r * u;
             if (m5 > max)
                 max = m5;
             salida = max;
    
             co_stream_write(output, &salida, sizeof(co_int64));
             IF_SIM(sampleswritten++;)
         }


beeing r,u,m5,max float. I get
Block #1 pipeline:
| Latency: 26
| Rate: 2
| Max. Unit Delay: 32
| Effective Rate: 64

However, if declared as co_uint32, the result is
| Latency: 2
| Rate: 1| Max. Unit Delay: 65
| Effective Rate: 65

Thank you.

Hi,
That would make more sense because floating point operations are multi-cycle and 'max' is a recursive variable. Does the math require a floating point multiply or could that be done using fixed-point? Fixed point would help both with rate as well as FPGA resources. I also suspect that the ">> 32" may be being done as a division in floating point because latency=26.

Best Regards,
Ed
Ed Trexel
Impulse Accelerated Technologies, Inc.





1 user(s) are reading this topic

0 members, 1 guests, 0 anonymous users