Jump to content


Large, Efficent Local Memory Arrays?


  • You cannot reply to this topic
1 reply to this topic

#1 Jonathan

    Member

  • Members
  • PipPip
  • 25 posts

Posted 08 June 2009 - 01:36 PM

The FPGA I'm using has about 4Mb of block ram distributed amongst 232 blocks. Each block can address up to 36-bits of memory for reading and simultaneously for writing.

I would like to declare a 1,024 item array of 224-bit structures and pull the data from the blocks as time efficiently as possible. That is, I wouldn't want to just declare an array of 1024 224-bit items for fear that each instance of a structure would be stored contiguously on a block, so that it would take several sequential reads to pull it from memory when I index it. Also, it contains several variables that are more than 36-bits, so even if Impulse C broke it up into datatypes it would still take 2 cycles to pull from the block memory.

So my question is, how efficently is this handled by Impulse C if I just plain declare the array?

#2 RalphBodenner

    Advanced Member

  • Admin
  • PipPipPip
  • 348 posts

Posted 10 June 2009 - 10:36 AM

Hi Jonathan,

The synthesis/mapping tools from the FPGA device vendors will construct RAM modules from the hard resources in the device for a (nearly) arbitrarily wide/deep Impulse C array. Ask for a 224 bit-wide array with 1024 elements and Xilinx ISE, for example, will stitch together block RAMs as needed to give you a 224-bit data bus.

This RAM will still be limited by having two ports: one for reading, another for reading or writing. So you can only get two elements at a time from a given Impulse C array. To work around this limitation of the FPGA hardware, you can use multiple C arrays, from each of which you can read two elements per clock cycle. You could alternately construct an array of great width, say (1024/2)*224 bits, and thereby read a two huge unsigned integers from it each clock, which you can then shift/mask to marshal into 224-bit structures, most of which could be done in parallel with the array reads. This often works, but the synthesis tools are not always able to create a datapath wide enough.

Impulse C supports integer types of arbitrary width. See co_math.h for examples of how you can define a co_uint128, for example. Handling such nonstandard C types in desktop simulation can be tricky, but the types are fully supported in hardware compilation.

Regards,
Ralph

Ralph Bodenner
Impulse Accelerated Technologies, Inc.





1 user(s) are reading this topic

0 members, 1 guests, 0 anonymous users