Storing array in FPGA - memory

I am trying to implement a simple multiplier. I have a text file, from in which there are two columns. I am multiplying column 1 to column 2. Here is code in Verilog:
module File_read(
input clk
);
reg [21:0] captured_data[0:10];
reg [21:0] a[0:8];
reg [21:0] b[0:8];
reg [43:0] product[0:5];
`define NULL 0
integer n=0;
integer i=0;
initial
$readmemh("abc.txt",captured_data);
always #(posedge clk) begin
product[i]<=captured_data[n]*captured_data[n+1];
n<=n+2;
i<=i+1;
end
endmodule
I have Xilinx Spartan®-6 LX45 FPGA board. And it offers 128M bit ddr2 ram and 16Mbyte x4 SPI Flash for configuration & data storage.
Now I want to store my file into FPGA board into memory. So how can I do this? Do I have to use IP core to access memory or by any other source?
P.S: This is my first time, I am storing anything on FPGA.
Regards!
Awais

First of all don't use DDR or Flash memory, unless you really need them. Your FPGA has plenty of BlockRAMs to store several thousand arguments for your multiplier.
One easy way is to instantiate 2 BlockRAMs and load them at compile time with data from a file. Xilinx offers tools like data2mem to achieve this.
Alternatively, you can use Ethernet or a UART connection to send the test data to your design.
Edit 1 - How to instantiate BlockRAM
Solution 1: A generic VHDL description.
type T_RAM is array(LINES - 1 downto 0) of std_logic_vector(BITS-1 downto 0);
signal ram : T_RAM;
begin
process (Clock)
begin
if rising_edge(Clock) then
if (WriteEnable = '1') then
ram(to_integer(WriteAddress)) <= d;
end if;
q <= ram(to_integer(ReadAddress));
end if;
end process;
Solution 2: The IPCore generator has a wizard to create BlockRAMs and assign external files.
Solution 3: Manually instantiate a BlockRAM macro. Each FPGA family comes with a HDL library guide of supported macros. For example the Virtex-5 has a RAMB36 macro on page 311.
The usage of BlockRAMs with data2MEM and *.bmm (BlockRAM memory map) files is described here.

Related

using circular memory to make the 1 second sound delay in verilog

module delay(
input [11:0] data_in,
input delay_clk, //here i will use a 20kHz clk
output reg [11:0] data_out
);
reg[11:0]memory[0:20000];
reg[15:0]write_index;//i
reg[15:0]read_index;//j
initial begin
write_index = 16'b0000000000000000;
read_index = 16'b0100111000100000;
end
always#(posedge delay_clk) begin
read_index = read_index+1;
memory [write_index] <= data_in;
data_out <= memory[read_index];
end
endmodule
I want to make a 1 second delay by using the circular memory.
I generate bitstream and program it to FGPA but there is no sound come out.
So how can i improve this verilog codes?
You don't increment write_index anywhere (like you do with read_index), it will cause this hardware to overwrite 1 memory cell over and over again. I don't think this is what you wanted, it beats the idea of using a memory array.
If you intend to store read_index in some kind of a memory element use <= assignment like the ones for memory and data_out.
Your write_index and read_index are a 16 bit variables. You use them to iterate over a 20k cells memory. They will overflow at some point. Is this intentional? I mean why 16bits and is it ok for those indexes to point at e.g. address 41k?

Can we synthesis a simple generic memory?

I'm trying to synthesis with simple generic memory model within design compiler.
but I do find that some error messages as the below,
and I used the simple generic memory model as the below
module RAM_generic
(clk,
enb,
wr_din,
wr_addr,
wr_en,
rd_addr,
rd_dout);
parameter AddrWidth = 1;
parameter DataWidth = 1;
input clk;
input enb;
input signed [DataWidth - 1:0] wr_din;
input [AddrWidth - 1:0] wr_addr;
input wr_en;
input [AddrWidth - 1:0] rd_addr;
output signed [DataWidth - 1:0] rd_dout;
reg [DataWidth - 1:0] ram [2**AddrWidth - 1:0];
reg [DataWidth - 1:0] data_int;
always #(posedge clk)
begin
if (enb == 1'b1) begin
if (wr_en == 1'b1) begin
ram[wr_addr] <= wr_din;
end
data_int <= ram[rd_addr];
end
end
assign rd_dout = data_int;
endmodule
I want to know Can't we synthesis a simple generic memory? If yes, What am I supposed to do to synthesis the generic memory synthesis error?
Yes you can.
In FPGA's a single or dual ported memory will be mapped on the internal memory structures. (At least if you use the right syntax! Look for the FPGA application notes how to do that)
In an ASIC it will be made from registers. I needed a small triple ported memory (Two read and a write port all simultaneous) a few years back and it came out fine. Most FIFO's have a memory in them and 90% of them are made from registers.
Your code is missing 'endmodule'. I don't spot any other obvious errors.
Some tips:
Using ((1 << AddrWidth) -1) will also work in old fashion Verilog.
I would not use a default width/depth of 1 for a memory. You get [0:0] constructs which work, but why should you if e.g. 8x8 is more likely to be used.
A generic memory should not be signed. By convention a generic memory is unsigned.
Parameters are by convention all uppercase. (At least in every firm I worked it was)

Re-configurable Memory Instance in verilog with DATA-IN and DATA-OUT are passed as parameter

How can I make a memory module in which DATA bus width are passed as parameter to each instances and my design re-configure itself according to the parameter? For example, assuming I have byte addressable memory and DATA-IN bus width is 32 bit (4 bytes written in each cycle) and DATA-OUT is 16 bits (2 bytes read each cycle). For other instance DATA-IN is 64 bits and DATA-OUT is 16 bits. For all such instances my design should work.
What I have tried is to generate write pointer values according to design parameters, e.g. DATA-IN 32 bit, write pointer will increment 4 every cycle while writing. For 64 bit -increment will be by 8 and so on.
Problem is: how to make 4 or 8 or 16 bytes to be written in single cycle according to parameters passed to instance?
//Something as following I want to implement. This memory instance can be considered as internal memory of FIFO having different datawidth for reading and writing in case you think of an application of such memory
module mem#(parameter DIN=16, parameter DOUT=8, parameter ADDR=4,parameter BYTE=8)
(
input [DIN-1:0] din,
output [DOUT-1:0] dout,
input wen,ren,clk
);
localparam DEPTH = (1<<ADDR);
reg [BYTE-1:0] mem [0:DEPTH-1];
reg wpointer=5'b00000;
reg rpointer=5'b00000;
reg [BYTE-1:0] tmp [0:DIN/BYTE-1];
function [ADDR:0] ptr;
input [4:0] index;
integer i;
begin
for(i=0;i<DIN/BYTE;i=i+1) begin
mem[index] = din[(BYTE*(i+1)-1):BYTE*(i)]; // something like this I want to implement, I know this line is not allowed in verilog, but is there any alternative to this?
index=index+1;
end
ptr=index;
end
endfunction
always #(posedge clk) begin
if(wen==1)
wpointer <= wptr(wpointer);
end
always #(posedge clk) begin
if(ren==1)
rpointer <= ptr(rpointer);
end
endmodule
din[(BYTE*(i+1)-1):BYTE*(i)] will not compile in Verilog because the MSB and LSB select bits are both variables. Verilog requires a known range. +: is for part-select (also known as a slice) allows a variable select index and a constant range value. It was introduced in IEEE Std 1364-2001 § 4.2.1. You can also read more about it in IEEE Std 1800-2012 § 11.5.1, or refer to previously asked questions: What is `+:` and `-:`? and Indexing vectors and arrays with +:.
din[BYTE*i +: BYTE] should work for you, alternatively you can use din[BYTE*(i+1)-1 -: BYTE].
Also, you should use non-blocking assignments (<=) to mem. In your code read and write can happen at the same time. With blocking there is a race condition between when accessing the same byte. It may synthesize, but your RTL and gate simulation may generated different results. I also strongly advice agent using functions for assigning memory. Functions in synthesizable code without nasty surprises need to self contained without references on anything outside of the function and any internal variables are always reset to a static constant at the start of the function.
With the guidelines mentioned above, I'd recommend recoding to something like the below. This is a template to start with, not a free lunch. I left out the out-of-range index compensation for you to figure out on your own.
...
localparam DEPTH = (1<<ADDR);
reg [BYTE-1:0] mem [0:DEPTH-1];
reg [ADDR-1:0] wpointer, rpointer;
integer i;
initial begin // init values for pointers (FPGA, not ASIC)
wpointer = {ADDR{1'b0}};
rpointer = {ADDR{1'b0}};
end
always #(posedge clk) begin
if (ren==1) begin
for(i=0; i < DOUT/BYTE; i=i+1) begin
dout[BYTE*i +: BYTE] <= mem[rpointer+i];
end
rpointer <= rpointer + (DOUT/BYTE);
end
if (wen==1) begin
for(i=0; i < DIN/BYTE; i=i+1) begin
mem[wpointer+i] <= din[BYTE*i +: BYTE];
end
wpointer <= wpointer + (DIN/BYTE);
end
end

Using Non-zero indexed Memory in Quartus (Verilog)

I am writing a memory system for a basic 16-bit educational CPU and am running into issues with Quartus Synthesis of my module. Specifically, I have broken down the address space into a few different parts and one of them (which is a ROM) is not synthesizing properly. (NOTE: I am synthesizing for a DE2-115 Altera board, QuartusII 12.1, SystermVerilog code)
So, in order to make the memory-mapped VRAM (a memory module that is dual-ported to allow VGA output while the CPU writes colors) more usable, I included a small ROM in the address space that includes code (assembled functions) that allow you to write characters into memory, ie a print_char function. This ROM is located in memory at a specific address, so in order to simplify the SV, I implemented the ROM (and all the memory modules) like so:
module printROM
(input rd_cond_code_t re_L,
input wr_cond_code_t we_L,
input logic [15:0] memAddr,
input logic clock,
input logic reset_L,
inout wire [15:0] memBus,
output mem_status_t memStatus);
reg [15:0] rom[`VROM_ADDR_LO:`VROM_ADDR_HI];
reg [15:0] dataOut;
/* The ROM */
always #(posedge clock) begin
if (re_L == MEM_RD) begin
dataOut <= rom[memAddr];
end
end
/* Load the rom data */
initial $readmemh("printROM.hex", rom);
/* Drive the line */
tridrive #(.WIDTH(16)) romOut(.data(dataOut),
.bus(memBus),
.en_L(re_L));
/* Manage asserting completion of a read or a write */
always_ff #(posedge clock, negedge reset_L) begin
if (~reset_L) begin
memStatus <= MEM_NOT_DONE;
end
else begin
if ((we_L == MEM_WR) | (re_L == MEM_RD)) begin
memStatus <= MEM_DONE;
end
else begin
memStatus <= MEM_NOT_DONE;
end
end
end
endmodule // printROM
Where VROM_ADDR_LO and VROM_ADDR_HI are two macros defining the addresses of this ROM, namely 'hEB00 and 'hEFFF. Thus, when an address in that range is read/written to by the CPU, this module is able to properly index into the rom memory. This technique works fine in VCS simulation.
However, when I go to synthesize this module, Quartus correctly implies a ROM but has issues initializing it. I get this error:
Error (113012): Address at line 11 exceeds the specified depth (1280) in the Memory Initialization File "psx18240.ram0_printROM_38938673.hdl.mif"
It looks like Quartus is converting the .hex file I give as the ROM code (ie printROM.hex) and using the CPU visible addresses (ie starting at 'hEB00) for the generated .mif file even though the size of rom is obviously too small. Does Quartus not support this syntax or am I doing something wrong?
It would appear that Quartus does not like what you're doing. Probably better would be if you only want a rom with 1280 entries, then you should just create one from 0:1279, and address it using that range (this is what your logic would have to synthesize into anyway).
reg [15:0] rom[0:`VROM_ADDR_HI-`VROM_ADDR_LO];
assign romAddr = (memAddr - `VROM_ADDR_LO);
always #(posedge clock) begin
if (re_L == MEM_RD) begin
dataOut <= rom[romAddr];
end
end
Go for ROM Megafunctions in mega wizard plugin manager in quartus. Its is a GUI based ipcore generater. There you can paramterize all the options even including the hex file. But you have to follow Intel hex file format. This also available in file->new.
For hdl based instantiation
http://quartushelp.altera.com/12.1/mergedProjects/hdl/mega/mega_file_lpm_rom.htm
User guide.
https://www.google.co.in/url?sa=t&source=web&rct=j&ei=nVAGVKvSDcu9ugTMooCQDQ&url=http://www.altera.com/literature/ug/ug_ram_rom.pdf&cd=1&ved=0CBsQFjAA&usg=AFQjCNE2ZXM1gIsMZ5BvkKTHanX1E7vamg

Why isn't this VHDL inferring BRAM in XST?

I have an array of vectors that I want to be stored in Block RAM on a Virtex-5 using ISE 13.4. It is 32Kb which should fit in 1 BRAM but it is all being stored in logic. My system uses an AMBA APB bus so I check for a select line and an enable line. Please help me understand why this code isn't inferring a BRAM. Note: this is a dummy example which is simpler to understand and should help me with my other code.
architecture Behavioral of top is
type memory_array is array (63 downto 0) of std_logic_vector(31 downto 0);
signal memory : memory_array;
attribute ram_style: string;
attribute ram_style of memory : signal is "block";
begin
process(Clk)
begin
if(rising_edge(Clk)) then
if(Sel and Wr_en and Enable) = '1' then
memory(to_integer(Paddr(5 downto 0))) <= Data_in;
elsif(Sel and not Wr_en and Enable) = '1' then
Data_out <= memory(to_integer(Paddr(5 downto 0)));
end if;
end if;
end process;
end Behavioral;
I declare the ram_style of the array as block but the XST report says: WARNING:Xst:3211 - Cannot use block RAM resources for signal <Mram_memory>. Please check that the RAM contents is read synchronously.
It appears that the problem lies in a read_enable condition, but the Virtex 5 User Guide makes it sound like there is an enable and a write_enable on the BRAM hard blocks. I could drive the output all the time, but I don't want to and that would waste power. Any other ideas?
Your logic may not match how your device's BRAM works (there are various limitations depending on the device). Usually, the data_out is updated on every clock cycle the RAM is enabled for, not just "when not writing" - try this:
process(Clk)
begin
if(rising_edge(Clk)) then
if(Sel and Enable) = '1' then
Data_out <= memory(to_integer(Paddr(5 downto 0)));
if wr_en = '1' then
memory(to_integer(Paddr(5 downto 0))) <= Data_in;
end if;
end if;
end if;
end process;
I moved the Data_out assignment "upwards" to make it clear that it gets the "old" value - that's the default behaviour of the BRAM, although other styles can also be set up.
Alternatively, the tools may be being confused by the sel and enable and write all in a single if statement - this is because they are mainly "template matching" rather than "function matching" when inferring BRAM. You may find that simply splitting out an "enable if" and a "write if" (as I did above) whilst keeping the rest of the functionality the same is sufficient to make the synthesiser do what is required.
If you are using Xilinx's XST then you can read all about inferring RAMs in the docs (page 204 onwards of my XST user guide - the chapter is called "RAM HDL Coding techniques")
Use the appropriate macro for the BRAM block on your device? I found that to work much better than relying on the synthesis tool not beeing stupid.
I tried many different combinations and here is the only one I got to work:
en_BRAM <= Sel and Enable;
process(Clk)
begin
if(rising_edge(Clk)) then
if(en_BRAM = '1')then
if(Wr_en = '1') then
icap_memory(to_integer(Paddr(5 downto 0))) <= Data_in;
else
Data_out <= icap_memory(to_integer(Paddr(5 downto 0)));
end if;
end if;
end if;
end process;
So I think the enable needs to be on the whole RAM and it can only be 1 signal. Then the write enable can also only be 1 signal and the read has to be only an else statement (not if/elsif). This instantiates a BRAM according to XST in ISE 13.3 on Windows 7 64-bit.

Resources