Can we synthesis a simple generic memory? - memory

I'm trying to synthesis with simple generic memory model within design compiler.
but I do find that some error messages as the below,
and I used the simple generic memory model as the below
module RAM_generic
(clk,
enb,
wr_din,
wr_addr,
wr_en,
rd_addr,
rd_dout);
parameter AddrWidth = 1;
parameter DataWidth = 1;
input clk;
input enb;
input signed [DataWidth - 1:0] wr_din;
input [AddrWidth - 1:0] wr_addr;
input wr_en;
input [AddrWidth - 1:0] rd_addr;
output signed [DataWidth - 1:0] rd_dout;
reg [DataWidth - 1:0] ram [2**AddrWidth - 1:0];
reg [DataWidth - 1:0] data_int;
always #(posedge clk)
begin
if (enb == 1'b1) begin
if (wr_en == 1'b1) begin
ram[wr_addr] <= wr_din;
end
data_int <= ram[rd_addr];
end
end
assign rd_dout = data_int;
endmodule
I want to know Can't we synthesis a simple generic memory? If yes, What am I supposed to do to synthesis the generic memory synthesis error?

Yes you can.
In FPGA's a single or dual ported memory will be mapped on the internal memory structures. (At least if you use the right syntax! Look for the FPGA application notes how to do that)
In an ASIC it will be made from registers. I needed a small triple ported memory (Two read and a write port all simultaneous) a few years back and it came out fine. Most FIFO's have a memory in them and 90% of them are made from registers.
Your code is missing 'endmodule'. I don't spot any other obvious errors.
Some tips:
Using ((1 << AddrWidth) -1) will also work in old fashion Verilog.
I would not use a default width/depth of 1 for a memory. You get [0:0] constructs which work, but why should you if e.g. 8x8 is more likely to be used.
A generic memory should not be signed. By convention a generic memory is unsigned.
Parameters are by convention all uppercase. (At least in every firm I worked it was)

Related

Fastest way of storing non-adjacent d registers with NEON intrinsics

I am porting 32bit NEON asm code to NEON intrinsics, and I am wondering if this code can be written in a concise way using intrinsics:
vst4.32 {d0[0], d2[0], d4[0], d6[0]}, [%[v1]]!
1) The previous code operates on q registers, but when it comes to storage, instead of using q0, q1, q2 and q3, it has to recreate vectors which have each part in one of the d registers, e.g. v1[0] = d0[0], v1[1] = d2[0] ... v2[0] = d0[1], v2[1] = d2[1] ... v3[0] = d1[0], v3[1] = d3[0] ... etc.
This operation is a one-liner in asm, but with intrinsics I don't know if I can do that without first splitting high and low bits and building a new float32x4x4_t variable to feed to vst4_f32.
Is that possible?
2) I'm not entirely sure of what [%[v1]]! does (yes, I googled quite a bit): it should be a reference to a variable named v1 and the exclamation mark will do writeback, which should mean the pointer is increased by the same amount that was written by the instruction on the same line.
Correct? Any way of replicating that with intrinsics?
After some more investigation I found this specific instruction to store a specific lane of an array of 4 vectors, so no need to split into high and low bits variables:
float32x4x4_t u = { q0, q1, q2, q3 };
vst4q_lane_f32(v1, u, 0);
v1 += 4;
Writeback is just an increased pointer, as #charlesbaylis wrote.
In principle, a sufficiently smart compiler could use the instruction you want for the vst4_f32 intrinsic, but in practice, no compiler is that good.
To get the post-index writeback, you can write
vst4_f32(ptr, v);
ptr += 4;
Some compilers will recognise this. GCC 5.1 (when released) will do this in at least some cases.
[Edit: misread the question, vst4q_lane_f32 does map to the required instruction perfectly]
It seems to be inline assembly.
Anyway, the answers are:
1) No
2) Yes

Storing array in FPGA

I am trying to implement a simple multiplier. I have a text file, from in which there are two columns. I am multiplying column 1 to column 2. Here is code in Verilog:
module File_read(
input clk
);
reg [21:0] captured_data[0:10];
reg [21:0] a[0:8];
reg [21:0] b[0:8];
reg [43:0] product[0:5];
`define NULL 0
integer n=0;
integer i=0;
initial
$readmemh("abc.txt",captured_data);
always #(posedge clk) begin
product[i]<=captured_data[n]*captured_data[n+1];
n<=n+2;
i<=i+1;
end
endmodule
I have Xilinx Spartan®-6 LX45 FPGA board. And it offers 128M bit ddr2 ram and 16Mbyte x4 SPI Flash for configuration & data storage.
Now I want to store my file into FPGA board into memory. So how can I do this? Do I have to use IP core to access memory or by any other source?
P.S: This is my first time, I am storing anything on FPGA.
Regards!
Awais
First of all don't use DDR or Flash memory, unless you really need them. Your FPGA has plenty of BlockRAMs to store several thousand arguments for your multiplier.
One easy way is to instantiate 2 BlockRAMs and load them at compile time with data from a file. Xilinx offers tools like data2mem to achieve this.
Alternatively, you can use Ethernet or a UART connection to send the test data to your design.
Edit 1 - How to instantiate BlockRAM
Solution 1: A generic VHDL description.
type T_RAM is array(LINES - 1 downto 0) of std_logic_vector(BITS-1 downto 0);
signal ram : T_RAM;
begin
process (Clock)
begin
if rising_edge(Clock) then
if (WriteEnable = '1') then
ram(to_integer(WriteAddress)) <= d;
end if;
q <= ram(to_integer(ReadAddress));
end if;
end process;
Solution 2: The IPCore generator has a wizard to create BlockRAMs and assign external files.
Solution 3: Manually instantiate a BlockRAM macro. Each FPGA family comes with a HDL library guide of supported macros. For example the Virtex-5 has a RAMB36 macro on page 311.
The usage of BlockRAMs with data2MEM and *.bmm (BlockRAM memory map) files is described here.

Re-configurable Memory Instance in verilog with DATA-IN and DATA-OUT are passed as parameter

How can I make a memory module in which DATA bus width are passed as parameter to each instances and my design re-configure itself according to the parameter? For example, assuming I have byte addressable memory and DATA-IN bus width is 32 bit (4 bytes written in each cycle) and DATA-OUT is 16 bits (2 bytes read each cycle). For other instance DATA-IN is 64 bits and DATA-OUT is 16 bits. For all such instances my design should work.
What I have tried is to generate write pointer values according to design parameters, e.g. DATA-IN 32 bit, write pointer will increment 4 every cycle while writing. For 64 bit -increment will be by 8 and so on.
Problem is: how to make 4 or 8 or 16 bytes to be written in single cycle according to parameters passed to instance?
//Something as following I want to implement. This memory instance can be considered as internal memory of FIFO having different datawidth for reading and writing in case you think of an application of such memory
module mem#(parameter DIN=16, parameter DOUT=8, parameter ADDR=4,parameter BYTE=8)
(
input [DIN-1:0] din,
output [DOUT-1:0] dout,
input wen,ren,clk
);
localparam DEPTH = (1<<ADDR);
reg [BYTE-1:0] mem [0:DEPTH-1];
reg wpointer=5'b00000;
reg rpointer=5'b00000;
reg [BYTE-1:0] tmp [0:DIN/BYTE-1];
function [ADDR:0] ptr;
input [4:0] index;
integer i;
begin
for(i=0;i<DIN/BYTE;i=i+1) begin
mem[index] = din[(BYTE*(i+1)-1):BYTE*(i)]; // something like this I want to implement, I know this line is not allowed in verilog, but is there any alternative to this?
index=index+1;
end
ptr=index;
end
endfunction
always #(posedge clk) begin
if(wen==1)
wpointer <= wptr(wpointer);
end
always #(posedge clk) begin
if(ren==1)
rpointer <= ptr(rpointer);
end
endmodule
din[(BYTE*(i+1)-1):BYTE*(i)] will not compile in Verilog because the MSB and LSB select bits are both variables. Verilog requires a known range. +: is for part-select (also known as a slice) allows a variable select index and a constant range value. It was introduced in IEEE Std 1364-2001 § 4.2.1. You can also read more about it in IEEE Std 1800-2012 § 11.5.1, or refer to previously asked questions: What is `+:` and `-:`? and Indexing vectors and arrays with +:.
din[BYTE*i +: BYTE] should work for you, alternatively you can use din[BYTE*(i+1)-1 -: BYTE].
Also, you should use non-blocking assignments (<=) to mem. In your code read and write can happen at the same time. With blocking there is a race condition between when accessing the same byte. It may synthesize, but your RTL and gate simulation may generated different results. I also strongly advice agent using functions for assigning memory. Functions in synthesizable code without nasty surprises need to self contained without references on anything outside of the function and any internal variables are always reset to a static constant at the start of the function.
With the guidelines mentioned above, I'd recommend recoding to something like the below. This is a template to start with, not a free lunch. I left out the out-of-range index compensation for you to figure out on your own.
...
localparam DEPTH = (1<<ADDR);
reg [BYTE-1:0] mem [0:DEPTH-1];
reg [ADDR-1:0] wpointer, rpointer;
integer i;
initial begin // init values for pointers (FPGA, not ASIC)
wpointer = {ADDR{1'b0}};
rpointer = {ADDR{1'b0}};
end
always #(posedge clk) begin
if (ren==1) begin
for(i=0; i < DOUT/BYTE; i=i+1) begin
dout[BYTE*i +: BYTE] <= mem[rpointer+i];
end
rpointer <= rpointer + (DOUT/BYTE);
end
if (wen==1) begin
for(i=0; i < DIN/BYTE; i=i+1) begin
mem[wpointer+i] <= din[BYTE*i +: BYTE];
end
wpointer <= wpointer + (DIN/BYTE);
end
end

Using Non-zero indexed Memory in Quartus (Verilog)

I am writing a memory system for a basic 16-bit educational CPU and am running into issues with Quartus Synthesis of my module. Specifically, I have broken down the address space into a few different parts and one of them (which is a ROM) is not synthesizing properly. (NOTE: I am synthesizing for a DE2-115 Altera board, QuartusII 12.1, SystermVerilog code)
So, in order to make the memory-mapped VRAM (a memory module that is dual-ported to allow VGA output while the CPU writes colors) more usable, I included a small ROM in the address space that includes code (assembled functions) that allow you to write characters into memory, ie a print_char function. This ROM is located in memory at a specific address, so in order to simplify the SV, I implemented the ROM (and all the memory modules) like so:
module printROM
(input rd_cond_code_t re_L,
input wr_cond_code_t we_L,
input logic [15:0] memAddr,
input logic clock,
input logic reset_L,
inout wire [15:0] memBus,
output mem_status_t memStatus);
reg [15:0] rom[`VROM_ADDR_LO:`VROM_ADDR_HI];
reg [15:0] dataOut;
/* The ROM */
always #(posedge clock) begin
if (re_L == MEM_RD) begin
dataOut <= rom[memAddr];
end
end
/* Load the rom data */
initial $readmemh("printROM.hex", rom);
/* Drive the line */
tridrive #(.WIDTH(16)) romOut(.data(dataOut),
.bus(memBus),
.en_L(re_L));
/* Manage asserting completion of a read or a write */
always_ff #(posedge clock, negedge reset_L) begin
if (~reset_L) begin
memStatus <= MEM_NOT_DONE;
end
else begin
if ((we_L == MEM_WR) | (re_L == MEM_RD)) begin
memStatus <= MEM_DONE;
end
else begin
memStatus <= MEM_NOT_DONE;
end
end
end
endmodule // printROM
Where VROM_ADDR_LO and VROM_ADDR_HI are two macros defining the addresses of this ROM, namely 'hEB00 and 'hEFFF. Thus, when an address in that range is read/written to by the CPU, this module is able to properly index into the rom memory. This technique works fine in VCS simulation.
However, when I go to synthesize this module, Quartus correctly implies a ROM but has issues initializing it. I get this error:
Error (113012): Address at line 11 exceeds the specified depth (1280) in the Memory Initialization File "psx18240.ram0_printROM_38938673.hdl.mif"
It looks like Quartus is converting the .hex file I give as the ROM code (ie printROM.hex) and using the CPU visible addresses (ie starting at 'hEB00) for the generated .mif file even though the size of rom is obviously too small. Does Quartus not support this syntax or am I doing something wrong?
It would appear that Quartus does not like what you're doing. Probably better would be if you only want a rom with 1280 entries, then you should just create one from 0:1279, and address it using that range (this is what your logic would have to synthesize into anyway).
reg [15:0] rom[0:`VROM_ADDR_HI-`VROM_ADDR_LO];
assign romAddr = (memAddr - `VROM_ADDR_LO);
always #(posedge clock) begin
if (re_L == MEM_RD) begin
dataOut <= rom[romAddr];
end
end
Go for ROM Megafunctions in mega wizard plugin manager in quartus. Its is a GUI based ipcore generater. There you can paramterize all the options even including the hex file. But you have to follow Intel hex file format. This also available in file->new.
For hdl based instantiation
http://quartushelp.altera.com/12.1/mergedProjects/hdl/mega/mega_file_lpm_rom.htm
User guide.
https://www.google.co.in/url?sa=t&source=web&rct=j&ei=nVAGVKvSDcu9ugTMooCQDQ&url=http://www.altera.com/literature/ug/ug_ram_rom.pdf&cd=1&ved=0CBsQFjAA&usg=AFQjCNE2ZXM1gIsMZ5BvkKTHanX1E7vamg

compilers - Instruction Selection for type declarations in AST

I'm learning compilers and creating a code generator for a simple language that deals with two types: characters and integers.
After the user input has been scanned by the scanner and then parsed by the parser, I get an AST representation of the input. I have made a code generation for an even simpler language which only processes expressions with integers, operators and variables.
However with this new language I sometimes get a subtree for a type declaration, like this:
(IS TYPE (x) (INT))
which says x is of type INT.
Should there be a case in my code generator which deals with these type declarations? Or is this simply for the semantic analyzer to type check, so I should just assume the types have been checked and ignore this part of the tree and simply assign the value for x?
Both situations are possible, you need to describe more about your language, to see if you really need to add that feature to your code generator, or skip it as unnecessary, and avoid extra work with this difficult and interesting topic of designing a programming language.
Is you "code generator" a program that recieves as an input code in a programming language (maybe small one) and outputs code in another programming language (maybe small one) ?
This tool is usually called a "translator".
Is you "code generator" a program that receive as an input a programming language and outputs assembler / bytecode like programming language ?
This tool is usually called a "compiler".
Note: "pile" is a synonym for "stack".
Usually an A.S.T., stores the type of an operation, or function call. Example, in c:
...
int a = 3;
int b = 5;
float c = (float)(a * b);
...
The last line, generates an A.S.T. similar to this, (skip A.S.T. for other lines):
..................................................................
..................................................................
......................+--------------+............................
......................| [root] |............................
......................| (no type) = |............................
......................+------+-------+............................
.............................|....................................
.................+-----------+------------+.......................
.................|........................|.......................
...........+-----+-----+....+-------------+-------------+.........
...........| (int) c |....| (float) (cast operation) |.........
...........+-----------+....+-------------+-------------+.........
..........................................|.......................
....................................+-----+-----+.................
....................................| (int) () |.................
....................................+-----+-----+.................
..........................................|.......................
....................................+-----+-----+.................
....................................| (int) * |.................
....................................+-----+-----+.................
..........................................|.......................
..............................+-----------+-----------+...........
..............................|.......................|...........
........................+-----+-----+...........+-----+-----+.....
........................| (int) a |...........| (float) b |.....
........................+-----------+...........+-----------+.....
..................................................................
..................................................................
Note that the "(float)" cast its like an operator or a function,
similar to your question.
Good Luck.
If this is a declaration
(IS TYPE (x) (INT))
then x should be laid out in memory. In the case of C and automatic variables, local auto variables are allocated on stack. To allocate needed size of stack you should know sizes of all local vars and sizes are from types.
If this variable is stored in a register, you should select a register of needed size (think about x86 with: AL, AX, EAX, RAX - the same register with different sizes), if your target has such.
Also, type is needed when there is an ambiguous operation in AST, which can operate on different data sizes (e.g. char, short, int - or 8-bit, 16-bit, 32-bit, etc). And for some assemblers, size of data is encoded into instruction itself; so codegen should remember sizes of variables.
Or, if the type of operation was not recorded in AST, the ADD:
(ADD (x) (y))
may mean both float and int additions (ADD or FADD instructions), so types of x and y are needed in codegen to select right variant.

Resources