I am writing a memory system for a basic 16-bit educational CPU and am running into issues with Quartus Synthesis of my module. Specifically, I have broken down the address space into a few different parts and one of them (which is a ROM) is not synthesizing properly. (NOTE: I am synthesizing for a DE2-115 Altera board, QuartusII 12.1, SystermVerilog code)
So, in order to make the memory-mapped VRAM (a memory module that is dual-ported to allow VGA output while the CPU writes colors) more usable, I included a small ROM in the address space that includes code (assembled functions) that allow you to write characters into memory, ie a print_char function. This ROM is located in memory at a specific address, so in order to simplify the SV, I implemented the ROM (and all the memory modules) like so:
module printROM
(input rd_cond_code_t re_L,
input wr_cond_code_t we_L,
input logic [15:0] memAddr,
input logic clock,
input logic reset_L,
inout wire [15:0] memBus,
output mem_status_t memStatus);
reg [15:0] rom[`VROM_ADDR_LO:`VROM_ADDR_HI];
reg [15:0] dataOut;
/* The ROM */
always #(posedge clock) begin
if (re_L == MEM_RD) begin
dataOut <= rom[memAddr];
end
end
/* Load the rom data */
initial $readmemh("printROM.hex", rom);
/* Drive the line */
tridrive #(.WIDTH(16)) romOut(.data(dataOut),
.bus(memBus),
.en_L(re_L));
/* Manage asserting completion of a read or a write */
always_ff #(posedge clock, negedge reset_L) begin
if (~reset_L) begin
memStatus <= MEM_NOT_DONE;
end
else begin
if ((we_L == MEM_WR) | (re_L == MEM_RD)) begin
memStatus <= MEM_DONE;
end
else begin
memStatus <= MEM_NOT_DONE;
end
end
end
endmodule // printROM
Where VROM_ADDR_LO and VROM_ADDR_HI are two macros defining the addresses of this ROM, namely 'hEB00 and 'hEFFF. Thus, when an address in that range is read/written to by the CPU, this module is able to properly index into the rom memory. This technique works fine in VCS simulation.
However, when I go to synthesize this module, Quartus correctly implies a ROM but has issues initializing it. I get this error:
Error (113012): Address at line 11 exceeds the specified depth (1280) in the Memory Initialization File "psx18240.ram0_printROM_38938673.hdl.mif"
It looks like Quartus is converting the .hex file I give as the ROM code (ie printROM.hex) and using the CPU visible addresses (ie starting at 'hEB00) for the generated .mif file even though the size of rom is obviously too small. Does Quartus not support this syntax or am I doing something wrong?
It would appear that Quartus does not like what you're doing. Probably better would be if you only want a rom with 1280 entries, then you should just create one from 0:1279, and address it using that range (this is what your logic would have to synthesize into anyway).
reg [15:0] rom[0:`VROM_ADDR_HI-`VROM_ADDR_LO];
assign romAddr = (memAddr - `VROM_ADDR_LO);
always #(posedge clock) begin
if (re_L == MEM_RD) begin
dataOut <= rom[romAddr];
end
end
Go for ROM Megafunctions in mega wizard plugin manager in quartus. Its is a GUI based ipcore generater. There you can paramterize all the options even including the hex file. But you have to follow Intel hex file format. This also available in file->new.
For hdl based instantiation
http://quartushelp.altera.com/12.1/mergedProjects/hdl/mega/mega_file_lpm_rom.htm
User guide.
https://www.google.co.in/url?sa=t&source=web&rct=j&ei=nVAGVKvSDcu9ugTMooCQDQ&url=http://www.altera.com/literature/ug/ug_ram_rom.pdf&cd=1&ved=0CBsQFjAA&usg=AFQjCNE2ZXM1gIsMZ5BvkKTHanX1E7vamg
Related
module delay(
input [11:0] data_in,
input delay_clk, //here i will use a 20kHz clk
output reg [11:0] data_out
);
reg[11:0]memory[0:20000];
reg[15:0]write_index;//i
reg[15:0]read_index;//j
initial begin
write_index = 16'b0000000000000000;
read_index = 16'b0100111000100000;
end
always#(posedge delay_clk) begin
read_index = read_index+1;
memory [write_index] <= data_in;
data_out <= memory[read_index];
end
endmodule
I want to make a 1 second delay by using the circular memory.
I generate bitstream and program it to FGPA but there is no sound come out.
So how can i improve this verilog codes?
You don't increment write_index anywhere (like you do with read_index), it will cause this hardware to overwrite 1 memory cell over and over again. I don't think this is what you wanted, it beats the idea of using a memory array.
If you intend to store read_index in some kind of a memory element use <= assignment like the ones for memory and data_out.
Your write_index and read_index are a 16 bit variables. You use them to iterate over a 20k cells memory. They will overflow at some point. Is this intentional? I mean why 16bits and is it ok for those indexes to point at e.g. address 41k?
I'm trying to synthesis with simple generic memory model within design compiler.
but I do find that some error messages as the below,
and I used the simple generic memory model as the below
module RAM_generic
(clk,
enb,
wr_din,
wr_addr,
wr_en,
rd_addr,
rd_dout);
parameter AddrWidth = 1;
parameter DataWidth = 1;
input clk;
input enb;
input signed [DataWidth - 1:0] wr_din;
input [AddrWidth - 1:0] wr_addr;
input wr_en;
input [AddrWidth - 1:0] rd_addr;
output signed [DataWidth - 1:0] rd_dout;
reg [DataWidth - 1:0] ram [2**AddrWidth - 1:0];
reg [DataWidth - 1:0] data_int;
always #(posedge clk)
begin
if (enb == 1'b1) begin
if (wr_en == 1'b1) begin
ram[wr_addr] <= wr_din;
end
data_int <= ram[rd_addr];
end
end
assign rd_dout = data_int;
endmodule
I want to know Can't we synthesis a simple generic memory? If yes, What am I supposed to do to synthesis the generic memory synthesis error?
Yes you can.
In FPGA's a single or dual ported memory will be mapped on the internal memory structures. (At least if you use the right syntax! Look for the FPGA application notes how to do that)
In an ASIC it will be made from registers. I needed a small triple ported memory (Two read and a write port all simultaneous) a few years back and it came out fine. Most FIFO's have a memory in them and 90% of them are made from registers.
Your code is missing 'endmodule'. I don't spot any other obvious errors.
Some tips:
Using ((1 << AddrWidth) -1) will also work in old fashion Verilog.
I would not use a default width/depth of 1 for a memory. You get [0:0] constructs which work, but why should you if e.g. 8x8 is more likely to be used.
A generic memory should not be signed. By convention a generic memory is unsigned.
Parameters are by convention all uppercase. (At least in every firm I worked it was)
I am trying to implement a simple multiplier. I have a text file, from in which there are two columns. I am multiplying column 1 to column 2. Here is code in Verilog:
module File_read(
input clk
);
reg [21:0] captured_data[0:10];
reg [21:0] a[0:8];
reg [21:0] b[0:8];
reg [43:0] product[0:5];
`define NULL 0
integer n=0;
integer i=0;
initial
$readmemh("abc.txt",captured_data);
always #(posedge clk) begin
product[i]<=captured_data[n]*captured_data[n+1];
n<=n+2;
i<=i+1;
end
endmodule
I have Xilinx Spartan®-6 LX45 FPGA board. And it offers 128M bit ddr2 ram and 16Mbyte x4 SPI Flash for configuration & data storage.
Now I want to store my file into FPGA board into memory. So how can I do this? Do I have to use IP core to access memory or by any other source?
P.S: This is my first time, I am storing anything on FPGA.
Regards!
Awais
First of all don't use DDR or Flash memory, unless you really need them. Your FPGA has plenty of BlockRAMs to store several thousand arguments for your multiplier.
One easy way is to instantiate 2 BlockRAMs and load them at compile time with data from a file. Xilinx offers tools like data2mem to achieve this.
Alternatively, you can use Ethernet or a UART connection to send the test data to your design.
Edit 1 - How to instantiate BlockRAM
Solution 1: A generic VHDL description.
type T_RAM is array(LINES - 1 downto 0) of std_logic_vector(BITS-1 downto 0);
signal ram : T_RAM;
begin
process (Clock)
begin
if rising_edge(Clock) then
if (WriteEnable = '1') then
ram(to_integer(WriteAddress)) <= d;
end if;
q <= ram(to_integer(ReadAddress));
end if;
end process;
Solution 2: The IPCore generator has a wizard to create BlockRAMs and assign external files.
Solution 3: Manually instantiate a BlockRAM macro. Each FPGA family comes with a HDL library guide of supported macros. For example the Virtex-5 has a RAMB36 macro on page 311.
The usage of BlockRAMs with data2MEM and *.bmm (BlockRAM memory map) files is described here.
How can I make a memory module in which DATA bus width are passed as parameter to each instances and my design re-configure itself according to the parameter? For example, assuming I have byte addressable memory and DATA-IN bus width is 32 bit (4 bytes written in each cycle) and DATA-OUT is 16 bits (2 bytes read each cycle). For other instance DATA-IN is 64 bits and DATA-OUT is 16 bits. For all such instances my design should work.
What I have tried is to generate write pointer values according to design parameters, e.g. DATA-IN 32 bit, write pointer will increment 4 every cycle while writing. For 64 bit -increment will be by 8 and so on.
Problem is: how to make 4 or 8 or 16 bytes to be written in single cycle according to parameters passed to instance?
//Something as following I want to implement. This memory instance can be considered as internal memory of FIFO having different datawidth for reading and writing in case you think of an application of such memory
module mem#(parameter DIN=16, parameter DOUT=8, parameter ADDR=4,parameter BYTE=8)
(
input [DIN-1:0] din,
output [DOUT-1:0] dout,
input wen,ren,clk
);
localparam DEPTH = (1<<ADDR);
reg [BYTE-1:0] mem [0:DEPTH-1];
reg wpointer=5'b00000;
reg rpointer=5'b00000;
reg [BYTE-1:0] tmp [0:DIN/BYTE-1];
function [ADDR:0] ptr;
input [4:0] index;
integer i;
begin
for(i=0;i<DIN/BYTE;i=i+1) begin
mem[index] = din[(BYTE*(i+1)-1):BYTE*(i)]; // something like this I want to implement, I know this line is not allowed in verilog, but is there any alternative to this?
index=index+1;
end
ptr=index;
end
endfunction
always #(posedge clk) begin
if(wen==1)
wpointer <= wptr(wpointer);
end
always #(posedge clk) begin
if(ren==1)
rpointer <= ptr(rpointer);
end
endmodule
din[(BYTE*(i+1)-1):BYTE*(i)] will not compile in Verilog because the MSB and LSB select bits are both variables. Verilog requires a known range. +: is for part-select (also known as a slice) allows a variable select index and a constant range value. It was introduced in IEEE Std 1364-2001 § 4.2.1. You can also read more about it in IEEE Std 1800-2012 § 11.5.1, or refer to previously asked questions: What is `+:` and `-:`? and Indexing vectors and arrays with +:.
din[BYTE*i +: BYTE] should work for you, alternatively you can use din[BYTE*(i+1)-1 -: BYTE].
Also, you should use non-blocking assignments (<=) to mem. In your code read and write can happen at the same time. With blocking there is a race condition between when accessing the same byte. It may synthesize, but your RTL and gate simulation may generated different results. I also strongly advice agent using functions for assigning memory. Functions in synthesizable code without nasty surprises need to self contained without references on anything outside of the function and any internal variables are always reset to a static constant at the start of the function.
With the guidelines mentioned above, I'd recommend recoding to something like the below. This is a template to start with, not a free lunch. I left out the out-of-range index compensation for you to figure out on your own.
...
localparam DEPTH = (1<<ADDR);
reg [BYTE-1:0] mem [0:DEPTH-1];
reg [ADDR-1:0] wpointer, rpointer;
integer i;
initial begin // init values for pointers (FPGA, not ASIC)
wpointer = {ADDR{1'b0}};
rpointer = {ADDR{1'b0}};
end
always #(posedge clk) begin
if (ren==1) begin
for(i=0; i < DOUT/BYTE; i=i+1) begin
dout[BYTE*i +: BYTE] <= mem[rpointer+i];
end
rpointer <= rpointer + (DOUT/BYTE);
end
if (wen==1) begin
for(i=0; i < DIN/BYTE; i=i+1) begin
mem[wpointer+i] <= din[BYTE*i +: BYTE];
end
wpointer <= wpointer + (DIN/BYTE);
end
end
i've got a question regarding the exploit_notesearch program.
This program is only used to create a command string we finally call with the system() function to exploit the notesearch program that contains a buffer overflow vulnerability.
The commandstr looks like this:
./notesearch Nop-block|shellcode|repeated ret(will jump in nop block).
Now the actual question:
The ret-adress is calculated in the exploit_notesearch program by the line:
ret = (unsigned int) &i-offset;
So why can we use the address of the i-variable that is quite at the bottom of the main-stackframe of the exploit_notesearch program to calculate the ret address that will be saved in an overflowing buffer in the notesearch program itself ,so in an completely different stackframe, and has to contain an address in the nop block(which is in the same buffer).
that will be saved in an overflowing buffer in the notesearch program itself ,so in an completely different stackframe
As long as the system uses virtual memory, another process will be created by system() for the vulnerable program, and assuming that there is no stack randomization,
both processes will have almost identical values of esp (as well as offset) when their main() functions will start, given that the exploit was compiled on the attacked machine (i.e. with vulnerable notesearch).
The address of variable i was chosen just to give an idea about where the frame base is. We could use this instead:
unsigned long sp(void) // This is just a little function
{ __asm__("movl %esp, %eax");} // used to return the stack pointer
int main(){
esp = sp();
ret = esp - offset;
//the rest part of main()
}
Because the variable i will be located on relatively constant distance from esp, we can use &i instead of esp, it doesn't matter much.
It would be much more difficult to get an approximate value for ret if the system did not use virtual memory.
the stack is allocated in a way as first in last out approach. The location of i variable is somewhere on the top and lets assume that it is 0x200, and the return address is located in a lower address 0x180 so in order to determine the where about to put the return address and yet to leave some space for the shellcode, the attacker must get the difference, which is: 0x200 - 0x180 = 0x80 (128), so he will break that down as follows, ++, the return address is 4 bytes so, we have only 48 bytes we left before reaching the segmentation. that is how it is calculated and the location i give approximate reference point.