How to read a byte of the Serial Presence Detect (SPD) data from the DIMM after 255(FF) bytes? - bios

I have got SMBus Base Address Register,
and program the SMBus Transmit Slave Address Register with the DIMM SMBus address, SMBBASE 04h.
Then program the SMBus Host Command Register with the DIMM’s SPD data offset to be read, SMBBASE 03h.
But the Host Command Register (HCMD)—Offset 3h is Size: 8 bits(255/FF),
So How can I read the after 255 bytes?
For example:
DDR4 Serial Presence Detect (SPD) Table:
Byte 320 : Module Manufacturer ID Code
I need to read Byte 320.
My code like this
unsigned ReadByte(unsigned SMBase_addr,unsigned i)
{
unsigned val;
outportb(SMBase_addr,0x1e);
outportb(SMBase_addr 0x04,0xa7);
outportb(SMBase_addr 0x03,i);
outportb(SMBase_addr 0x02,0x48);
while((inportb(SMBase_addr))&0x01){
delay(10);
}
val=inportb(SMBase_addr 0x05);
return val;
}
for(i=0;i<383;i )
{
data=ReadByte(SMBase_addr,i);
printf("%4x",data);
}
and I change
outportb(SMBase_addr 0x03,i);
to
outportw(SMBase_addr 0x03,i);
Host Status Register return 0x44, Device Error (DERR).

At least in Linux PC,
You need to write SMBus address 0x37 first to reach page 1. (Let write 0 to SMbus addr 32)
Than ALL of your DDR4 RAM SPD switches to Page 1.
Just use regular functions for writing & reading required addresses.
Than switch to page 0 by writing SMBus address 0x36 after.

Trying to read DDR4 SPD?
They have 2 pages of 256 bytes each, and you need a dummy write to a special predefined address 0x6E to switch all SPD chips to page 1 (where your byte 320 is located), and than write to 0x6C switch them back to page 0 (to prevent an SPD read failure during next boot).
Read this datasheet on page 12 for more info.

Related

Capturing all traffic in Wireshark from a specific MAC OUI?

I would like to capture all wifi traffic from a specific device manufacturer using Wireshark/Tshark/TCPDump/etc. I want to use a CAPTURE filter, not a display filter. Basically, I want to capture all packets from the MAC address 11:22:33:xx:xx:xx and nothing else. Or, put another way, the first 3 octets or OUI of the MAC address using Berkeley Packet Filtering Syntax. Anyone have a preferred method?
Per this post, use syntax like ether[A:B] in your capture filter where
A = start byte location in ethernet frame, starting at 0
B = number of bytes, must be 1, 2, or 4
So to match 3 bytes, you have to have 2 comparisons: Match 2 bytes and 1 byte separately.
If you only want about packets coming from this OUI (per question):
tcpdump 'ether[0:2] == 0x1122 && ether[2:1] == 0x33'
If you want all packets going to/from this OUI:
tcpdump 'ether[0:2] == 0x1122 && ether[2:1] == 0x33 \
|| ether[6:2] == 0x1122 && ether[8:1] == 0x33'
The first 12 bytes (0-11) of the ethernet header consist of the destination and then source mac addresses. So to select both sets of 3 bytes 0-2 and 6-8, select 2 bytes at 0, 1 byte at 2, 2 bytes at 6 and 1 byte at 8.
You should also be able to use this with tshark as long as you preface this with the -f capture filter flag.

SP605 Spartan 6 DDR3 addressing

the following post is quite long, but since I have had trouble making the SP605 board properly interact with the DDR3 for over a month now, hopefully this will be useful to others in the same situation as I find myself in. I am pretty certain it's a simple configuration or conceptual error, but I would be more than happy to have this resolved soon.
=== SCENARIO ===
I have created a USB-UART interface to communicate with the FPGA and control the DDR3. Using the IP generator in ISE, I generated a MIG wrapper and then I designed the memory interface controller. However, I have referenced manuals ug388 and ug416, but I have not been able to have the DDR3 behave as expected.
=== PROBLEM STATEMENT ===
Playing around with the burst lengths for write and read commands, I am able to get data back from the DDR3, yet the addressing scheme does not seem to be correct as data is duplicated in addresses 0 and 1, 2 and 3, 4 and 5, and so forth. Also, whenever I write into address 0, for example, nothing changes. Then, when I write into address 1, both addresses 0 and 1 are updated with the data value I just sent. It seems I am "losing" half of the memory space due to this coupled effect.
=== DDR3 IP CONFIGURATION ===
The setup for the DDR3 using the IP generator – considering the SP605 board scenario – is listed below. In sum, I activated the DDR3 Bank 3 and configured Port0 to be 32-bit bidirectional.
Memory selection:
Enable AXI interface: unchecked
Use extended MCB performance range: unchecked
Memory type for bank 3: DDR3 SDRAM
Memory type for bank 1: none
Options for C3 – DDR3 SDRAM
Frequency: 400 MHz
Memory part: MTJ41J64M16XX-187E
Memory options for C3 – DDR3 SDRAM
Output driver impedance control: RZQ/6
RTT (nominal) – ODT: RZQ/4
Auto self refresh: enabled
Port configuration for C3 – DDR3 SDRAM
Two 32-bit bi-directional and four 32-bit unidirectional ports
Port0: checked
Port1: unchecked
Port2: unchecked
Port3: unchecked
Port4: unchecked
Port5: unchecked
Memory address mapping selection: row-bank-column
FPGA options for C3 – DDR3 SDRAM
Memory interface pin termination: Calibrated input termination
Select RZQ pin location: R7
Select ZIO pin location: W4
Debug signals for memory controller: disable
System clock: differential
=== DATA STRUCTURE ===
From Matlab, I send in a 64-bit command which should write or read the DDR3 based on the address and data provided in this command.
wire [00:00] cmd_instruction = usb_data[63:63]; // ‘0’ = write; ‘1’ = read
wire [27:00] cmd_address = usb_data[62:37]; // 26-bit address
wire [31:00] cmd_data = usb_data[31:00]; // 32-bit data
In ug388, the following can be extracted:
Page 20: The address is 26 bits wide.
C_MEM_ADDR_WIDTH = 13
C_MEM_BANKADDR_WIDTH = 3
C_MEM_NUM_COL_BITS = 10
C_P0_DATA_PORT_SIZE = 32 // 32-bit data ports
C_P0_MASK_SIZE = 4 // 4 bytes = 32 bits (1 mask bit = 1 entire data byte)
Pages 26-27: Command data structure.
pX_cmd_addr[29:0]: 30-bit address, however the last two bits should = “00” since every word (32 bits) is formed by 4 bytes.
pX_cmd_bl[5:0]: Burst length of 1 is obtained by setting this signal to 0.
pX_cmd_instr[2:0]: The only command instructions used are write=”000” and read=”001”.
Page 28: Write data structure.
pX_wr_mask[PX_MASKSIZE-1:0]: 4-bit mask is set to “0000” so that all 4 bytes are always written into the memory.
=== SIGNAL ASSIGNMENTS ===
Using all this information, I assigned my signals in the following manner:
assign p0_mcb_cmd_instr = {2'b00, cmd_instruction};
assign p0_mcb_cmd_addr = {2’d0, cmd_address, 2'd0};
assign p0_mcb_cmd_bl = 6'd0;
assign p0_mcb_wr_data = cmd_data;
assign p0_mcb_wr_mask = 4'd0;
localparam C3_MEM_BURST_LEN = 8;
=== CONCLUSIONS ===
Based on the configuration, does anyone know what the expected behavior of my controller should be?
If any additional information is necessary for clarification, please let me know.
Thanks a lot,
Bruno.

Process memory limit and address space for UNIX and Linux and Windows

what is the maximum amount of memory for a single process in UNIX and Linux and windows? how to calculate that? How much user address space and kernel address space for 4 GB of RAM?
How much user address space and kernel address space for 4 GB of RAM?
The address space of a process is divided into two parts,
User space: On standard 32 bit x86_64 architecture,the maximum addressable memory is 4GB, out of which addresses from 0x00000000 to 0xbfffffff = (3GB) meant for code, data segments. This region can be addressed when user process executing either in user or kernel mode.
Kernel space: Similarly, addresses from 0xc0000000 to 0xffffffff = (1GB) are meant for virtual address space of the kernel and can only addressed when the process executes in kernel mode.
This particular address space split on x86 is determined by the value of PAGE_OFFSET. Referring to Linux 3.11.1v page_32_types.h and page_64_types.h, page offset is defined as below,
#define __PAGE_OFFSET _AC(CONFIG_PAGE_OFFSET, UL)
Where Kconfig defines a default value of default 0xC0000000 also with other address split options available.
Similarly for 64 bit,
#define __PAGE_OFFSET _AC(0xffff880000000000, UL).
On 64 bit architecture 3G/1G split doesn't hold anymore due to huge address space. As per the source latest Linux version has given above offset as offset.
When I see my 64 bit x86_64 architecture, a 32 bit process can have entire 4GB of user address space and kernel will hold address range above 4GB. Interestingly on modern 64 bit x86_64 CPU's not all address lines are enabled(or the address bus is not large enough) to provide us 2^64 = 16 exabytes of virtual address space. Perhaps AMD64/x86 architectures has 48/42 lower bits enabled respectively resulting to 2^48 = 256TB / 2^42= 4TB of address space. Now this definitely improves performance with large amount of RAM, at the same time question arises how it is efficiently managed with the OS limitations.
In Linux there's a way to find out the limit of address space you can have.
Using the rlimit structure.
struct rlimit {
rlim_t cur; //current limit
rlim_t max; //ceiling for cur.
}
rlim_t is a unsigned long type.
and you can have something like:
#include <stdio.h>
#include <stdlib.h>
#include <sys/resource.h>
//Bytes To GigaBytes
static inline unsigned long btogb(unsigned long bytes) {
return bytes / (1024 * 1024 * 1024);
}
//Bytes To ExaBytes
static inline double btoeb(double bytes) {
return bytes / (1024.00 * 1024.00 * 1024.00 * 1024.00 * 1024.00 * 1024.00);
}
int main() {
printf("\n");
struct rlimit rlim_addr_space;
rlim_t addr_space;
/*
* Here we call to getrlimit(), with RLIMIT_AS (Address Space) and
* a pointer to our instance of rlimit struct.
*/
int retval = getrlimit(RLIMIT_AS, &rlim_addr_space);
// Get limit returns 0 if succeded, let's check that.
if(!retval) {
addr_space = rlim_addr_space.rlim_cur;
fprintf(stdout, "Current address_space: %lu Bytes, or %lu GB, or %f EB\n", addr_space, btogb(addr_space), btoeb((double)addr_space));
} else {
fprintf(stderr, "Coundn\'t get address space current limit.");
return 1;
}
return 0;
}
I ran this on my computer and... prrrrrrrrrrrrrrrrr tsk!
Output: Current address_space: 18446744073709551615 Bytes, or 17179869183 GB, or 16.000000 EB
I have 16 ExaBytes of max address space available on my Linux x86_64.
Here's getrlimit()'s definition
it also lists the other constants you can pass to getrlimits() and introduces getrlimit()s sister setrlimit(). There is when the max member of rlimit becomes really important, you should always check you don't exceed this value so the kernel don't punch your face, drink your coffee and steal your papers.
PD: please excuse my sorry excuse of a drum roll ^_^
On Linux systems, see man ulimit
(UPDATED)
It says:
The ulimit builtin is used to set the resource usage limits of the
shell and any processes spawned by it. If a new limit value is
omitted, the current value of the limit of the resource is printed.
ulimit -a prints out all current values with switch options, other switches, e.g. ulimit -n prints out no. of max. open files.
Unfortunatelly, "max memory size" tells "unlimited", which means that it is not limited by system administrator.
You can view the memory size by
cat /proc/meminfo
Which results something like:
MemTotal: 4048744 kB
MemFree: 465504 kB
Buffers: 316192 kB
Cached: 1306740 kB
SwapCached: 508 kB
Active: 1744884 kB
(...)
So, if ulimit says "unlimited", the MemFree is all yours. Almost.
Don't forget that malloc() (and new operator, which calls malloc()) is a STDLIB function, so if you call malloc(100) 10 times, there will be lot of "slack", follow link to learn why.

Reassembling packets in a Lua Wireshark Dissector

I'm trying to write a dissector for the Safari Remote Debug protocol which is based on bplists and have been reasonably successful (current code is here: https://github.com/andydavies/bplist-dissector).
I'm running into difficultly with reassembling packets though.
Normally the protocol sends a packet with 4 bytes containing the length of the next packet, then the packet with the bplist in.
Unfortunately some packets from the iOS simulator don't follow this convention and the four bytes are either tagged onto the front of the bplist packet, or onto the end of the previous bplist packet, or the data is multiple bplists.
I've tried reassembling them using desegment_len and desegment_offset as follows:
function p_bplist.dissector(buf, pkt, root)
-- length of data packet
local dataPacketLength = tonumber(buf(0, 4):uint())
local desiredPacketLength = dataPacketLength + 4
-- if not enough data indicate how much more we need
if desiredPacketLen > buf:len() then
pkt.desegment_len = dataPacketLength
pkt.desegment_offset = 0
return
end
-- have more than needed so set offset for next dissection
if buf:len() > desiredPacketLength then
pkt.desegment_len = DESEGMENT_ONE_MORE_SEGMENT
pkt.desegment_offset = desiredPacketLength
end
-- copy data needed
buffer = buf:range(4, dataPacketLen)
...
What I'm attempting to do here is always force the size bytes to be the first four bytes of a packet to be dissected but it doesn't work I still see a 4 bytes packet, followed by a x byte packet.
I can think of other ways of managing the extra four bytes on the front, but the protocol contains a lookup table thats 32 bytes from the end of the packet so need a way of accurately splicing the packet into bplists.
Here's an example cap: http://www.cloudshark.org/captures/2a826ee6045b #338 is an example of a packet where the bplist size is at the start of the data and there are multiple plists in the data.
Am I doing this right (looking other questions on SO, and examples around the web I seem to be) or is there a better way?
TCP Dissector packet-tcp.c has tcp_dissect_pdus(), which
Loop for dissecting PDUs within a TCP stream; assumes that a PDU
consists of a fixed-length chunk of data that contains enough information
to determine the length of the PDU, followed by rest of the PDU.
There is no such function in lua api, but it is a good example how to do it.
One more example. I used this a year ago for tests:
local slicer = Proto("slicer","Slicer")
function slicer.dissector(tvb, pinfo, tree)
local offset = pinfo.desegment_offset or 0
local len = get_len() -- for tests i used a constant, but can be taken from tvb
while true do
local nxtpdu = offset + len
if nxtpdu > tvb:len() then
pinfo.desegment_len = nxtpdu - tvb:len()
pinfo.desegment_offset = offset
return
end
tree:add(slicer, tvb(offset, len))
offset = nxtpdu
if nxtpdu == tvb:len() then
return
end
end
end
local tcp_table = DissectorTable.get("tcp.port")
tcp_table:add(2506, slicer)

How to manual set minimal value for dynamic buffer in Netty 3.2.6? For example 2048 bytes

I need to receive full packet from other IP (navigation device) by TCP/IP.
The device has to send 966 bytes periodically (over one minute), for example.
In my case first received buffer has length 256 bytes (first piece of packet), the second is 710 bytes (last piece of packet), the third is full packet (966 bytes).
How to manual set minimal value for first received buffer length?
This is piece of my code:
Executor bossExecutors = Executors.newCachedThreadPool();
Executor workerExecutors = Executors.newCachedThreadPool();
NioServerSocketChannelFactory channelsFactory =
new NioServerSocketChannelFactory(bossExecutors, workerExecutors);
ServerBootstrap bootstrap = new ServerBootstrap(channelsFactory);
ChannelPipelineFactory pipelineFactory = new NettyServerPipelineFactory(this.HWController);
bootstrap.setPipelineFactory(pipelineFactory);
bootstrap.setOption("child.tcpNoDelay", true);
bootstrap.setOption("child.keepAlive", true);
bootstrap.setOption("child.receiveBufferSizePredictorFactory",
new FixedReceiveBufferSizePredictorFactory(2048)
);
bootstrap.bind(new InetSocketAddress(this.port));
No matter what receiveBufferSizePredictorFactory you specify, you will see a message is split into multiple MessageEvents. It's because TCP/IP is not a message-oriented protocol but a stream-oriented one. Please read the user guide that explains how to write a proper decoder that deals with this common issue.

Resources