Sending 256 bits via spi from MCU through FPGA to the DUT - memory

I have designed a program that receives 32 bits from MCU (with SPI protocol) and the FPGA gets these 32 bits and stores them in a register(32 bits) then sends them to the DUT.
Now, I am wondering, do we have any limitations when we use registers????
For example, now I need to send and receive 256 bits of data between the MCU and the FPGA (By SPI protocol).
Can I simply save them in a register with a length of 256?? or should I divide this by 256 and save them in different registers??
Also, is it always need to be the multiplication of 32,64,128......??? or can I only receive for example 40 bits from the MCU ???.
So mainly, I want to know what kind of limitation we have when we receive data and store them in FPGA via registers.
Thank you in advance.

The SPI protocol commonly works with 8-bit values, but it is not limited in general. So, yes, you can send and receive values of 256 bits width.
However, specific implementations in hardware or software put additional limits on the width. Common SPI hardware blocks in MCU use specific word width, for example 8 bits.
In that case you can transfer multiple words to transfer 256 bits. The synchronization could be done via the select signal, for example.
If you have a width that is not an integer multiple of the transferred word, you can transfer more and ignore the padding bits. For example, if your MCU's SPI hardware can only transfer 32-bit words and your requirement is 40 bits, transfer 2 words of 32 bits and ignore 24 of them. Which bits are ignored is up to your specification.
How you realize this in your FPGA, it also up to you, and it depends on your "protocol layer" above SPI. If your DUT needs to be accessed with 256 bits, the FPGA will use 256 bit registers. But whether you organize them in multiple registers of the transferred width or as a single register, is your choice.
Bottom line: The SPI protocol is a quite simple protocol, mainly a clocked synchronous serial transfer with select lines, potentially full-duplex. You can use it with great freedom to build your own protocol on top of it.

Related

Reading a 32-bit ADC result with a 16-bit SPI interface

I need to measure very small ripple on a power supply output that represents about 0.0017% of the output voltage. As a result, I need high resolution ADC to do this. I am using the F2837xD board which has at most 12-bits of resolution for single-ended signals, and because of the differential mode configuration I cannot use the 16-bit differential ADCs, and even if I could, not all the bits would be usable and the resolution still isn't good enough.
I plan to use an external ADC with at least 20-bits of resolution and to communicate with the SPI interface that is on the board. The issue is, that the maximum word length of the SPI register is 16-bits.
Is there a way to read 20-bits or more accurately with a 16-bit SPI interface, whether this be in software or any additional hardware that could be utilized after the external ADC? How much of the accuracy gained from using an external high resolution ADC would actually be retained in these circumstances?
https://www.ti.com/lit/ug/spruhm8i/spruhm8i.pdf?ts=1631716470368&ref_url=https%253A%252F%252Fwww.ti.com%252Ftool%252FTMDSDOCK28379D%253FkeyMatch%253DTMDSDOCK28379D pg 2223 for the SPI interface.

Data memory and Instructions on PIC18F4321

We are studying the PIC18F4321, and at some point my professor drew the following diagram on the board:
He made it look like instructions (such as ADDLW 0X02, MOVWF 0X24, etc) will take two addresses in data memory, because memory addresses in the PIC18F4321 only take a byte and instructions are 16 bits wide.
But in the datasheet of the PIC18F4321, I cannot find where it says that these 16 bits instructions will ever be stored in data memory. Before he said that, I had in mind that the data memory was for storing register values, not full instructions. On the other hand, I know that there is also program memory, but program memory it is not 8 bits wide, which makes his drawing even more confusing.
1) Are 16 bits instructions ever stored in Data Memory?
2) One way I found of trying to explain the picture is that perhaps the memory in question is not necessarily 8 bits wide, it is just that every address can only take 8 bits. So <8> would be simply stating how many bits you can hold in that address. Would this be a reasonable explanation?
1) Are 16 bits instructions ever stored in Data Memory?
No. Data memory is not used for storing instructions - you cannot execute any code from data memory. All instructions are stored in program memory, which consists of 16 bit instruction words. The datasheet details the format and layout of the different instructions. Some instructions are single word, some require multiple words. The program memory is addressed by a 21 bit program counter, which encompasses a 2Mbyte space although for the PIC18F4321 there is just 8Kbytes of program memory, which equates to 4096 single-word instructions.
Data memory consists of 8 bit bytes, addressed by a 12 bit bus, which allows up to 4096 bytes of data memory although the PIC18F4321 has just 512 bytes of data memory, split into two banks of 256 bytes. This data memory contains the SFR's (special function registers) and the general purpose registers (GPR) that you use in your application.
All of this is explained in greater detail in the datasheet for this device, specifically in Section 5.
The way that program memory is addressed by the program counter (PC) enforces the 16-bit instruction word alignment by forcing the least significant bit of the PC to zero, which forces access in multiples of two bytes. Quoting from the datasheet:
The PC addresses bytes in the program memory. To prevent the PC from
becoming misaligned with word instructions, the Least Significant bit
of PCL is fixed to a value of ‘0’. The PC increments by 2 to address
sequential instructions in the program memory.
I suggest that you thoroughly read Section 5 of the linked datasheet and see if you have any remaining doubts. It contains a lot of detail, but it is well described even though it will take more than one reading to understand it completely.

How long does it take to set up an I/O controller on PCIe bus

Say I have an InfiniBand or similar PCIe device and a fast Intel Core CPU and I want to send e.g. 8 bytes of user data over the IB link. Say also that there is no device driver or other kernel: we're keeping this simple and just writing directly to the hardware. Finally, say that the IB hardware has previously been configured properly for the context, so it's just waiting for something to do.
Q: How many CPU cycles will it take the local CPU to tell the hardware where the data is and that it should start sending it?
More info: I want to get an estimate of the cost of using PCIe communication services compared to CPU-local services (e.g. using a coprocessor). What I am expecting is that there will be a number of writes to registers on the PCIe bus, for example setting up an address and length of a packet, and possibly some reads and writes of status and/or control registers. I expect each of these will take several hundred CPU cycles each, so I would expect the overall setup would take order of 1000 to 2000 CPU cycles. Would I be right?
I am just looking for a ballpark answer...
Your ballpark number is correct.
If you want to send an 8 byte payload using an RDMA write, first you will write the request descriptor to the NIC using Programmed IO, and then the NIC will fetch the payload using a PCIe DMA read. I'd expect both the PIO and the DMA read to take between 200-500 nanoseconds, although the PIO should be faster.
You can get rid of the DMA read and save some latency by putting the payload inside the request descriptor.

32-bit PC, size of pointer

For a 4G ram, there is 4 * 1024 * 1024 * 1024 * 8 = 2^(32+3) bits. My question is how could a 32-bit PC can access a 4G memory. What I can think of this is "a byte is the storage unit, one can not store a data in a bit". Is this correct?
Another question is: in such PC, does a pointer always have size 32 bit? It seems reasonable for me, because we have 2^32 storage units to store the data. But in this answer and the next with their remarks, this is said to be wrong. If it is wrong, why?
Individual bits are accessed by reading the address of the byte containing it, modifying the byte and writing back if necessary.
In some architectures the smallest addressable unit is double word, in which case no single byte can be accessed "as is". Theoretically one could design an architecture that would address 16 GB of memory with 32-bits of unique addresses. And similar things happened years ago, when the addressable units of a Hard Drive were limited to bare 2^28 units of 512 byte sectors or so.
It's not completely wrong to say that PC's have 32-bit pointers. That's just a bit old information, as the newer models are internally 64-bit systems and can access depending on the OS up to 2^48 bytes of memory. Currently most existing PCs are 32-bit and nothing can be done about it.
Well, StuartLC remainded about paging. Even in the current 32-bit systems, one can use 48-bits of addressing using old age segment registers. (Can't remember if there was a restriction of segment registers low three bits being zero...) But anyway that would allow 2^45 bytes of individual addresses, out of which just a small fraction could ever be in the main memory simultaneously. If an OS supporting that addressing mode was developed, then probably full 64 bits would be allocated for the pointer. Just like it is today with 64-bit processors.
My question is how could a 32-bit PC can access a 4G memory
You may be confusing address bus (addressable memory) and the size of the processor registers. This superuser post details the differences.
Paging is a technique commonly used to allow memory to be addressed beyond the size of the OS's capabilities, e.g. see PAE
does a pointer always have size 32 bit
No, not necessarily - e.g. on 16 bit DOS and Windows, and also pointers could be relative to a segment.
Can one can not store a data in a bit?
Yes, you can, e.g. in C, bit packing in structs can be done, albeit at the cost of performance and portability.
Today performance is more important, and compilers will typically try and align data to its machine word size, for performance reasons.

Memory data bus decomposition

Say we have a 32-bit wide memory bus to a shared memory in a network switch. Now I want to make the storing of packets maximize parallel. I put a DMA after each input port, so the switch controller will not be blocked until one packet is stored completely. Assume one packet of each input port is 8 bits. So Could the memory bus be decomposed into 4 8-bit sub-memory buses in order to make each DMA could lead a 8-bit wide packet into the corresponding memory address parallelly(ignore conflicts temporarily)?
Sorry for such a weird question, and for not quite knowing about the computer organization and architecture.

Resources