Reading a 32-bit ADC result with a 16-bit SPI interface - signal-processing

I need to measure very small ripple on a power supply output that represents about 0.0017% of the output voltage. As a result, I need high resolution ADC to do this. I am using the F2837xD board which has at most 12-bits of resolution for single-ended signals, and because of the differential mode configuration I cannot use the 16-bit differential ADCs, and even if I could, not all the bits would be usable and the resolution still isn't good enough.
I plan to use an external ADC with at least 20-bits of resolution and to communicate with the SPI interface that is on the board. The issue is, that the maximum word length of the SPI register is 16-bits.
Is there a way to read 20-bits or more accurately with a 16-bit SPI interface, whether this be in software or any additional hardware that could be utilized after the external ADC? How much of the accuracy gained from using an external high resolution ADC would actually be retained in these circumstances?
https://www.ti.com/lit/ug/spruhm8i/spruhm8i.pdf?ts=1631716470368&ref_url=https%253A%252F%252Fwww.ti.com%252Ftool%252FTMDSDOCK28379D%253FkeyMatch%253DTMDSDOCK28379D pg 2223 for the SPI interface.

Related

Sending 256 bits via spi from MCU through FPGA to the DUT

I have designed a program that receives 32 bits from MCU (with SPI protocol) and the FPGA gets these 32 bits and stores them in a register(32 bits) then sends them to the DUT.
Now, I am wondering, do we have any limitations when we use registers????
For example, now I need to send and receive 256 bits of data between the MCU and the FPGA (By SPI protocol).
Can I simply save them in a register with a length of 256?? or should I divide this by 256 and save them in different registers??
Also, is it always need to be the multiplication of 32,64,128......??? or can I only receive for example 40 bits from the MCU ???.
So mainly, I want to know what kind of limitation we have when we receive data and store them in FPGA via registers.
Thank you in advance.
The SPI protocol commonly works with 8-bit values, but it is not limited in general. So, yes, you can send and receive values of 256 bits width.
However, specific implementations in hardware or software put additional limits on the width. Common SPI hardware blocks in MCU use specific word width, for example 8 bits.
In that case you can transfer multiple words to transfer 256 bits. The synchronization could be done via the select signal, for example.
If you have a width that is not an integer multiple of the transferred word, you can transfer more and ignore the padding bits. For example, if your MCU's SPI hardware can only transfer 32-bit words and your requirement is 40 bits, transfer 2 words of 32 bits and ignore 24 of them. Which bits are ignored is up to your specification.
How you realize this in your FPGA, it also up to you, and it depends on your "protocol layer" above SPI. If your DUT needs to be accessed with 256 bits, the FPGA will use 256 bit registers. But whether you organize them in multiple registers of the transferred width or as a single register, is your choice.
Bottom line: The SPI protocol is a quite simple protocol, mainly a clocked synchronous serial transfer with select lines, potentially full-duplex. You can use it with great freedom to build your own protocol on top of it.

ESP8266 programming without SDK

There are limitations in the ESP SDK libraries (which are not public) like for example the length of the packet recv (112bytes max) when in promisc mode.
I tried reaching them to get some input and directions - but they seem to be replying nonsense.
Is it possible to program the chip without the SDK - thus make my own SDK and forget their limitations?
The processor-core on the esp8266 is an 'xtensa'. The processor-core, or let's just call it the processor, is what we program with C or C++ or assembler. The processor's instruction set is made public by the company (which is Tensilica .. or Cadence??) and once you have the instruction set, you can program directly or make a compiler and have complete freedom with the processor.
The processor-core is not the complete product and for us end-consumers, and companies, like Espressif, buy the Intellectual Property rights to a processor-core's design and build an end-product by putting peripherals like SPI, I2C, UART and in the esp8266's case, the wifi-tranceiver, around the processor-core.
These peripherals are controlled digitally, and output to the processor digitally, so the processor can interface with them - but their action can be either digital or analog. For UART, SPI, I2C etc, espressif has provided us with the datasheet that informs of all the registers that can be used to control that peripheral. It's something like write to this X memory address what you want to transfer and then set the bit Y on the Z memory address to begin the transfer. For SPI for example, there are registers to control speed, polarity, phase etc for a transfer. Once you know how to control a peripheral at the lower level, you can write high level drivers, which espressif does provide too, but you can write your own.
For Wifi, espressif hasn't released how the peripheral can be interfaced with, so we have to depend upon the compiled binaries that espressif sends us. You can use 'objdump -t' on the 'lib/lib80211.a' to get atleast the names of the routines that the Wifi driver provides. You can call these routines from C or assembler code and go a little bit lower than espressif intended but to go any lower would require 'Reverse Engineering' by manually understanding the low level code in the compiled Driver and nobody's gonna take that risk and time-drain.

How do I calculate memory bandwidth on a given (Linux) system, from the shell?

I want to write a shell script/command which uses commonly-available binaries, the /sys fileystem or other facilities to calculate the theoretical maximum bandwidth for the RAM available on a given machine.
Notes:
I don't care about latency, just bandwidth.
I'm not interested in the effects of caching (e.g. the CPU's last-level cache), but in the bandwidth of reading from RAM proper.
If it helps, you may assume a "vanilla" Intel platform, and that all memory DIMMs are identical; but I would rather you not make this assumption.
If it helps, you may rely on root privileges (e.g. using sudo)
#einpoklum you should have a look at Performance Counter Monitor available at https://github.com/opcm/pcm. It will give you the measurements that you need. I do not know if it supports kernel 2.6.32
Alternatively you should also check Intel's EMON tool which promises support for kernels as far back as 2.6.32. The user guide is listed at https://software.intel.com/en-us/download/emon-user-guide, which implies that it is available for download somewhere on Intel's software forums.
I'm not aware of any standalone tool that does it, but for Intel chips only, if you know the "ARK URL" for the chip, you could get the maximum bandwidth using a combination of a tool to query ARK, like curl, and something to parse the returned HTML, like xmllint --html --xpath.
For example, for my i7-6700HQ, the following works:
curl -s 'https://ark.intel.com/products/88967/Intel-Core-i7-6700HQ-Processor-6M-Cache-up-to-3_50-GHz' | \
xmllint --html --xpath '//li[#class="MaxMemoryBandwidth"]/span[#class="value"]/span/text()' - 2>/dev/null
This returns 34.1 GB/s which is the maximum theoretical bandwidth of my chip.
The primary difficulty is determining the ARK URL, which doesn't correspond in an obvious way to the CPU brand string. One solution would be to find the CPU model on an index page like this one, and follow the link.
This gives you the maximum theoretical bandwidth, which can be calculated as (number of memory channels) x (trasfer width) x (data rate). The data rate is the number of transfers per unit time, and is usually the figure given in the name of the memory type, e.g., DDR-2133 has a data rate of 2133 million transfers per second. Alternately you can calculate it as the product of the bus speed (1067 MHz in this case) and the data rate multiplier (2 for DDR technologies).
For my CPU, this calculation gives 2 memory channels * 8 bytes/transfer * 2133 million transfers/second = 34.128 GB/s, consistent with the ARK figure.
Note that theoretical maximum as reported by ARK might be lower or higher than the theoretical maximum on your particular system for various reasons, including:
Fewer memory channels populated than the maximum number of channels. For example, if I only populated one channel on my dual channel system, theoretical bandwidth would be cut in half.
Not using the maximum speed supported RAM. My CPU supports several RAM types (DDR4-2133, LPDDR3-1866, DDR3L-1600) with varying speeds. The ARK figure assumes you use the fastest possible supported RAM, which is true in my case, but may not be true on other systems.
Over or under-clocking of the memory bus, relative to the nominal speed.
Once you get the correct theoretical figure, you won't actually reach this figure in practice, due to various factors including the following:
Inability to saturate the memory interface from one or more cores due to limited concurrency for outstanding requests, as described in the section "Latency Bound Platforms" in this answer.
Hidden doubling of bandwidth implied by writes that need to read the line before writing it.
Various low-level factors relating the DRAM interface that prevents 100% utilization such as the cost to open pages, the read/write turnaround time, refresh cycles, and so on.
Still, using enough cores and non-termporal stores, you can often get very close to the theoretical bandwidth, often 90% or more.

x86 protected mode memory management

I'm newibe of x86 cpu.
I read all materials about memory management of protected mode in x86.
the materials are Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 3A, System Programming Guide, Part 1
I believe I understand the many steps when cpu is accessing memory.
: selector register is index of segment descriptor table, and the entry of descriptor table is base of the segment, and linear address is addition of the base of the segment and 32bit offset.
But, what I'am confusing about is, it seems to me that CPU cannot know which memory address it will be access at the first time until the all steps above is finished. If CPU want to access specific memory address, It must know the selector value, and offset. But my question is how does it know ?? only information does CPU know is memory address it want to access doesn't it??
How does CPU know the input(selector value, offset) already when it only knows the output(memory address)??
... by
Microprocessor Real Time Clocks or Timer Chips,
periodic function called 'clock signal'
by Memory Controller Hub
Advanced Configuration and Power Interface (ACPI)
ROM, a non-volatile memory inside chips (RealMode Memory Map)
The Local Descriptor Table (LDT) is a memory table used in the x86 architecture in protected mode and containing memory segment descriptors: start in linear memory, size, executability, writability, access privilege, actual presence in memory, etc.
Interrupt descriptor table, is a data structure used by the x86 architecture to implement an interrupt vector table. The IDT is used by the processor to determine the correct response to interrupts and exceptions.
Intel 8259 is a Programmable Interrupt Controller (PIC) designed for the Intel 8085 and Intel 8086 microprocessors. The initial part was 8259, a later A suffix version was upward compatible and usable with the 8086 or 8088 processor. The 8259 combines multiple interrupt input sources into a single interrupt output to the host microprocessor, extending the interrupt levels available in a system beyond the one or two levels found on the processor chip
You also missing real mode
look also DOS_Protected_Mode_Interface & Virtual Control Program Interface
How timer chip control reset line of CPU ?
See also OSCILLATOR CIRCUIT WITH SIGNAL BUFFERING AND START-UP CIRCUITRYfrom Google Patents
real time clock
The CPU 'start' executing code stored in ROM on the motherboard at address FFFF0
The routine test the central hardware, search for video ROM
...
So.. is it not the CPU that 'start' because is power supply line that 'starts'
The power supply signal is sent to the motherboard, where it is received by the processor timer chip that controls the reset line to the processor.
How does the BIOS detect RAM ? See also serial presence detect, power-on self-test (POST)
BIOS is a 16-bit program running in real mode
The BIOS begins its POST when the CPU is reset. The first memory location the CPU tries to execute is known as the reset vector. In the case of a hard reboot, the northbridge will direct this code fetch (request) to the BIOS located on the system flash memory. For a warm boot, the BIOS will be located in the proper place in RAM and the northbridge will direct the reset vector call to the RAM
What is this reset vector ?
The reset vector is the default location a central processing unit will go to find the first instruction it will execute after a reset.
The reset vector is a pointer or address, where the CPU should always begin as soon as it is able to execute instructions. The address is in a section of non-volatile memory initialized to contain instructions to start the operation of the CPU, as the first step in the process of booting the system containing the CPU.
The reset vector for the 8086 processor is at physical address FFFF0h (16 bytes below 1 MB). The value of the CS register at reset is FFFFh and the value of the IP register at reset is 0000h to form the segmented address FFFFh:0000h, which maps to physical address FFFF0h.
About northbridge
A northbridge or host bridge is one of the two chips in the core logic chipset architecture on a PC motherboard, the other being the southbridge. Unlike the southbridge, northbridge is connected directly to the CPU via the front-side bus (FSB)
Sources:
"80386 Programmer's Reference Manual" (PDF). Intel. 1990. Section 10.1 Processor State After Reset
"80386 Programmer's Reference Manual" (PDF). Intel. 1990. Section 10.2.3 First Instruction,

Realistic data rate over PCI bus using DMA?

What is the realistic data transfer rate over a 32-bit/33MHz PCI bus? We need to transfer 32K 32-bit samples from a PCI card to an Intel CPU running Windows. I would think the block would transfer in 1msec but it is taking 40msec. The PCI board has a PLX PCI-9056. We are accessing card memory with a virtual address, but our CPU is bricked-out which make me think the data rate is being held up by CPU involvement. If we go to DMA, will we transfer in closer to 1msec? The reason I have my doubts is the PXI SDK User Manual states:
"BAR space memory read/write is generally slow in relative terms. Reads are typically only 2-4MB/s."
You should check if you can enable burst mode and continuous burst, such that multiple DWords can be transmitted without new address cycles. This makes things much faster. The PLX PCI9056 supports this option, but it must be set by SW accordingly.
We have data rates up to 90 MB/s with DMA Master Transfer on our custom designed frame grabber card.

Resources