Difference in operations of Bus Mastering and Third Party DMA

Difference in operations of Bus Mastering and Third Party DMA - driver

After some digging my understanding on DMA is following -
1) BUS mastering DMA controller resides on the device card.
a) does cycle stealing, and takes control of system bus to control transfers. Plz correct.
b) hard-disk controllers and NIC are examples
c) I know of BR and BG signals used to take control of system bus to access memory
2) Third party DMA, stays on system board.
a) performs burst mode transfers (i am not very firm on this though)
b) uses channels as scatter-gather lists for transfers
c) do not know which devices use them. please help.
d) have no idea on how the DMA accesses memory in this case.
Another important item I want to understand is - what is the requirement that makes a device maker to choose one of 1) or 2).
Also please let me know if my understanding as mentioned is correct.

Related

ESP32: Best way to store data frequently?

I'm developing a C++ application in the ESP32-DevKitC board where I sense acceleration from an accelerometer. The application goal is to store the accelerometer data until storage is full and then send all the data through WiFi and start all again. The micro also goes to deep-sleep mode when is possible.
I'm currently using the ESP32 NVS library which is very well documented and pretty easy to use. The negative side of this is that the library uses Flash memory, therefore a lot of writings will end up degrading the drive.
I know that Espressif also offers some other storage libraries (FAT, SPIFFS, etc.) but, as far as I know (correct me if I'm wrong), they all use Flash drive.
Is there any other possibility of doing what I want to but without using the Flash storage?
Aclarations
Using Flash memory is not the problem itself, but degrading it.
Storage has to be non volatile or at least not being erased when the micro goes to deep-sleep mode.
I'm not using any Arduino library.

That's a great question that I wish more people would ask.
ESP32s use NOR flash storage, which is usually rated for between 10,000 to 100,000 write cycles (100,000 seems to be the standard these days). Flash can't write single bytes; instead of writes a "page" of bytes, which I believe is 256 bytes. So each 256 byte page is rated for at least 100,000 cycles. When a device is rated for 100,000 cycles it's likely to be usable for at least 10 times that, but the manufacturer is not going to make any promises beyond the 100,000.
SPIFFS (and LittleFS, now used on the ESP8266 Arduino Core) perform "wear leveling", to minimize the number of times a particular page is written. So if you modify the same section of a file repeatedly, it will automatically be written to different pages of flash. FAT is not designed to work well with flash storage; I would avoid it.
Whether SPIFFS with wear leveling will be adequate for your needs depends on your needed lifetime of the device versus how much data you'll be writing and how frequently.
NVS may perform some level of wear levelling, to an extent I'm unsure about. Here, in a forum post with 2 ESP employees, they both confirm that NVS does do some form of wear levelling. NVS is best used to persist things like configuration information that doesn't change frequently. It's not a great choice for storing information that's updated often.
You mentioned that the data just needs to survive deep sleep. If that's the case, your best option (if it's large enough) is to use the ESP32's RTC static RAM. This chunk of memory will survive restarts and deep sleep mode, but will lose its state if power is interrupted. It's real RAM so you won't wear it out by writing to it frequently, and it doesn't cost a lot of energy to write to. The catch is there's only 8KB of it.
If the 8KB of RTC RAM isn't enough and you're writing too much data too frequently to trust that SPIFFS will be okay, your best bet would be an SD card. The ESP32 can talk to an SD card adapter. SD cards use NAND flash, which has a much greater lifespan than NOR and can be safely overwritten many more times (which is why these kinds of cards are usable for filesystems in devices like Raspberry Pis).
Writing to flash also takes much more energy than writing to regular RAM. If your device is going to be battery powered, the RTC RAM is also a better choice than SPIFFS or an SD card from a power savings perspective.
Finally, if you use the RTC RAM I'd recommend starting to write it over wifi before it's full, as bringing up wifi and transmitting the data could easily take long enough that you might run out of space for some samples. Using it as a ring buffer and starting the transmit process when you hit a high water mark rather than when the buffer is full would probably be your best bet.

I know i'm late with this answer but you can buy ESP32 modules with external RAM even with 4-8mb. External ram is really fast ( at least much faster than the flash, it uses SPI interface to communicate ) and you can fit a lot of sensor readings in there.
I'm using an ESP32_WROVER_E module with 8mb external ram ( 4mb is usable with normal function calls ) and 16mb flash.
Here is a link of the module that i'm using at TME's site.

ESP8266 programming without SDK

There are limitations in the ESP SDK libraries (which are not public) like for example the length of the packet recv (112bytes max) when in promisc mode.
I tried reaching them to get some input and directions - but they seem to be replying nonsense.
Is it possible to program the chip without the SDK - thus make my own SDK and forget their limitations?

The processor-core on the esp8266 is an 'xtensa'. The processor-core, or let's just call it the processor, is what we program with C or C++ or assembler. The processor's instruction set is made public by the company (which is Tensilica .. or Cadence??) and once you have the instruction set, you can program directly or make a compiler and have complete freedom with the processor.
The processor-core is not the complete product and for us end-consumers, and companies, like Espressif, buy the Intellectual Property rights to a processor-core's design and build an end-product by putting peripherals like SPI, I2C, UART and in the esp8266's case, the wifi-tranceiver, around the processor-core.
These peripherals are controlled digitally, and output to the processor digitally, so the processor can interface with them - but their action can be either digital or analog. For UART, SPI, I2C etc, espressif has provided us with the datasheet that informs of all the registers that can be used to control that peripheral. It's something like write to this X memory address what you want to transfer and then set the bit Y on the Z memory address to begin the transfer. For SPI for example, there are registers to control speed, polarity, phase etc for a transfer. Once you know how to control a peripheral at the lower level, you can write high level drivers, which espressif does provide too, but you can write your own.
For Wifi, espressif hasn't released how the peripheral can be interfaced with, so we have to depend upon the compiled binaries that espressif sends us. You can use 'objdump -t' on the 'lib/lib80211.a' to get atleast the names of the routines that the Wifi driver provides. You can call these routines from C or assembler code and go a little bit lower than espressif intended but to go any lower would require 'Reverse Engineering' by manually understanding the low level code in the compiled Driver and nobody's gonna take that risk and time-drain.

What does it mean to have a dual channel in an Axi GPIO?

I am learning about the Microblaze processors and i don't really understand that while using the gpio functions.

This simply means that you have a second, independent GPIO on the same peripheral.
It's like having 2 different GPIO peripherals, but without the burden of allocating another one (with associated bus attachment logic duplication, etc..)
The Xilinx GPIO peripherals have always been like this, back from the OPB bus ones, to the PLB bus, and then, now, with the newest AXI bus peripherals.
You would have answered yourself by reading the peripheral documentation. (Hint: on Chapter "Register Space", page 10, you see a second set of registers, named "GPIO2_*", which are available only when "dual channel" is enabled)

Opening camera in multiple program in openCV

How can I open a single webcam in multiple programs written in openCV simultaneously. Btw I have attached 3 webcams and all are working fine in any single program of openCV, but why two program can't use them both, simultaneously?
Is this a restriction or is there any workaround?

Yes, this is an intention to restrict
Why? The conceptual view is related to the hardware control layer. Operating system assumes, there are some peripherals, that can be used on-demand, but keeps their context-of-use non-share-able.
As an example, one may assume a USB-mouse. While it can be used within several processes, some reasoning told the architects, it would not be the case one mouse should feed events into more than one-( window )-framed -context ( Yes, right, a.k.a. process )
Some other peripherals are even instances of both an EventSENSOR and an EventCOLLECTOR, USB-Cameras for example can receive and process signals from operating system to re-adjust their physical state ( Pan-Tilt-Zoom as an example ).
The more obvious becomes the 1:1 relation assumption, which is something we may sometimes wish not to have hardcoded there. On the other hand, what would the poor device do, if one process instructs it go-left and another go-right at the same time?
Similarly, who would be happy if one USB-mouse will send it's motion and other MMI-interaction events to all currently running processes? At least Drag&Drop UI-navigation policy will become a funny lottery.
Yes, there is a workaround
A simplest "just-enough" scenario includes aCameraControllerPROCESS which keeps the 1:1 relation with USB-device and has the visual data-acquisition responsibility. Here nanoseconds matter, so do not spend any CPU_CLK on anything other but on moving bytes into a buffer. All other processing shall be left on aVisualDataViewConsumerPROCESS where openCV may ( and will ) spend tens and hundreds microseconds on it's own.
Plus it has a communication / service provisioning part, which allows other distributed processes, access the acquired visual data in parallel manner ( in a non-blocking, concurrent view manner, which is a must for distributed soft-real-time processing ).
If one's architecture requires more features or alternatives, this layered approach allows one to add features while retaining both the control and performance overheads within acceptable envelopes.
A good implementation may remain quite smart, fast, low-latency and slim (resources-wise), if one uses ZeroMQ with it's Zero-Copy inproc: virtually abstract transport class, where one does lose an almost Zero-Latency as a cost of the trick, but gains an immense power of having a robust, massively-distribute-able one may scale-out beyond one, two, dozen, thousands, ... you name it ... hosts for free

Use dma transfert with Cyclone V Avalon-MM for PCIe

Is it possible to do DMA transferts with the IP core «Cyclone V Avalon-MM for PCIe» provided by altera in Qsys (quartus 14.0) ?
Altera provide an ip-core named «Cyclone V Avalon-MM DMA for PCIe» to do dma transfert. But this ip-core does not support PCIe Gen1 with 1x lane.
The demo (ep_g1x1) design for «Cyclone V Avalon-MM for PCIe» include a DMA block that is connected on Avalon-mm TX bus of PCIe ip-core.
Then I'm wondering if it's possible to write data from this DMA block to the root-complex (host) ? Because I can't find how to do that.

From my brief skim of the material, it should be possible to issue DMA reads or writes from an RC to your Cyclone V (EP) using the IP core you're interested in.
I've done DMA reads and writes on a Stratix V, however it was in a non-Qsys design just using the PCIe core HIP block (custom TLP encoding and decoding logic). This block just seems to be a wrapper around their PCIe HIP block that also handles the transaction layer for you.
The first step will be to get your RC to issue PCIe DMA read or writes requests. In the case of a read request, you'll want to send a memory read complete with data (CplD) request with a length greater than 1 DWORD. I would suggest dedicating an entire BAR to map the memory space you want to DMA from on the FPGA to keep your address targeting simple.
On the FPGA side, I would suggest using Signal Tap and probing the Rxm* interface signals on the core. This way you can see the exact timing of the DMA read request that comes out of the core. My guess is that the RXMRead_<n>_o signal will go high indicating the start of the request. At which point you'll have to decode and pass the RxmAddress_<n>_o and RXMBurstCount_<n>_o to some glue logic that will fetch the requested data from the FPGA's memory. Once you're ready to send back the data, assert the RXMReadDataValid_<n>_i for each valid word being sent.
I'm guessing that the «Cyclone V Avalon-MM DMA for PCIe» core that you referenced takes care of that 'glue' logic I mentioned for you, and allows you to connect straight to a SDRAM controller on your Qsys bus. Altera doesn't usually encrypt their megafuction code, so if your system verilog is strong, it might be worth digging through their generated files and seeing if you can reuse that bit of code in some way.
As for core settings, the only thing that I saw that you need to look out for is making sure the Single DW Completer setting is turned OFF. Otherwise the core will abort any requests it receives with a length greater than 1 DWORD.
Hope that helped somewhat.

I finally managed to make DMA request with the «Cyclone V Avalon-MM for PCIe» altera core-ip. Then yes it's possible.
On my Linux system, rootcomplex (RC) is included under i.MX6 with Linux operating system. Then most of the tricks are on the Linux side in fact.
Under the Linux driver a PAGE must be requested with dma_alloc_coherent() call and the address of this page must be written on the CRA register named ADDR_MAP_LO0 and ADDR_MAP_HI0.
On my system, memory pages are 4k sized, then I had to configure the «address translation settings» of the PCIe hard ip with pages of 4k to be coherent.
Once that done, I simply connected the DMA controller provided by Qsys on the TX avalon-MM slave port of PCIe IP.
Telling the DMA to write data on this port will automatically generate TLPs from the FPGA to write on i.MX6 ram.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart