U-Boot: controlling SDRAM refresh rate - memory

My goal is to find the pice of code in denx u-boot, which controls (sets) the refresh rate for the external memory (sdram/ddr memory).
In particular, I have a Tiny4412 evaluation board, which features an Samsung exynos processor. At the moment I am digging through the u-boot code base to find the code, which initializes the external memory and sets the refresh rate.
Do you know the code, which is in charge of this task and the file that implements it?
Furhtermore, I was curious, whether it is possible to set the refresh rate after the boot process finished (e.g. using a kernel module)?

Your board is not in mainline u-boot. The Exynos4412 chip is only supported in the Odroid board, and not the board you are using.
It is NOT good practice to modify the refresh rate after the bootloader has set it.

Related

Multiple boards Windows Device Manager prompts resource conflict, error code is 12

I inserted multiple same boards into the system. Device PCIe is implemented with Xilinx IP core. After each FPGA program is programed, manually refresh the device manager to check whether the device and driver are working properly.
My confusion is that it seems that this approach can only work on two boards at the same time. After the third board is programmed , and then refresh the task manager, the system prompts insufficient resources, "This device cannot find enough free resources that it can use (Code 12)"
I tried to disable the other two boards, but the device still prompted conflicts. I don't know how to query conflicting resources.
My boards have 2 BARs(BAR0: 2KB, BAR1:16MB), and 1 IRQ.
I did a few experiments and felt that it was caused by the conflict of memory resources. The conflicting party was the AMD integrated graphics card that came with the motherboard.
1st, 2nd, 3rd all indicate my board number.
3rd is not recognized because of conflicts.
After I shut it down, I plugged in all three boards and then turned it on again. As a result, during the startup process, the system restarted again after a sudden power failure, and then all three boards were normal. At this time, it is found that the memory address of the graphics card has changed
I want to know how to resolve the conflict? modify my driver code or FPGA configuration?
PCIe enumeration should resolve memory allocations issues, however there are a couple implementation issues to be aware of. Case in point, I have used Xilinx XDMA's with a 64 bit BAR of 2GB in size and I have literally bricked a DELL XPS motherboard. But I have done the same with an IBM system and it just worked. The point here is that enumeration can be done with Firmware, Hardware or OS driven event. If you doing hardware manager, that sounds like OS driven, but when I toasted the XPS board, that was some kind of FW issue that was related to BAR size that resulted in a permanent failure. 16MB isn't big and it should not be a problem, but I would recommend going with the Xilinx defaults first and show reliability there. I think it's one BAR at 1M. I have run 3x 64 bit BARS at 1MB a piece without issue, but keep it simple and show reliability. Then move up. This will help isolate if it is a system flakiness or not.
I have seen some system use FW based enumeration that comes up really fast, before the FPGA has configured, in which case there is no PCIe target to ID. If you frequently find that your FPGA is not detected on power up, but detected on a rescan, this can be a symptom. How to resolve this is a bit of a pain. We ended up using partial reconfiguration. Start with the PCIe interface, then have a reconfig to load the remaining image. Let's hope it isn't this problem
The next thing is to be aware of is your reset mechanism within the FPGA. You probably hooked the PCIe IP reset to the bus reset which is great, however, I have in the past also hooked that reset to internal PLL locked signals that may not be up. For troubleshooting purposes, keep that reset simple and get rid of everything else to show that just the PCIe IP by itself is reliable first.
You also have to be careful here too. If you strip things down, make sure it is clean. If you ignore the PLL lock and try to use a Xilinx driver, such as the XDMA driver, it has a routine where it tries to identify the XDMA with data transactions. It is looking for the DMA BAR one BAR at a time. But when it does this, the transaction it attempts may go out on the AXI bus if the BAR isn't the XDMA control BAR. If the AXI bus isn't out of reset or clocked when this happens you will lock the AXI bus and I have locked up a Linux box this way on several occasions. AXI requires that transactions complete, otherwise it just sits there waiting.
BTW, on a Linux box, you can look at the enumeration output in the kernel log. I'm not sure if Windows shows you the same thing or not. But this can be helpful if you see that the device was initially probed but then something invalid was detected in the config register, versus not being seen at all.
So a couple things to look at.

ESP32: Best way to store data frequently?

I'm developing a C++ application in the ESP32-DevKitC board where I sense acceleration from an accelerometer. The application goal is to store the accelerometer data until storage is full and then send all the data through WiFi and start all again. The micro also goes to deep-sleep mode when is possible.
I'm currently using the ESP32 NVS library which is very well documented and pretty easy to use. The negative side of this is that the library uses Flash memory, therefore a lot of writings will end up degrading the drive.
I know that Espressif also offers some other storage libraries (FAT, SPIFFS, etc.) but, as far as I know (correct me if I'm wrong), they all use Flash drive.
Is there any other possibility of doing what I want to but without using the Flash storage?
Aclarations
Using Flash memory is not the problem itself, but degrading it.
Storage has to be non volatile or at least not being erased when the micro goes to deep-sleep mode.
I'm not using any Arduino library.
That's a great question that I wish more people would ask.
ESP32s use NOR flash storage, which is usually rated for between 10,000 to 100,000 write cycles (100,000 seems to be the standard these days). Flash can't write single bytes; instead of writes a "page" of bytes, which I believe is 256 bytes. So each 256 byte page is rated for at least 100,000 cycles. When a device is rated for 100,000 cycles it's likely to be usable for at least 10 times that, but the manufacturer is not going to make any promises beyond the 100,000.
SPIFFS (and LittleFS, now used on the ESP8266 Arduino Core) perform "wear leveling", to minimize the number of times a particular page is written. So if you modify the same section of a file repeatedly, it will automatically be written to different pages of flash. FAT is not designed to work well with flash storage; I would avoid it.
Whether SPIFFS with wear leveling will be adequate for your needs depends on your needed lifetime of the device versus how much data you'll be writing and how frequently.
NVS may perform some level of wear levelling, to an extent I'm unsure about. Here, in a forum post with 2 ESP employees, they both confirm that NVS does do some form of wear levelling. NVS is best used to persist things like configuration information that doesn't change frequently. It's not a great choice for storing information that's updated often.
You mentioned that the data just needs to survive deep sleep. If that's the case, your best option (if it's large enough) is to use the ESP32's RTC static RAM. This chunk of memory will survive restarts and deep sleep mode, but will lose its state if power is interrupted. It's real RAM so you won't wear it out by writing to it frequently, and it doesn't cost a lot of energy to write to. The catch is there's only 8KB of it.
If the 8KB of RTC RAM isn't enough and you're writing too much data too frequently to trust that SPIFFS will be okay, your best bet would be an SD card. The ESP32 can talk to an SD card adapter. SD cards use NAND flash, which has a much greater lifespan than NOR and can be safely overwritten many more times (which is why these kinds of cards are usable for filesystems in devices like Raspberry Pis).
Writing to flash also takes much more energy than writing to regular RAM. If your device is going to be battery powered, the RTC RAM is also a better choice than SPIFFS or an SD card from a power savings perspective.
Finally, if you use the RTC RAM I'd recommend starting to write it over wifi before it's full, as bringing up wifi and transmitting the data could easily take long enough that you might run out of space for some samples. Using it as a ring buffer and starting the transmit process when you hit a high water mark rather than when the buffer is full would probably be your best bet.
I know i'm late with this answer but you can buy ESP32 modules with external RAM even with 4-8mb. External ram is really fast ( at least much faster than the flash, it uses SPI interface to communicate ) and you can fit a lot of sensor readings in there.
I'm using an ESP32_WROVER_E module with 8mb external ram ( 4mb is usable with normal function calls ) and 16mb flash.
Here is a link of the module that i'm using at TME's site.

Is there any means to do RAM memory testing for multi-core during another application using target memory domain

I try to implement ram test such like this url(http://www.esacademy.com/en/library/technical-articles-and-documents/miscellaneous/software-based-memory-testing.html) in dual core microcontroller.
This ram testing shall be available in the middle of another process.
I think implementing this by using interrupt disable, but it is not appropriate.
As a precondition, My implementing ram test is supposed to do data backup to another domain before testing and to put these data back to initial address.
So, other driver can use same data as usual after RAM testing.
In this case I use interrupt disable, it's not available in dual core.
Because the both of cores access the same RAM domain and disabling interrupt
is not working another core's processing, there are only occurring data inconsistency.
Could you give me your idea?
Somewhat by definition if you are running code on that ram you are not testing that ram if you want to do a memory test you need to be off the ram under test.
But that depends on what your definition of test is. If it is a memory test to test the memory itself, cant be on it, you are not testing some of the memory so you are not testing the memory (looks like what your link is about, note links are bad in SO questions and answers, remote links are not assumed to remain active).
Cant test half then another half you are not testing the address bus completely.
If this is a performance test then ideally you want to be off of it and have the test run completely from cache. Multi core helps for a targeted test as you can push the interface a little bit harder, difficult to max it out with a general purpose processor though, multi-core or not.
Otherwise if you just want to exercise a fraction then allocate a fraction and test it, in whatever way you wish. Its not really a memory test though.
Sounds like from your requirements you are not really interested in a full memory test, so do as much as you can to make your boss happy.
Actual memory testing a system is very much specific to that system, how you approach it, how you solve it. You want that code(/stack) to not be on that ram, ideally the chip/system design includes a fast internal SRAM that you can use for board bring up and design verification, possibly manufacture test, but manufacture test should be testing the solder/board not all the bits in the ram, there are ways to do that too. If no internal sram then they had to design some way to bring the system up, or not, if you can run from flash and have the cache on, and can map that out of the way of the DRAM address space, then you can test the dram(/external ram) that way (no stack, just the CPU registers, basically assembly language).

DSP on Beaglebone

I have a Beaglebone running Ubuntu. We want to continuously sample from 3 on-board ATD converters at 100KS/s, and every window of samples we will run a cross correlation DSP algorithm. Once we find a correlation value above a threshold, we will send the value to a PC.
My concern is the process scheduling in Ubuntu. If our process gets swapped out and an ATD sample becomes available during this time, the process will miss the sample. We need to ensure that our process will capture every sample and save it in memory.
With this being said, is there a way to trigger interrupts on the Beaglebone so that if an ATD sample is ready, the sample will be saved in the memory of our program even if the program does not have the processor at the time?
Thanks!
You might be able to trigger the EDMA or use the PRUSS. Probably best to ask on beagleboard#googlegroups.com. There isn't a DSP per-se on the BeagleBone.
This is not exactly an answer to your question, but hopefully it explains how the process works. Since you didn't mention what hardware you are running for AD conversion, maybe this is the best that can be done:
With audio hardware, which faces the same problem, the solution comes from the hardware and the drivers working together: whenever the hardware has filled up enough of the buffer it signals the driver (via an interrupt or some similar mechanism). In some cases, it's also possible that the driver polls the hardware or something like that, but that's a less efficient solution, and I'm not sure anyone does it that way anymore (maybe on cheaper hardware?). From there, the driver process may call right into the end-user process, or it may simply mark the relevant end-user process as "runnable". Either way, control needs to be transferred to the end user process.
For that to happen, the end user process must be running at a higher priority than anything else occupying the CPUs at that moment. To guarantee that your process will always be first in the queue, you can run it at a high priority, with the appropriate permissions, you can even run in very high priorities.
The time it takes for the top priority process to go from runnable to running is sometimes called the "latency" of the OS, though I am sure there's a more specific technical term. The latency of Linux is on the order of 1 ms, but since it's not a "hard" real-time OS, this is not a guarantee. If this is too long to handle your chunks of data, you may have to buffer some of it in your driver.

Windows to embedded port: data and code memory size

I am in the process of porting a windows 7 library to an embedded platform. In order to do so my employer asks me the amount of memory (and CPU but let us concentrate on the memory for now) that my system will need once ported - so he can size the board to my needs.
I had a look on the internet and there seem not be exist much information about this question, hence my questions:
in order to get a rough idea of the memory footprint of the code in flash memory (code only without memory for data), I read on the Internet that I should sum the size of all the dlls I use. It seems that all compilers and platforms give a different size for the code footprint but overall the size of the code (without data) is often very close. Do you confirm?
in order to deal with the memory required by the data only (heap + stack but no code), I had a look at the task manager (and process explorer). It seems the overall amount of data which I use is specified in the 'peak working set'. I have a few questions about it though:
2.a. Does the 'working set' include the heap + stack memory or does it correspond to the heap only?
2.b. Does the 'working set' include the size for the code as well? (as I am on windows 7, the code is also stored in RAM and not in flash as on embedded systems), or does it only correspond to the data?
2.c. it seems the 'peak working set' reflects the maximum amount of physical memory that was actually in RAM from the time the program was started, but it does not reflect the size the program could take afterwards (if I happen to allocate memory at runtime - which would be bad ;) - the peak value would go on increasing). Do you confirm?
2.d. Hence, do you also confirm that if I do not allocate memory at runtime, the 'peak working set' should roughly be the maximum size of RAM my embedded system will need? Up to a bit of size difference due to the difference in systems technology...
Thanks,
Antoine.
Unless you are intending to run your application on Windows Embedded, then looking at the code and data usage in Windows is not going to be much of an indicator of anything useful!
1) DLLs are libraries - not all the code within them will be utilised by your code. Most embedded systems are statically linked and the linker will link only modules that are actually referenced in your by your code. So taking the sum of the DLL dependencies is likley to lead to a gross over estimation of memory requirement.
2) Windows memory management is profligate with memory use - because it can be and to do so generally improves performance of typical desktop systems. For example, an thread stack in Windows is typically of the order to 2Mb - you may seldom use that much, but Windows gives it to you in any case because it can and to do so errs on the side of safety. A thread stack in an embedded system will typically range from a few tens of bytes to a few tens of kilobytes - it depends on your application.
Windows task manager shows what Windows allocates to your process, that may not relate to what your process needs. Also your application is using Windows services - all the memory used for kernel and device services will not show up as part of your process, but your embedded system may still need those.
If you do use your Windows prototype code to assess the embedded system requirements, then your best place to start is by getting the linker to generate a map file, which will give a detailed description of memory usage in terms of statically allocated data and code size.
Code size depends not only on the performance of the compiler, but also on the efficiency of the instruction set. Some architectures achieve higher code density than others. Windows application code size is never a good indicator of embedded code size because its execution environment is likley to be so much different. For example an pre-emptive multitasking RTOS kernel on a 32bit ARM can be implemented in less than 10Kb of code, a file system perhaps another 10, and network stack anything from 10 to 30K, USB another 10. As you can see this is a different world to desktop code.
Data memory usage is more easily determined perhaps; but you do that through analysis of your application rather than observing what Windows does. There is the data your application instantiates directly, and then there is data instantiated by libraries and device drivers you might call - in Windows the latter is likley to be relatively large and out of your control. Typical embedded systems libraries for things such a s network stacks, USB, file systems etc. are fall smaller and far more deterministic in both performance and size.
Your better bet is to describe your application in terms of its general purpose, performance requirements, real-time constraints, and its hardware requirements (display, networking, I/O, mass storage etc.), and then look at comparable solutions or at the libraries you will need to implement your solution; most embedded systems are "bare board" and do not have the services you find in Windows unless you write them or use third-party solutions - Windows is seldom a comparable solution to an embedded system.
If it is just a library rather than an application, then build it for a likley target using a Windows hosted GCC cross-compiler and see how big it ends up. You don't need hardware for that or even expend any money.

Resources