Why TX descriptor ring size should be 4 times that of RX descriptor ring size? - dma

I was reading Performance Optimization Guidelines for DPDK based applications and it is recommended that TX ring size should be 4 times that of RX ring size. I think Intel guys got this number empirically. Is there any reason to allocate the ring size in 1:4 ratio.
Here is the link for the performance guidelines:

From https://communities.intel.com/community/tech/wired/blog/2011/06/24/parameter-talk-tx-and-rx-descriptors
You can see the “Transmit buffers” are below it. To modify the number of descriptors you just increase the value. In our Windows offerings there will be a limit of 2048 and it must be in increments of 8. On the Transmit side the starting value is 512, but the same rules of 2048 by 8 still apply. Why more TX than RX? Our architecture favors the nondeterministic RX side for priority so there is more turnover of descriptors than the TX side. Plus the O/S can sometimes not return TX assets back to the driver in a timely fashion.

Related

How does ultra wide band determine position?

Apple iPhone's now have a U1 chip which is described as "Ultra Wideband technology for spatial awareness". I've heard the technology can do time of flight calculations to determine range, but that doesn't answer how it determines relative position. How does the positioning work?
How does ultra-wideband work?
Travelling at the speed of light
The idea is to send radio waves from one module to another and measure the time of flight (TOF), or in other words, how long it takes. Because radio waves travel at the speed of light (c = 299792458 m/s) we can simply divide the time of flight by this speed to get the distance.
However, perhaps you've noticed that the radio waves travel fast. Very fast! In a single nanosecond, which is a billionth of a second, a wave has travelled almost 30 cm. So if we want to perform centimetre-accurate ranging, we have to be able to measure the timing very very accurately! So now the question is, how can we do this? How can we even measure the timing of.. a wave?
It's all about the bandwidth
In physics, there is something called the Heisenberg's uncertainty principle. The principle states that it is impossible to know both the frequency and the timing of a signal. Consider for example a sinusoid; a signal with a well-known frequency but a very ill-determined timing: the signal has no beginning or end. However, if we combine multiple sinusoidal signals with a slightly different frequency, we can create a 'pulse' with more defined timing, i.e., the peak of the pulse. This is seen in the following figure from Wikipedia that sequentially adds sinusoids to a signal to get a sharper pulse:
fig. 1
The range of frequencies that are used for this signal is called the bandwidth Δf. Using Heisenberg's uncertainty principle we can roughly determine the width Δt of the pulse, given a certain bandwidth Δf*:
ΔfΔt ≥ 1/4π
From this simple formula we can see that if we want a narrow pulse, which is necessary for accurate timing, we need to use a large bandwidth. For example, using the bandwidth Δf = 20 MHz available for wifi systems we obtain a pulse-width larger than Δt ≥ 4ns. Using the speed of light this relates to a pulse of 1.2 m 'long' which is too much for accurate ranging. Firstly, because it is hard to accurately determine the peak of such a wide pulse, and secondly because of reflections. Reflections come from the signals bouncing onto objects (walls, ceilings, closets, desks, etc..) within the surrounding environment. These reflections are also captured by the receiver and may overlap with the line-of-sight pulse which makes it very hard to measure the true peak of the pulse. With pulses of 4 ns wide, any object within 1.2 m of the receiver or the transmitter will cause an overlapping pulse. Because of this, ranging from wifi using time-of-flight is not suitable for indoor applications.
The ultra-wideband signals have typically a bandwidth of 500 MHz resulting in pulses of 0.16 ns wide! This timing resolution is so fine that at the receiver, we are able to distinguish several reflections of the signal. Hence, it remains possible to do accurate ranging even in places with a lot of reflectors, such as indoor environments.
fig. 2
Where to put all this bandwidth?
So we need a lot of bandwidth. Unfortunately, everybody wants it: in wireless communication systems, more bandwidth means faster downloads. However, if everybody would transmit signals on the same frequencies, all the signals would interfere and nobody would be able to receive anything meaningful. Because of this, the use of the frequency spectrum is highly regulated.
So how is it possible that UWB gets 500 MHz of precious bandwidth and most other systems have to satisfy with a lot less? Well, the UWB systems are only allowed to transmit at very low power (the power spectrum density must be below -41.3 dBm/MHz). This very strict power constraint means that a single pulse is not able to reach far: at the receiver, the pulse will likely be below the noise level. In order to solve this issue, a train of pulses is sent by the transmitter (typically 128 of 1024) to represent a single bit of information. At the receiver, the received pulses are accumulated and with enough pulses, the power of the 'accumulated pulse' will rise above the noise level and reception is possible. Hooray!
The IEEE 802.15.4 standard for Low-Rate Wireless Personal Area Networks has defined a number of UWB channels of at least 500MHz wide. Depending on the country you're in, some of these channels are either allowed or not. In general, the lower band channels (1 to 4) can be used in most countries under some limitations on update rate (using mitigation techniques). Channel 5 is accepted in most parts of the world without any limitations, with the notable exception of Japan. Purely from physics, the lower the channel center frequency, the better the range.
‍
‍
A note on the received signal strength (RSS)
There exists another way to measure the distance between two points by using radio waves, and that is by using the received signal strength. The further the two points are, the smaller the received signal strength will be. Hence, from this RSS-value, we should be able to derive the distance. Unfortunately, it's not that simple. In general, the received signal strength will be a combination of the power of all the reflections and not only of the desired line-of-sight. Because of this, it becomes very hard to relate the RSS value to the true distance. The figure below shows just how bad it is.
In this figure, the RSS value of a Bluetooth signal is measured at certain distances. At every distance, the error bars show how the RSS value behaves at the given distance. Clearly, the variation on the RSS value is very large which makes RSS unsuitable for accurate ranging or positioning.
‍
Source

How does Flutter calculate pixels for different resolutions?

Flutter apps can run on a variety of hardware, operating systems, and form factors. How are "pixels" calculated for different resolutions?
From https://api.flutter.dev/flutter/dart-ui/FlutterView/devicePixelRatio.html :
The number of device pixels for each logical pixel. This number might
not be a power of two. Indeed, it might not even be an integer. For
example, the Nexus 6 has a device pixel ratio of 3.5.
Device pixels are also referred to as physical pixels. Logical pixels
are also referred to as device-independent or resolution-independent
pixels.
By definition, there are roughly 38 logical pixels per centimeter, or
about 96 logical pixels per inch, of the physical display. The value
returned by devicePixelRatio is ultimately obtained either from the
hardware itself, the device drivers, or a hard-coded value stored in
the operating system or firmware, and may be inaccurate, sometimes by
a significant margin.
The Flutter framework operates in logical pixels, so it is rarely
necessary to directly deal with this property.

How can I select an optimal window for Short Time Fourier Transform?

I want to select an optimal window for STFT for different audio signals. For a signal with frequency contents from 10 Hz to 300 Hz what will be the appropriate window size ? similarly for a signal with frequency contents 2000 Hz to 20000 Hz, what will  be the optimal window size ?
I know that if a window size is 10 ms then this will give you a frequency resolution of about 100 Hz. But if the frequency contents in the signal lies from 100 Hz to 20000 HZ then 10 ms will be appropriate window size ? or we should go for some  other window size because of 20000 Hz frequency content in a signal ?
I know the classic "uncertainty principle" of the Fourier Transform. You can either have high resolution in time or high resolution in frequency but not both at the same time. The window lengths allow you to trade off between the two.
Windowed analysis is designed for quasi-stationary signals. Quasi-stationary signals are signals which change over time but on some short period of time they might be considered stable.
One example of quasi-stationary signal is speech. Frequency components of this signal change over time when position of tongue and mouth changes, but on a short period of time approximately 0.01s they might be considered stable because tongue does not move this fast. The range of 0.01s is determined by our biology, we just can't move tongue faster than that.
Another example is music. When you touch the string you might consider it produces more or less stable sound for some short period of time. Usually 0.05 seconds. Within this period you might consider sound stable.
There might be other types of signals, for example, it might have frequency 10Ghz and be quasi-stationary of 1ms of time.
Windowed analysis allows to capture both stationary properties of signal and change of signal over time. Here it does not matter what sample rate does signal have, what frequency resolution do you need or what are the main harmonics. Are main harmonics near 100Hz or near 3000Hz. It is important on what period of time the signal is stationary and on what it can be considered as changing.
So for speech 25ms window is good just because speech is quasi-stationary on that range. For music you usually take longer windows because our fingers are moving slower than our mouth. You need to study your signal to decide optimal window length or you need to provide more information about it.
You need to specify your "optimality" criteria.
For a desired frequency resolution, you need a length or window size of roughly Fs/df (or from a fraction to twice that length or more, depending on S/N and window). However the length also needs to be similar to or shorter than the length of time during which your signal is stationary within your desired frequency resolution bounds. This may not be possible or known, thus requiring you to specify which criteria (df vs. dt) is more important for your desired "optimality".
If multiple window lengths meet your criteria, then the shortest length that is a multiple of very small primes is likely to be the most computationally efficient for the following FFTs within an STFT computational sequence.
Based on the sampling theorem, the sampling frequency needs to be larger than twice the highest frequency of the signal. And based on DFT (discrete Fourier Transform), we also know that the frequency resolution is the inverse of the entire signal duration, and the the entire frequency span is the inverse of the time resolution. Note that the frequency is simply the inverse of the period, thus the relationships go inversely with each other.
Frequency resolution = 1 / (overall time duration)
Frequency span = 1 / (time resolution)
Having said that, to process 20kHz audio signal, we need to sample in 40kHz. And if we want to get the frequency resolution down, say to 10Hz, we will need to sample the entire duration as long as 0.1Sec, which is 1/10Hz.
This is the reason we normally see that audio files are said to be 44k. Because the human hearing range is limited to 20kHz. To add some margin to it, we use 44k sampling frequency in stead of 40kHz.
I think the uncertainty principle goes with the fact that more localized signal in one domain, actually spread out on the other. For example, a pulse in time domain goes from negative infinity to positive infinite, i.e the entire stretch of the spectrum. And vice versa that the a single frequency signal in spectrum stretches from negative infinity to positive infinite in time domain. This is simply because we had to go forever in order to know if a signal could be a pure sinusoidal signal or not.
But for DFT, we can always get the frequency span if we sample twice the highest frequency of the signal, and the resolution we want if we sample the signal duration long enough. So, not so uncertain as the uncertainty principle says, as long as we know how many samples to take and how fast and how long to take them.

operating systems main memory fragmentation

Suppose a small computer system has 4 MB of main memory. The system manages it in fixed sized frames. A frames table maintains the status of each frame in memory. How large (how many byte) should a frame be? You have a choice of one of the following: 1K, 5K, or 10K bytes. Which of these choices will minimize the total space wasted by processes due to fragmentation and frames table storage?
Assume the following: On the average, 10 processes will reside in memory. The average amount of wasted space will be 1/2 frame for each process.
The frames table must have one entry for each frame. Each entry requires 10 bytes.
Here is my answer:
1K would minimize the fragmentation, as known small size leads to big tables but smaller wasted space.
10 processes ~ 1/2 frame wasted on each.
Am I on the right track?
Yes, you are. I agree with you that on a system such as this, the smallest size makes the most sense. However, for example, if you take the situation of x86-64, where the options are 4kb, 2MB, 1GB. Considering modern memory sizes of approximation 4GB, obviously 1GB makes no sense, but because most programs nowadays contain quite a bit of compiled code, or in the case of interpreted and VM languages, all of the code of the VM, 2 MB pages make the most sense. In other words, to determine these things, you have to think about the average memory usage of a program in this system, the number of programs, and most importantly, the ration of average fragmentation to page table size. Because while a small memory size like that benefits from the low fragmentation, 4kb pages on 4GB of memory is a very large page table. Very large.

Unable to get correct frequency value on iphone

I'm trying to analyze frequency detection algorithms on iOS platform. So I found several implementations using FFT and CoreAudio (example 1 and example 2). But in both cases there is some imprecision in frequency exists:
(1) For A4 (440Hz) shows 441.430664 Hz.
(1) For C6 (1046.5 Hz) shows 1518.09082 Hz.
(2) For A4 (440Hz) shows 440.72 Hz.
(2) For C6 (1046.5 Hz) shows 1042.396606 Hz.
Why this happens and how to avoid this problem and detect frequency in more accurate way?
Resolution in the frequency domain is inversely related to number of FFT bins. You need to either:
increase the size of your FFT to get finer resolution
use magnitude of adjacent bins to tweak the frequency estimate
use an alternative method for frequency estimation rather than FFT e.g. parametric model

Resources