Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I am troubleshooting an IPTV Layer 3 multicast VPN across a 10GB MPLS network.
Only certain HD channels are experiencing severe freezing and tiling; all other SD and HD channels work fine. Our IPTV video monitoring equipment is detecting and reporting packet loss by monitoring a continuity counter. I have asked every equipment vendor we have and read every .pdf I can find and no one seems to know exactly:
How/when/where an MPEG transport stream continuity counter fits into a transport stream ?
What packets/frames in the transport stream are being counted?
Why does the reported packet loss seem to occur in increments of 16 (0, 16, 32) ?
How can there be an error condition with 0 packet loss ?
How/when/where does the PCR value fit into the transport stream ?
That's a lot of questions ! Let's clarify a bit:
Continuity Counter (CC) is carried in the header of every Transport Packet (TP) of the Transport Stream (TS).
Every TP also has a packet identifier (PID) in the header. Every PID has its own CC. The CC for any given PID is incremented every time the TP has a payload according wikipedia but I think it is actually incremented on every new TP... [EDIT]: CC is only incremented when the payload flag is true (cf Mike Reedel comment below)
Actually since the CC is on 4 bits the value should go from 0x0 to 0xF then start over at 0x0.
Some people are careless about the standards, it can happen that during the multiplexing of the TS that the CC is not correctly incremented: in this case you didn't lose any packet but because the CC is broken, your tool is reporting an error. However the error can happen anywhere during the transmission of the TS, including the monitoring tool that might not sample at the correct rate.
Program Clock Reference (PCR) is a timestamp that is regularly inserted in the TS to provide an accurate 27 Mhz clock to the decoder. It should be repeated every 40ms according to the standards. There is no obligation about the PID carrying the PCR but most of the time it is the Video PID: you need to look at the PMT to find out on which PID the PCR is.
Some references:
Wikipedia
ETSI TR 101 290
Tektronix CC FAQ
ISO 13818-1
Related
I have an iOS app that reads/writes on a BLE device. The device is sending me data over 20 bytes long and I see they get trimmed. Based on the following thread
Bluetooth LE maximum transmission size
it looks like iOS is trimming the data. That thread shows the solution on how to write bigger data sizes, but how do we read info larger than 20 bytes?
For anyone looking at this post years later like I am, we ran into this question as well at one point. I would like to share some helpful hints for data larger than 20 bytes.
Since the data is larger than one packet can handle, you will need to send it in multiple packets. It helps significantly if your data ALWAYS ends with some sort of END byte. For us, our end byte gives the size of the total byte array so we can check that at the end of reading.
Create a loop that checks for a packet constantly and stops when it receives that end byte (would also be good to have a timeout for that loop).
Make sure to clear the "buffer" when you start a new read.
It is nice to have an "isBusy" boolean to keep track of whether another function is currently waiting to read from the device. This prevents read overlaps. For us, if the port is currently busy, we wait a half second and try again.
Hope this helps!
I’m here again with a new question; this time about PLC.
I start by saying I’m new of PLC and I’ve never saw one of them until a couple of month ago.
I’m asked to write a program that read, from Delphi, some data from a PLC Siemens S7-300 in order to archive them in a SQL Server database. I’m using the “libnodave” library.
The program is quite simple. I must verify a bit and when it is on I have to read the data from the PLC and set off the bit. With the library I’ve told about I can read and write without problems, but the data I have to read are stored in a group of byte (about 60 bytes), so I’ve to read some bytes, skip some others and read others bytes. Moreover the bit I must test is in the end of this group of bytes.
So I read the entire group of bytes I put the data red in a group of variables and then I test the bit and, if it is on, I store the data into the database.
In order to skip the byte I don’t have to read I use this kind of statements:
for i := 1 to 14 do
daveGetU8(dc);
for i := 1 to 6 do
daveGetU16(dc);
My questions are these:
There is a better way to read the data skipping the ones I don’t have
to read?
Is it convenient to read the entire group of bytes and after
test the bit or is better to make two reading separated?
I say this because I’ve found in internet that the read operations requires some time, so is better to make the minimum numbers of reading possible.
Eros
Communicating with a PLC involves some overhead.
You send a request and after some time you receive an answer.
Often the communication is through a serial line with limited bandwidth.
The timing then involves:
Time to send the request
Time for the PLC to respond
Time to transfer the response
It is difficult to give a definite answer to your questions, since we don't know how critical the timing is.
Anyway, polling the flag byte only seems like reasonable way to go.
When the flag is set, read the entire block in one command and then clear the flag.
Reading the data in small parts to avoid the gaps, is probably more time consuming than reading the entire block at once.
You can make the maths yourself since you know the specifications.
Example:
Lets say the baud rate is 9600 baud. This means roughly 1 byte per millisecond transfer time. The command to read is about 10 bytes long and the block answer about 70 bytes (assuming the protocol is binary). The PLC delay time about 50 ms. This adds to 130 ms, while reading the flag only adds to about 70 ms.
Only you can say if the additional polling time of 70 ms is acceptable.
Edit: In a comment it is stated that the communication is via ethernet on a 100+ MBit/s line. In that case, I suggest to read all data in one command and process it in the PC. Timing is of little concern with such bandwidth.
I'm writing an IOS/iPhone app that communicates with an sp-10c imu via bluetooth low energy. The packets I am receiving were originally 61 bytes long, but I have shortened them to 38 bytes so that the packet is sent in the minimum number of notifications.
I have no control over the actual programming of the sp-10c, but they do give me control of some of the settings. I can change what all info I receive (accelerations, gyros quaternions, etc) and thereby change the packet size, and I can also change the transmission interval (by 10 millisecond intervals). I have read a lot of the questions and answers on here related to this subject, but the only solutions I have seen require having programming control over the peripheral device, which I do not have.
I have also read Apple's handbook on this, which states that the minimum interval should be 20 ms. If I could reliably get entire packets at 50 hz, that would work for my needs. Anything less is pretty much worthless. My problem is that I'm getting intervals of massive packet loss at the 20ms interval (there are times when 40 or more packets are lost, and this happens regularly). I really don't see reliable results until I slow it down to an interval of 60 ms or more, which, as I said, is worthless. (I am trying to sense a sharp change in acceleration that lasts about 20 - 40ms)
Most of the questions and answers on here are somewhat dated, but I saw where someone said that things may have gotten worse as far as the BLE connection + Apple devices go. It has been suggested that classic bluetooth is the only way to go when large amounts of throughput are needed, but the sp-10c does not offer classic bluetooth as an option.
My questions are:
Is this the normal behavior for a ble connection with apple devices?
Has anyone had any success getting reliable ble notifications at 20ms with longer packets?
Should I give up now and try to find an imu with classic bluetooth?
I guess what I'm really getting at is this: Am I doing something wrong, or is this just par for the ble/iPhone course?
Would it help to try and limit the packet to less than 20 bytes so it is received in one notification?
We have a critical need to lower the latency of our UDP listener on iOS.
We're implementing an alternative to RTP-MIDI that runs on iOS but relies on a simple UDP server to receive MIDI data. The problem we're having is that RTP-MIDI is able receive and process messages around 20ms faster than our simple UDP server on iOS.
We wrote 3 different code bases in order to try and eliminate the possibility that something else in the code was causing the unacceptable delays. In the end we concluded that there is a lag between time when the iPAD actually receives a packet and when that packet is actually presented to our application for reading.
We measured this with with a scope. We put a pulse on one of the probes from the sending device every time it sent a Note-On command. We put another probe attached to the audio output of the ipad. We triggered on the pulse and measured the amount of time it took to hear the audio. The resulting timing was a reliable average of 45ms with a minimum of 38 and maximum around 53 in rare situations.
We did the exact same test with RTP-MIDI (a far more verbose protocol) and it was 20ms faster. The best hunch I have is that, being part of CoreMIDI, RTPMIDI could possibly be getting higher priority than our app, but simply acknowledging this doesn't help us. We really need to figure out how fix this. We want our app to be just as fast, if not faster, than RTPMIDI and I think this should be theoretically possible since our protocol will not be as messy. We've declared RTPMIDI to be unacceptable for our application due to the poor design of its journal system.
The 3 code bases that were tested were:
Objective-C implementation derived from the PGMidi example which would forward data received on UDP verbatim via virtual midi ports to GarageBand etc.
Objective-C source base written by an experienced audio engine developer with a built-in low-latency sine wave generator for output.
Unity3D application with Mono-based UDP listener and built-in sound-font synthesizer plugns.
All 3 implementations showed identical measurements on the scope test.
Any insights on how we can get our messages faster would be greatly appreciated.
NEWER INFORMATION in the search for answers:
I was digging around for answers, and I found this question which seems to suggest that iOS might respond more quickly if the communication were TCP instead of UDP. This would take some effort to test on our part because our embedded system lacks TCP capabilities, only UDP. I am curious as to whether maybe I could hold open a TCP connection for the sole purpose of keeping the Wifi responsive. Crazy Idea? I dunno. Has anyone tried this? I need this to be as real-time as possible.
Answering my own question here:
In order to keep the UDP latency down, it turns out, all I had to do was to make sure the Wifi doesn't go silent for more than 150ms (or so). The exact timing requirements are unknown to me at this time, however the initial tests I was running were with packets 500ms apart and that was too long. When I increased the packet rate to 1 every 150ms, the UDP latency was on par with RTPMIDI giving us total lag time of around 18ms average (vs. 45ms) using the same techniques I described in the original question. This was on par with our RTPMIDI measurements.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 7 years ago.
Improve this question
In the article "Teach Yourself Programming in Ten Years" Peter Norvig (Director of Research, Google) gives the following approximate timings for various operations on a typical 1GHz PC back in 2001:
execute single instruction = 1 nanosec = (1/1,000,000,000) sec
fetch word from L1 cache memory = 2 nanosec
fetch word from main memory = 10 nanosec
fetch word from consecutive disk location = 200 nanosec
fetch word from new disk location (seek) = 8,000,000 nanosec = 8 millisec
What would the corresponding timings be for your definition of a typical PC desktop anno 2010?
Cache and main memory have gotten faster. Disks have higher sequential bandwidth. And SSDs have much lower seek time.
The original list is pretty crummy though, he's mixing latency measures (like seek time) with 1/throughput (you're dreaming if you think you can round-trip to the disk controller, even if the data is already in cache and requires no head movement, in 200ns).
None of the latencies have really changed. The single instruction and L1 latency are actually longer than the figures he gave, but you get multiple instructions working in parallel (pipelining) and several words fetched from cache for the price of one. Similarly for disk transfers, you'll get consecutive blocks delivered in much more rapid succession, but the wait time after issuing a request hasn't changed much, unless you've moved to SSD.
CPU architecture has changed enough, though, that trying to put a single number on any of these is a loss. Different instructions require very different execution times, and data dependencies control the throughput you see. Cache behavior is dominated by the cost of sharing between multi-core CPUs. And so on.
The main thing I would try to obtain from those timings is the differences in scale between them: Memory is around an order of magnitude slower than executing code directly on the CPU, and disk is several orders of magnitude slower than that.
Many developers still think of optimization in terms of CPU time. But with one cache miss, your CPU idles for at least 10 clock cycles, given the above timings. A hard page fault would require 8 million clock cycles. This is why optimizing for memory usage (to reduce page faults) and optimizing your data layout (to reduce cache misses) will often have a higher payback than any optimizations that focus just on code flow.