How accurate is Delphi's 'Now' timestamp - delphi

I'm going to be working on a project that will require (fairly) accurate time-stamping of incoming RS232 serial and network data from custom hardware. As the data will be coming from a number of independant hardware sources, I will need to timestamp all data so it can be deskewed/ interpolated to a nominal point in time.
My immediate though was just to use the inbuilt Now command to timestamp, however a quick Google seems to indicate that this is only going to be accurate to around 50 msecs or so.
Unfortunately, the more I read the more confused I become. There seems to be a lot of conflicting advice on GetTickCount and QueryPerformanceCounter, with complications due to todays multicore processors and CPU throttling. I have also seen posts recommending using the Windows multimedia timers, but I cannot seem to find any code snippets to do this.
So, can anyone advise me:
1) How accurate 'Now' will be.
2) Whether there is a simple, higher accuracy alternative.
Note: I would be hoping to timestamp to within, say , 10 milliseconds, and i am not looking for a timer as such, just a better time-stamping method. This will be running on a Windows 7 32 bit low-power micro-PC. I will be using either Delphi XE or Delphi 2007, if it makes any difference.

According to documentation, Now is as accurate only to the nearest second:
Although TDateTime values can represent milliseconds, Now is accurate only to the nearest second.
Despite this, looking at the current implementation, Now is as accurate as the GetLocalTime windows API could be.
Making a quick test, it shows Now returns values for each millisecond in the clock, for example:
begin
System.SysUtils.FormatSettings.LongTimeFormat := 'hh:mm:ss.zzz';
for I := 1 to 5000 do
Writeln(TimeToStr(Now()));
end.
When I executed this console program from the command line project1 >times.txt, in a Windows 7 64 bits machine, I got a file that goes along 29 milliseconds continually (no one is missing in the file).
You have to face the fact that running in a Windows environment, your application/thread may get processor slices with varying time in between, depending on how busy is the system and the priority of your application/threads versus all the other threads running in the system.

Related

Alternatives to D3D11_CREATE_DEVICE_PREVENT_INTERNAL_THREADING_OPTIMIZATIONS?

This is a followon to this question about using the DX11VideoRenderer sample (a replacement for EVR that uses DirectX11 instead of EVR's DirectX9).
I've been trying to track down why it uses so much more CPU than the EVR. Task Manager shows me that most of that time is kernel mode.
Using profiling tools, I see that a LOT of time is being spent in numerous calls to NtDelayExecution (aka Sleep). How many calls? ~100,000 over the course of ~12 seconds. Ok, yeah, I'm sending a lot of frames in those 12 seconds, but that's still a lot of calls, every one of which requires a kernel mode transition.
The callstack shows the last call in "my" code is to IDXGISwapChain1::Present(0, 0). The actual call seems be Sleep(0) and comes from nvwgf2umx.dll (which is why this question is tagged NVidia: hopefully someone there can call up the code and see what the logic is behind such frequent calls).
I couldn't quite figure out why it would need to do /any/ Sleeping during Present. It's not like we wait for vertical retrace anymore, is it? But the other reason to use Sleep has to do with yielding to other threads. Which led me to a serious clue:
If I use D3D11_CREATE_DEVICE_PREVENT_INTERNAL_THREADING_OPTIMIZATIONS, the CPU utilization drops. Along with some other fixes, the DX11 version is now faster and uses less CPU time than the DX9 version (which is what I would hope/expect). Profiling shows that Sleep has dropped from >30% to <1%.
Unfortunately, this page tells me:
This flag is not recommended for general use.
Oh.
So, any ideas on how to get decent performance without using debug flags?

Control motor/position over slow bus

I seem to have coded myself into a corner with the following issue: I'm trying to control a motor on a robot through a slow RS485-based bus connection. Unfortunately, I don't have access to the firmware on the motor, so I'm stuck with the current setup.
The biggest issue is that I can only control the motor's target speed. While I can retrieve its absolute position through a built-in encoder, there is no positioning function built into the firmware on the motor itself.
The second issue is that the bus connection is really slow, the somewhat awkward protocol needs 25 ms for a full cycle - is controlling a position via speed adjustments even feasible this way?
I have a tried a naive approach of estimating the position 25 ms ahead, subtracting the current position and dividing by 25 ms to calculate the speed required to the next desired position. However, this oscillates badly at certain speeds when targeting a fixed position, I assume due to the high cycle times producing a lot of overshoot.
Maybe a PID controller could help, but I am unsure what the target value would be -- every PID I have used so far used a fixed target. A completely moving target (i.e. the position) is hard to imagine, at least for me.
What's the usual way to deal with a situation like this? Maybe combine the naive approach and add PID-control only for an additional offset term? Or do I need to buy different motors?
If you want to keep the benefits of rs485 (it has some great positive things), then you likely would need to rethink how you drive this engine.
It might be easier to change the motor control, so that you only have to send some numeric data as "end position" and leave it to you smart control to handle that. In that situation your rs485 communication is minimal.
I always tend to think keep the "brains" at place where they are needed in industrial environments so you keep your IO down, or else someday you end up with behemoths such as industrial ethernet.

Read data from PLC with Delphi and libnodave library

I’m here again with a new question; this time about PLC.
I start by saying I’m new of PLC and I’ve never saw one of them until a couple of month ago.
I’m asked to write a program that read, from Delphi, some data from a PLC Siemens S7-300 in order to archive them in a SQL Server database. I’m using the “libnodave” library.
The program is quite simple. I must verify a bit and when it is on I have to read the data from the PLC and set off the bit. With the library I’ve told about I can read and write without problems, but the data I have to read are stored in a group of byte (about 60 bytes), so I’ve to read some bytes, skip some others and read others bytes. Moreover the bit I must test is in the end of this group of bytes.
So I read the entire group of bytes I put the data red in a group of variables and then I test the bit and, if it is on, I store the data into the database.
In order to skip the byte I don’t have to read I use this kind of statements:
for i := 1 to 14 do
daveGetU8(dc);
for i := 1 to 6 do
daveGetU16(dc);
My questions are these:
There is a better way to read the data skipping the ones I don’t have
to read?
Is it convenient to read the entire group of bytes and after
test the bit or is better to make two reading separated?
I say this because I’ve found in internet that the read operations requires some time, so is better to make the minimum numbers of reading possible.
Eros
Communicating with a PLC involves some overhead.
You send a request and after some time you receive an answer.
Often the communication is through a serial line with limited bandwidth.
The timing then involves:
Time to send the request
Time for the PLC to respond
Time to transfer the response
It is difficult to give a definite answer to your questions, since we don't know how critical the timing is.
Anyway, polling the flag byte only seems like reasonable way to go.
When the flag is set, read the entire block in one command and then clear the flag.
Reading the data in small parts to avoid the gaps, is probably more time consuming than reading the entire block at once.
You can make the maths yourself since you know the specifications.
Example:
Lets say the baud rate is 9600 baud. This means roughly 1 byte per millisecond transfer time. The command to read is about 10 bytes long and the block answer about 70 bytes (assuming the protocol is binary). The PLC delay time about 50 ms. This adds to 130 ms, while reading the flag only adds to about 70 ms.
Only you can say if the additional polling time of 70 ms is acceptable.
Edit: In a comment it is stated that the communication is via ethernet on a 100+ MBit/s line. In that case, I suggest to read all data in one command and process it in the PC. Timing is of little concern with such bandwidth.

How to improve accuracy of profiling

I want to improve the running time of some code.
In order to that I first time the running time of all relevant code, using code like this:
before:= rdtsc;
myobject.run;
after:= rdtsc;
Then I zoom in and time a relevant part, like so:
procedure myobject.part;
begin
StartTime:= rdtsc;
...
EndTime:= rdtsc;
inc(TotalTime, (EndTime- StartTime));
end;
I have some code to copy paste the timings into Excel, a typical outcome would look like:
(the 89.8% and 10.2% adding up to 100% is a coincidence and has nothing to do with the data or the question)
(when the data shows 1 it means 0 to avoid divide by zero errors)
Note the difference between run A and run B.
I have not changed anything yet so run A and B should give the same running time.
Further note that I know that on both runs procedure part was invoked exactly the same number of times (the data is the same and the algorithm is deterministic).
The running time of procedure part is very short (it is just called many times).
If there was some way to block out other processes during these short bursts of runtime (less than 700 CPU cycles) my timings would be much more accurate.
How do I get these timings to be more reliable?
Is there a way to monopolize the CPU to only run my task when timing and nothing else?
Note that I'm not looking for obvious answers like:
- Close other running programs
- Disable the virusscanner etc...
I've tagged the question Delphi because I'm using Delphi right now (and there may be some Delphi specific option to achieve this result).
I've also tagged it language-agnostic because there may be some more general way.
Update
Because I'm using the CPU instruction RDTSC I'm not affected by CPU throttling. If the CPU slows down, the number of cycles stays the same.
Update2
I have 2 answers, but neither answers the question...
The question is how do I prevent these changes in running time?
Do I have to run the code 20x and always compare the lowest running time out of the 20 runs?
Or to I set my program priority to realtime?
Or is there some other trick to use so my code sample does not get interrupted?
To want to improve the running time of some code.
In order to that I first time the running time of all relevant code, ...
OK, I'm a bit of a stuck record on this subject, but lots of people think that to improve running time requires first measuring it accurately.
Not So.
Improving running time requires finding out what's taking a large fraction of time (the exact fraction does not matter) and doing it differently or maybe not at all.
What it's doing is often not revealed by timing individual routines.
Here's the method I use,
and here's a very amateur video of it.
The problem with profiling your code like that, by sticking special statements into it, is that those special statements themselves take time to run. And since the things taking the most time are likely to be things happening in tight loops, the more they run, the more they distort your timings. What you need for good information is something that will observe your program from outside, without modifying the executing code.
In other words, you need a sampling profiler. And there just happens to be a very good one for Delphi available for free, by the rather descriptive name of Sampling Profiler. It runs your program and watches what it's doing, then correlates that against the map file (make sure to set up your project options to generate a Detailed map file) to give you an intelligible readout on what your program is spending its time on.
And if you want to narrow things down, you can use OutputDebugString to output profiling commands to make it only pay attention to specific parts of your code. It's got instructions in the help file.
I've used a lot of different methods, and this is the most useful way I've found to figure out what Delphi programs are spending their time on. And it's free. Give it a try.

DSP on Beaglebone

I have a Beaglebone running Ubuntu. We want to continuously sample from 3 on-board ATD converters at 100KS/s, and every window of samples we will run a cross correlation DSP algorithm. Once we find a correlation value above a threshold, we will send the value to a PC.
My concern is the process scheduling in Ubuntu. If our process gets swapped out and an ATD sample becomes available during this time, the process will miss the sample. We need to ensure that our process will capture every sample and save it in memory.
With this being said, is there a way to trigger interrupts on the Beaglebone so that if an ATD sample is ready, the sample will be saved in the memory of our program even if the program does not have the processor at the time?
Thanks!
You might be able to trigger the EDMA or use the PRUSS. Probably best to ask on beagleboard#googlegroups.com. There isn't a DSP per-se on the BeagleBone.
This is not exactly an answer to your question, but hopefully it explains how the process works. Since you didn't mention what hardware you are running for AD conversion, maybe this is the best that can be done:
With audio hardware, which faces the same problem, the solution comes from the hardware and the drivers working together: whenever the hardware has filled up enough of the buffer it signals the driver (via an interrupt or some similar mechanism). In some cases, it's also possible that the driver polls the hardware or something like that, but that's a less efficient solution, and I'm not sure anyone does it that way anymore (maybe on cheaper hardware?). From there, the driver process may call right into the end-user process, or it may simply mark the relevant end-user process as "runnable". Either way, control needs to be transferred to the end user process.
For that to happen, the end user process must be running at a higher priority than anything else occupying the CPUs at that moment. To guarantee that your process will always be first in the queue, you can run it at a high priority, with the appropriate permissions, you can even run in very high priorities.
The time it takes for the top priority process to go from runnable to running is sometimes called the "latency" of the OS, though I am sure there's a more specific technical term. The latency of Linux is on the order of 1 ms, but since it's not a "hard" real-time OS, this is not a guarantee. If this is too long to handle your chunks of data, you may have to buffer some of it in your driver.

Resources