Snopt Exceeds Time Limit - drake

As a followup from this post, I've successfully passed in "Time limit" through Drake and into Snopt. I've verified this by checking the output debug file that Snopt generates, and see that my specified time limit was set.
However, I notice that Snopt often exceeds my time limit (ex: specified time limit of 0.5s, but actual runtime was 1s). This is actually documented in Snopt, since the time limit is "only checked every 20 minor iterations".
I'm purely looking for advice from those who have worked with Snopt: are there any tricks to employ that can force Snopt to stay closer to the time limit? One immediate idea is to run the solver in a separate thread that is killed at the time limit, but are there any other tricks using parameters or modifying the optimization problem? I'm trying to solve IK for a 7DOF arm with self-collision avoidance.

Related

Break points of load testing

My application is overloaded or not able to do actions after some time what type of errors we will have and how do we identify breakpoints of application with the load test. what types of tests we can do to identify breakpoints. thanks in advance
If you're wanting to overload a web front end application you can try set up concurrent users in a selenium test and seeing how it might break.
If you're wanting to test back end applications then you could write unit / integration tests in a multi-threaded approach and hit it with a lot of queries.
Your question does however need to be a bit more specific or provide some additional info though.
There are 2 main performance testing types:
Load testing - when you put the system under anticipated load, i.e. exactly mimic its real usage by real users and check whether it is capable of supporting X concurrent users providing reasonable response time
Stress testing - when you identify the application under test boundaries and breaking points by putting it under heavier load. I.e. start from anticipated number of users (if you don't have an "anticipated" number - start from 1) and gradually increase the load at the same time keeping an eye on performance metrics.
Ideally when you increase the load by factor of 2x the throughput (number of requests per second) should increase by the same factor. When you increase the load but throughput does not increase it means that you found so called saturation point - it is basically the maximum number of users your system can efficiently support prior to degradation.
If you continue increasing the load you will observe increased response time. Also errors can start occurring. When response starts exceeding the maximum defined in NFR or SLA - you can call this "breaking point".
There is also one more "interesting" performance testing type - Soak Testing which is basically the same as Load Testing (or a little bit more users) but for prolonged period of time, this way you can detect the majority of memory leaks.

How to improve accuracy of profiling

I want to improve the running time of some code.
In order to that I first time the running time of all relevant code, using code like this:
before:= rdtsc;
myobject.run;
after:= rdtsc;
Then I zoom in and time a relevant part, like so:
procedure myobject.part;
begin
StartTime:= rdtsc;
...
EndTime:= rdtsc;
inc(TotalTime, (EndTime- StartTime));
end;
I have some code to copy paste the timings into Excel, a typical outcome would look like:
(the 89.8% and 10.2% adding up to 100% is a coincidence and has nothing to do with the data or the question)
(when the data shows 1 it means 0 to avoid divide by zero errors)
Note the difference between run A and run B.
I have not changed anything yet so run A and B should give the same running time.
Further note that I know that on both runs procedure part was invoked exactly the same number of times (the data is the same and the algorithm is deterministic).
The running time of procedure part is very short (it is just called many times).
If there was some way to block out other processes during these short bursts of runtime (less than 700 CPU cycles) my timings would be much more accurate.
How do I get these timings to be more reliable?
Is there a way to monopolize the CPU to only run my task when timing and nothing else?
Note that I'm not looking for obvious answers like:
- Close other running programs
- Disable the virusscanner etc...
I've tagged the question Delphi because I'm using Delphi right now (and there may be some Delphi specific option to achieve this result).
I've also tagged it language-agnostic because there may be some more general way.
Update
Because I'm using the CPU instruction RDTSC I'm not affected by CPU throttling. If the CPU slows down, the number of cycles stays the same.
Update2
I have 2 answers, but neither answers the question...
The question is how do I prevent these changes in running time?
Do I have to run the code 20x and always compare the lowest running time out of the 20 runs?
Or to I set my program priority to realtime?
Or is there some other trick to use so my code sample does not get interrupted?
To want to improve the running time of some code.
In order to that I first time the running time of all relevant code, ...
OK, I'm a bit of a stuck record on this subject, but lots of people think that to improve running time requires first measuring it accurately.
Not So.
Improving running time requires finding out what's taking a large fraction of time (the exact fraction does not matter) and doing it differently or maybe not at all.
What it's doing is often not revealed by timing individual routines.
Here's the method I use,
and here's a very amateur video of it.
The problem with profiling your code like that, by sticking special statements into it, is that those special statements themselves take time to run. And since the things taking the most time are likely to be things happening in tight loops, the more they run, the more they distort your timings. What you need for good information is something that will observe your program from outside, without modifying the executing code.
In other words, you need a sampling profiler. And there just happens to be a very good one for Delphi available for free, by the rather descriptive name of Sampling Profiler. It runs your program and watches what it's doing, then correlates that against the map file (make sure to set up your project options to generate a Detailed map file) to give you an intelligible readout on what your program is spending its time on.
And if you want to narrow things down, you can use OutputDebugString to output profiling commands to make it only pay attention to specific parts of your code. It's got instructions in the help file.
I've used a lot of different methods, and this is the most useful way I've found to figure out what Delphi programs are spending their time on. And it's free. Give it a try.

Which factors affect the speed of cpu tracing?

When I use YJP to do cpu-tracing profile on our own product, it is really slow.
The product runs in a 16 core machine with 8GB heap, and I use grinder to run a small load test (e.g. 10 grinder threads) which have about 7~10 steps during the profiling. I have a script to start the product with profiler, start profiling (using controller api) and then start grinder to emulate user operations. When all the operations finish, the script tells the profiler to stop profiling and save snapshot.
During the profiling, for each step in the grinder test, it takes more than 1 million ms to finish. The whole profiling often takes more than 10 hours with just 10 grinder threads, and each runs the test 10 times. Without profiler, it finishes within 500 ms.
So... besides the problems with the product to be profiled, is there anything else that affects the performance of the cpu tracing process itself?
Last I used YourKit (v7.5.11, which is pretty old, current version is 12) it had two CPU profiling settings: sampling and tracing, the latter being much faster and less accurate. Since tracing is supposed to be more accurate I used it myself and also observed huge slowdown, in spite of the statement that the slowdown were "average". Yet it was far less than your results: from 2 seconds to 10 minutes. My code is a fragment of a calculation engine, virtually no IO, no waits on whatever, just reading a input, calculating and output the result into the console - so the whole slowdown comes from the profiler, no external influences.
Back to your question: the option mentioned - samping vs tracing, will affect the performance, so you may try sampling.
Now that I think of it: YourKit can be setup such that it does things automatically, like making snapshots periodically or on low memory, profiling memory usage, object allocations, each of this measures will make profiling slowlier. Perhaps you should make an online session instead of script controlled, to see what it really does.
According to some Yourkit Doc:
Although tracing provides more information, it has its drawbacks.
First, it may noticeably slow down the profiled application, because
the profiler executes special code on each enter to and exit from the
methods being profiled. The greater the number of method invocations
in the profiled application, the lower its speed when tracing is
turned on.
The second drawback is that, since this mode affects the execution
speed of the profiled application, the CPU times recorded in this mode
may be less adequate than times recorded with sampling. Please use
this mode only if you really need method invocation counts.
Also:
When sampling is used, the profiler periodically queries stacks of
running threads to estimate the slowest parts of the code. No method
invocation counts are available, only CPU time.
Sampling is typically the best option when your goal is to locate and
discover performance bottlenecks. With sampling, the profiler adds
virtually no overhead to the profiled application.
Also, it's a little confusing what the doc means by "CPU time", because it also talks about "wall-clock time".
If you are doing any I/O, waits, sleeps, or any other kind of blocking, it is important to get samples on wall-clock time, not CPU-only time, because it's dangerous to assume that blocked time is either insignificant or unavoidable.
Fortunately, that appears to be the default (though it's still a little unclear):
The default configuration for CPU sampling is to measure wall time for
I/O methods and CPU time for all other methods.
"Use Preconfigured Settings..." allows to choose this and other
presents. (sic)
If your goal is to make the code as fast as possible, don't be concerned with invocation counts and measurement "accuracy"; do find out which lines of code are on the stack a large fraction of the time, and why.
More on all that.

How accurate is Delphi's 'Now' timestamp

I'm going to be working on a project that will require (fairly) accurate time-stamping of incoming RS232 serial and network data from custom hardware. As the data will be coming from a number of independant hardware sources, I will need to timestamp all data so it can be deskewed/ interpolated to a nominal point in time.
My immediate though was just to use the inbuilt Now command to timestamp, however a quick Google seems to indicate that this is only going to be accurate to around 50 msecs or so.
Unfortunately, the more I read the more confused I become. There seems to be a lot of conflicting advice on GetTickCount and QueryPerformanceCounter, with complications due to todays multicore processors and CPU throttling. I have also seen posts recommending using the Windows multimedia timers, but I cannot seem to find any code snippets to do this.
So, can anyone advise me:
1) How accurate 'Now' will be.
2) Whether there is a simple, higher accuracy alternative.
Note: I would be hoping to timestamp to within, say , 10 milliseconds, and i am not looking for a timer as such, just a better time-stamping method. This will be running on a Windows 7 32 bit low-power micro-PC. I will be using either Delphi XE or Delphi 2007, if it makes any difference.
According to documentation, Now is as accurate only to the nearest second:
Although TDateTime values can represent milliseconds, Now is accurate only to the nearest second.
Despite this, looking at the current implementation, Now is as accurate as the GetLocalTime windows API could be.
Making a quick test, it shows Now returns values for each millisecond in the clock, for example:
begin
System.SysUtils.FormatSettings.LongTimeFormat := 'hh:mm:ss.zzz';
for I := 1 to 5000 do
Writeln(TimeToStr(Now()));
end.
When I executed this console program from the command line project1 >times.txt, in a Windows 7 64 bits machine, I got a file that goes along 29 milliseconds continually (no one is missing in the file).
You have to face the fact that running in a Windows environment, your application/thread may get processor slices with varying time in between, depending on how busy is the system and the priority of your application/threads versus all the other threads running in the system.

Interpretation of results of "Energy Usage" instrument tool

I am running "energy usage" instrument over ios application using a device, I wanted to use it to check how much battery is getting drained because of the app I am testing. It shows "Energy usage level" which is giving me numbers like 13/20 , 12/20 , etc over different points of time.
How to interpret the results(I know, it gives relative energy usage on a scale of 0-20) in terms of :
1) How much battery is getting drained because of the app and particular operation.
2) Which operation / function is causing this drain.
3) What number is considered as safe and what number should be considered as high / too high.
4) Any other conclusion that we can make ?
I would appreciate if some one can answer above questions or give me link for reference. I have searched around and could not find answers to above questions, I just found how to find out those relative energy usage numbers only.
My 2 cents:
1) You can create a UIAutomation script to repeatedly run some actions, and collect 'energy usage' upon each action. So that you can say "if make a call of 5 minutes, it takes xxx battery", "if keep navigating for 5 minutes, it takes xxxx battery".....
2) As I mentioned above; You can collect data against each action
3) I would say, try to find similar apps, and bench mark, compare with theirs.
4) Try to use different devices, iOS, and you can probably tell customers that what device/iOS is minimal required or recommended.
Energy Diagnostics reports power consumption number (we call them "electricities" at my office) are fairly unreliable. Powergremlin gives you some insight into the actual numbers that make up said "electicity" units. That won't answer parts 2-4 of our question, but it does provide more detail and accuracy than Energy Diagnostics.
The scale of Batter Consumption of IOS App is given by max 20 points.
if your app is running at 1/20,it means your app takes 20 hours to complete the batter
if it is running at 20/20 it takes 1 hour to complete the full batter.

Resources