Why memory leaks with a dynamic array of a custom class? - mql4

I'm creating a indicator that recognizes candlestick shapes.
To do that I created a separate class Candlestick that I include to the indicator file.
The problem is that I suffer from memory leaks.
I'm new to pointers and after reading / watching a lot, I still seem to miss something here.
This is the Indicator class. The content of the Candlestick class is irrelevant so I leave that out.
Candlestick *candles[];
void OnDeinit(const int reason)
{
for(int i = 0; i < ArraySize(candles); i++ ){
delete(candles[i]);
}
}
int OnCalculate(args here)
{
ArrayResize(candles, Bars);
for(int i = MathMax(Bars-2-IndicatorCounted(), 1); i >= 0; i--)
{
candles[i] = new Candlestick();
// Do stuff with this candle (and other candles) here e.g.
if(candles[i+1].type == BULLISH) Print("Last candle was Bullish");
}
}
When I do this I get memory leak errors. It seems that I need to delete the pointers to the candles in that dynamic array. The problem is, when and where? Because I need them in the next iteration of the for(){...} loop. So I can't delete it there.
When I delete it in the OnDeinit() function there are still candles out there and I still get the leak error.
How come?

First, Nick, welcome to the Worlds of MQL4
You might have already realised, the MQL4 code is not a C.
Among many important differences, the key here is what does the code-execution platform ( the MetaTrader Terminal 4 ) do in what moment.
OnCalculate() is a zombie-alike process, which gets invoked many times, but anyway, definitely not under your control.
Next, OnCalculate() by-design does not mean a new Bar.
How to?
MQL4 conceptually originates from days, when computing resources were many orders smaller and much more expensive in terms of their time-sharing CPU-MUX-ing during a code execution phase.
Thus the MQL4-user-domain language retains benefits from some hidden gems, that are not accessible directly. One of these is a very efficient register-based update-processing and keeping dynamic resources allocations on miminum, for their devastatingly adverse effects on Real-Time execution predictability.
This will help you understand how to design & handle your conceptual objects way smarter, best by mimicking this "stone-age"-but-VERY-efficient behaviour ( both time-wise & memory-wise ), instead of flooding your memory-pool with infinite amount of unmanaged instances upon each call of OnCalulate() which sprinkles an endless count of new Candlestick(); // *--> candles[]
A best next step:
If in doubts, just read about best practices for ArrayResize() in the platform localhost-help/documentation, to start realise the things, that introduce overheads ( if not blocks ) in a domain, where nano$econd$ count & hurt in professional software design.

Related

How to handle searching Large Lists in Flutter?

I want to ask how I should handle a large list in Flutter. My app gets super slow when I am at a data item that is really deep in the list which I am searching. My list is 70,000+ objects of a data structure large.
The following is how I am "Searching" the list.
Future<Iterable<SomeDataStruct>> _getAllData() async {
return allData.where((a) => (a.dataTitle.toLowerCase().contains(querySearch.toLowerCase().trim())));
}
Building the list using a ListView.builder inside of a FutureBuilder.
When I search and a result or results from deep inside the list are populated the app is extremely slow to the point where I click a list item and it takes few seconds before it does its onTap. And if I need to change the search query it takes time for the soft keyboard to come back up after I click on the TextField.
Where am I making a mistake or handling this wrong, what should I do to have my huge list searchable without making app unbearable.
EDIT:
How I made it not slow down the app after change to code. Is this correct?
String tempQuery;
List<SomeDataStruct> searchResults = [];
Future<List<SomeDataStruct>> _getAllData() async {
if(querySearch!=tempQuery) {
tempQuery = querySearch;
searchResults = allData.where((a) => (a.dataTitle.toLowerCase().contains(querySearch.toLowerCase().trim()))).toList();
}
return searchResults;
}
Contains is expensive
Contains-queries are expensive, because every entry needs to be checked at every position (up to value.length - searchTerm.length) if the search term can be found.
Limiting search support to the beginning of the string would improve performance a lot already. In addition you could create helper data-structures where the whole list of values is split into parts with the same character at the beginning. If the chunks are still too big another level could be added for the 2nd character. The lookup would be fast because there are only a limited number of characters.
Using a database might take off some programming work (maintaining indexes). A database like SQLite could be used with indexes specialized to your kind of queries.
Split into smaller chunks of work to allow the framework to do its work
If you can't limit to "beginning of string"-search, you could still split the data structure into smaller chunks and invoke search for each chunk async. This way the UI gets "some air to breath" to re-render the UI before the next chunk is searched. The search result would be updated incrementally.
Move work off the UI thread
Another way would be to start up another isolate and do the search there. Another isolate can run on another CPU (core) and therefore would not block the UI thread when searching. This way it wouldn't be necessary to split into chunks. It might still be advantageous though to incrementally update the UI instead of keeping the user waiting until the whole search result becomes available.
See also
https://api.dartlang.org/stable/2.2.0/dart-isolate/dart-isolate-library.html
https://pub.dartlang.org/packages/isolate
https://codingwithjoe.com/dart-fundamentals-isolates/
Caching
It might also help improve performance to keep search results in memory. For example if the user enters foo and then presses backspace then you could reuse the search result for fo that you previously calculated already, but that only helps in some cases.
Measuring
Another important point of course is to do benchmarking. Whatever you try to improve performance, create benchmarks to learn what measures have what effect and if it's worth it. You'll learn a lot about your scenario, your data, Dart, ..., and this will allow you to make good decisions.
Searching a list will crash your app if the list is large enough. I tested searching through a list of 150,000 items and my app crashed. It seems to work if what you searching is near the start of the list, but if you search for something that might be deep in the list for example index 100,000 then things slow down considerable and app will freeze.
To solve this I moved to a database solution by using the moor library which is built on top of sqlite. You simple follow the docs to set up the boilerplate stuff, doesn't take long.
You can create a simple database table which just 2 fields, id and another to hold your list values. For this example I call the other field itemName
In your database class you can have a function to do the filtering e.g.
Future<List<Item>> getFilteredItems(search) => (select(Items)..where((t) => t.itemName.like(search))).get();
Then your code will look something like:
String tempQuery;
List<SomeDataStruct> searchResults = [];
Future<List<SomeDataStruct>> _getAllData() async {
if(querySearch!=tempQuery) {
tempQuery = querySearch;
query = await MyDatabaseClass().getFilteredItems("%$querySearch%");
for (int i = 0; i<query.length; i++) {
searchResults.add(query[i].itemName); //itemName
}
}
return searchResults;
}
No more crashes or freezes after this.
Maybe this way :
ListView.builder(
itemBuilder: (BuildContext context, int index) {
return Text(data[index]);
},
)

Best way to access/store persistent data in CUDA along multiple kernel calls [duplicate]

I have 2 very similar kernel functions, in the sense that the code is nearly the same, but with a slight difference. Currently I have 2 options:
Write 2 different methods (but very similar ones)
Write a single kernel and put the code blocks that differ in an if/else statement
How much will an if statement affect my algorithm performance?
I know that there is no branching, since all threads in all blocks will enter either the if, or the else.
So will a single if statement decrease my performance if the kernel function is called a lot of times?
You have a third alternative, which is to use C++ templating and make the variable which is used in the if/switch statement a template parameter. Instantiate each version of the kernel you need, and then you have multiple kernels doing different things with no branch divergence or conditional evaluation to worry about, because the compiler will optimize away the dead code and the branching with it.
Perhaps something like this:
template<int action>
__global__ void kernel()
{
switch(action) {
case 1:
// First code
break;
case 2:
// Second code
break;
}
}
template void kernel<1>();
template void kernel<2>();
It will slightly decrease your performance, especially if it's in an inner loop, since you're wasting an instruction issue slot every so often, but it's not nearly as much as if a warp were divergent.
If it's a big deal, it may be worth moving the condition outside the loop, however. If the warp is truly divergent, though, think about how to remove the branching: e.g., instead of
if (i>0) {
x = 3;
} else {
x = y;
}
try
x = ((i>0)*3) | ((i<3)*y);

What are the performance implications of these C# features?

I have been designing a component-based game library, with the overall intention of writing it in C++ (as that is my forte), with Ogre3D as the back-end. Now that I am actually ready to write some code, I thought it would be far quicker to test out my framework under the XNA4.0 framework (somewhat quicker to get results/write an editor, etc). However, whilst I am no newcomer to C++ or C#, I am a bit of a newcomer when it comes to doing things the "XNA" way, so to speak, so I had a few queries before I started hammering out code:
I read about using arrays rather than collections to avoid performance hits, then also read that this was not entirely true and that if you enumerated over, say, a concrete List<> collection (as opposed to an IEnumerable<>), the enumerator is a value-type that is used for each iteration and that there aren't any GC worries here. The article in question was back in 2007. Does this hold true, or do you experienced XNA developers have real-world gotchas about this? Ideally I'd like to go down a chosen route before I do too much.
If arrays truly are the way to go, no questions asked, I assume when it comes to resizing the array, you copy the old one over with new space? Or is this off the mark? Do you attempt to never, ever resize an array? Won't the GC kick in for the old one if this is the case, or is the hit inconsequential?
As the engine was designed for C++, the design allows for use of lambdas and delegates. One design uses the fastdelegate library which is the fastest possible way of using delegates in C++. A more flexible, but slightly slower approach (though hardly noticeable in the world of C++) is to use C++0x lambdas and std::function. Ideally, I'd like to do something similar in XNA, and allow delegates to be used. Does the use of delegates cause any significant issues with regard to performance?
If there are performance considerations with regards to delegates, is there a difference between:
public void myDelegate(int a, int b);
private void myFunction(int a, int b)
{
}
event myDelegate myEvent;
myEvent += myFunction;
vs:
public void myDelegate(int a, int b);
event myDelegate myEvent;
myEvent += (int a, int b) => { /* ... */ };
Sorry if I have waffled on a bit, I prefer to be clear in my questions. :)
Thanks in advance!
Basically the only major performance issue to be aware of in C# that is different to what you have to be aware of in C++, is the garbage collector. Simply don't allocate memory during your main game loop and you'll be fine. Here is a blog post that goes into detail.
Now to your questions:
1) If a framework collection iterator could be implemented as a value-type (not creating garbage), then it usually (always?) has been. You can safely use foreach on, for example, List<>.
You can verify if you are allocating in your main loop by using the CLR Profiler.
2) Use Lists instead of arrays. They'll handle the resizing for you. You should use the Capacity property to pre-allocate enough space before you start gameplay to avoid GC issues. Using arrays you'd just have to implement all this functionality yourself - ugly!
The GC kicks in on allocations (not when memory becomes free). On Xbox 360 it kicks in for every 1MB allocated and is very slow. On Windows it is a bit more complicated - but also doesn't have such a huge impact on performance.
3) C# delegates are pretty damn fast. And faster than most people expect. They are about on-par with method calls on interfaces. Here and here are questions that provide more detials about delegate performance in C#.
I couldn't say how they compare to the C++ options. You'd have to measure it.
4) No. I'm fairly sure this code will produce identical IL. You could disassemble it and check, or profile it, though.
I might add - without checking myself - I suspect that having an event myDelegate will be slower than a plain myDelegate if you don't need all the magic of event.

Game entities: Handling collisions

I'm trying to make a 2D game (my first one). I'm not looking for algorithms to determine whether or not objects collide, but rather how I should organize everything. I'm having great difficulty in figuring out what responsibility should go to which so class, so much so that I started feeling stupid. =))
I guess my principal classes are Entity (and its children) and EntityManager. What interface should Entity provide, for example? How should entities become aware that they are in collision with another entity — should the manager perhaps update them and pass a CollisionEvent to the handleCollision function of each entity? Any suggestions are more than welcomed.
I assume that EntityManager contains all entities so the manager is the one who needs to check collisons between all entities. like this
for(int i = 0 ; i < totalEntities ; ++i)
for(int j = i+1 ; j < totalEntities ; ++j)
{
CollisionInfo info;
if( CheckCollision(entities[i], entities[j], info) )
{
// Okay, what we should do? I suggest two solutions
// 1. simple one
entities[i].OnCollide(info);
entities[j].OnCollide(info);
// 2. event-or-message driven system
EventManager::Instance()->SendEvent(COLLISION_EVENT, info)
}
}
The first one is probably the simplest one. However, what if there are some other objects which are interested in this collision event? like sound, logging or scoring system? Even entities are which not related to that collision might want to "know" this event so that they can change their behavior. (Think a boss monster gets more angry when its kids are killed by you!)
So, to make it more flexible, #2 has come. First, you need to have your own event-or-message system ( you can think it as Windows message system ) where objects can subscribe specific messages they want to handle. Then, EntityManager can simply propagate collision events by sending messages. Entities can subscribe this collision message type and they should know if they need to handle this particular collision by examining the info. Likewise, your scoring system can subscribe it and calculate new score for kills.
If the game is simple enough, you could go for #1 but I highly recommend #2 and you will be very satisfied with it. Good Luck! :)

how to get page size

I was asked this question in an interview Plz tell me the answer :-
You have no documentation of the kernel. You only knows that you kernel supports paging.
How will you find that page size ? There is no flag or macro you have that can tell you about page size.
I was given the hint as you can use Time to get the answer. I still have no clue for it.
Run code like the following:
for (int stride = 1; stride < maxpossiblepagesize; stride += searchgranularity) {
char* somemem = (char*)malloc(veryverybigsize*stride);
starttime = getcurrentveryaccuratetime();
for (pos = somemem; pos < somemem+veryverybigsize*stride; pos += stride) {
// iterate over "veryverybigsize" chunks of size "stride"
*pos = 'Q'; // Just write something to force the page back into physical memory
}
endtime = getcurrentveryaccuratetime();
printf("stride %u, runtime %u", stride, endtime-starttime);
}
Graph the results with stride on the X axis and runtime on the Y axis. There should be a point at stride=pagesize, where the performance no longer drops.
This works by incurring a number of page faults. Once stride surpasses pagesize, the number of faults ceases to increase, so the program's performance no longer degrades noticeably.
If you want to be cleverer, you could exploit the fact that the mprotect system call must work on whole pages. Try it with something smaller, and you'll get an error. I'm sure there are other "holes" like that, too - but the code above will work on any system which supports paging and where disk access is much more expensive than RAM access. That would be every seminormal modern system.
It looks to me like a question about 'how does paging actually work'
They want you to explain the impact that changing the page size will have on the execution of the system.
I am a bit rusty on this stuff, but when a page is full, the system starts page swapping, which slows everything down. So you want to run something that will fill up the memory to different sizes, and measure the time it takes to do a task. At some point there will be a jump, where the time taken to do the task will suddenly jump.
Like I said I am a bit rusty on the implementation of doing this. But i'm pretty sure that is the shape of the answer they were after.
Whatever answer they were expecting it would almost certainly be a brittle solution. For one thing you can have multiple pages sizes so any answer you may have gotten for one small allocation may be irrelevant for the next multi-megabyte allocation (see things like Linux's Large Page support).
I suspect the question was more aimed at seeing how you approached the problem rather than the final solution you came up with.
By the way this question isn't about linux because you do have documentation for that as well as POSIX compliance, for which you just call sysconf(_SC_PAGE_SIZE).

Resources