Instantiation within loop - primitives and objects - memory

To make this language agnostic let's pseudo code something along the lines of:
for(int i=0;i<=N;i++){
double d=0;
userDefinedObject o=new userDefinedObject();
//effectively do something useful
o.destroy();
}
Now, this may get into deeper details between Java/C++/Python etc, but:
1 - Is doing this with primitives wrong or just sort of ugly/overkill (d could be defined above, and set to 0 in each iteration if need be).
2 - Is doing this with an object actually wrong? Now,I know Java will take care of the memory but for C++ let's assume we have a proper destructor that we call.
Now - the question is quite succinct - is this wrong or just a matter of taste?
Thank you.

Java Garbage Collector will take care of any memory allocation without a reference, which means that if you instantiate on each iteration, you will allocate new memory and lose the reference to the previous one. Said this, you can conclude that the GC will take care of the non-referenced memory, BUT you also have to consider the fact that memory allocation, specifically object initialization takes time and process. So, if you do this on a small program, you're probably not going to feel anything wrong. But let say you're working with something like Bitmap, the allocation will totally own your memory.
For both cases, I'd say it is a matter of taste, but in a real life project, you should be totally sure that you need to initialize within a loop

Related

Using something other than a Swift array for mutable fixed-size thread-safe data passed to OpenGL buffer

I am trying to squeeze every bit of efficiency out of my application I am working on.
I have a couple arrays that follow the following conditions:
They are NEVER appended to, I always calculate the index myself
The are allocated once and never change size
It would be nice if they were thread safe as long as it doesn't cost performance
Some hold primitives like floats, or unsigned ints. One of them does hold a class.
Most of these arrays at some point are passed into a glBuffer
Never cleared just overwritten
Some of the arrays individual elements are changed entirely by = others are changed by +=
I currently am using swift native arrays and am allocating them like var arr = [GLfloat](count: 999, repeatedValue: 0) however I have been reading a lot of documentation and it sounds like Swift arrays are much more abstract then a traditional C-style array. I am not even sure if they are allocated in a block or more like a linked list with bits and pieces thrown all over the place. I believe by doing the code above you cause it to allocate in a continuous block but i'm not sure.
I worry that the abstract nature of Swift arrays is something that is wasting a lot of precious processing time. As you can see by my above conditions I dont need any of the fancy appending, or safety features of Swift arrays. I just need it simple and fast.
My question is: In this scenario should I be using some other form of array? NSArray, somehow get a C-style array going, create my own data type?
Im looking into thread safety, would a different array type that was more thread safe such as NSArray be any slower?
Note that your requirements are contradictory, particularly #2 and #7. You can't operate on them with += and also say they will never change size. "I always calculate the index myself" also doesn't make sense. What else would calculate it? The requirements for things you will hand to glBuffer are radically different than the requirements for things that will hold objects.
If you construct the Array the way you say, you'll get contiguous memory. If you want to be absolutely certain that you have contiguous memory, use a ContiguousArray (but in the vast majority of cases this will give you little to no benefit while costing you complexity; there appear to be some corner cases in the current compiler that give a small advantage to ContinguousArray, but you must benchmark before assuming that's true). It's not clear what kind of "abstractness" you have in mind, but there's no secrets about how Array works. All of stdlib is open source. Go look and see if it does things you want to avoid.
For certain kinds of operations, it is possible for other types of data structures to be faster. For instance, there are cases where a dispatch_data is better and cases where a regular Data would be better and cases where you should use a ManagedBuffer to gain more control. But in general, unless you deeply know what you're doing, you can easily make things dramatically worse. There is no "is always faster" data structure that works correctly for all the kinds of uses you describe. If there were, that would just be the implementation of Array.
None of this makes sense to pursue until you've built some code and started profiling it in optimized builds to understand what's going on. It is very likely that different uses would be optimized by different kinds of data structures.
It's very strange that you ask whether you should use NSArray, since that would be wildly (orders of magnitude) slower than Array for dealing with very large collections of numbers. You definitely need to experiment with these types a bit to get a sense of their characteristics. NSArray is brilliant and extremely fast for certain problems, but not for that one.
But again, write a little code. Profile it. Look at the generated assembler. See what's happening. Watch particularly for any undesired copying or retain counting. If you see that in a specific case, then you have something to think about changing data structures over. But there's no "use this to go fast." All the trade-offs to achieve that in the general case are already in Array.

Best practice for dealing with package allocation in Go

I'm writing a package which makes heavy use of buffers internally for temporary storage. I have a single global (but not exported) byte slice which I start with 1024 elements and grow by doubling as needed.
However, it's very possible that a user of my package would use it in such a way that caused a large buffer to be allocated, but then stop using the package, thus wasting a large amount of allocated heap space, and I would have no way of knowing whether to free the buffer (or, since this is Go, let it be GC'd).
I've thought of three possible solutions, none of which is ideal. My question is: are any of these solutions, or maybe ones I haven't thought of, standard practice in situations like this? Is there any standard practice? Any other ideas?
Screw it.
Oh well. It's too hard to deal with this, and leaving allocated memory lying around isn't so bad.
The problem with this approach is obvious: it doesn't solve the problem.
Exported "I'm done" or "Shrink internal memory usage" function.
Export a function which the user can call (and calling it intelligently is obviously up to them) which will free the internal storage used by the package.
The problem with this approach is twofold. First, it makes for a more complex, less clean interface to the user. Second, it may not be possible or practical for the user to know when calling such a function is wise, so it may be useless anyway.
Run a goroutine which frees the buffer after a certain period of the package going unused, or which shrinks the buffer (perhaps halving the length) whenever its size hasn't been increased in a while.
The problem with this approach is primarily that it puts unnecessary strain on the scheduler. Obviously a single goroutine isn't so bad, but if this were accepted practice, it wouldn't scale well if every package you imported were doing this under the hood. Also, if you have a time-sensitive application, you may not want code running when you're not aware of it (that is, you may assume that the package isn't doing any work when its functions are not being called - a reasonable assumption, I'd say).
So... any ideas?
NOTE: You can see the existing project here (the relevant code is only a few tens of lines).
A common approach to this is letting the client pass an existing []byte (or whatever) as an argument to some call/function/method. For example:
// The returned slice may be a sub-slice of dst if dst was large enough
// to hold the entire encoded block. Otherwise, a newly allocated slice
// will be returned. It is valid to pass a nil dst.
func Foo(dst []byte, whatever Bar) (ret []byte, err error)
(Example)
Another approach is to get a new []byte from a, for example cache and/or for example pool (if you prefer the later name for that concept) and rely on clients to return used buffers to such "recycle-bin".
BTW: You're doing it right by thinking about this. Where it's possible to reasonably reuse []byte buffers, there's a potential for lowering the GC load and thus making your program better performing. Sometimes the difference can be critical.
You could reslice your buffer at the end of every operation.
buffer = buffer[:0]
Then your function extendAndSliceBuffer would have the original backing array most likely available if it needs to grow. If not, you would suffer a new allocation, which you might get anyway when you do extendAndSliceBuffer.
Overall, I think a cleaner solution is to do like #jnml said and let the users pass their own buffer if they care about performance. If they don't care about performance, then you should not use a global var and simply allocate the buffer as you need and let it go when it gets out of scope.
I have a single global (but not exported) byte slice which I start
with 1024 elements and grow by doubling as needed.
And there's your problem. You shouldn't have a global like this in your package.
Generally the best approach is to have an exported struct with attached functions. The buffer should reside in this struct unexported. That way the user can instantiate it and let the garbage collector clean it up when they let go of it.
You also want to avoid requiring globals like this as it can hamper unit tests. A unit test should be able to instantiate the exported struct, as the user can, and do it each time for every test.
Also depending on what kind of buffer you need, bytes.Buffer may be useful as it already provides io.Reader and io.Writer functions. bytes.Buffer also automatically grows and shrinks its buffer. In buffer.go you'll see various calls to b.Truncate(0) that does the shrinking with the comment "reset to recover space".
It's generally really really bad form to write Go code that is not thread-safe. If two different goroutines call functions that modify the buffer at the same time, who knows what state the buffer will be in when they finish? Just let the user provide a scratch-space buffer if they decide that the allocation performance is a bottleneck.

Does instance variable throttles performance? Should one class serves two purposes?

I have a class that have multiple instance variables. I want to achieve two purposes with the class. It's possible that I may only use some variables for one purpose and sometime use both.
Here's a more concrete example. I want to create a class that every time the user tap the screen, a dog sprite and cat sprite appear with an animation. If tapped again, they continue to perform different animation. However, sometime I only want the dog sprite to appear and update. And some other rare times, I want the cat sprite to appear after a couple of taps after the dog sprite appeared.
The question is: does instance variable allocate too much memory? I'm highly concerned with performance, because I'm planning to make a memory-intensive game. Since it's hard to predict when I actually use all the instance variable, should I divide them into two classes? Let's divide the possible scenarios to get a better idea.
Only the Dog Sprite is used and the cat sprite never appears : The cat's instance variable is left untouched if left in one class.
The dog sprite appear first, then the cat sprite appear later : Both sprite will eventually appear. It's possible to divide it into two classes, but some methods are duplicated since methods such as the touch advance logic and animation are similar. But if we leave it in once class, scenario 1 could occur, which could possibly be solve without a lot of duplicate code being reproduced.
Other things could occur, but the problems is already discussed above. These are the pro and con from my point of view:
One Class Approach
Pro
Avoid some duplicate logic
No need to import multiple header that leads to some similar instance variable
Con
Possibly leave half of instance variables unused (including NSString, CCSprite, a lot of integers and floats, CCAnimation, CCLabelBMFont)
Two Class Approach
Pro
Less instance variables
Possibly inherit from the class without inheriting some unnecessary variables in the future
Con
Some logic are reproduced
It's difficult to decide which option I should use. Any suggestions would be great! Thank you in advance!
if (didHelp)
for (int x = 0; x < 100; x++)
NSLog(#"Thanks!");
I'm highly concerned with performance
You and thousands of other inexperienced developers. Seriously, there are two things you're most likely going to experience:
your idea is way out of proportion and no amount of performance optimization will make it work -> change your idea
performance won't matter the least bit and you simply wasted time
Performance is among the least important things a game developer needs to consider at the start of a project.
Why?
Case #2 is self evident.
Assessing case #1 with reasonable accuracy before you even get started requires experience. Even then it's difficult. Simply have a backup plan if feature X proves to be too technically challenging (or impossible). If you can't assess performance, and your idea won't work with any backup plan, again you have two options:
implement a different idea
create a quick prototype to find out the peak performance parameters: memory usage, CPU & GPU utilization, loading times, and whatever other fitness tests seem appropriate to find out if your idea is feasible within a few days, if not hours.
does instance variable allocate too much memory?
No, and allocated memory has very little to do with performance.
You can use class_getInstanceSize to see how much memory a class instance uses. Rarely ever will a class instance use more than 500 Bytes. However, this only counts memory allocated for instance variables - not the memory the instance variables may point to. In a cocos2d app it's fair to say that 95% of your memory usage will come from textures.
It's difficult to decide which option I should use
Always strive to:
write readable code
write maintainable code
write less code
write safer code
write code only once (avoid duplication)
EmbodiedD,
You are certainly worried about too much here. The heap is going to get quite large in most applications. One simple class will be irrelevant. When you have 1000 instances of a data intensive class then you might have to start thinking about profiling.
If you are worried about organization, that's another thing altogether.
If you are loading classA with var1 and var2 or loading classA with var1 and class2 with var2, its more a matter of how you were taught to do abstraction.
This is a somewhat open-ended question, and there are indefinitely many ways to approach this question. Therefore, this is my approach and may or may not fit in every scenarios.
There are cases when an instance variable could be replace -- however this should not affect your decision if necessarily needed. Instance variable should be used when needed. Do not perform endless calculation just to substitute a single instance variable. Do try to limit your instance variables into variables when it is not needed outside a certain scope. Thanks to the informative users that posted on here, instance variable left unused impact performance at such a microscopic scale that you should not worry.
From my point of view, a class should only have one focus -- on function and and should pass on any other information to other class that need it. Information should remain encapsulated -- with one function to maintain reusability in other projects.
One should focus on the relationship of the function. IT-IS is the relationship that say one object should inherit another. In reality, it's like a Sienna-IS a car. A boat-IS a vehicle. Therefore, these objects should inherit any information from it's superclass. On contrast, IT-HAS say that these class contain something, usually of a quality or component, that cannot be inherited. A sienna-IS a car, but a tire-IS-NOT a sienna. Rather, a sienna-HAS a tire.
Another important relationship is delegation. The fancy definition say it perform a task on behalf of another, much like how delegates in the US represent the people of their states. Basically, it pass a certain information saying to the other class, who should in good practice, not affect the other former class. The class should not know exactly who it pass on to, but know enough to pass on certain information. This process of not knowing the exact identity of the delegate is called coupling.
In my case, of cats and dogs, delegation along with IT-IS is subjectively the best answer. Your opinion may differ. A base class should contain all the information that the Cat and Dog share. And any other information that is needed, such as the sprite's position, should be passed on as a delegate to the other class. And based on what I wrote, a class should not, in normal circumstances, programmed to do two function; for a class do one function and pass on all other dutiful needs to another.

What are the performance implications of these C# features?

I have been designing a component-based game library, with the overall intention of writing it in C++ (as that is my forte), with Ogre3D as the back-end. Now that I am actually ready to write some code, I thought it would be far quicker to test out my framework under the XNA4.0 framework (somewhat quicker to get results/write an editor, etc). However, whilst I am no newcomer to C++ or C#, I am a bit of a newcomer when it comes to doing things the "XNA" way, so to speak, so I had a few queries before I started hammering out code:
I read about using arrays rather than collections to avoid performance hits, then also read that this was not entirely true and that if you enumerated over, say, a concrete List<> collection (as opposed to an IEnumerable<>), the enumerator is a value-type that is used for each iteration and that there aren't any GC worries here. The article in question was back in 2007. Does this hold true, or do you experienced XNA developers have real-world gotchas about this? Ideally I'd like to go down a chosen route before I do too much.
If arrays truly are the way to go, no questions asked, I assume when it comes to resizing the array, you copy the old one over with new space? Or is this off the mark? Do you attempt to never, ever resize an array? Won't the GC kick in for the old one if this is the case, or is the hit inconsequential?
As the engine was designed for C++, the design allows for use of lambdas and delegates. One design uses the fastdelegate library which is the fastest possible way of using delegates in C++. A more flexible, but slightly slower approach (though hardly noticeable in the world of C++) is to use C++0x lambdas and std::function. Ideally, I'd like to do something similar in XNA, and allow delegates to be used. Does the use of delegates cause any significant issues with regard to performance?
If there are performance considerations with regards to delegates, is there a difference between:
public void myDelegate(int a, int b);
private void myFunction(int a, int b)
{
}
event myDelegate myEvent;
myEvent += myFunction;
vs:
public void myDelegate(int a, int b);
event myDelegate myEvent;
myEvent += (int a, int b) => { /* ... */ };
Sorry if I have waffled on a bit, I prefer to be clear in my questions. :)
Thanks in advance!
Basically the only major performance issue to be aware of in C# that is different to what you have to be aware of in C++, is the garbage collector. Simply don't allocate memory during your main game loop and you'll be fine. Here is a blog post that goes into detail.
Now to your questions:
1) If a framework collection iterator could be implemented as a value-type (not creating garbage), then it usually (always?) has been. You can safely use foreach on, for example, List<>.
You can verify if you are allocating in your main loop by using the CLR Profiler.
2) Use Lists instead of arrays. They'll handle the resizing for you. You should use the Capacity property to pre-allocate enough space before you start gameplay to avoid GC issues. Using arrays you'd just have to implement all this functionality yourself - ugly!
The GC kicks in on allocations (not when memory becomes free). On Xbox 360 it kicks in for every 1MB allocated and is very slow. On Windows it is a bit more complicated - but also doesn't have such a huge impact on performance.
3) C# delegates are pretty damn fast. And faster than most people expect. They are about on-par with method calls on interfaces. Here and here are questions that provide more detials about delegate performance in C#.
I couldn't say how they compare to the C++ options. You'd have to measure it.
4) No. I'm fairly sure this code will produce identical IL. You could disassemble it and check, or profile it, though.
I might add - without checking myself - I suspect that having an event myDelegate will be slower than a plain myDelegate if you don't need all the magic of event.

why is stack and heap both required for memory allocation

I've searched a while but no conclusive answer is present on why value types have to be allotted on the stack while the reference types i.e. dynamic memory or the objects have to reside on the heap.
why cannot the same be alloted on the stack?
They can be. In practice they're not because stack is a typically scarcer resource than heap and allocating reference types on the stack may exhaust it quickly. Further, if a function returns data allocated on its stack, it will require copying semantics on the caller's part or risk returning something that will be overwritten by the next function call.
Value types, typically local variables, can be brought in and out of scope quickly and easily with native machine instructions. Copy semantics for value types on return is trivial as most fit into machine registers. This happens often and should be as cheap as possible.
It is not correct that value types always live on the stack. Read Jon Skeet's article on the topic:
Memory in .NET - what goes where
I understand that the stack paradigm (nested allocations/deallocations) cannot handle certain algorithms which need non-nested object lifetimes.
just as the static allocation paradigm cannot handle recursive procedure calls. (e.g. naive calculation of fibonacci(n) as f(n-1) + f(n-2))
I'm not aware of a simple algorithm that would illustrate this fact though. any suggestions would be appreciated :-)
Local variables are allocated in the stack. If that was not the case, you wouldn't be able to have variables pointing to the heap when allocating variable's memory. You CAN allocate things in the stack if you want, just create a buffer big enough locally and manage it yourself.
Anything a method puts on the stack will vanish when the method exits. In .net and Java, it would be perfectly acceptable (in fact desirable) if a class object vanished as soon as the last reference to it vanished, but it would be fatal for an object to vanish while references to it still exist. It is not in the general case possible for the compiler to know, when a method creates an object, whether any references to that object will continue to exist after the method exits. Absent such assurance, the only safe way to allocate class objects is to store them on the heap.
Incidentally, in .net, one major advantage of mutable value types is that they can be passed by reference without surrendering perpetual control over them. If class 'foo', or a method thereof, has a structure 'boz' which one of foo's methods passes by reference to method 'bar', it is possible for bar, or the methods it calls, to do whatever they want to 'boz' until they return, but once 'bar' returns any references it held to 'boz' will be gone. This often leads to much safer and cleaner semantics than the promiscuously-sharable references used for class objects.

Resources