What is LINKEDIT Block - instruments

I have been doing performance measurement on my app and I see a sizeable amount allocated to _LINKEDIT block and I just wished to know if any one knows about it.
Thanks
Sudeep Dua

The __LINKEDIT segment contains raw data used by the dynamic linker,
such as symbol, string, and relocation table entries.
You can find more info about it here:
https://github.com/aidansteele/osx-abi-macho-file-format-reference

Related

Tracking address when writing to flash

My system needs to store data in an EEPROM flash. Strings of bytes will be written to the EEPROM one at a time, not continuously at once. The length of strings may vary. I want the strings to be saved in order without wasting any space by continuing from the last write address. For example, if the first string of bytes was written at address 0x00~0x08, then I want the second string of bytes to be written starting at address 0x09.
How can it be achieved? I found that some EEPROM's write command does not require the address to be specified and just continues from lastly written point. But EEPROM I am using does not support that. (I am using Spansion's S25FL1-K). I thought about allocating part of memory to track the address and storing the address every time I write, but that might wear out flash faster. What is widely used method to handle such case?
Thanks.
EDIT:
What I am asking is how to track/save the address in a non-volatile way so that when next write happens, I know what address to start.
I never worked with this particular flash, but I've implemented something similar. Unfortunately, without knowing your constrains / priorities (memory or CPU efficient, how often write happens etc.) it is impossible to give a definite answer. Here are some techniques that you may want to consider. I don't know if they are widely used though.
Option 1: Write X bytes containing string length before the string. Then on initialization you could parse your flash: read the length n, jump n bytes forward; read the next byte. If it's empty (all ones for your flash according to the datasheet) then you got your first empty bit. Otherwise you've just read the length of the next string, so do the same over again.
This method allows you to quickly search for the last used sector, since the first byte of the used sector is guaranteed to have a value. The flip side here is overhead of extra n bytes (depending on the max string length) each time you write a string, and having to parse it to get the value (although this can only be done once on boot).
Option 2: Instead of prepending the size, append the unique "end-of-string" sequence, and then parse on boot for the last sequence before ones that represent empty flash.
Disadvantage here is longer parse, but you possibly could get away with just 1 byte-long overhead for each string.
Option 3 would be just what you already thought of: allocating a separate sector that would contain the value you need. To reduce flash wear you could also write these values back-to-back and search for the last one each time you boot. Also, you might consider the expected lifetime of the device that you program versus 100,000 erases that your flash can sustain (again according to the datasheet) - is wearing even a problem? That of course depends on how often data will be saved.
Hope that helps.

In Lua, how to determine the size of an object?

Is there a way, in Lua, to determine the (in memory) size of an object?
I found an article on Gamepedia about Lua object memory sizes, but it is not general and precise.
I would give the same explanation as #NicolBolas, but different answers to the questions.
Is there a way, in Lua, to determine the (in memory)size of an object?
Yes, but you may need to use an external module for that. See my earlier answer and specifically lua-getsize module.
Is there a way, in Lua, to determine if the table to be stored is greater than the MP size?
If you know the size of the table with X elements, you can probably extrapolate to a table with Y elements of approximately the same content, but you wont be able to limit the allocations to a particular size unless you use your own allocator that has that logic.
Is there a way, in Lua, to determine if the table to be stored is greater than the MP size?
No.
Is there a way, in Lua, to determine the (in memory)size of an object?
No.
Lua is not responsible for things like capping memory and so forth. That ought to be handled from the C code that creates and manages the Lua state. So if you have a 16MB limit, then that needs to be built into the lua_State when you call lua_newstate. You pass it an allocation function that needs to keep track of all such allocations. It would also allocate storage from the memory pool, not from the heap.
Of course, the allocator can't tell exactly why an allocation is happening. So there's no way to limit just this one specific table to 16MB, if you intend for the Lua state to also do other things.
If you have such specific memory needs for just this one table, you probably need to allocate and store it in C/C++, and then use the Lua interface to expose it to Lua to read/manipulate.

CFMutableArray grows beyond its capacity

Considere I have a CFMutableArray object created with the following function call:
CFMutableArrayRef marray = CFArrayCreateMutable(kCFAllocatorDefault, 1, &kCFTypeArrayCallBacks);
According to the CFMutableArray Documentation, the second argument of CFArrayCreateMutable, which is called capacity, is "the maximum number of values that can be contained by the new array. The array starts empty and can grow to this number of values (and it can have less).
Pass 0 to specify that the maximum capacity is not limited. The value must not be negative."
However, if I append more than one value to my new array, it keeps growing. I mean, if the new array already has one value and I append a new one with CFArrayAppendValue(marray, newValue), this value is stored and the array count goes to 2, exceeding its capacity.
So, why this happens? Did I misunderstand the documentation?
Interesting. I don't think you're mis-understanding the documentation. I will point out that CFMutableArrayRef is "Toll Free Bridged" on iOS with NSMutableArray and therefore interchangeable. On iOS [NSMutableArray arrayWithCapacity] is NOT so limited and capacity is basically "guidance" and NOT a hard upper limit.
It might be worth filing a bug report on that, probably the docs are just wrong.
UPDATE: Just goes to show ya... always, always, always follow the maxim of the best docs are the source. I CMD-clicked on CFArrayCreateMutable to look at the comment in the source .h file.... guess I was right, because that says 'capacity' is a HINT that the implementation may ignore, as it apparently does in this case.
#function CFArrayCreateMutable
⋮
#param capacity A hint about the number of values that will be held by the CFArray.
Pass 0 for no hint. The implementation may ignore this hint,** or may use it to
optimize various operations. An array's actual capacity is only limited by
address space and available memory constraints). If this parameter is negative,
the behavior is undefined.
Don't forget, header comments are written by developers, whilst "docs" are written by some tech writer that doesn't have the same depth of knowledge.
From docs:
CFArrayAppendValue
Adds a value to an array giving it the new largest index.
Parameters
theArray
The array to which value is to be added. If theArray is a limited-capacity array and it is full before this operation, the behavior is undefined.
So, I think you should only make sure not to add a value to a full limited-capacity array, or the behavior will be undefined!
Or, you can pass 0 for the capacity parameter to specify that the maximum capacity is not limited.
Update:
As #Cliff Ribaudo pointed out, it seems there is a contradiction between the official documentation and code documentation:
#param capacity A hint about the number of values that will be held
by the CFArray. Pass 0 for no hint. The implementation may
ignore this hint, or may use it to optimize various
operations. An array's actual capacity is only limited by
address space and available memory constraints).
So we can assume the online documentation is outdated and the code documentation is possibly the right.

iOS: Memory consumption for data type allocation

It may be a basic question but just curious to know the answer. How much memory will be occupied when we are creating each variable like int, NSString, NSDictionary, NSData etc. in our iOS Obj C program. I hope, it will be similar to C program based on the size of the data type (or) differently ?
Thank you!
In short, you can't. At least not in a useful fashion.
Just about every class will have any number of allocations that are not directly credited to the instance's allocation directly. As well, many classes -- NSDictionary, NSArray, NSString -- are actually a part of a class cluster and, thus, what is actually allocated is a subclass. Finally, for the various collection and data classes, the size of the associated allocations are going to vary wildly based on their contents. Some classes -- UIImage, NSData, etc -- may contain MBs of data that isn't actually represented by a heap allocation in that they are mapping data from the filesystem.
Or, to summarize, class_getInstanceSize() is useless.
Instead, you need to focus on the memory usage of your app as a systemic characteristic of its operating behavior. The Allocations Instrument can do a good job of measuring memory usage and, more importantly, help you identify what is responsible for consumption therein.
You should consider the size of Objective-C objects an implementation detail that you can't rely on. They may happen to be consistent from one environment to another, but they offer no guarantee of that. Further, there's no consistent and complete way for you to measure their size since they may be implemented in all kinds of clever ways.
If you need precise management of memory allocations, just use C types, which are themselves entirely valid Objective-C.
You can get the size of any class by calling this, an example:
#import "objc/runtime.h"
int size = class_getInstanceSize([NSDictionary class]);
It depends of content you would like to put into this classes (NSString, NSDictionary, NSData). You can say that NSMutableString and etc changes theirs sizes, but they just reallocates themselves.
The below document by Apple developer site will be helpful for you.
http://developer.apple.com/library/ios/#documentation/DeveloperTools/Conceptual/InstrumentsUserGuide/MemoryManagementforYouriOSApp/MemoryManagementforYouriOSApp.html

TStringList, Dynamic Array or Linked List in Delphi?

I have a choice.
I have a number of already ordered strings that I need to store and access. It looks like I can choose between using:
A TStringList
A Dynamic Array of strings, and
A Linked List of strings (singly linked)
and Alan in his comment suggested I also add to the choices:
TList<string>
In what circumstances is each of these better than the others?
Which is best for small lists (under 10 items)?
Which is best for large lists (over 1000 items)?
Which is best for huge lists (over 1,000,000 items)?
Which is best to minimize memory use?
Which is best to minimize loading time to add extra items on the end?
Which is best to minimize access time for accessing the entire list from first to last?
On this basis (or any others), which data structure would be preferable?
For reference, I am using Delphi 2009.
Dimitry in a comment said:
Describe your task and data access pattern, then it will be possible to give you an exact answer
Okay. I've got a genealogy program with lots of data.
For each person I have a number of events and attributes. I am storing them as short text strings but there are many of them for each person, ranging from 0 to a few hundred. And I've got thousands of people. I don't need random access to them. I only need them associated as a number of strings in a known order attached to each person. This is my case of thousands of "small lists". They take time to load and use memory, and take time to access if I need them all (e.g. to export the entire generated report).
Then I have a few larger lists, e.g. all the names of the sections of my "virtual" treeview, which can have hundreds of thousands of names. Again I only need a list that I can access by index. These are stored separately from the treeview for efficiency, and the treeview retrieves them only as needed. This takes a while to load and is very expensive memory-wise for my program. But I don't have to worry about access time, because only a few are accessed at a time.
Hopefully this gives you an idea of what I'm trying to accomplish.
p.s. I've posted a lot of questions about optimizing Delphi here at StackOverflow. My program reads 25 MB files with 100,000 people and creates data structures and a report and treeview for them in 8 seconds but uses 175 MB of RAM to do so. I'm working to reduce that because I'm aiming to load files with several million people in 32-bit Windows.
I've just found some excellent suggestions for optimizing a TList at this StackOverflow question:
Is there a faster TList implementation?
Unless you have special needs, a TStringList is hard to beat because it provides the TStrings interface that many components can use directly. With TStringList.Sorted := True, binary search will be used which means that search will be very quick. You also get object mapping for free, each item can also be associated with a pointer, and you get all the existing methods for marshalling, stream interfaces, comma-text, delimited-text, and so on.
On the other hand, for special needs purposes, if you need to do many inserts and deletions, then something more approaching a linked list would be better. But then search becomes slower, and it is a rare collection of strings indeed that never needs searching. In such situations, some type of hash is often used where a hash is created out of, say, the first 2 bytes of a string (preallocate an array with length 65536, and the first 2 bytes of a string is converted directly into a hash index within that range), and then at that hash location, a linked list is stored with each item key consisting of the remaining bytes in the strings (to save space---the hash index already contains the first two bytes). Then, the initial hash lookup is O(1), and the subsequent insertions and deletions are linked-list-fast. This is a trade-off that can be manipulated, and the levers should be clear.
A TStringList. Pros: has extended functionality, allowing to dynamically grow, sort, save, load, search, etc. Cons: on large amount of access to the items by the index, Strings[Index] is introducing sensible performance lost (few percents), comparing to access to an array, memory overhead for each item cell.
A Dynamic Array of strings. Pros: combines ability to dynamically grow, as a TStrings, with the fastest access by the index, minimal memory usage from others. Cons: limited standard "string list" functionality.
A Linked List of strings (singly linked). Pros: the linear speed of addition of an item to the list end. Cons: slowest access by the index and searching, limited standard "string list" functionality, memory overhead for "next item" pointer, spead overhead for each item memory allocation.
TList< string >. As above.
TStringBuilder. I does not have a good idea, how to use TStringBuilder as a storage for multiple strings.
Actually, there are much more approaches:
linked list of dynamic arrays
hash tables
databases
binary trees
etc
The best approach will depend on the task.
Which is best for small lists (under
10 items)?
Anyone, may be even static array with total items count variable.
Which is best for large lists (over 1000 items)?
Which is best for huge lists (over 1,000,000 items)?
For large lists I will choose:
- dynamic array, if I need a lot of access by the index or search for specific item
- hash table, if I need to search by the key
- linked list of dynamic arrays, if I need many item appends and no access by the index
Which is best to minimize memory use?
dynamic array will eat less memory. But the question is not about overhead, but about on which number of items this overhead become sensible. And then how to properly handle this number of items.
Which is best to minimize loading time to add extra items on the end?
dynamic array may dynamically grow, but on really large number of items, memory manager may not found a continous memory area. While linked list will work until there is a memory for at least a cell, but for cost of memory allocation for each item. The mixed approach - linked list of dynamic arrays should work.
Which is best to minimize access time for accessing the entire list from first to last?
dynamic array.
On this basis (or any others), which data structure would be preferable?
For which task ?
If your stated goal is to improve your program to the point that it can load genealogy files with millions of persons in it, then deciding between the four data structures in your question isn't really going to get you there.
Do the math - you are currently loading a 25 MB file with about 100000 persons in it, which causes your application to consume 175 MB of memory. If you wish to load files with several millions of persons in it you can estimate that without drastic changes to your program you will need to multiply your memory needs by n * 10 as well. There's no way to do that in a 32 bit process while keeping everything in memory the way you currently do.
You basically have two options:
Not keeping everything in memory at once, instead using a database, or a file-based solution which you load data from when you need it. I remember you had other questions about this already, and probably decided against it, so I'll leave it at that.
Keep everything in memory, but in the most space-efficient way possible. As long as there is no 64 bit Delphi this should allow for a few million persons, depending on how much data there will be for each person. Recompiling this for 64 bit will do away with that limit as well.
If you go for the second option then you need to minimize memory consumption much more aggressively:
Use string interning. Every loaded data element in your program that contains the same data but is contained in different strings is basically wasted memory. I understand that your program is a viewer, not an editor, so you can probably get away with only ever adding strings to your pool of interned strings. Doing string interning with millions of string is still difficult, the "Optimizing Memory Consumption with String Pools" blog postings on the SmartInspect blog may give you some good ideas. These guys deal regularly with huge data files and had to make it work with the same constraints you are facing.
This should also connect this answer to your question - if you use string interning you would not need to keep lists of strings in your data structures, but lists of string pool indexes.
It may also be beneficial to use multiple string pools, like one for names, but a different one for locations like cities or countries. This should speed up insertion into the pools.
Use the string encoding that gives the smallest in-memory representation. Storing everything as a native Windows Unicode string will probably consume much more space than storing strings in UTF-8, unless you deal regularly with strings that contain mostly characters which need three or more bytes in the UTF-8 encoding.
Due to the necessary character set conversion your program will need more CPU cycles for displaying strings, but with that amount of data it's a worthy trade-off, as memory access will be the bottleneck, and smaller data size helps with decreasing memory access load.
One question: How do you query: do you match the strings or query on an ID or position in the list?
Best for small # strings:
Whatever makes your program easy to understand. Program readability is very important and you should only sacrifice it in real hotspots in your application for speed.
Best for memory (if that is the largest constrained) and load times:
Keep all strings in a single memory buffer (or memory mapped file) and only keep pointers to the strings (or offsets). Whenever you need a string you can clip-out a string using two pointers and return it as a Delphi string. This way you avoid the overhead of the string structure itself (refcount, length int, codepage int and the memory manager structures for each string allocation.
This only works fine if the strings are static and don't change.
TList, TList<>, array of string and the solution above have a "list" overhead of one pointer per string. A linked list has an overhead of at least 2 pointers (single linked list) or 3 pointers (double linked list). The linked list solution does not have fast random access but allows for O(1) resizes where trhe other options have O(lgN) (using a factor for resize) or O(N) using a fixed resize.
What I would do:
If < 1000 items and performance is not utmost important: use TStringList or a dyn array whatever is easiest for you.
else if static: use the trick above. This will give you O(lgN) query time, least used memory and very fast load times (just gulp it in or use a memory mapped file)
All mentioned structures in your question will fail when using large amounts of data 1M+ strings that needs to be dynamically chaned in code. At that Time I would use a balances binary tree or a hash table depending on the type of queries I need to maken.
From your description, I'm not entirely sure if it could fit in your design but one way you could improve on memory usage without suffering a huge performance penalty is by using a trie.
Advantages relative to binary search tree
The following are the main advantages
of tries over binary search trees
(BSTs):
Looking up keys is faster. Looking up a key of length m takes worst case
O(m) time. A BST performs O(log(n))
comparisons of keys, where n is the
number of elements in the tree,
because lookups depend on the depth of
the tree, which is logarithmic in the
number of keys if the tree is
balanced. Hence in the worst case, a
BST takes O(m log n) time. Moreover,
in the worst case log(n) will approach
m. Also, the simple operations tries
use during lookup, such as array
indexing using a character, are fast
on real machines.
Tries can require less space when they contain a large number of short
strings, because the keys are not
stored explicitly and nodes are shared
between keys with common initial
subsequences.
Tries facilitate longest-prefix matching, helping to find the key
sharing the longest possible prefix of
characters all unique.
Possible alternative:
I've recently discovered SynBigTable (http://blog.synopse.info/post/2010/03/16/Synopse-Big-Table) which has a TSynBigTableString class for storing large amounts of data using a string index.
Very simple, single layer bigtable implementation, and it mainly uses disc storage, to consumes a lot less memory than expected when storing hundreds of thousands of records.
As simple as:
aId := UTF8String(Format('%s.%s', [name, surname]));
bigtable.Add(data, aId)
and
bigtable.Get(aId, data)
One catch, indexes must be unique, and the cost of update is a bit high (first delete, then re-insert)
TStringList stores an array of pointer to (string, TObject) records.
TList stores an array of pointers.
TStringBuilder cannot store a collection of strings. It is similar to .NET's StringBuilder and should only be used to concatenate (many) strings.
Resizing dynamic arrays is slow, so do not even consider it as an option.
I would use Delphi's generic TList<string> in all your scenarios. It stores an array of strings (not string pointers). It should have faster access in all cases due to no (un)boxing.
You may be able to find or implement a slightly better linked-list solution if you only want sequential access. See Delphi Algorithms and Data Structures.
Delphi promotes its TList and TList<>. The internal array implementation is highly optimized and I have never experienced performance/memory issues when using it. See Efficiency of TList and TStringList

Resources