Is the best way to sort an array in Delphi is "alphanumeric".
I found this comment in an old code of my application
" The elements of this array must be in ascending, alphanumeric
sort order."
If so ,what copuld be the reason?
-Vas
There's no "best" way as to how to sort the elements of an array (or any collection for that fact). Sort is a humanized characteristic (things are not usually sorted) so I'm guessing the comment has more to do with what your program is expecting.
More concretely, there's probably other section of code elsewhere that expect the array elements to be sorted alphanumerically. It can be something so simple as displaying it into a TreeView already ordered so that the calling code doesn't have to sort the array first.
Arrays are represented as a contiguous memory assignment so that access is fast. Internally the compiler just does a call to GetMem asking for SizeOf(Type) * array size. There's nothing in the way the elements are sorted that affects the performance or memory size of the arrays in general. It MUST be in the program logic.
Most often an array is sorted to provide faster search times. Given a list of length L, I can compare with the midpoint (L DIV 2) and quickly determine if I need to look at the greater half, or the lesser half, and recursively continue using this pattern until I either have nothing to divide by or have found my match. This is what is called a Binary search. If the list is NOT sorted, then this type of operation is not available and instead I must inspect every item in the list until I reach the end.
No, there is no "best way" of sorting. And that's one of the reasons why you have multiple sorting techniques out there.
With QuickSort, you even provide the comparison function where you determine what order you ultimately want.
Sorting an array in some way is useful when you're trying to do a binary search on the array. A binary search can be extremely fast, compared to other methods. But if the sort error is wrong, the search will be unable to find the record.
Other reasons to keep arrays sorted are almost always for cosmetic reasons, to decide how the array is sent to some output.
The best way to re-order an array depends of the length of the array and the type of data it contains. A QuickSort algorithm would give a fast result in most cases. Delphi uses it internally when you're working with string-lists and some other lists. Question is, do you really need to sort it? Does it really need to stay an array even?
But the best way to keep an array sorted is by keeping it sorted from the first element that you add to it! In general, I write a wrapper around my array types, which will take care of keeping the array ordered. The 'Add' method will search for the biggest value in the array that's less or equal to the value that I want to add. I then insert the new item right after that position. To me, that would be the best solution. (With big arrays you could use the binary search method again to find the location where you need to insert the new record. It's slower than appending records to the end but you never have to wonder if it's sorted or not, since it is...
Related
I have some questions about the ideas proposed in this video.
The speaker shows an array that holds values and pointers, and he also shows a separate "free" linked list, that is updated whenever an item is added/removed.
Why are these used? Doesn't using an array / limiting yourself to a set of free nodes defeat the purpose of a linked list?
Isn't one of the perk of using a linked list the ability to traverse fragmented data?
Why use these free nodes, when you can dynamically allocate storage?
The proposed structure, to me, doesn't seem dynamic at all, and is in fact a convoluted and inefficient array.
The approach you mention makes sense in certain use cases. For example if the common case is that the array is 90% full and most of the time is spent iterating over it, you can very quickly loop over an array and just skip the few empty items. This can be much, much faster than "pointer chasing" which plain linked lists use, because the CPU's hardware prefetcher can predict which memory you will need in advance.
And compared with a plain array and no free list, it has the advantage of O(1) allocation of an element into an empty slot.
how to check if an array is sorted?
I am sorting using sort descriptors. Is there any API to check if an array is already in sorted order in Swift/Objective-C.
Thanks
i think there is no frame work, simply iterate truth the array, and check if the current element greater or equal (or less or equal, or which kind of sorting you look for) is. This is the easiest way. Look please at this Question Solution
As far as I know, there isn't a built in way to check if an array is already sort descriptors. The best way to check is to iterate through the array and check if each element should come before the element precedes it (using whatever definition of "should come before" you want for your sort). If you're sorting custom objects, you can write some sort of compareTo method that compares two objects of your class, which will make it convenient to check using the method I described.
I'm simply curious as lately I have been seeing the use of Hashmaps in Java and wonder if Delphi's Sorted String list is similar at all.
Does the TStringList object generate a Hash to use as an index for each item? And how does the search string get checked against the list of strings via the Find function?
I make use of Sorted TStringLists very often and I would just like to understand what is going on a little bit more.
Please assume I don't know how a hash map works, because I don't :)
Thanks
I'm interpreting this question, quite generally, as a request for an overview of lists and dictionaries.
A list, as almost everyone knows, is a container that is indexed by contiguous integers.
A hash map, dictionary or associative array is a container whose index can be of any type. Very commonly, a dictionary is indexed with strings.
For sake of argument let us call our lists L and our dictionaries D.
Lists have true random access. An item can be looked-up in constant time if you know its index. This is not the case for dictionaries and they usually resort to hash-based algorithms to achieve efficient random access.
A sorted list can perform binary search when you attempt to find a value. Finding a value, V, is the act of obtaining the index, I, such that L[I]=V. Binary search is very efficient. If the list is not sorted then it must perform linear search which is much less efficient. A sorted list can use insertion sort to maintain the order of the list – when a new item is added, it is inserted at the correct location.
You can think of a dictionary as a list of <Key,Value> pairs. You can iterate over all pairs, but more commonly you use index notation to look-up a value for a given key: D[Key]. Note that this is not the same operation as finding a value in a list – it is the analogue of reading L[I] when you know the index I.
In older versions of Delphi it was common to coax dictionary behaviour out of string lists. The performance was terrible. There was little flexibility in the contents.
With modern Delphi, there is TDictionary, a generic class that can hold anything. The implementation uses a hash and although I have not personally tested its performance I understand it to be respectable.
There are commonly used algorithms that optimally use all of these containers: unsorted lists, sorted lists, dictionaries. You just need to use the right one for the problem at hand.
TStringList holds the strings in an array.
If you call Sort on an otherwise unsorted (Sorted property = false) string list then a QuickSort is performed to sort the items.
The same happens if you set Sorted to true.
If you call Find (or IndexOf which calls find) on an unsorted string list (Sorted property = false, even if you explicitly called Sort the list is considered unsorted if the Sorted property isn't true) then a linear search is performed comparing all strings from the start till a match is found.
If you call Find on a sorted string list (Sorted property = true) then a binary search is performed (see http://en.wikipedia.org/wiki/Binary_search for details).
If you add a string to a sorted string list, a binary search is performed to determine the correct insertion position, all following elements in the array are shifted by one and the new string is inserted.
Because of this insertion performance gets a lot worse the larger the string list is. If you want to insert a large number of entries into a sorted string list, it's usually better to turn sorting off, insert the strings, then set Sorted back to true which performs a quick sort.
The disadvantage of that approach is that you will not be able to prevent the insertion of duplicates.
EDIT: If you want a hash map, use TDictionary from unit Generics.Collections
You could look at the source code, since that comes with Delphi. Ctrl-Click on the "sort" call in your code.
It's a simple alphabetical sort in non-Unicode Delphi, and a slightly more complex Unicode one in later versions. You can supply your own comparison for custom sort orders. Unfortunately I don't have a recent version of Delphi so can't confirm, but I expect that under the hood there's a proper Unicode-aware and locale-aware string comparison routine. Unicode sorting/string comparison is not trivial and a little web searching will point out some of the pitfalls.
Supplying your own comparison routine is often done when you have delimited text in the strings or objects attached to them (the Objects property). In those cases you often wat to sort by a property of the object or something other than the first field in the string. Or it might be as simple as wanting a numerical sort on the strings (so "2" comes after "1" rather than after "19")
There is also a THashedStringList, which could be an option (especially in older Delphi versions).
BTW, the Unicode sort routines for TStringList are quite slow. If you override the TStringList.CompareStrings method then if the strings only contain Ansi characters (which if you use English exclusively they will), you can use customised Ansi string comparisons. I use my own customised TStringList class that does this and it is 4 times faster than the TStringList class for a sorted list for both reading and writing strings from/to the list.
Delphi's dictionary type (in generics-enabled versions of Delphi) is the closest thing to a hashmap, that ships with Delphi. THashedStringList makes lookups faster than they would be in a sorted string list. you can do lookups using a binary search in a sorted stringlist, so it's faster than brute force searches, but not as fast as a hash.
The general theory of a hash is that it is unordered, but very fast on lookup and insertion. A sorted list is reasonably fast on insertion until the size of the list gets large, although it's not as efficient as a dictionary for insertion.
The big benefit of a list is that it is ordered but a hash-lookup dictionary is not.
Given two linked lists of integers. I was asked to return a linked list which contains the non-common elements. I know how to do it in O(n^2), any way to do it in O(n)?
Use a hash table.
Iterate through the first linked list, entering the values you come across into a hash table.
Iterate through the second linked list, adding any element not found into the hash table into your list of non-common elements.
This solution should be O(n), assuming no collisions in the hash table.
create a new empty list. have a hash table and populate it with elements of both lists. complexity n. then iterate over each list sequentially and while iterating, put those elements in the new list which are not present in the hash table. complexity n. overall complexity=n
If they're unsorted, then I don't believe it is possible to get better than O(n^2). However, you can do better by sorting them... you can sort in reasonably fast time, and then get something like O(nlogn) (I'm not certain that's what it would be, but I think it can be that fast if you use the right algorithm).
According to the Wikipedia article on linked lists, inserting in the middle of a linked list is considered O(1). I would think it would be O(n). Wouldn't you need to locate the node which could be near the end of the list?
Does this analysis not account for the finding of the node operation (though it is required) and just the insertion itself?
EDIT:
Linked lists have several advantages over arrays. Insertion of an element at a specific point of a list is a constant-time operation, whereas insertion in an array may require moving half of the elements, or more.
The above statement is a little misleading to me. Correct me if I'm wrong, but I think the conclusion should be:
Arrays:
Finding the point of insertion/deletion O(1)
Performing the insertion/deletion O(n)
Linked Lists:
Finding the point of insertion/deletion O(n)
Performing the insertion/deletion O(1)
I think the only time you wouldn't have to find the position is if you kept some sort of pointer to it (as with the head and the tail in some cases). So we can't flatly say that linked lists always beat arrays for insert/delete options.
You are correct, the article considers "Indexing" as a separate operation. So insertion is itself O(1), but getting to that middle node is O(n).
The insertion itself is O(1). Node finding is O(n).
No, when you decide that you want to insert, it's assumed you are already in the middle of iterating through the list.
Operations on Linked Lists are often done in such a way that they aren't really treated as a generic "list", but as a collection of nodes--think of the node itself as the iterator for your main loop. So as you're poking through the list you notice as part of your business logic that a new node needs to be added (or an old one deleted) and you do so. You may add 50 nodes in a single iteration and each of those nodes is just O(1) the time to unlink two adjacent nodes and insert your new one.
For purposes of comparing with an array, which is what that chart shows, it's O(1) because you don't have to move all the items after the new node.
So yes, they are assuming that you already have the pointer to that node, or that getting the pointer is trivial. In other words, the problem is stated: "given node at X, what is the code to insert after this node?" You get to start at the insert point.
Insertion into a linked list is different than iterating across it. You aren't locating the item, you are resetting pointers to put the item in there. It doesn't matter if it is going to be inserted near the front end or near the end, the insertion still involves pointers being reassigned. It'll depend on how it was implemented, of course, but that is the strength of lists - you can insert easily. Accessing via index is where an array shines. For a list, however, it'll typically be O(n) to find the nth item. At least that's what I remember from school.
Inserting is O(1) once you know where you're going to put it.
Does this analysis not account for the finding of the node operation (though it is required) and just the insertion itself?
You got it. Insertion at a given point assumes that you already hold a pointer to the item that you want to insert after:
InsertItem(item * newItem, item * afterItem)
No, it does not account for searching. But if you already have hold of a pointer to an item in the middle of the list, inserting at that point is O(1).
If you have to search for it, you'd have to add on the time for searching, which should be O(n).
Because it does not involve any looping.
Inserting is like:
insert element
link to previous
link to next
done
this is constant time in any case.
Consequently, inserting n elements one after the other is O(n).
The most common cases are probably inserting at the begining or at the end of the list (and the ends of the list might take no time to find).
Contrast that with inserting items at the begining or the end of an array (which requires resizing the array if it's at the end, or resizing and moving all the elements if it's at the begining).
The article is about comparing arrays with lists. Finding the insert position for both arrays and lists is O(N), so the article ignores it.
O(1) is depending of that fact that you have a item where you will insert the new item. (before or after). If you don´t, it´s O(n) becuase you must find that item.
I think it's just a case of what you choose to count for the O() notation. In the case of inserting the normal operation to count is copy operations. With an array, inserting in the middle involves copying everything above the location up in memory. With a linked list, this becomes setting two pointers. You need to find the location no matter what to insert.
If you have the reference of the node to insert after the operation is O(1) for a linked list.
For an array it is still O(n) since you have to move all consequtive nodes.