F# Hiding stuff? - f#

Sry for the bad headline could find the right words.
At the moment I'm trying to make some basic data structures that F# can use in any situation the first one is a Double linked list.
My question is not as much how to get it implemented, but rather if there is a why to hide the ugliness of the data structure. short form, i have something that as node at could look like
type Node<'N> =
| (node<'N> ref, 'N, node<'N>)
| Empty
and to analyse this when we have more then three items of the list is rather error prone. So is there a way that i can make the "look" the user of the library sees so it could look more like a List from the .NET. I'm asking for a way that doesn't rely on an already established data type, and one that not return a string look ( " ... " )

You can wrap your F# type in a class and keep the actual F# representation hidden. For example, if you wanted a super simple mutable list, you could do something like this:
type private MyListNode<'T> =
| Empty
| Cons of 'T * MyListNode<'T>
type MyList<'T>() =
let mutable nodes = Empty
member x.Prepend(el) = nodes <- Cons(el, nodes)
member x.ToArray() =
let rec loop el = seq {
match el with
| Empty -> ()
| Cons(x, xs) ->
yield x
yield! loop xs }
loop nodes |> Array.ofSeq
The C# user can work with MyList, which is ordinary class with Prepend and ToArray methods. The MyListNode type is private (hidden inside your F# library) and C# users will never see it.

This is not an answer to the question, but to the comments, because what I'm going to say requires diagrams and so it won't work in comments.
kam wrote:
But my Double Linked List are running in O(1) and if we assume that it is only the data and not the "pointers" that are immutable then you could still copy the whole list in O(1) time since the only thing you do when adding or deleting are changing or make a pointer (ref cell) to the old list then we still have that we have a copy of the old list without copying every single element again.
If you try to do that, you will find that with a doubly-linked list, you can't, in fact, preserve the old list pointers. Here's why.
With a singly-linked list, you can prepend to a list in O(1) time while maintaining any pointers to the old list intact. Here's an example:
Old list containing three items:
New list after prepending a new head:
Note how the reference from the other code has stayed intact. The old list, referenced by the other code, is ["Item 1"; "Item 2"; "Item 3"]. The new list is ["New head item"; "Item 1"; "Item 2"; "Item 3"]. But the reference held by a different part of the code still points to a well-formed list. The "well-formed" part is important, as you're about to see.
With a doubly-linked list, things get more complicated — and it turns out that it is not possible to maintain immutability and have O(1) time. First let's look at the old list containing three items:
This is a well-formed doubly-linked list. It obeys the following properties that all well-formed doubly-linked lists should obey:
All nodes have a Fwd and Back pointer.
All nodes but the head node have a valid (non-null) Back pointer. Only the head node's Back pointer is null.
All nodes but the tail node have a valid (non-null) Fwd pointer. Only the tail node's Fwd pointer is null.
From any node that isn't the tail, going forward and then back should bring you back to the same node you started at.
From any node that isn't the head, going back and then forward should bring you back to the same node you started at.
Now, how do we add a new head item while still making sure that the reference from the other code continues to point to a well-formed doubly-linked list?
Here's a first attempt. We add the new head item, adjust its Fwd pointer to point to the "old" head node, and rewrite that node's Back pointer to point to the new head node:
This is still a well-formed list, as you can easily verify. All five properties still hold true for every node. But wait! The reference from some other part of the code has had its list changed out from under it! Where before it was pointing to a list of three items, now it's pointing to the second item of a list of four items! If that other code is only iterating forwards, it won't notice a change. But the minute it tries to iterate backward, it will notice that there's a new head item that wasn't there before! We have broken the immutability promise. Immutability is a guarantee to other code that consumes our data structure that "If you have a reference to this data structure, the data you see will never change out from under you." And we just broke that promise: the old code used to see the list ["Item 1"; "Item 2"; "Item 3"], and now it sees the list ["New head item"; "Item 1"; "Item 2"; "Item 3"].
Okay, then. Are there ways to keep that promise, and not change what the other code sees? Well, we could try not rewriting that old head node; that way the old code still sees a doubly-linked list of three items, and everyone's happy, right? Well, let's see what it would look like if we did it that way:
Great: the other code still sees the exact same doubly-linked list that it used to see, and there's no way to get from the old list to the new head node. So any part of that other code that tries to go backwards from the head of the list will find that the head still goes to null, like it should. But wait: what about the five properties of a well-formed list? Well, it turns out we've violated property #4: from the head node, going forward and then going back ends up at a null pointer, not the node we started at. So we no longer have a well-formed list: bummer.
Okay, so that approach won't work. What else could we try. Well... hey! I have an idea! Let's just make a copy of the old head node, and adjust the copy while we leave the old head node alone! That's still O(1) since we know we're only copying one node. Then the other code sees exactly what it used to see, a list of three items, but the new list has four items. Brilliant! Just what we want, right? Well, let's look at it:
Okay, does this work? Well, the other code has a reference to the old, unchanged head node, so that's fine: it can't ever accidentally see the new data, so it still continues to see exactly what it used to have, a list of three items. Good. And from the new head node, we can go forward and back and end up where we started, so that's good... but wait... no, there's still a problem. From the copy of item 1 node, going forward and then back takes us to the old "item 1" node, not to the "copy of item 1" node. So we have still violated the properties of well-formed lists, and this list isn't well-formed either.
There's an answer to that one, too: copy the node with item 2 in it. I'm getting tired of drawing diagrams and this answer is getting long, so I'll let you work that one out for yourself — but you'll quickly see that then, the node with the copy of item 2 has the same problem as before: going forward and back takes you to the "old" item 2. (Or else you've adjusted the "old" item 2 node, and thereby broken the immutability promise since the other code can now see the "new" data via some series of Fwd and/or Back operations).
But there's a solution to that one, too: just copy item 3 as well. I won't draw that diagram either, but you can work it out for yourself. And you'll find that once you've copied items 1, 2, and 3 into the new list, you've managed to satisfy both the immutability promise, and all the properties of well-formed lists. The other code still sees the untouched old list, and the new list has four items in it. The only problem is, you had to copy every item in the list — an O(N) operation, by definition — in order to achieve this result.
Summary: Singly-linked lists have three properties:
You can prepend items in O(1) time.
Other code that had a reference to the old list still sees the same data after your prepend operation.
The old list and the new list are both well-formed.
With doubly-linked lists, however, you only get to have two of those three properties. You can have an O(1) prepend operation and maintain well-formed lists, but then any other code will see the list data change. Or you could have O(1) prepend and still have the other code see the same data it used to, but then your lists will no longer be well-formed. Or, by copying every node in the list, you can let the other code still see the same data it used to, AND your new list will be well-formed — but you had to do an O(N) operation in order to achieve this result.
This is why I said that it's not possible to have an immutable doubly-linked list with O(1) operations.

Related

LRU cache with a singly linked list

Most LRU cache tutorials emphasize using both a doubly linked list and a dictionary in combination. The dictionary holds both the value and a reference to the corresponding node on the linked list.
When we perform a remove operation, we look up the node from the linked list in the dictionary and we'll have to remove it.
Now here's where it gets weird. Most tutorials argue that we need the preceding node in order to remove the current node from the linked list. This is done in order to get O(1) time.
However, there is a way to remove a node from a singly linked list in O(1) time here. We set the current node's value to the next node and then kill the next node.
My question is: why are all these tutorials that show how to implement an LRU cache with a doubly linked list when we could save constant space by using a singly linked list?
You are correct, the single linked list can be used instead of the double linked list, as can be seen here:
The standard way is a hashmap pointing into a doubly linked list to make delete easy. To do it with a singly linked list without using an O(n) search, have the hashmap point to the preceding node in the linked list (the predecessor of the one you care about, or null if the element is at the front).
Retrieve list node:
hashmap(key) ? hashmap(key)->next : list.head
Delete:
successornode = hashmap(key)->next->next
hashmap( successornode ) = hashmap(key)
hashmap(key)->next = successornode
hashmap.delete(key)
Why is the double linked list so common with LRU solutions then? It is easier to understand and use.
If optimization is an issue, then the trade off of a slightly less simple solution of a single linked list is definitely worth it.
There are a few complications for swapping the payload
The payload could be large (such as buffers)
part of the application code may still refer to the payload (have it pinned)
there may be locks or mutexes involved (which can be owned by both the DLL/hash nodes and/or the payload)
In any case, modifying the DLL affects at most 2*2 pointers, swapping the payload will need (memcpy for the swap +) walking the hash-chain (twice), which could need access to any node in the structure.

Merge sorting with linked chains in java

I am looking for a template of sorts for merging two linked chains that have already been sorted. I'm still fairly new to Java, and this seems to be a pretty challenging task to accomplish with the limited knowledge I have. I have an understanding of how to merge sort an array, but when it comes to linked lists I seem to be drawing blanks. Any help you all could give me, be it actual code or simply advise on where to start, would be greatly appreciated.
Thank you for your time!
If the two linked list are already sorted, then it is so easy to merge those two together. I am gonna tell you the algorithm but you need to write the code yourself since it seems like a school project. First you make a new linked list, and then assign the head of the new list to be the min of list1Head and list2Head, then you just walk the two list, each time picking the min of the current node of the two list and append to the new created list, make the current to be .Next if it got picked. If one of the list doesn't have more nodes, then append the rest of another list directly to the new list. Done
Can't you look at the first element in each list and take the smallest. This is the start of the new list. Remove this from the front ofwhichever list it came from. Now look at the first element again and take the smallest and make it the second element in the new list. Then just repeat this process zipping the two lists together.
If you want to avoid creating a new list the just find the smallest then look at the thing is pointing at and the beginning of the other list and see which is smaller. If you are not already pointing at the smaller one the update the pointer so it is. Then rinse and repeat.

In practice is Linked List addition O(N) or O(1)?

It is said that addition and deletion in a Linked List happens in constant time ie O(1) but access to elements happen in time proportional to the size of the list ie O(N). My question is how can you remove or add any element without first traversing to it ?In that case isn't addition or deletion also of the order O(N)?
Taking the example of Java , what happens when we use the api like this :
LinkedList stamps = new LinkedList();
stamps.add(new Stamp("Brazil"));
stamps.add(new Stamp("Spain"));
---
----
stamps.add(new Stamp("UnitedStates"); //say this is kth element in the list
----
stamps.add(new Stamp("India");
Then when some one does stamps.remove(k) , how can this operation happen in constant time?
Deleting items from a linked list works in constant time only if you have a pointer to the actual node on the list. If the only thing you have is the information that you want to delete the "n"th node, then there is no way to know which one it is - in which case you are required to traverse the list first, which is of course O(n).
Adding, on the other hand, always works in constant time, since it is in no way connected to the number of elements already contained by the list. In the example provided, every call to add() is O(1), not including the cost of calling the constructor of class Stamp. Adding to a linked list is simply attaching another element to its end. This is, of course, assuming that the implementation of the linked list knows which node is currently at the end of the list. If it doesn't know that, then, of course, traversal of the entire list is needed.

Best Possible algorithm to check if two linked lists are merging at any point? If so, where? [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Linked list interview question
This is an interview question for which I don't have an answer.
Given Two lists, You cannot change list and you dont know the length.
Give best possible algorithm to:
Check if two lists are merging at any point?
If merging, at what point they are merging?
If I allow you to change the list how would you modify your algorithm?
I'm assuming that we are talking about simple linked lists and we can safely create a hash table of the list element pointers.
Q1: Iterate to end of both lists, If the respective last elements are the same, the lists merge at some point.
Complexity - O(N), space complexity - O(1)
Q2:
Put all elements of one list into a hash table
Iterate over 2nd list, probing the hash table for each element of the list. The first hit (if any) is the merge point, and we have the position in the 2nd list.
To get the position in the 1st list, iterate over the first list again looking for the element found in the previous step.
Time complexity - O(N). Space complexity - O(N)
Q3:
As Q1, but also reverse the direction of the list pointers.
Then iterate the reversed lists looking for the last common element - that is the merge point - and restoring the list to the original order.
Time complexity - O(N). Space complexity - O(1)
Number 1: Just iterate both and then check if they end with the same element. Thats O(n) and it cant be beaten (as it might possibly be the last element that is common, and getting there always takes O(n)).
Walk those two lists parallel by one element, add each element to Set of visited nodes (can be hash map, or simple set, you only need to check if you visited that node before). At each step check if you visited that node (if yes, then it's merging point), and add it to set of nodes if you visit it first time. Another version (as pointed by #reinier) is to walk only first list, store its nodes in Set and then only check second list against that Set. First approach is faster when your lists merge early, as you don't need to store all nodes from first list. Second is better at worst case, where both list don't merge at all, since it didn't store nodes from second list in Set
see 1.
Instead of Set, you can try to mark each node, but if you cannot modify structure, then it's not so helpful. You could also try unlink each visited node and link it to some guard node (which you check at each step if you encountered it while traversing). It saves memory for Set if list is long enough.
Traverse both the list and have a global variable for finding the number of NULL encountered . If they merge at some point there will be only 1 NULL else there will be two NULL.

Why is inserting in the middle of a linked list O(1)?

According to the Wikipedia article on linked lists, inserting in the middle of a linked list is considered O(1). I would think it would be O(n). Wouldn't you need to locate the node which could be near the end of the list?
Does this analysis not account for the finding of the node operation (though it is required) and just the insertion itself?
EDIT:
Linked lists have several advantages over arrays. Insertion of an element at a specific point of a list is a constant-time operation, whereas insertion in an array may require moving half of the elements, or more.
The above statement is a little misleading to me. Correct me if I'm wrong, but I think the conclusion should be:
Arrays:
Finding the point of insertion/deletion O(1)
Performing the insertion/deletion O(n)
Linked Lists:
Finding the point of insertion/deletion O(n)
Performing the insertion/deletion O(1)
I think the only time you wouldn't have to find the position is if you kept some sort of pointer to it (as with the head and the tail in some cases). So we can't flatly say that linked lists always beat arrays for insert/delete options.
You are correct, the article considers "Indexing" as a separate operation. So insertion is itself O(1), but getting to that middle node is O(n).
The insertion itself is O(1). Node finding is O(n).
No, when you decide that you want to insert, it's assumed you are already in the middle of iterating through the list.
Operations on Linked Lists are often done in such a way that they aren't really treated as a generic "list", but as a collection of nodes--think of the node itself as the iterator for your main loop. So as you're poking through the list you notice as part of your business logic that a new node needs to be added (or an old one deleted) and you do so. You may add 50 nodes in a single iteration and each of those nodes is just O(1) the time to unlink two adjacent nodes and insert your new one.
For purposes of comparing with an array, which is what that chart shows, it's O(1) because you don't have to move all the items after the new node.
So yes, they are assuming that you already have the pointer to that node, or that getting the pointer is trivial. In other words, the problem is stated: "given node at X, what is the code to insert after this node?" You get to start at the insert point.
Insertion into a linked list is different than iterating across it. You aren't locating the item, you are resetting pointers to put the item in there. It doesn't matter if it is going to be inserted near the front end or near the end, the insertion still involves pointers being reassigned. It'll depend on how it was implemented, of course, but that is the strength of lists - you can insert easily. Accessing via index is where an array shines. For a list, however, it'll typically be O(n) to find the nth item. At least that's what I remember from school.
Inserting is O(1) once you know where you're going to put it.
Does this analysis not account for the finding of the node operation (though it is required) and just the insertion itself?
You got it. Insertion at a given point assumes that you already hold a pointer to the item that you want to insert after:
InsertItem(item * newItem, item * afterItem)
No, it does not account for searching. But if you already have hold of a pointer to an item in the middle of the list, inserting at that point is O(1).
If you have to search for it, you'd have to add on the time for searching, which should be O(n).
Because it does not involve any looping.
Inserting is like:
insert element
link to previous
link to next
done
this is constant time in any case.
Consequently, inserting n elements one after the other is O(n).
The most common cases are probably inserting at the begining or at the end of the list (and the ends of the list might take no time to find).
Contrast that with inserting items at the begining or the end of an array (which requires resizing the array if it's at the end, or resizing and moving all the elements if it's at the begining).
The article is about comparing arrays with lists. Finding the insert position for both arrays and lists is O(N), so the article ignores it.
O(1) is depending of that fact that you have a item where you will insert the new item. (before or after). If you don´t, it´s O(n) becuase you must find that item.
I think it's just a case of what you choose to count for the O() notation. In the case of inserting the normal operation to count is copy operations. With an array, inserting in the middle involves copying everything above the location up in memory. With a linked list, this becomes setting two pointers. You need to find the location no matter what to insert.
If you have the reference of the node to insert after the operation is O(1) for a linked list.
For an array it is still O(n) since you have to move all consequtive nodes.

Resources