A reference I'm using says the following:
For efficiency reasons, we choose the front of the queue to be at the head of the list, and the rear of the queue to be at the tail of the list. In this way, we remove from the head and insert at the tail.
I'm wondering why would it be bad to insert at the head and remove at the tail? Is it because in a singly linked list, removing a tail node is not so easy because you have to access the one before and the only way of doing that in a singly linked list is to start from the beginning?
yes.
Removing from the tail will reduce the performance. Since need to travel from starting. At the User point of view adding in tail and removing from head or adding in head and removing from tail, both will give similar functionality. Because user point of view head and tail are the 2 end of the queue. So adding in head and removing from tail doesn't have any useful.
Related
Implement an algorithm to find the kth to last element of a singly linked list.
Is a good solution to the above problem, reversing the linked list then traversing again and getting the kth element?
First of all the list is singly-linked. So that is a good hint that you should not try to reverse it, because you would need as much storage to make the copy.
You can use a modified version of the turtle-hare algorithm:
Use a hare pointer at the start of the list
Move it at least K elements away
If you hit the last element before, then you cannot find the Kth to last element.
Place a turtle pointer at the start of the list
Now runs the hare pointer to the end of the list, each time you move the hare pointer, moves the turtle.
When the hare pointer reach the end of the list, the turtle is on the Kth to last element.
Let's set the context/limitations:
A linked-list consists of Node objects.
Nodes only have a reference to their next node.
A reference to the list is only a reference to the head Node object.
No preprocessing or indexing has been done on the linked-list other than construction (there are no other references to internal nodes or statistics collected, i.e. length).
The last node in the list has a null reference for its next node.
Below is some code for my proposed solution.
Node cursor = head;
Node middle = head;
while (cursor != null) {
cursor = cursor.next;
if (cursor != null) {
cursor = cursor.next;
middle = middle.next;
}
}
return middle;
Without changing the linked-list architecture (not switching to a doubly-linked list or storing a length variable), is there a more efficient way to find the middle element of singly-linked list?
Note: When this method finds the middle of an even number of nodes, it always finds the left middle. This is ideal as it gives you access to both, but if a more efficient method will always find the right middle, that's fine, too.
No, there is no more efficient way, given the information you have available to you.
Think about it in terms of transitions from one node to the next. You have to perform N transitions to work out the list length. Then you have to perform N/2 transitions to find the middle.
Whether you do this as a full scan followed by a half scan based on the discovered length, or whether you run the cursor (at twice speed) and middle (at normal speed) pointers in parallel is not relevant here, the total number of transitions remains the same.
The only way to make this faster would be to introduce extra information to the data structure which you've discounted but, for the sake of completeness, I'll include it here. Examples would be:
making it a doubly-linked list with head and tail pointers, so you could find it in N transitions by "squeezing" in from both ends to the middle. That doubles the storage requirements for pointers however so may not be suitable.
having a skip list with each node pointing to both it's "child" and its "grandchild". This would speed up the cursor transitions resulting in only about N in total (that's N/2 for each of cursor and middle). Like the previous point, there's an extra pointer per node required for this.
maintaining the length of the list separately so you could find the middle in N/2 transitions.
same as the previous point but caching the middle node for added speed under certain circumstances.
That last point bears some extra examination. Like many optimisations, you can trade space for time and the caching shows one way to do it.
First, maintain the length of the list and a pointer to the middle node. The length is initially zero and the middle pointer is initially set to null.
If you're ever asked for the middle node when the length is zero, just return null. That makes sense because the list is empty.
Otherwise, if you're asked for the middle node and the pointer is null, it must be because you haven't cached the value yet.
In that case, calculate it using the length (N/2 transitions) and then store that pointer for later, before returning it.
As an aside, there's a special case here when adding to the end of the list, something that's common enough to warrant special code.
When adding to the end when the length is going from an even number to an odd number, just set middle to middle->next rather than setting it back to null.
This will save a recalculation and works because you (a) have the next pointers and (b) you can work out how the middle "index" (one-based and selecting the left of a pair as per your original question) changes given the length:
Length Middle(one-based)
------ -----------------
0 none
1 1
2 1
3 2
4 2
5 3
: :
This caching means, provided the list doesn't change (or only changes at the end), the next time you need the middle element, it will be near instantaneous.
If you ever delete a node from the list (or insert somewhere other than the end), set the middle pointer back to null. It will then be recalculated (and re-cached) the next time it's needed.
So, for a minimal extra storage requirement, you can gain quite a bit of speed, especially in situations where the middle element is needed more often than the list is changed.
Detecting cycles in a single linked list is a well known problem. I know that this question has been asked a zillion times all over the internet. The reason why I am asking it again is I thought of a solution which I did not encounter at other places. (I admit I haven't searched that deeply either).
My solution is:
Given a linked list and pointer to some node, break the link between node and node->next();
Then start at node->next() and traverse till either you hit an end (which means there was no loop) or till you reach at node which means there was a loop.
Is there anything wrong/good about above solution ?
Note: Do join the link back once you are done.
That will work to detect complete cycles (i.e., cycles with a period of the whole list), e.g.:
A -> B -> C -> D -> A
But what if we have a cycle somewhere else in the list?
e.g.,
A -> B -> C -> D -> E -> C
I can't see that your algorithm will detect the cycle in this case.
Keep in mind that to detect the first case, we need not even break the link. We could just traverse the list and keep comparing the next link for each node with the head element to see if we'd started back at the start yet (or hit the end).
I guess the most trivial approach (not necessarily the best, but one that everybody should know how to implement in Java in a few lines of code) is to build a Hash Set of the nodes, start adding them until you find one that you already saw before. Takes extra memory though.
If you can mark nodes, start marking them until you find one you marked before (the hash map is essentially an external marker).
And check the usual graph theory books...
You are not allowed to break a link, even if you join it back at the end. What if other programs read the list at the same timeĀ ?
The algorithm must not damage the list while working on it.
Using a gremlin script and neo4j I try to find all paths between two nodes, descending at most 10 levels down. But all I get as response from the REST API is a
java.lang.ArrayIndexOutOfBoundsException: -1
Here is the script:
x = g.v(2)
y = g.v(6)
x.both.loop(10){!it.object.equals(y)}.paths
I looked through the documentation, but couldnt find anything relevant for this usecase.
In Gremlin the argument to loop is the number of steps back that you wish to go and the closure is evaluated to determine when to break out of the loop. In this case, because you have loop(10) it's going to go back way too far to a point where the pipeline is not defined. With the respect to the closure, you'll need to check not only if the object is the one in question, in which case you should stop, but also whether or not you've done 10 loops already.
What you really want it something like this:
x.both.loop(1){!it.object.equals(y) && it.loops < 10}.paths
However, I should add that if there is a cycle in the graph, this will gladly traverse the cycle over and over and result in far too many paths. You can apply some clever filter and sideEffect to avoid visiting nodes multiple times.
For more information see the Loop Pattern Page on the Gremlin Wiki.
A lot of what I'm reading says that removing an internal element in a doubly linked list (DLL) is O(1); but why is this the case?
I understand why it's O(n) for SLLs; traverse the list O(n) and remove O(1) but don't you still need to traverse the list in a DLL to find the element?
For a doubly linked list, it's constant time to remove an element once you know where it is.
For a singly linked list, it's constant time to remove an element once you know where it and its predecessor are.
Since that link you point to shows a singly linked list removal as O(n) and a doubly linked one as O(1), it's certain that's once you already know where the element is that you want to remove, but not anything else.
In that case, for a doubly linked list, you can just use the prev and next pointers to remove it, giving you O(1). Ignoring the edge cases where you're at the head or tail, that means something like:
corpse->prev->next = corpse->next
corpse->next->prev = corpse->prev
free (corpse)
However, in a singly linked list where you only know the node you want deleted, you can't use corpse->prev to get the one preceding it because there is no prev link.
You have to instead find the previous item by traversing the list from the head, looking for one which has a next of the element you want to remove. That will take O(n), after which it's once again O(1) for the actual removal, such as (again, ignoring the edge cases for simplicity):
lefty = head
while lefty->next != corpse:
lefty = lefty-> next
lefty->next = corpse->next
free (corpse)
That's why the two complexities are different in that article.
As an aside, there are optimisations in a singly-linked list which can make the deletion O(n) (the deletion being effectively O(1) once you've found the item you want to delete, and the previous item). In code terms, that goes something like:
# Delete a node, returns true if found, otherwise false.
def deleteItem(key):
# Special cases (empty list and deleting head).
if head == null: return false
if head.data == key:
curr = head
head = head.next
free curr
return true
# Search non-head part of list (so prev always exists).
prev = head
curr = head.next
while curr != null:
if curr.data == key:
# Found it so delete (using prev).
prev.next = curr.next
free curr
return true
# Advance to next item.
prev = curr
curr = curr.next
# Not found, so fail.
return false
As it's stated where your link points to:
The cost for changing an internal element is based on already having a pointer to it, if you need to find the element first, the cost for retrieving the element is also taken.
So, for both DLL and SLL linear search is O(n), and removal via pointer is O(1).
The complexity of removal in DLL is O(1).
It can also be O(1) in SLL if provided pointer to preceding element and not to the element itself.
This complexity is assuming you know where the element is.
I.e. the operation signature is akin to remove_element(list* l, link* e)
Searching for the element is O(n) in both cases.
#Matuku: You are correct.
I humbly disagree with most answers here trying to justify how delete operation for DLL is O(1). It's not.
Let me explain.
Why are we considering the scenario that we 'would' have the pointer to the node that is being deleted? LinkedLists (Singly/Doubly) are traversed linearly, that's their definition. They have pointers only to the head/tail. How can we suddenly have a pointer to some node in between? That defeats the purpose of this data structure. And going by that assumption, if I have a DLL list of say 1 million nodes, then do I also have to maintain 1 million pointers (let's call them access pointers) pointing to each of those nodes so that I can delete them in O(1)? So how would I store those 1 millions access pointers? And how do I know which access pointer points to the correct data/node that I want to delete?
Can we have a real world example where we 'have' the pointer to the data that has to be deleted 100% of the time?
And if you know the exact location/pointer/reference of/to the node to be deleted, why to even use a LinkedList? Just use array! That's what arrays are for - direct access to what you want!
By assuming that you have direct access to any node you want in DLL is going against the whole idea of LinkedList as a conceptual Data Structure. So I agree with OP, he's correct. I will stick with this - Doubly LinkedLists cannot have O(1) for deleting any node. You still need to start either from head or tail, which brings it down to O(n).
" If " we have the pointer to the node to be deleted say X, then of course it's O(1) because we have pointers to the next and prev node we can delete X. But that big if is imaginary, not real.
We cannot play with the definition of the sacred Data Structure called LinkedLists for some weird assumptions we may have from time to time.