Empty list when tail of singly linked list is null - linked-list

It seems that people always say that if the head of a singly linked list is null, then the list is empty, but would checking the tail work as well? Supposing I know for sure that a list has a tail for sure, can I check if the tail is null to determine if it's empty?

In the simplest implementation of singly linked list you keep the reference to a "node" struct, which contains both the value and the reference to the tail.
If the list is empty, this reference is null. You don't have neither a value, nor a tail.
If the list contains at least one element, then the reference points to the head.

For a singly linked list, I assume that you are keeping pointers or references to the 'head' node and the 'tail' node. If this is the case, the answer depends on how you handle setting these references.
Case 1 - The 'tail' is not set until there are at least two nodes.
0 Nodes in list: The 'head' and 'tail' are both null.
1 Node in list: The 'head' is set, but the 'tail' is null.
2+ Nodes in list: The 'head' and 'tail' are both set, to different nodes.
Case 2 - The 'tail' is always set if the head is set.
0 Nodes in list: The 'head' and 'tail' are both null.
1 Node in list: The 'head' and 'tail' are both set, to the same node.
2+ Nodes in list: The 'head' and 'tail' are both set, to different nodes.
Case 1 makes sense if you have data where you want a single node to mean something special, like a dead end or outlier. Here you're choice to check the list for the tail's existence to determine an empty list would fail.
Case 2 is simpler to code and requires fewer decisions. The head is always the first node, and the tail is always the last node, even if they are the same node. Here your choice to check for the tail's existence makes perfect sense.
If you want to check the tail, just make sure you check the specification of your library or build your own list to the spec of case 2.

Related

Why does adding to a singly linked list require O(1) constant time?

While doing leetcode, it says adding to a specific node in a singly linked list requires O(1) time complexity:
Unlike an array, we don’t need to move all elements past the inserted element. Therefore, you can insert a new node into a linked list in O(1) time complexity, which is very efficient.
When deleting it's O(n) time which makes sense because you need to traverse to the node-1 and change the pointers. Isn't it the same when adding, which means it should also be O(n) time complexity?
Specifically, when adding, you still need to traverse to the index-1 you want to add at and change that node's .next to the new node.
Leetcode reference - adding: here
Leetcode reference - conclusion: chart
It is important to know what the given input is. For instance, for the insert operation on a singly linked list you have highlighted the case where the node is given after which a new node should be inserted. This is indeed a O(1) operation. This is so because all the nodes that precede the given node are not affected by this operation.
This is different in this case: if for the delete operation the node is given that must be deleted, it cannot be done in O(1) in a singly linked list, because the node that precedes it must be updated. So now that preceding node must be retrieved by iterating from the start of the list.
We can "turn the tables":
What would be the time complexity if we were given a node and need to insert a new node before it? Then it will not be O(1), but O(n), for the simple reason that a change must be made to the node that precedes the given node.
What would be the time complexity if for a delete action we were given the node that precedes it? Then it can be done in O(1).
Still, if the input for either an insert or delete action is not a node reference, but an index, or a node's value, then both have a time complexity of O(n): the list must be traversed to find the given index or the given value.
So the time complexity for an action on a singly linked list depends very much on what the input is.
No, you do not need to traverse the list to insert an element past an existing, given element. For this, you only need to update the next pointers of the element you already have and of the element you are inserting. It's not necessary to know the previous element.
Note that even insertion past the last element can be implemented in O(1) on a singly-linked list, if you keep a reference to the last element of the list.

LRU cache with a singly linked list

Most LRU cache tutorials emphasize using both a doubly linked list and a dictionary in combination. The dictionary holds both the value and a reference to the corresponding node on the linked list.
When we perform a remove operation, we look up the node from the linked list in the dictionary and we'll have to remove it.
Now here's where it gets weird. Most tutorials argue that we need the preceding node in order to remove the current node from the linked list. This is done in order to get O(1) time.
However, there is a way to remove a node from a singly linked list in O(1) time here. We set the current node's value to the next node and then kill the next node.
My question is: why are all these tutorials that show how to implement an LRU cache with a doubly linked list when we could save constant space by using a singly linked list?
You are correct, the single linked list can be used instead of the double linked list, as can be seen here:
The standard way is a hashmap pointing into a doubly linked list to make delete easy. To do it with a singly linked list without using an O(n) search, have the hashmap point to the preceding node in the linked list (the predecessor of the one you care about, or null if the element is at the front).
Retrieve list node:
hashmap(key) ? hashmap(key)->next : list.head
Delete:
successornode = hashmap(key)->next->next
hashmap( successornode ) = hashmap(key)
hashmap(key)->next = successornode
hashmap.delete(key)
Why is the double linked list so common with LRU solutions then? It is easier to understand and use.
If optimization is an issue, then the trade off of a slightly less simple solution of a single linked list is definitely worth it.
There are a few complications for swapping the payload
The payload could be large (such as buffers)
part of the application code may still refer to the payload (have it pinned)
there may be locks or mutexes involved (which can be owned by both the DLL/hash nodes and/or the payload)
In any case, modifying the DLL affects at most 2*2 pointers, swapping the payload will need (memcpy for the swap +) walking the hash-chain (twice), which could need access to any node in the structure.

Cyper query- Property value change propagation

Hi,
In the above graph, we have a scenario where in any one of value property of a node is updating, the effect of that value, to be propagated to the remaining nodes. How should this value change event be propagated thru' the cypher query.
Appreciate your support
Is the requirement that this particular property should always be the same for this group of nodes? If it must be the same, then I would recommend extracting it into a node instead, and create relationships to that node from all nodes that should be using it.
With the value in a single place, it will only require a single property change on that node and everything will be in the right state.
EDIT
Requirements are rather fuzzy, so my answer will be fuzzy as well.
If you're matching based upon relationship types, then you'll want some kind of multiplicity on the relationship and maybe specifying allowed types in the match. Such as:
MATCH (start:RNode)-[:R45|R34|R23|R12*]->(r:RNode)
WHERE start.ID = 123 (or however you're matching on your start node)
That will match on every single node from your startNode up the relationship chain until there are no more of the allowed relationships to continue traversing.
If you need a more complicated expansion, you may want to look at the APOC Procedure library's Path Expander.
After you find the right matching query, then it should just be a matter of doing the recalculation for all matched nodes.

implementing a 'greedy' match to find the extent of a subtree in Cypher

I have a graph that contains many 'subtrees' of items where an original item can be cloned which results in
(clone:Item)-[:clones]->(original:Item)
and a cloned item can also be cloned:
(newclone:Item)-[:clones]->(clone:Item)
the first item is created by a user:
(:User)-[:created]->(:item)
and the clones are collected by a user:
(:User)-[:collected]->(:item)
Given any item in the tree, I want to be able to match all the items in the tree. I'm using:
(1) match (known:Item)-[:clones*]-(others:Item)
My understanding is that this implements a 'greedy' match, traversing the tree in all directions, matching all items.
In general this works, however in some circumstances it doesn't seem to match all the items in the tree. For example, in the following query, this doesn't seem to be matching the whole subtree.
match p = (known:Item)-[r:clones*]-(others:Item) where not any(x in nodes(p) where (x)<-[:created]-(:User)) return p
Here I'm trying to find subtrees which are missing a 'created' Item (which were deleted in the source SQL database.
What I'm finding is that it giving me false positives because it's matching only part of a particular tree. For example, if there is a tree with 5 items structured properly as described above, it seems (in some cases) to be matching a subset of the tree (maybe 2 out of 5 items) and that subset doesn't contain the created card and so is returned by the query when I didn't expect it to.
Question
Is my logic correct or am I misunderstanding something? I'm suspecting that I'm misunderstanding paths, but I'm confused by the fact that the basic 'greedy' match works in most cases.
I think that my problem is that I've been confused because the query is finding multiple paths in the tree, some of which satisfy the test in the query and some don't. When viewed in the neo4j visualisation, the multiple paths are consolidated into what looks like the whole tree whereas the tabular results show that the match (1) above actually gives multiple paths.
I'm now thinking that I should be using collections rather than paths for this.
You are quite right that the query matches more paths than what is apparent in the browser visualization. The query is greedy in the sense that it has no upper bound for depth, but it also has no lower bound (well, strictly the lower bound is 1), which means it will emit a short path and a longer path that includes it if there are such. For data like
CREATE
(u)-[:CREATED]->(i)<-[:CLONES]-(c1)<-[:CLONES]-(c2)
the query will match paths
i<--c1
i<--c1<--c2
c1<--c2
c2-->c1
c2-->c1-->i
c1-->i
Of these paths, only the ones containing i will be filtered by the condition NOT x<-[:CREATED]-(), leaving paths
c1<--c2
c2-->c1
You need a further condition in your pattern before that filter, a condition such that each path that passes it should contain some node x where x<-[:CREATED]-(). That way that filter condition is unequivocal. Judging from the example model/data in your question, you could try matching all directed variable depth (clone)-[:CLONES]->(cloned) paths, where the last cloned does not itself clone anything. That last cloned should be a created item, so each path found can now be expected to contain a b<-[:CREATED]-(). That is, if created items don't clone anything, something like this should work
MATCH (a)-[:CLONES*]->(b)
WHERE NOT b-[:CLONES]->()
AND NOT b<-[:CREATED]-()
This relies on only matching paths where a particular node in each path can be expected to be created. An alternative is to work on each whole tree by itself by getting a single pointer into the tree, and test the entire tree for any created item node. Then the problem with your query could be said to be that it treats c1<--c2 as if it's a full tree and he solution is a pattern that only matches once for a tree. You can then collect the nodes of the tree with the variable depth match from there. You can get such a pointer in different ways, easiest is perhaps to provide a discriminating property to find a specific node and collect all the items in that node's tree. Perhaps something like
MATCH (i {prop:"val"})-[:CLONES*]-(c)
WITH i, collect(distinct c) as cc
WHERE NOT (
i<-[:CREATED]-() OR
ANY (c IN cc WHERE c<-[:CREATED]-()
) //etc
This is not a generic query, however, since it only works on the one tree of the one node. If you have a property pattern that is unique per tree, you can use that. You can also model your data so that each tree has exactly one relationship to a containing 'forest'.
MATCH (forest)-[:TREE]->(tree)-->(item)-[:CLONES*]-(c) // etc
If your [:COLLECTED] or some other relationship, or a combination of relationships and properties make a unique pattern per tree, these can also be used.

What would be the benefit of having a root node in a linkedlist

From community college, I was told to implement linked list with starting node as empty node and append data node to the empty node, but in University, they don't use an empty node. I remember there were advantages of having an empty node but cannot recall it at this point.
What would be the benefit of having an empty node?
One that I can think of is that empty starting node can store list properties such as size of the linked list, and because it never gets deleted, we can extract list properties from it.
This is an example of having an empty node: (also refer to empty node implementation)
(EmptyNode)->(1st Data)->(2nd Data)->null
And this is an example of not having an empty node which is more common.
(1st Data)->(2nd Data)->null
Thank you in advance.
The advantage of an empty node is that it's easier to represent an empty list that still otherwise exists.
While you can sometimes represent an empty list as simply null, the disadvantage is that it assumes that lists are always represented as pointers. Another disadvantage is that you can't call any functions on null/ you make the interface awkward.
Imagine:
RootNodedListNode<char> list; // start empty
list.add('a');
list.add('b');
RootlessListNode<char> * list = null; // start empty
//list->add('a');
list = new RootlessListNode<char>('a');
list->add('b');

Resources