I'm having problem understanding why do we* use node as data-type?
*(I'm doing CS50 and while solving problem sets it's givel like this)
node *hashtable[50];
(here node refers to linked list node)
as we are just storing an pointer for a linked list in it, wouldn't it be better to define it as just an array of char*
char *hashtable[50];
Hashing functions have collisions. When a key hashes to an index where the table is already occupied, one strategy to resolve the collisions have a linked list there and you simply append to it.
There are other collision resolution strategies, but the separate chaining strategy is probably the simplest.
In order to be able to treat the hash table items as linked lists, they need to have at least a next pointer in addition to their payload. Hence the items need to be some kind of struct node* rather than the payload type directly.
Related
Most LRU cache tutorials emphasize using both a doubly linked list and a dictionary in combination. The dictionary holds both the value and a reference to the corresponding node on the linked list.
When we perform a remove operation, we look up the node from the linked list in the dictionary and we'll have to remove it.
Now here's where it gets weird. Most tutorials argue that we need the preceding node in order to remove the current node from the linked list. This is done in order to get O(1) time.
However, there is a way to remove a node from a singly linked list in O(1) time here. We set the current node's value to the next node and then kill the next node.
My question is: why are all these tutorials that show how to implement an LRU cache with a doubly linked list when we could save constant space by using a singly linked list?
You are correct, the single linked list can be used instead of the double linked list, as can be seen here:
The standard way is a hashmap pointing into a doubly linked list to make delete easy. To do it with a singly linked list without using an O(n) search, have the hashmap point to the preceding node in the linked list (the predecessor of the one you care about, or null if the element is at the front).
Retrieve list node:
hashmap(key) ? hashmap(key)->next : list.head
Delete:
successornode = hashmap(key)->next->next
hashmap( successornode ) = hashmap(key)
hashmap(key)->next = successornode
hashmap.delete(key)
Why is the double linked list so common with LRU solutions then? It is easier to understand and use.
If optimization is an issue, then the trade off of a slightly less simple solution of a single linked list is definitely worth it.
There are a few complications for swapping the payload
The payload could be large (such as buffers)
part of the application code may still refer to the payload (have it pinned)
there may be locks or mutexes involved (which can be owned by both the DLL/hash nodes and/or the payload)
In any case, modifying the DLL affects at most 2*2 pointers, swapping the payload will need (memcpy for the swap +) walking the hash-chain (twice), which could need access to any node in the structure.
So, a table constructor has two components, list-like and record-like. Do list-like entries always take precedence over record-like ones? I mean, consider the following scenario:
a = {[1]=1, [2]=2, 3}
print(a[1]) -- 3
a = {1, [2]=2, [1]=3}
print(a[1]) -- 1
Is the index 1 always associated with the first list-like entry, 2 with the second, and so on? Or is there something else?
There are two types of tables in Lua, arrays and dictionaries, these are what you call "lists" and "records". An array, contains values in an order, this gives you a few advantages, like faster iteration or inserting/removing values. Dictionaries work like a giant lookup table, it has no order, it's advantages are how you can use any value as a key, and you are not as restricted.
When you construct a table, you have 2 syntaxes, you can seperate the values with commas, e.g. {2,4,6,8} thereby creating an array with keys 1 through n, or you can define key-value pairs, e.g. {[1]=2,[58]=4,[368]=6,[48983]=8} creating a dictionary, you can often mix these syntaxes and you won't run into any problems, but this is not the case in your scenario.
What you are doing is defining the same key twice during the table's initial construction. This is most generally impractical and as such hasn't really had any serious thought put into it during the language's development. This means that what happens is essentially unspecified behaviour. It is not completely understood what effect this will have, and may be inconsistent across different platforms or implementations.
As such, you should not use this in any commercial projects, or any code you'll be sharing with other people. When in doubt, construct an empty table and define the key-value pairs afterward.
I see people usually use a temp node to manipulate a linked list. For example, create a new node whose pointer is stored in temp, point previous block to temp, then use temp for the next node.
Why not keep a designated name to each node(keep a variable that stores its address), so that we can access that node by simply dereferencing its name. This way, we can still insert a new node by pointing the previous node to it and pointing it to the next node.
I know there is a reason why linked list is not made this way, I just can't figure out why.
The linked list data type is simply not made for having a name for each item. In many cases you simply just don't need to name everything. If you need such behavior you can extend the type for your needs.
It all comes down to: Use the data structure which fits to your actual use case.
In Java for example there is a pre-defined type which does exactly, what you have described:
LinkedHashMap<K, V>
In my code I want to take advantage of ETS's bag type that can store multiple values for single key. However, it would be very useful to know if insertion actually inserts a new value or not (i.e. if the inserted key with value was or was not present in the bag).
With type set of ETS I could use ets:insert_new, but semantics is different for bag (emphasis mine):
This function works exactly like insert/2, with the exception that instead of overwriting objects with the same key (in the case of set or ordered_set) or adding more objects with keys already existing in the table (in the case of bag and duplicate_bag), it simply returns false.
Is there a way to achieve such functionality with one call? I understand it can be achieved by a lookup followed by an optional insert, but I am afraid it might hurt performance of concurrent access.
Is there any way to add parameters in tuple after initialisation?
Like :
var tupleX = ("Hi", "Rachit")
Now I want to add a parameter to tuple after which tupleX will have 3 or more parameters
Is it possible?
No. A tuple has a set number of elements. You may want to use an Array or some other class instead.
The difference between a tuple and a list (or other collections) is precisely the fixed amount of elements it contains.
From a type system perspective (1, 2) and (1, 2, 3) are of two distinct types, so of course you cannot alter the number of elements since you would be changing the type.
It's probably also important to notice that, as explained here,
Tuples are useful for temporary groups of related values. They are not suited to the creation of complex data structures. If your data structure is likely to persist beyond a temporary scope, model it as a class or structure, rather than as a tuple.
So if you need to alter a tuple overtime, you probably don't want to use a tuple, but rather a class, a struct or even a dictionary.