So I am creating a maze generating algorithm using recursive backtracking. I keep track of the points that I visit in a stack using a matrix. This matrix has two columns, one for the x-coordinate, and one for the y-coordinate. The problem is, my program works for small mazes, but for bigger mazes my calculator runs out of memory. I was wondering if there is a less memory intensive way to implement a stack. I'm thinking about using strings as a possible way to do it. I use a ti-84 CSE by the way.
Your stack should probably be implemented using a list. I'll be using L1 for demonstration purposes. A stack is a last-in, first-out data structure.
List elements are accessible by using
L1(X)
Where X is the item you want. This means first in has to go to L1(1) (the beginning of the list; the 1st item), and onwards, and first out has to come out of the last item in the list. To find how many items are in a list (and therefore find out the Nth item is the last) use
dim(L1)
This will give a number of how many items there are. Instead of storing it to a variable, we can use it to always access the last item in a list. using this:
L1(dim(L1))->M
//this addresses the item of L1 at dim(L1), meaning the last item
Now M will have the value of the last item. This is the first-out part. Then, to destroy the last item (since you popped it off), do this:
dim(L1)-1->dim(L1)
So putting it all together, your "pop" code will be:
If dim(L1)>0
Then
// It checks if L1 has an element to pop off in the first place
L1(dim(L1))->M
dim(L1)-1->dim(L1)
End
Now, M will have the value of the last item, and the last item is destroyed. Now, onto the push code. To push, you must put your number into a new slot one higher than the old last number. Essentially, you must make a new last item to put it in. Luckily, this is very easy in TI-Basic. Your "push" code would be:
If dim(L1)<99
// It checks if L1 has less than the maximum number of elements,
// which is 99.
M->L1(dim(L1)+1)
And if you're gonna be storing X/Y coordinates with your stack, I'd recommend a format such as this:
X + .01Y -> M
//X=3, Y = 15
// This would make M be 3.15
And to seperate these back into two seperate coordinates:
int(M)->X
// The integer value of M is 3, which is what X was earlier
100*fPart(M)->Y
// The fraction part of M was .15. Multiply that by 100 to get 15,
// which is what Y was earlier
Related
I'm reading how the probabilistic data structure count-min-sketch is used in finding the top k elements in a data stream. But I cannot seem to wrap my head around the step where we maintain a heap to get our final answer.
The problem:
We have a stream of items [B, C, A, B, C, A, C, A, A, ...]. We are asked to find out the top k most frequently appearing
items.
My understanding is that, this can be done using micro-batching, in which we accumulate N items before we start doing some real work.
The hashmap+heap approach is easy enough for me to understand. We traverse the micro-batch and build a frequency map (e.g. {B:34, D: 65, C: 9, A:84, ...}) by counting the elements. Then we maintain a min-heap of size k by traversing the frequency map, adding to and evicting from the heap with each [item]:[freq] as needed. Straightforward enough and nothing fancy.
Now with CMS+heap, instead of a hashmap, we have this probabilistic lossy 2D array, which we build by traversing the micro-batch. The question is: how do we maintain our min-heap of size k given this CMS?
The CMS only contains a bunch of numbers, not the original items. Unless I also keep a set of unique elements from the micro-batch, there is no way for me to know which items I need to build my heap against at the end. But if I do, doesn't that defeat the purpose of using CMS to save memory space?
I also considered building the heap in real-time when we traverse the list. With each item coming in, we can quickly update the CMS and get the cumulative frequency of that item at that point. But the fact that this frequency number is cumulative does not help me much. For example, with the example stream above, we would get [B:1, C:1, A:1, B:2, C:2, A:2, C:3, A:3, A:4, ...]. If we use the same logic to update our min-heap, we would get incorrect answers (with duplicates).
I'm definitely missing something here. Please help me understand.
Keep a hashmap of size k, key is id, value is Item(id, count)
Keep a minheap of size k with Item
As events coming in, update the count-min 2d array, get the min, update Item in the hashmap, bubble up/bubble down the heap to recalculate the order of the Item. If heap size > k, poll min Item out and remove id from hashmap as well
Below explanation comes from a comment from this Youtube video:
We need to store the keys, but only K of them (or a bit more). Not all.
When every key comes, we do the following:
Add it to the count-min sketch.
Get key count from the count-min sketch.
Check if the current key is in the heap. If it presents in the heap, we update its count value there. If it not present in the heap, we check if heap is already full. If not full, we add this key to the heap. If heap is full, we check the minimal heap element and compare its value with the current key count value. At this point we may remove the minimal element and add the current key (if current key count > minimal element value).
In Swift 3 Collection indices have to conform to Comparable instead of Equatable.
Full story can be read here swift-evolution/0065.
Here's a relevant quote:
Usually an index can be represented with one or two Ints that
efficiently encode the path to the element from the root of a data
structure. Since one is free to choose the encoding of the “path”, we
think it is possible to choose it in such a way that indices are
cheaply comparable. That has been the case for all of the indices
required to implement the standard library, and a few others we
investigated while researching this change.
In my implementation of a custom linked list collection a node (pointing to a successor) is the opaque index type. However, given two instances, it is not possible to tell if one precedes another without risking traversal of a significant part of the chain.
I'm curious, how would you implement Comparable for a linked list index with O(1) complexity?
The only idea that I currently have is to somehow count steps while advancing the index, storing it within the index type as a property and then comparing those values.
Serious downside of this solution is that indices must be invalidated when mutating the collection. And while that seems reasonable for arrays, I do not want to break that huge benefit linked lists have - they do not invalidate indices of unchanged nodes.
EDIT:
It can be done at the cost of two additional integers as collection properties assuming that single linked list implements front insert, front remove and back append. Any meddling around in the middle would anyway break O(1) complexity requirement.
Here's my take on it.
a) I introduced one private integer type property to my custom Index type: depth.
b) I introduced two private integer type properties to the collection: startDepth and endDepth, which both default to zero for an empty list.
Each front insert decrements the startDepth.
Each front remove increments the startDepth.
Each back append increments the endDepth.
Thus all indices startIndex..<endIndex have a reflecting integer range startDepth..<endDepth.
c) Whenever collection vends an index either by startIndex or endIndex it will inherit its corresponding depth value from the collection. When collection is asked to advance the index by invoking index(_ after:) I will simply initialize a new Index instance with incremented depth value (depth += 1).
Conforming to Comparable boils down to comparing left-hand side depth value to the right-hand side one.
Note that because I expand the integer range from both sides as well, all the depth values for the middle indices remain unchanged (thus are not invalidated).
Conclusion:
Traded benefit of O(1) index comparisons at the cost of minor increase in memory footprint and few integer increments and decrements. I expect index lifetime to be short and number of collections relatively small.
If anyone has a better solution I'd gladly take a look at it!
I may have another solution. If you use floats instead of integers, you can gain kind of O(1) insertion-in-the-middle performance if you set the sortIndex of the inserted node to a value between the predecessor and the successor's sortIndex. This would require to store (and update) the predecessor's sortIndex on your nodes (I imagine this should not be to hard since it is only changed on insertion or removal and it can always be propagated 'up').
In your index(after:) method you need to query the successor node, but since you use your node as index, that is be straightforward.
One caveat is the finite precision of floating points, so if on insertion you the distance between the two sort indices are two small, you need to reindex at least part of the list. Since you said you only expect small scale, I would just go through the hole list and use the position for that.
This approach has all the benefits of your own, with the added benefit of good performance on insertion in the middle.
Let's set the context/limitations:
A linked-list consists of Node objects.
Nodes only have a reference to their next node.
A reference to the list is only a reference to the head Node object.
No preprocessing or indexing has been done on the linked-list other than construction (there are no other references to internal nodes or statistics collected, i.e. length).
The last node in the list has a null reference for its next node.
Below is some code for my proposed solution.
Node cursor = head;
Node middle = head;
while (cursor != null) {
cursor = cursor.next;
if (cursor != null) {
cursor = cursor.next;
middle = middle.next;
}
}
return middle;
Without changing the linked-list architecture (not switching to a doubly-linked list or storing a length variable), is there a more efficient way to find the middle element of singly-linked list?
Note: When this method finds the middle of an even number of nodes, it always finds the left middle. This is ideal as it gives you access to both, but if a more efficient method will always find the right middle, that's fine, too.
No, there is no more efficient way, given the information you have available to you.
Think about it in terms of transitions from one node to the next. You have to perform N transitions to work out the list length. Then you have to perform N/2 transitions to find the middle.
Whether you do this as a full scan followed by a half scan based on the discovered length, or whether you run the cursor (at twice speed) and middle (at normal speed) pointers in parallel is not relevant here, the total number of transitions remains the same.
The only way to make this faster would be to introduce extra information to the data structure which you've discounted but, for the sake of completeness, I'll include it here. Examples would be:
making it a doubly-linked list with head and tail pointers, so you could find it in N transitions by "squeezing" in from both ends to the middle. That doubles the storage requirements for pointers however so may not be suitable.
having a skip list with each node pointing to both it's "child" and its "grandchild". This would speed up the cursor transitions resulting in only about N in total (that's N/2 for each of cursor and middle). Like the previous point, there's an extra pointer per node required for this.
maintaining the length of the list separately so you could find the middle in N/2 transitions.
same as the previous point but caching the middle node for added speed under certain circumstances.
That last point bears some extra examination. Like many optimisations, you can trade space for time and the caching shows one way to do it.
First, maintain the length of the list and a pointer to the middle node. The length is initially zero and the middle pointer is initially set to null.
If you're ever asked for the middle node when the length is zero, just return null. That makes sense because the list is empty.
Otherwise, if you're asked for the middle node and the pointer is null, it must be because you haven't cached the value yet.
In that case, calculate it using the length (N/2 transitions) and then store that pointer for later, before returning it.
As an aside, there's a special case here when adding to the end of the list, something that's common enough to warrant special code.
When adding to the end when the length is going from an even number to an odd number, just set middle to middle->next rather than setting it back to null.
This will save a recalculation and works because you (a) have the next pointers and (b) you can work out how the middle "index" (one-based and selecting the left of a pair as per your original question) changes given the length:
Length Middle(one-based)
------ -----------------
0 none
1 1
2 1
3 2
4 2
5 3
: :
This caching means, provided the list doesn't change (or only changes at the end), the next time you need the middle element, it will be near instantaneous.
If you ever delete a node from the list (or insert somewhere other than the end), set the middle pointer back to null. It will then be recalculated (and re-cached) the next time it's needed.
So, for a minimal extra storage requirement, you can gain quite a bit of speed, especially in situations where the middle element is needed more often than the list is changed.
Sorry, I made a mistake in my earlier question. Because of that I didn't get the answer I wanted.
The teacher told us that every time you divide something by 2, the run-time is likely to be log n. For instance, if we divide an array into two, each time we traverse one of the array, the run-time would be log n. However, we may run into a case with LinkedList where we may be easily misled. For instance, we may have an algorithm to set the nth element of the list to something else by starting from either the head or the tail in order to have a run-time of less than n. Logically, we may think that the run time would be log n, but it's not. Why is that? And how do you determine that?
Do we need to absolutely have splitting to get a run-time of log n? I don't think it makes any logical sense to say the run-time of n when the maximum run-time of the loop is n/2.
I think some concepts need a bit of refining here, because the time complexity is only related to algorithm, not to the size of the data structure you're operating on.
The teacher told us that every time you divide something by 2, the run-time is likely to be log n. For instance, if we divide an array into two, each time we traverse one of the array, the run-time would be log n.
Now, traversing an array, like
for (int i = 0; i < array.size; i++) {
variable = array[i];
}
runs in O(n): the time needed to perform such an operation varies linearly with the size of the array. You will have O(log n) for operations like a binary search on an array, but you cannot generalize this concept to all array operations, and especially not to those who need to iterate over the array.
Now, this sentence
For instance, we may have an algorithm to set the nth element of the list to something else by starting from either the head or the tail in order to have a run-time of less than n.
leads me to believe that you think that the n as used in big O and what you call the "nth element" are directly related. They aren't. On a linked list your only option to go to element n is to go to the start of the list and follow the links down the element you're looking for (or in the case of a double linked list, go to the start or end depending on the position of the element you're looking for), so this operation has a time complexity of O(n), ie linearly related to the length of the collection.
I am trying to implement a 1D DCT type II filter in Labview. The formula for this can be seen here
As you can see xk = the sum of a sum function involving an iteration of n.
As far as I know the nested for loop should handle the function with the shift registers keeping a running total of the output. My problem lies with the output the the matrix xk. There is either only one output to the matrix or each output over-writes the last output due to no indexig. trying to put the matrix inside the for loop results in an error between the shift register and the matrix:
You have connected two terminals of different types.
The source is a double and the sink is a 1D array of double
Anyone know how I can index the output to the array?
I believe this should work. Please check the math.
the inner for-loop will run either 8 times, or however many elements are in the array xn. LabVIEW uses whichever number is smaller to determine the iteration count. So if xn is empty, the for loop wont run at all. If it's 20, the for loop will run 8 times.
Regardless, the outer loop will always run 8 times, so xk will have 8 elements total.
Also, shift registers that do not initialize a value at the beginning of a for or while loop can cause problems, unless you mean to do that. The value stored in the shift register after running the first time could be a problem the second time you go to run it.