I can write code using the Tortoise and Hare algorithm, and it works. But it doesn't make intuitive sense to me. It feels like it could break down for the right cycle length and hare step size.
Are there alternative explanations to help me understand?
I just saw someones solution who used this today. I'll explain what I understood from it.
Suppose both turtle and hare starts from position 1 in a list of size 5. Turtle advances 1 step and hare advances 2 steps at a time.
1st iteration:
turtle = 2,
hare = 3,
hare is behind by 4 steps (4,5,1,2)
2nd iteration:
turtle = 3,
hare = 5,
hare is behind by 3 steps (1,2,3)
3rd iteration:
turtle = 4,
hare = 2,
hare is behind by 2 steps (3,4)
4th iteration:
turtle = 5,
hare = 4,
hare is behind by 1 step (5)
5th iteration:
turtle = 1,
hare = 1,
hare caught up.
The gap decreases by 1 every iteration. So eventually the hare catches up to the turtle.
If there is a cycle, you can think of it as an infinitely repeating number line.
At the start of the Tortoise and Hare algorithm, we place the hare one node ahead of the tortoise. But once they enter the cycle, you can think of the hare being positioned behind the tortoise.
The typical steps for the tortoise and hare are 1 and 2, respectively. And no matter how far behind the hare is, it reduces down to two options - 1 step back, or 2 steps back.
If the hare is 1 step back, it will meet with the tortoise on the next step. Tortoise moves 1, Hare moves 2.
If the hare is 2 steps back, the tortoise will move forward 1, and the hare will move forward 2. Now, the hare is only one step back, and they'll meet on the next step.
To me this makes sense, and it appears to generalize to any size step the hare takes. The key insight for me was changing my mental approach from a cycle to an infinite, repeating number line.
Related
I am having an issue with the results I am getting from performing value iteration, with the numbers increasing to infinity so I assume I have a problem somewhere in my logic.
Initially I have a 10x10 grid, some tiles with a reward of +10, some with a reward of -100, and some with a reward of 0. There are no terminal states. The agent can perform 4 non-deterministic actions: move up, down, left, and right. It has an 80% chance of moving in the chosen direction, and a 20% chance of moving perpendicularly.
My process is to loop over the following:
For every tile, calculate the value of the best action from that tile
For example to calculate the value of going north from a given tile:
self.northVal = 0
self.northVal += (0.1 * grid[x-1][y])
self.northVal += (0.1 * grid[x+1][y])
self.northVal += (0.8 * grid[x][y+1])
For every tile, update its value to be: the initial reward + ( 0.5 * the value of the best move for that tile )
Check to see if the updated grid has the changed since the last loop, and if not, stop the loop as the numbers have converged.
I would appreciate any guidance!
What you're trying to do here is not Value Iteration: value iteration works with a state value function, where you store a value for each state. This means, in value iteration, you don't keep an estimate of each (state,action) pair.
Please refer the 2nd edition of Sutton and Barto book (Section 4.4) for explanation, but here's the algorithm for quick reference. Note the initialization step: you only need a vector storing the value for each state.
So, I'm working on a programming puzzle and need some help to see what I'm missing.
Here is the prompt:
What are the contents of the stack after running the following algorithm?
Given an array and an empty stack:
Push the values of the array onto the stack until a -1 value is found. Do not push the -1.
Pop the values in the stack until an even value is found or the stack is empty. The even value shouldn't be popped.
Repeat steps 1 and 2 until the stack is full or the array is read completely.
Type the contents of the stack from top to bottom, separated by a space, in the Answer box. If the stack is empty, enter a 0.
For this problem, use an array = {10,1, 3, 2, 3, -1, 4, 5, 7}, and a stack size of 6.
I honestly feel like this should be pretty easy but for some reason I'm missing something and continue to get it wrong.
Here is my thought process: (stack from bottom to top)
10 1 3 2 3
10 1 3 2
10 1 3 2 10 1
10 1 3 2 10
10 1 3 2 10 10
Final from top to bottom: 10 10 2 3 1 10
Is there something about stacks that I'm missing?
Any ideas?
I have created a nice pattern using the following one line code
repeat 36 [repeat 10[fd 10 rt 36] rt 10]
Now I want this to appear as if it is rotating. I have tried to clear the screen and then rotate the turtle a at a specific angle and then print the pattern again. But there is something completely wrong in my logic. Can anybody help?
In order to accomplish animation, you need an interpreter which supports it. The interpreter must be one which renders the entire output before displaying it (doesn't show the turtle movement during drawing), and it also must support the wait command (or something similar to it). An example of an interpreter that meets these qualifications would be the one at www.logointerpreter.com. Here's an example which spins your wheel a full rotation and works with that interpreter:
ht
repeat 360
[
clean
repeat 36 [repeat 10[fd 10 rt 36] rt 10]
wait 10
rt 1
]
As you can see, the outer loop draws 360 separate frames. After drawing each frame, it waits 10 milliseconds, so you can see the frame. It then rotates the turtle one degree before clearing the screen and beginning the drawing of the next frame. If you need a little more control, you could also store the starting angle for each frame in a variable, like this:
ht
make "start 0
repeat 360
[
cs
rt :start
repeat 36 [repeat 10[fd 10 rt 36] rt 10]
wait 10
make "start (:start + 1)
]
I am creating a football simulation game and I would like to make a 2D view of match. My match is 90 minutes long and there are 22 players on the field. How could I save a movements/path for players so that it wouldn't take lots of space. I know I could save it something like
Minute: min,
Player: id,
X: xCoord,
Y: yCoord
and then just move objects with jQuery from point A to point B, but I am sure it isn't the best solution, because it would require lots of space and database entries.
I am using MongoDB, but all suggestions are welcome.
How do the players move? They move a little in each step of the main loop? Or they go in long straight lines and then make sudden turns and go in other straight lines? In the first case you would probably need to save each milisecond or so (each step of the main loop), or you could save their positions every ten steps or every second, etc. And the replay could interpolate the saved points (thought the replay would look "gross" like that, it could save a lot of space in your db). In the second case (straight lines), you could just save the points where the players turn in another direction. In this case you'll save their position, angle and speed (along with time, obviously).
The first table could be (the intervals could be more than 1ms, depending on the power of the machine):
PLAYER TIME(ms) X Y
1 0 0 0
1 1 0 2
1 2 0 4
1 3 0 7
1 4 0 10
1 5 4 13
While the second table would be:
PLAYER TIME(ms) X Y Dir Speed
1 0 0 0 90 2
1 2 0 4 90 3
1 4 0 10 60 5
or something like that. Dir is the direction in degrees. Hope that helps!
I'm currently writing in C# what could basically be called my own interpretation of the NES hardware for an old-school looking game that I'm developing. I've fired up FCE and have been observing how the NES displayed and rendered graphics.
In a nutshell, the NES could hold two bitmaps worth of graphical information, each with the dimensions of 128x128. These are called the PPU tables. One was for BG tiles and the other was for sprites. The data had to be in this memory for it to be drawn on-screen. Now, if a game had more graphical data then these two banks, it could write portions of this new information to these banks -overwriting what was there - at the end of each frame, and use it from the next frame onward.
So, in old games how did the programmers 'bank switch'? I mean, within the level design, how did they know which graphic set to load? I've noticed that Mega Man 2 bankswitches when the screen programatically scrolls from one portion of the stage to the next. But how did they store this information in the level - what sprites to copy over into the PPU tables, and where to write them at?
Another example would be hitting pause in MM2. BG tiles get over-written during pause, and then get restored when the player unpauses. How did they remember which tiles they replaced and how to restore them?
If I was lazy, I could just make one huge static bitmap and just grab values that way. But I'm forcing myself to limit these values to create a more authentic experience. I've read the amazing guide on how M.C. Kids was made, and I'm trying to be barebones about how I program this game. It still just boggles my mind how these programmers accomplisehd what they did with what they had.
EDIT: The only solution I can think of would be to hold separate tables that state what tiles should be in the PPU at what time, but I think that would be a huge memory resource that the NES wouldn't be able to handle.
wSo after a night of thinking and re-reading documents, I think I came up with a perfect solution. A matrix!
Given the following data:
3, -1, -1, -1, -1
-1, 0, 1, 2, -1
-1, -1, -1, 3, -1
-1, -1, 5, 4, -1
-1, -1, -1, -1, -1
I can use this information to access information within lookup tables to determine what information I need. The first entry (0,0) defines the whole map, where as the other values define what is needed in that particular screen.
MAP ARRAY PALETTE MUSIC TILESET STARTINGSCR
0 0 0 1 4
1 4 3 2 2
2 etc.
3
So when loading the map, I look at item (0,0). It will say I need to load X tiles into the PPU, use Y color pallete, Z tileset, and A music. It will also say that screen 0 is the starting screen and that the level starts there - position the character accordingly.
SCREEN PALETTE TILESET MUSIC TILEDATA SCROLLL SCROLLR SCROLLU SCROLLD
0 0 1 2 4 true true true true
1 etc
2 2 1 2 3 false false false true
Now lets say I need to transition screens. I can look at the current screen vs the target screen. If the new screen needs information not in the PPU, I can initiate a transition that will load the data during it. I can also see if I can scroll into that direction; e.g., if the target screen is -1, I cannot scroll that direction. I can also store a flag somewhere to determine that if scrolled onto that screen, I cannot scroll back. E.g, I can go right into screen #2 but cannot scroll left into screen 1.