Why is this Turing Machine not a decider? - automata

Like I get that a Turing Machine is a decider if it halts for every input. But for this question, it just gave me this diagram and asked me to deduce whether this Turing Machine was considered a decider or not and the answer was no. How was 'no' deduced? I can't seem to wrap my head around this whole decider and non-decider Turing Machine thing. If anyone could help explain this to me. Thanks.

If I am getting this right, then if I write "ba" on the tape and start the Turing machine with the head located at "b", then it will loop forever.
Starting at position 0 in state 0.
Read "b", replace "b" by "b", go right to position 1, move to state 1
Read "a", replace "a" by "a", go left to position 0, move to state 0
Read "b", replace "b" by "b", go right to position 1, move to state 1
Read "a", replace "a" by "a", go left to position 0, move to state 0
Read "b", replace "b" by "b", go right to position 1, move to state 1
Read "a", replace "a" by "a", go left to position 0, move to state 0
...
You can see that this will loop forever.
Because there is an input for which the Turning machine doesn't halt, it's not a decider.

Related

Reactor - Publish behavior

I am very confused with publish behavior in Flux. Why the second subscriber does not print anything but the first one does. It is a hot publisher and values are emitted once a second. both should share same elements.
Flux<String> flux = Flux.fromIterable(Arrays.asList("a", "b", "c", "d", "e", "f"))
.publish()
.autoConnect()
.delayElements(Duration.ofSeconds(1));
flux.subscribe(s -> System.out.println("1 - " + s));
flux.subscribe(s -> System.out.println("2 - " + s));
Interestingly share method shows the output for both subscribers.
The fact it's a "hot publisher" means that you may miss values if they're emitted before you subscribe, which is what's happening here - the first subscriber is causing the Flux to start publishing, and it publishes all its values before your second subscriber.
You may expect delayElements() to change this behaviour, as it's delaying each one by a second, which should be more than enough time for the second subscriber to subscribe. However, this doesn't happening as you're only delaying the elements after the publish() and autoconnect() calls - or to put it another way, only after the Flux has been made "hot". This means all the values are emitted near instantly, before being delayed, and before the second subscriber gets a chance to subscribe.
Instead, I suspect you want the delayElements() call before it's a hot Flux as follows:
Flux<String> flux = Flux.fromIterable(Arrays.asList("a", "b", "c", "d", "e", "f"))
.delayElements(Duration.ofSeconds(1))
.publish()
.autoConnect();
This will then delay the elements before the Flux becomes hot, almost certainly giving your second subscriber enough time to subscribe, and printing both sets of results as you expect.

Defining states, Q and R matrix in reinforcement learning

I am new to RL and I am referring couple of books and tutorials, yet I have a basic question and I hope to find that fundamental answer here.
the primary book referred: Sutton & Barto 2nd edition and a blog
Problem description (only Q learning approach): The agent has to reach from point A to point B and it is in a straight line, point B is static and only the initial position of Agent is always random.
-----------A(60,0)----------------------------------B(100,0)------------->
keeping it simple Agent always moves in the forward direction. B is always at X-axis position 100, which also a goal state and in first iteration A is at 60 X-axis position. So actions will be just "Go forward" and "Stop". Reward structure is to reward the agent 100 when A reaches point B and else just maintain 0, and when A crosses B it gets -500. So the goal for the Agent is to reach and stop at position B.
1)how many states would it require to go from point A to point B in this case? and how to define a Q and an R matrix for this?
2)How to add a new col and row if a new state is found?
Any help would be greatly appreciated.
Q_matrix implementation:
Q_matrix((find(List_Ego_pos_temp == current_state)) ,
possible_actions) = Q_matrix(find(List_Ego_pos_temp == current_state),possible_actions) + this.learning_rate * (Store_reward(this.Ego_pos_counter) + ...
this.discount * max(Q_matrix(find(List_Ego_pos_temp == List_Ego_pos_temp(find(current_state)+1))),possible_actions) - Q_matrix((find(List_Ego_pos_temp == current_state)) , possible_actions));
This implementation is in matlab.
List_Ego_pos_temp is a temporary list which store all the positions of the Agent.
Also, lets say there are ten states 1 to 10 and we also know that with what speed and distance the agent moves in each state to reach till state 10 and the agent always can move only sequentially which means agent can go from s1 to s2 to s3 to s4 till 10 not s1 to s4 or s10.
lets say at s8 is the goal state and Reward = 10, s10 is a terminal state and reward is -10, from s1 to s7 it receives reward of 0.
so will it be a right approach to calculate a Q table if the current state is considered as state1 and the next state is considered as state2 and in the next iteration current state as state2 and the next state as state3 and so on? will this calculate the Q table correctly as the next state is already fed and nothing is predicted?
Since you are defining the problem in this case, many of the variables are dependent on you.
You can define a minimum state (for e.g. 0) and a maximum state (for e.g. 150) and define each step as a state (so you could have 150 possible states). Then 100 will be your goal state. Then your action will be defined as +1 (move one step) and 0 (stop). Then the Q matrix will be a 150x2 matrix for all possible states and all actions. The reward will be scalar as you have defined.
You do not need to add new column and row, since you have the entire Q matrix defined.
Best of luck.

Stack: The terminlogy and example

For this simple problem, I need to find the value(s) of stack1 and in order, if any. When it comes to the stack, the principle is LIFO (last in, first out) or FILO (first in, last out). And the reason stacks are used is to reverse the data, and displaying it in reverse order.
Stack<Integer> stack1 = new Stack<Integer>();
stack1.push (2);
stack1.push(5);
stack1.push (stack1.pop() - stack1.pop());
stack1.push(8);
The question above made me think, if we were to use the principle, should the answer be this: 8, 3, 5, 2.
8 being the last value being the start, then the next value being 3, from taking 5 and 2 (the "pop" being the deletion at the "head"). Then the next two values being 5 and 2. Would that be the right answer, or did I got the incorrect answer?
The Stack is a LIFO (last in first out). Look at it from the point of view of the first element you put in. You should also check this out what is the basic difference between stack and queue?.
As for the example, the answer is 8 and 3 only because when you calculated the 3 as stack.pop() - stack.pop() you deleted the 5 and the 2 from the stack, so they won´t be there anymore.
Stack stack1 = new Stack(); []
stack1.push (2); [2]
stack1.push(5); [2,5]
stack1.push (stack1.pop() [2] - stack1.pop() [] ); [3]
stack1.push(8); [3,8]

Why are the only the states 0 and 2 present in line 8?

LR Parsing:
LR Parsing Table:
In line 7, we reduce by T->T*F.
And State 7 on T does not have any transition.
In line 8, why do we have only the states 0 and 2?
At step 7, we reduce T→T*F, which means that:
We pop the right-hand side off of the stack, leaving only the state 0 corresponding to symbol $.
We consult the goto transitions of state 0 (the new top of the stack) for the left-hand symbol T. That says we should goto state 2.
We push the new state 2 onto the stack along with the associated symbol T.
At the end, the stack is 0 2 with symbols $ T, as shown at step 8.
This is well-described in the text and pseudocode algorithms of the excellent book from which those charts were copied.

XMonad: Is there a way to bind a simultaneously triggered keychord?

Is there a way to make simultaneous key presses into a keybinding, e.g. for the keys w, e, f, when pressed within 0.05 seconds of each other, to trigger a command?
To be more specific:
If w, e, f are pressed within 0.05 seconds of each other, then upon the pressing of the last one, XMonad should trigger said command. XMonad should also have intercepted the three keys so that they are not superfluously sent to the focused window.
Otherwise (if at least one of them are not pressed within the 0.05 second time period) XMonad should send the keys to the focused window as usual.
My goal in this is to use w, e, f to "Escape" into a vim-like "Normal Mode", a XMonad.Actions.Submap (submap).
Update with a failed method, in case anyone can see a way to fix it:
I attempted to implement this using submaps, so that, for example, if you pressed w you would end up in chord_mode_w, if you pressed e from there you would end up in chord_mode_we, and if you pressed f from there you would finally end up in normal_mode, for instance. The implementation was very messy: I included, in my main keybindings, something like:
("w", spawn "xdotool key <chord_mode_w_keybinding> ; sleep 0.05 ; xdotool key <abort_keybinding>")
(chord_mode_w_keybinding, chord_mode_w)
for detecting w (the rest would be similar), along with (incomplete) submaps such as:
chord_mode_w = submap . mkKeymap c $
[
("e", chord_mode_we )
, ("f", chord_mode_wf )
, (abort_keybinding, pasteString "w")
-- in order for the submap to not eat all other letters,
-- would need to include all mappings like:
, ("a", pasteString "wa")
, ("b", pasteString "wb")
...
]
chord_mode_we = submap . mkKeymap c $
[
("f", normal_mode )
, (abort_keybinding, pasteString "we")
-- in order for the submap to not eat all other letters,
-- would need to include all mappings like:
, ("a", pasteString "wea")
, ("b", pasteString "web")
...
]
chord_mode_wf = submap . mkKeymap c $
[
("e", normal_mode )
, (abort_keybinding, pasteString "wf")
-- in order for the submap to not eat all other letters,
-- would need to include all mappings like:
, ("a", pasteString "wfa")
, ("b", pasteString "wfb")
...
]
A complete implementation would evidently have been very messy, but in theory should have sent me to normal_mode if I pressed "wef" within 0.05 seconds of each other, aborting and typing out the characters otherwise. There were two problems, however:
pasteString (as well as the other paste functions in XMonad.Util.Paste) is too slow for normal typing
I would end up in normal_mode only a small fraction of the time even if I set the abort delay much higher. Not sure of the reason behind this.
(The reason I used pasteString when aborting instead of spawning another xdotool was that output of xdotool would re-trigger one of the chord_mode_w_keybinding, chord_mode_e_keybinding, chord_mode_f_keybinding, back in the main keybindings, sending me back to the chord modes indefinitely.)
https://hackage.haskell.org/package/xmonad-contrib-0.13/docs/XMonad-Actions-Submap.html
Submap really does do almost what you want (it gets you most of the way there) ... and I will suggest that you may want to change what you are trying to do, ever so slightly, and then Submaps handle it perfectly.
You can configure Submap to capture a w key event, and start waiting for an e which then waits for an f. I even tried this out, and confirmed that it works:
, ((0, xK_w), submap . M.fromList $
[ ((0, xK_e), submap . M.fromList $
[ ((0, xK_f), spawn "notify-send \"wef combo detected!\"" ) ])
])
However, the above is almost certainly not something you'd actually want to do... since it is now impossible to send a w keypress to a window (I had to disable that config before typing this answer, which required sending several w keypress events to the active window)
The behavior I saw just now when playing with this is: if I press w, xmonad traps that event (does not send it to the active window) and is now in a state where it is waiting for either e or something else ... if I press something else, xmonad is no longer in that state, but it does not "replay" those trapped events. So if I press w and then some other key which isn't e, the result is only that xmonad is back out of the state listening for keys in the submap. It does not ever allow a w through to the active window... which I found inconvenient.
Your options as I see it are:
1) settle for the initial keybinding having a modifier, so your multi-key command would be Mod4-w e f
2) find a way to hack the delay logic you described into the action in the submap
I started using a config like this, where I nest conceptually similar actions that are infrequently needed under a tree of nested submaps, analogous to what I pasted above. The root of that tree, however, always has a modifier, so it doesn't steal keypresses which I want to forward to the active window. I use Mod3-semicolon as the root of that tree, and then there are many unmodified keypresses which are just letters (they are mnemonics for the actions).
To me, this seems like a better solution, rather than waiting for a few hundred milliseconds, and then forwarding the events unless they matched. I feel like I would find that annoying, since it would delay any w keypress event...
YMMV, hope it helps someone

Resources