Machine Level Architecture, implement commands using others - memory

Doing some prep for interviews so doing interview questions people have posted on glassdoor for similar positions. Ran into one I'm stuck and a little confused on.
A processor with only 1 register and 2 memory slots. It has two instructions SUB and STO. Implement LOD, ADD, and MOV using only the following:
SUB a, memory1
SUB a, memory2
STO memory1, a
STO memory2, a
I'm Assuming STO is store and LOD is load here. So would the register be assumed to start with a value of 0? If not I'm not sure how to even start since I can't use subtract with the register if it has no value in it, can I? Bit lost here.

This is basically puzzle solving. Which is not terribly productive as an interview technique, but it definitely pays to understand that is the case and to treat it a bit differently than e.g. a methodical programming problem. In the interview context you want to make it very clear how you are approaching the search space and what you are thinking rather than just spitting out the answer.
In this spirit, I'll approach this a bit more conversationally...
Per the question of whether a has to be zero initially, what would we do if the initial value is arbitrary? How do we get it to be zero? The only compute instruction we have is a subtract... How do you get a guaranteed zero from that? Well X - X is always zero right? So if we need the accumulator to be zero, we store to a memory location and then subtract it back from the accumulator. The post condition for that is the accumulator is zero.
So from here we can see that one constraint we'll have is using one or both memory locations as temporary storage. This is a fairly big issue, but we likely have to insist that the composite instruction sequences get to use one memory location as a temp while the other is the input operand. It is an open question whether all operations can be implemented without destroying the input as well.
Let's start with load as it should be simplest. We want:
LOD a, memory1 # a = *memory1 -- destroys value in memory2
Lets try:
# zero a
STO a, memory2
SUB a, memory2
SUB a, memory1 # a == -memory1
# Save -memory1 and zero a again
STO a, memory2
SUB a, memory2 # a = 0
SUB a, memory2 # a = 0 - (-memory1)
There's LOD. It may well be possible to be more efficient, but that should get you started.
ADD a, memory1 # a = a + *memory1 -- destroys value in memory2
As above, we'll be using the X - (-Y) == X + Y algebraic equivalence.
STO memory2, a # save a, call this "original_a"
SUB a, memory2 # a = 0
SUB a, memory1 # a = -*memory1
SUB a, memory2 # a = -*memory1 - original_a
# Save -*memory1 - original_a, then zero a
STO a, memory2
SUB a, memory2
SUB a, memory2 # a = -(-*memory1 - original_a) == *memory1 + original_a
And so on...

Related

"How to make a turing machine of {w ∈ {a,b}∗ | 2na(w) = 3nb(w)}. My question is in how to apply the condition"

This is an assignment I have for a module. I understand turing machines, the problem for me is how do I make sure the ratio is maintained. I can see how one would check this if we can check for every 5 digits without being intermixed (e.g. {aababaabab}) but for the words like: {aaaaaabbbb}. Very lost.
Any tips/help?
I assume this means that the ration n/m = 3/2 where n is the number of occurrences of a and m is the number of occurrences of b.
To solve the general case, lots of solutions are possible; here is one that might be easy to understand.
scanning left to right, find 3 occurrences of a. Skip over A, B and b. Change the three a you find to A and go back to the beginning of the tape. If you hit the end of the tape without crossing over any a or b, halt-accept. If you hit the end of the tape having seen some a and b but not at least three a, halt-reject. Otherwise, continue to step 2.
scanning left to right, find 2 occurrences of b. Skip over A, B and a. Change the two b you find to B and go back to the beginning of the tape. If you hit the end without seeing at least two b, halt-reject. Repeat starting from step 1 until you halt-accept or halt-reject.
Examples:
aababaabab aaaaaabbbb aaaaabbbb
AAbAbaabab AAAaaabbbb AAAaabbbb
AABABaabab AAAaaaBBbb AAAaaBBbb
AABABAAbAb AAAAAABBbb halt_reject
AABABAABAB AAAAAABBBB
halt_accept halt_accept

Lowering memory usage in HashMap in Rust

I'm trying to parse a very long file by using a fixed-size sliding window over it. For each such window I'd like to either insert it to a HashMap as the key with custom struct as value, or modify existing value for the window. My main problem is memory usage, since it should scale into very large quantities (up to several billions of distinct windows) and I want to reuse existing keys.
I would like to append windows (or more specifically bytes) to a vector and use the index as a key in the HashMap, but use the window under index for hash computation and key comparison. Because windows are overlapping, I will append only the part of the window which is new (if I have an input AAAB and size 3 I would have 2 windows: AAA and AAB, but would only store 4 bytes - AAAB; windows would have indices 0 and 1 respectively), which is the reason behind not keying the HM with window itself.
Here's the simplified pseudo-code, in which I omitted the minimal-input problem:
let file = // content of the file on which i can call windows()
let vec = Rc::new(RefCell::new(Vec::new())); // RefCell allows me to store Rc in the hashmap while mutating the underlying vector
let hm: HashMap<KeyStruct, ValueStruct> = HashMap::new();
for i in file.windows(FIXED_SIZE) {
let idx = vec.len();
vec.borrow_mut().push(i);
if hm.contains_key(KeyStruct::new(idx)) {
// get the key associated with i
// modify the value associated with i
// do something with the key
vec.borrow_mut().pop(); // window is already in the vector
}
else {
hm.insert(KeyStruct::new(idx), ValueStruct::new(...));
}
}
I have came up with 2 different approaches: either modifying the existing HashMap implementation so that it works as intended, or using a custom struct as key to the HashMap. Since I would only use one vector in order to store windows, I could store a Rc to it in the HashMap and then use that for lookups.
I could also create a struct which would hold both a Rc and index, using it as a key to the HashMap. The latter solution works with a vanilla HashMap, but stores a lot of redundant Rcs to the same vector. I also thought about storing a static pointer to Rc and then get Rc in unsafe blocks, but I would have to guarantee that the position of the Rc on the stack never changes and I'm not sure if I can guarantee that.
I tried to implement the first approach (custom HashMap), but it turns out that Buckets use a lot of features which are gated, and I can't compile the project using the stable compiler.
What's even worse is that I would like to get the key that is already in the HashMap on a successful lookup (because different indices can store the same window, for which the hash/cmp would be identical) and use it inside the value structure. I couldn't find a way to do this using the provided API for HashMap - the closest I get is by using entry(), which can contain an OccupiedEntry, but it doesn't have any way to retrieve the key, and there's no way to get it by unsafe memory lookups, because documentation on repr() says that the order in structs is not guaranteed in the default representation. I can store the key (or only the index) in the value struct, but that adds yet another size_of::<usize>() bytes per entry, only to store the index/key in a reachable manner, which is kept with that entry either way.
My questions are:
Is it possible to compile/reuse parts of std::collections which are not pub, such that I could modify few methods of HashMap and compile the whole project?
Is there any way of getting the key after successful lookup in the HashMap? (I even found out that libs team decided against implementing method over Entry which would allow me to get the key...)
Can you see any alternative to solutions that I mentioned?
EDIT
To clarify the problem let's consider a simple example - input ABABCBACBC and window size of 2. We should give index as a key to the HashMap, and it should get the window-size number of bytes as window starting from that index: with vector [A, A, C], index 1 and window-size 2 HashMap should try to find a hash/key for AC.
We get windows like this:
AB -> BA -> AB -> BC -> CB -> BA -> AC -> CB -> BC
First pair is AB, we append it into the empty vector and give it an index of 0.
vec = [A, B]
hm = [(0, val)]
The next pair is BA:
start with vec = [A, B]
using algorithm not shown here, I know that I have a common part between last inserted window (AB) and current window (BA), namely B
append part of the window to the existing vector, so we have vec = [A, B, A]
perform a lookup using index 1 as the index of window
it has not been found so the new key, val is inserted to HashMap
vec = [A, B, A]
hm = [(0, val0), (1, val1)]
Next up is window AB:
once again we have a common part - A
append: vec = [A, B, A, B]
lookup using index 2
it is successful, so I should delete the newly inserted part of window and get the index of the window inside vector - in this case 0
modify value, do something with the key etc...
vec = [A, B, A]
hm = [(0, val0_modified), (1, val1)]
After looping over this input i should end up with:
vec = [A, B, A, B, C, B, A, C]
and indices for pairs could be represented as: [(AB, 0), (BA, 1), (BC, 3), (CB, 4), (AC, 6)]
I do not want to modify keys. I also don't want to modify the vector with the exception of pushing/popping the window during lookup/insertion.
Sidenote: even though I still have redundant information in this particular example after putting everything into vector, it won't be the case while working with the original data.

Possible to use less/greater than operators with IF ANY?

Is it possible to use <,> operators with the if any function? Something like this:
select if (any(>10,Q1) AND any(<2,Q2 to Q10))
You definitely need to create an auxiliary variable to do this.
#Jignesh Sutar's solution is one that works fine. However there are often multiple ways in SPSS to accomplish a certain task.
Here is another solution where the COUNT command comes in handy.
It is important to note that the following solution assumes that the values of the variables are integers. If you have float values (1.5 for instance) you'll get a wrong result.
* count occurrences where Q2 to Q10 is less then 2.
COUNT #QLT2 = Q2 TO Q10 (LOWEST THRU 1).
* select if Q1>10 and
* there is at least one occurrence where Q2 to Q10 is less then 2.
SELECT (Q1>10 AND #QLT2>0).
There is also a variant for this sort of solution that deals with float variables correctly. But I think it is less intuitive though.
* count occurrences where Q2 to Q10 is 2 or higher.
COUNT #QGE2 = Q2 TO Q10 (2 THRU HIGHEST).
* select if Q1>10 and
* not every occurences of (the 9 variables) Q2 to Q10 is two or higher.
SELECT IF (Q1>10 AND #QGE2<9).
Note: Variables beginning with # are temporary variables. They are not stored in the data set.
I don't think you can (would be nice if you could - you can do something similar in Excel with COUNTIF & SUMIF IIRC).
You've have to construct a new variable which tests the multiple ANY less than condition, as per below example:
input program.
loop #j = 1 to 1000.
compute ID=#j.
vector Q(10).
loop #i = 1 to 10.
compute Q(#i) = trunc(rv.uniform(-20,20)).
end loop.
end case.
end loop.
end file.
end input program.
execute.
vector Q=Q2 to Q10.
loop #i=1 to 9 if Q(#i)<2.
compute #QLT2=1.
end loop if Q(#i)<2.
select if (Q1>10 and #QLT2=1).
exe.

Pathfinding in Prolog

I'm trying to teach myself Prolog. Below, I've written some code that I think should return all paths between nodes in an undirected graph... but it doesn't. I'm trying to understand why this particular code doesn't work (which I think differentiates this question from similar Prolog pathfinding posts). I'm running this in SWI-Prolog. Any clues?
% Define a directed graph (nodes may or may not be "room"s; edges are encoded by "leads_to" predicates).
room(kitchen).
room(living_room).
room(den).
room(stairs).
room(hall).
room(bathroom).
room(bedroom1).
room(bedroom2).
room(bedroom3).
room(studio).
leads_to(kitchen, living_room).
leads_to(living_room, stairs).
leads_to(living_room, den).
leads_to(stairs, hall).
leads_to(hall, bedroom1).
leads_to(hall, bedroom2).
leads_to(hall, bedroom3).
leads_to(hall, studio).
leads_to(living_room, outside). % Note "outside" is the only node that is not a "room"
leads_to(kitchen, outside).
% Define the indirection of the graph. This is what we'll work with.
neighbor(A,B) :- leads_to(A, B).
neighbor(A,B) :- leads_to(B, A).
Iff A --> B --> C --> D is a loop-free path, then
path(A, D, [B, C])
should be true. I.e., the third argument contains the intermediate nodes.
% Base Rule (R0)
path(X,Y,[]) :- neighbor(X,Y).
% Inductive Rule (R1)
path(X,Y,[Z|P]) :- not(X == Y), neighbor(X,Z), not(member(Z, P)), path(Z,Y,P).
Yet,
?- path(bedroom1, stairs, P).
is false. Why? Shouldn't we get a match to R1 with
X = bedroom1
Y = stairs
Z = hall
P = []
since,
?- neighbor(bedroom1, hall).
true.
?- not(member(hall, [])).
true.
?- path(hall, stairs, []).
true .
?
In fact, if I evaluate
?- path(A, B, P).
I get only the length-1 solutions.
Welcome to Prolog! The problem, essentially, is that when you get to not(member(Z, P)) in R1, P is still a pure variable, because the evaluation hasn't gotten to path(Z, Y, P) to define it yet. One of the surprising yet inspiring things about Prolog is that member(Ground, Var) will generate lists that contain Ground and unify them with Var:
?- member(a, X).
X = [a|_G890] ;
X = [_G889, a|_G893] ;
X = [_G889, _G892, a|_G896] .
This has the confusing side-effect that checking for a value in an uninstantiated list will always succeed, which is why not(member(Z, P)) will always fail, causing R1 to always fail. The fact that you get all the R0 solutions and none of the R1 solutions is a clue that something in R1 is causing it to always fail. After all, we know R0 works.
If you swap these two goals, you'll get the first result you want:
path(X,Y,[Z|P]) :- not(X == Y), neighbor(X,Z), path(Z,Y,P), not(member(Z, P)).
?- path(bedroom1, stairs, P).
P = [hall]
If you ask for another solution, you'll get a stack overflow. This is because after the change we're happily generating solutions with cycles as quickly as possible with path(Z,Y,P), only to discard them post-facto with not(member(Z, P)). (Incidentally, for a slight efficiency gain we can switch to memberchk/2 instead of member/2. Of course doing the wrong thing faster isn't much help. :)
I'd be inclined to convert this to a breadth-first search, which in Prolog would imply adding an "open set" argument to contain solutions you haven't tried yet, and at each node first trying something in the open set and then adding that node's possibilities to the end of the open set. When the open set is extinguished, you've tried every node you could get to. For some path finding problems it's a better solution than depth first search anyway. Another thing you could try is separating the path into a visited and future component, and only checking the visited component. As long as you aren't generating a cycle in the current step, you can be assured you aren't generating one at all, there's no need to worry about future steps.
The way you worded the question leads me to believe you don't want a complete solution, just a hint, so I think this is all you need. Let me know if that's not right.

Find all possible pairs between the subsets of N sets with Erlang

I have a set S. It contains N subsets (which in turn contain some sub-subsets of various lengths):
1. [[a,b],[c,d],[*]]
2. [[c],[d],[e,f],[*]]
3. [[d,e],[f],[f,*]]
N. ...
I also have a list L of 'unique' elements that are contained in the set S:
a, b, c, d, e, f, *
I need to find all possible combinations between each sub-subset from each subset so, that each resulting combination has exactly one element from the list L, but any number of occurrences of the element [*] (it is a wildcard element).
So, the result of the needed function working with the above mentioned set S should be (not 100% accurate):
- [a,b],[c],[d,e],[f];
- [a,b],[c],[*],[d,e],[f];
- [a,b],[c],[d,e],[f],[*];
- [a,b],[c],[d,e],[f,*],[*];
So, basically I need an algorithm that does the following:
take a sub-subset from the subset 1,
add one more sub-subset from the subset 2 maintaining the list of 'unique' elements acquired so far (the check on the 'unique' list is skipped if the sub-subset contains the * element);
Repeat 2 until N is reached.
In other words, I need to generate all possible 'chains' (it is pairs, if N == 2, and triples if N==3), but each 'chain' should contain exactly one element from the list L except the wildcard element * that can occur many times in each generated chain.
I know how to do this with N == 2 (it is a simple pair generation), but I do not know how to enhance the algorithm to work with arbitrary values for N.
Maybe Stirling numbers of the second kind could help here, but I do not know how to apply them to get the desired result.
Note: The type of data structure to be used here is not important for me.
Note: This question has grown out from my previous similar question.
These are some pointers (not a complete code) that can take you to right direction probably:
I don't think you will need some advanced data structures here (make use of erlang list comprehensions). You must also explore erlang sets and lists module. Since you are dealing with sets and list of sub-sets, they seems like an ideal fit.
Here is how things with list comprehensions will get solved easily for you: [{X,Y} || X <- [[c],[d],[e,f]], Y <- [[a,b],[c,d]]]. Here i am simply generating a list of {X,Y} 2-tuples but for your use case you will have to put real logic here (including your star case)
Further note that with list comprehensions, you can use output of one generator as input of a later generator e.g. [{X,Y} || X1 <- [[c],[d],[e,f]], X <- X1, Y1 <- [[a,b],[c,d]], Y <- Y1].
Also for removing duplicates from a list of things L = ["a", "b", "a"]., you can anytime simply do sets:to_list(sets:from_list(L)).
With above tools you can easily generate all possible chains and also enforce your logic as these chains get generated.

Resources