loop to sum array returns address like reading instead of the correct answer. why? - forth

I'm trying to make a program that will sum an array for me, but it keeps on giving me a really long number resembling an address when I try to run the word sum. I tried taking it apart and running it line by line outside of the word in the terminal and manually looping worked fine, but it totally fails when I go to actually make it work. what am i doing wrong?
variable length \ length var declared
create list \ space for my list made
0 variable cumsum \ sum value initialized to zero
: upload ( n1 n2 n3 --) \ loops thru and stuffs data into array
depth ( n1 n2 n3 -- n1 n2 n3 depth) \ get depth
length ! ( n1 n2 n3 depth -- n1 n2 n3) \ array length stored
list ( n1 n2 n3 -- n1 n2 n3 addr)
length # ( n1 n2 n3 addr -- n1 n2 n3 addr nlength)
cells allot ( n1 n2 n3 addr nlength -- n1 n2 n3)
length # 1+ ( n1 n2 n3 -- n1 n2 n3 nlength) \ consume all entries
0 ( n1 n2 n3 nl -- n1 n2 n3 nl 0) \ lower loop parameter..
do ( n1 n2 n3 nl 0 -- n1 n2 n3) \ loop begins
list ( n1 n2 n3 -- n1 n2 n3 addr)
I ( n1 n2 n3 addr -- n1 n2 n3 addr I) \ calculating address
cells ( n1 n2 n3 addr I -- n1 n2 n3 addr Ibytes)
+ ( n1 n2 n3 addr Ibytes -- n1 n2 n3 addr1+)
! ( n1 n2 n3 addr1+ -- n1 n2) \ storing into calculated address
loop
;
upload works like a charm, but then I go to use this word after
: sum ( n1 n2 n3 -- nsum)
upload \ initiates the array
length # \ invokes upper limit of loop
0 \ lower limit of loop
do
list ( -- addr) \ addr invoked
I cells + ( addr -- addr+) \ offset calculated and added
# ( addr+ -- nl) \ registered value at address fetched
cumsum # ( nl -- nl ncs) \ cum sum value fetched to stack
+ ( nl ncs -- nsum) \ summation
cumsum ! ( nsum --) \ new sum written to cumsum
loop
cumsum ? ( -- cumsum) \ show sum
;
and it returns a really long number that looks like an address, and not the sum of some list of small numbers that I add to test it with.
1 ok
2 ok
3 ok
sum 140313777201982 ok

So if I understand correctly, the problem is to:
Store all numbers on the stack into an array.
Sum all numbers stored in an array.
I would do something like this:
: upload ( ... "name" -- ) create depth dup , 0 ?do , loop ;
: sum ( a -- n ) 0 swap #+ 0 ?do #+ rot + swap loop drop ;
Use like this:
1 2 3 4 upload array
array sum .

In UPLOAD you execute LIST LENGTH # CELLS ALLOT. ALLOT allocates memory at the current dictionary or data space pointer, not necessarely at the address returned by LIST. ALLOT does not consume a start address from the stack. Actually the returned address by LIST in the above code snippet is used later by ! in your array filling loop. It's the data for the first array cell. Hence your address like number returned by SUM.
Best is to keep CREATE and ALLOT together. Some dictionary additions occured between the creation of LIST and executing ALLOT. Your array cells might not be where LIST is pointing.
In general variables don't consume a number from the stack. Most of the time they're automaticly initialised to 0. So 0 VARIABLE CUMSUM will a leave zero on the stack.
This has concequences for DEPTH and thus LENGTH, if you run or type the code in one go.
Try to avoid DEPTH, better is to explicitly tell array defining words how many items you want, example: CREATE LIST 3 CELLS ALLOT
BTW running your code as is in SwiftForth, I allocate a 4 cell array right after the dictionary entree for SUM. I store 5 items ( LENGTH # 1+ in UPLOAD ) in the dictionary right after LIST, overwriting parts of the dictionary entree for CUMSUM ...
Lars Brinkhoff shows a nice alternative, apart from DEPTH that is ;-)

The main problem was as #roelf explained. I'll just add two points of interest.
1) How could you know that something was off? When you have a problem like this, examine the memory - do a hex dump!
1 2 3 sum -1223205794 ok
list 32 dump
B7175C58: 58 5C 17 B7 03 00 00 00 - 02 00 00 00 01 00 00 00 X\..............
B7175C68: 00 00 00 00 00 00 00 00 - 5E 5C 17 B7 58 5C 17 B7 ........^\..X\..
ok
You can see that the first cell of list is garbage. So maybe upload wasn't working so well after all!
2) Note that if you just want to add all values on the stack, you don't need to clutter the namespace with variables:
: sum depth 1 do + loop ;

Related

Writing the production rules of this finite state machine

Consider the following state diagram which accepts the alphabet {0,1} and accepts if the input string has two consecutive 0's or 1's:
01001 --> Accept
101 --> Reject
How would I write the production rules to show this? Is it just:
D -> C0 | B1 | D0 | D1
C -> A0 | B0
B -> A1 | C1
And if so, how would the terminals (0,1) be differentiated from the states (A,B,C) ? And should the state go before or after the input? That is, should it be A1 or 1A for example?
The grammar you suggest has no A: it's not a non-terminal because it has no production rules, and it's not a terminal because it's not present in the input. You could make that work by writing, for example, C → 0 | B 0, but a more general solution is to make A into a non-terminal using an ε-rule: A → ε and then
C → A 0 | B 0.
B0 is misleading, because it looks like a single thing. But it's two grammatical symbols, a non-terminal (B) and a terminal 0.
With those modifications, your grammar is fine. It's a left linear grammar; a right linear grammar can also be constructed from the FSA by considering in-transitions rather than out-transitions. In this version, the epsilon production corresponds to final states rather than initial states.
A → 1 B | 0 C
B → 0 C | 1 D
C → 1 B | 0 D
D → 0 D | 1 D | ε
If it's not obvious why the FSM corresponds to these two grammars, it's probably worth grabbing a pad of paper and constructing a derivation with each grammar for a few sample sentences. Compare the derivations you produce with the progress through the FSM for the same input.

Add two numbers up to one constant

I have a constant number X. I also have two numbers that add up to it. How can I make it so that if I change one number, the other number automatically changes so that it still adds up to X.
I have tried to take subtract the one number from X and add it to the other number, but instead I got two numbers in the thousands.
Assuming your constant value is 10, you can set this in a cell and make all your other calculations based on it.
For example, you can have cell C2 containing your constant, in this example, 10
Then in C4 you can have the number which you change, and the value of C5 will be equal to the value of the constant minus the value in C4.
You can then finally do your sum wherever you want, adding up the values of C4 and C5.
Here's an example Spreadsheet:
Untitiled spreadsheet ☆
File Edit View Insert Format Data Tools Extensions Help Last edit was 2 minutes ago
↶ ↷ 🖶 ⮷ | 100%⯆ | $ % .0 .00 123⯆ | Default(Ro... ⯆ | 10 ⯆ | B | I | S | A |⯐|☰
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
1
2
Contsant:
10
3
4
Number 1:
3
5
Number 2:
=(C2 - C4)
6
7
Sum:
=(C4 + C5)
8

Overfitting in data frame that some rows repeated

I have a machine learning problem in a logistic regression algorithm. That I have a data frame where some rows and features are repeated like the below table:
feature 1
feature 2
feature 3
...
feature n-1
feature n
Target
a1
a2
a3
..
an
1
1
b1
b2
b3
..
bn
1
0
c1
c2
c3
..
cn
1
1
..
..
..
..
..
1
..
a1
a2
a3
..
an
2
..
b1
b2
b3
..
bn
2
..
c1
c2
c3
..
cn
2
..
..
..
..
..
..
2
..
a1
a2
a3
..
an
3
..
b1
b2
b3
..
bn
3
..
c1
c2
c3
..
cn
3
..
..
..
..
..
..
..
..
Is it possible to occur overfitting or underfitting with this data frame or not?
And what about a data frame that has between 6 or 8 features with about 500 rows?
I should add and notice this, rows that are repeated in features from 1 to n-1 vary in feature n.
Whether you overfit or not is due to:
the complexity of the model
the available data.
But what's important is the actual data. If you double the data by repeating it, you don't effectively change the data you have. In fact, many algorithms randomly sample from the dataset. So, having duplicates changes nothing (except if the duplicated data has a different distribution than the non-duplicated data)
As such, removing the duplication in the data will not affect whether your overfit or not.
Edit: Now, if the data is not duplicated, but rather modified, it is a different story:
where some rows and features are repeated
Then, no effect.
But if the data is modified, as the table shows, then you need to explain: Is this actual noisy measurements? Is this some random transcription/data collection error?
However, if it is not errors in the dataset but actual data, then it is important to keep it. This is not about overfitting, this is about training with the actual data.

wxMaxima ezunits funny business

Is the handling of the units broken or what am I missing?
load(ezunits);
σ_N: 10000`N/(50`mm*10`mm);
newts: 123`kg*m/s^3; newts `` N; newts + 321 `kg*m/s^2;
produces not what one would have hoped for:
(%i1) load(ezunits);
(%o1) "C:/maxima-5.43.2/share/maxima/5.43.2/share/ezunits/ezunits.mac"
(%i2) σ_N: 10000`N/(50`mm*10`mm);
(σ_N) 10000 ` (N/500 ` 1/mm^2)
(%i5) newts: 123`kg*m/s^3; newts `` N; newts + 321 `kg*m/s^2;
(newts) 123 ` (kg*m)/s^3
(%o4) 123/s ` N
(%o5) 321 ` (kg*m)/s^2+123 ` (kg*m)/s^3
Should be:
σ_N= 20 N/mm^2
newts= 123 N/s
For the first part, you have to use parentheses to indicate the grouping you want. When you write a ` b/c, it is interpreted as a ` (b/c), but in this case you want (a ` b)/c. (Grouping works that way because it's assumed that stuff like x ` m/s is more common than (x ` m)/s.)
(%i2) σ_N: (10000`N)/(50`mm*10`mm);
N
(%o2) 20 ` ---
2
mm
Just for fun, let's check the dimensions of this quantity. I guess it should be force/area.
(%i3) dimensions (%);
mass
(%o3) ------------
2
length time
(%i4) dimensions (N);
length mass
(%o4) -----------
2
time
(%i5) dimensions (mm);
(%o5) length
Looks right to me.
For the second part, I don't understand what you're trying to so. The variable newts has units equivalent to N/s, so I don't understand why you're trying to convert it to N, and I don't understand why you're trying to add N/s to N. Anyway here's what I can make of it.
(%i6) newts: 123`kg*m/s^3;
kg m
(%o6) 123 ` ----
3
s
(%i7) newts `` N/s;
N
(%o7) 123 ` -
s
When quantities with different dimensions are added, ezunits just lets it stand; it doesn't produce an error or anything.
(%i8) newts + 321 ` kg*m/s^2;
kg m kg m
(%o8) 321 ` ---- + 123 ` ----
2 3
s s
The motivation for that is that it allows for stuff like 3`sheep + 2`horse or x`hour + y`dollar-- the conversion rate can be determined after the fact. In general, allowing for expressions to be reinterpreted after the fact is, I believe, the mathematical attitude.

Find last value in column A, if condition in column B is true

I've got hiking distance data from a start point in column A and a column with a yes/no condition (let's say a "Y" denotes a campsite, for example).
What I'm trying to achieve is to calculate the distance between each distance marker in column A that has the condition "Y" in column B. (Desired output is column C.)
A B C
--------------
0 Y
12
26 Y 26 (26 - 0 = 26)
57
124 Y 98 (124 - 26 = 98)
137
152 Y 28 (152 - 124 = 28)
169
. . .
. . .
. . .
I can pull out the distance from column A with a simple IF statement, but that doesn't get me anywhere, of course.
I've searched the Internet extensively and there are a ton of threads out there about finding the last value or last non-empty value in a column.
So I've tried to use INDEX, FILTER, and LOOKUP in all sorts of combinations, but sadly nothing produces the result I'm looking for.
The tricky part, I guess, is to find the last value with a Y above the "current" Y (if that makes any sense).
In C2 try
=ArrayFormula(if(B2:B="y", A2:A-iferror(vlookup(row(A2:A)-1, filter({row(A2:A), A2:A}, len(B2:B)),2)),))
and see if that works?

Resources