display nodes dfs algorithm depth

display nodes dfs algorithm depth - graph-algorithm

during the creation of the adjacent nodes I have created a vector from where to view the adjacent nodes until you reach the destination node but when you create do not understand why because I delete it at the end.
vettore_nuovo_percorso =
1 2 4 5
vettore_nuovo_percorso =
1 2 4 5
nodi =
1 2 4 5
ans =
1 1 1 1 1 1 1
0 2 4 2 2 2 2
0 0 0 3 4 4 4
0 0 0 0 0 3 5
nodi_sorgenti =
3 5
nodi_sorgenti =
3 5
'BLACK'
'BLACK'
'BLACK'
'BLACK'
'BLACK'
ans =
1 1 1 1 1 1 1
0 2 4 2 2 2 2
0 0 0 3 4 4 4
0 0 0 0 0 3 5
conta_nodo_trovato =
1
crucial aspect of the code
vector path should not vanish
vettore_nuovo_percorso =
[]
nodi =
[]
ans =
1 1 1
0 2 4
nodi_sorgenti =
2 4
nodi_sorgenti =
2 4
'BLACK'
'BLACK'
'WHITE'
'BLACK'
'WHITE'
ans =
1 1 1
0 2 4
function [vertice,nodi]=DFS_Visit(edges,vertices,self,nodi_visitati,vb,conta_impo_colore,conta_nodo_trovato)
%con un ciclo copio sul vettore nodi_sorgenti tutte le destinazioni
%raggiugibili da dove poi viene passatto come terzo parametro alla chiamata
%DFS_Visit(edges,vertices,self,nodi_visitati,vb,conta_impo_colore,conta_nodo_trovato)
for j6=1:length(nodi_visitati)
vertices.conta_righe=vertices.conta_righe+1;
%copio tutte le possibili destinazioni sul vettore
%percorso solo una volta
display(vertices.conta_righe);
if vertices.conta_righe==1
for j=1:length(nodi_visitati)
self.vettore_percorso(vertices.conta_righe,j)=nodi_visitati(j6);
display(self.vettore_percorso);
end
end
[n1,m1]=size(self.vettore_percorso);
for j2=1:m1
%prelevo la sorgente
if nodi_visitati(j6)==self.vettore_percorso(n1,j2)
if nodi_visitati(j6)~=vb
conta_edge=0;
conta_nodi_sorgenti=0;
%dimensione riga e colonna
[n1,m1]=size(self.vettore_percorso);
display(self.vettore_percorso);
%se esiste prelevo tutte le destinazioni del nodo
edges2=ver.connectedEdges(edges,nodi_visitati(j6));
for j8=1:length(edges2)
%contatore nodi sorgenti
%inserimento destinzioni nel vettore nodi_sorgenti
conta_nodi_sorgenti=conta_nodi_sorgenti+1;
nodi_sorgenti(conta_nodi_sorgenti)=edges2(j8);
if nodi_sorgenti(conta_nodi_sorgenti)==vb
indice=conta_nodi_sorgenti;
end
%copio in un vettore in una nuova riga e nella stessa colonna il valore
%della destinazione
display('edg');
display(edges2);
end
[n2,m2]=size(self.vettore_percorso);
%copio sul vettore tutti i valori pari a
%valore delle destinazioni nella colonna j2
%trovata da dove risiede la sorgente trovata
if self.vettore_percorso(n1,j2)~=vb
for k=1:n2
for j=1:length(edges2)
self.vettore_percorso(k,m2+j)=self.vettore_percorso(k,j2);
display(self.vettore_percorso);
end
if k==n2
for k3=1:length(edges2)
%copio su tutti
%ivettori le
%destinazioni
%ragggiungibili
%sul vettore
%self.vettore_percorso
self.vettore_percorso(n2+1,m2+k3)=edges2(k3);
end
end
end
end
end
end
end
end
[n,m]=size(self.vettore_percorso);
%chiamata della funzione DFS
for j=1:length(nodi_sorgenti)
if nodi_sorgenti(j)~=vb
self.conta_v=1;
display(conta_visita);
display('nkl');
display(conta_visita);
display(nodi_sorgenti);
else
self.conta_v=0;
end
end
if self.conta_v==0
DFS.DFS_Visit(edges,vertices,self,nodi_sorgenti,vb,conta_impo_colore,0);
end
end
%copio sul vettore tutte i percorsi fino a raggiungere
%la destinazione
[n3,m3]=size(self.vettore_percorso);
for j3=1:m3
if self.vettore_percorso(j4,j3)==vb
conta_nodo_trovato=1;
[n3,m3]=size(self.vettore_percorso);
display('j3');
conta_visita=1;
display(j3);
conta_percorso=conta_percorso+1;
for j7=1:m3
if j7==j3
for j8=1:n3
vettore_nuovo_percorso(conta_percorso,j8)=self.vettore_percorso(j8,j3);
display(vb);
end
end
end
end
end
end

Related

What is this function doing in Lua?

function splitSat(str, pat, max, regex)
pat = pat or "\n" --Patron de búsqueda
max = max or #str
local t = {}
local c = 1
if #str == 0 then
return {""}
end
if #pat == 0 then
return nil
end
if max == 0 then
return str
end
repeat
local s, e = str:find(pat, c, not regex) -- Dentro del string str, busca el patron pat desde la posicion c
-- guarda en s el numero de inicio y en e el numero de fin
max = max - 1
if s and max < 0 then
if #(str:sub(c)) > 0 then -- Si la longitud de la porcion de string desde c hasta el final es mayor que 0
t[#t+1] = str:sub(c)
else values
t[#t+1] = "" --create a table with empty
end
else
if #(str:sub(c, s and s - 1)) > 0 then -- Si la longitud de la porcion de string str entre c y s
t[#t+1] = str:sub(c, s and s - 1)
else
t[#t+1] = "" --create a table with empty values
end
end
c = e and e + 1 or #str + 1
until not s or max < 0
return t
end
I'd like to know what this function is doing. I know that it makes a kind of table taking a string and a pattern. Especially I want to know what *t[#t+1] = str:sub(c, s and s - 1)* is doing.

From what I get, it splits a long string into substrings that match a certain pattern and ignores everything in between the pattern maches. For example, it might match the string 11aa22 to the pattern \d\d, resulting in the table ["11", "22"].
t[#t+1] = <something> inserts a value at the end of table t, it's the same as table.insert(t, <something>)
#t returns the length of an array (that is, a table with consecutive numeric indices), for example, #[1, 2, 3] == 3
str:sub(c, s and s - 1) takes advantage of many of luas features. s and s - 1 evaluates to s-1 if s is not nil, and nil otherwise. Just s-1 would throw an error if s was nil
10 and 10 - 1 == 9
10 - 1 == 9
nil and nil - 1 == nil
nil - 1 -> throws an error
str:sub(a, b) just returns a substring starting at a and ending at b (a and b being numeric indices)
("abcde"):sub(2,4) == "bcd"

Do math on string count (and text parsing with awk)

I have a 4 column file (input.file) with a header:
something1 something2 A B
followed by many 4-column rows with the same format (e.g.):
ID_00001 1 0 0
ID_00002 0 1 0
ID_00003 1 0 0
ID_00004 0 0 1
ID_00005 0 1 0
ID_00006 0 1 0
ID_00007 0 0 0
ID_00008 1 0 0
Where "1 0 0" is representative of "AA", "0 1 0" means "AB", and "0 0 1" means "BB"
First, I would like to create a 5th column to identify these representations:
ID_00001 1 0 0 AA
ID_00002 0 1 0 AB
ID_00003 1 0 0 AA
ID_00004 0 0 1 BB
ID_00005 0 1 0 AB
ID_00006 0 1 0 AB
ID_00007 0 0 0 no data
ID_00008 1 0 0 AA
Note that the A's and B's need to be parsed from columns 3 and 4 of the header row, as they are not always A and B.
Next, I want to "do math" on the counts for (the new) column 5 as follows:
(2BB + AB) / 2(AA + AB + BB)
Using the example, the math would give:
(2(1) + 3) / 2(3 + 3 + 1) = 5/14 = 0.357
which I would like to append to the end of the desired output file (output.file):
ID_00001 1 0 0 AA
ID_00002 0 1 0 AB
ID_00003 1 0 0 AA
ID_00004 0 0 1 BB
ID_00005 0 1 0 AB
ID_00006 0 1 0 AB
ID_00007 0 0 0 no data
ID_00008 1 0 0 AA
B_freq = 0.357
So far I have this:
awk '{ if ($2 = 1) {print $0, $5="AA"} \
else if($3 = 1) {print $0, $5="AB"} \
else if($4 = 1) {print $0, $5="BB"} \
else {print$0, $5="no data"}}' input.file > output.file
Obviously, I was not able to figure out how to parse the info from row 1 (the header row, edited out "column 1"), much less do the math.
Thanks guys!

a more structured approach...
NR==1 {a["100"]=$3$3; a["010"]=$3$4; a["001"]=$4$4; print; next}
{k=$2$3$4;
print $0, (k in a)?a[k]:"no data";
c[k]++}
END {printf "\nB freq = %.3f\n",
(2*c["001"]+c["010"]) / 2 / (c["100"]+c["010"]+c["001"])}
UPDATE
For non binary data you can follow the same logic with some pre-processing. Something like this should work in the main block:
for(i=2;i<5;i++) v[i]=(($i-0.9)^2<=0.1^2)?1:0;
k=v[2] v[3] v[4];
...
here the value is quantized at one for the range [0.8,1] and zero otherwise.
To capture "B" or substitute set h=$4 in the first block and use it as printf "\n%s freq...",h,(2*c...

How does Weka evaluate classifier model

I used random forest algorithm and got this result
=== Summary ===
Correctly Classified Instances 10547 97.0464 %
Incorrectly Classified Instances 321 2.9536 %
Kappa statistic 0.9642
Mean absolute error 0.0333
Root mean squared error 0.0952
Relative absolute error 18.1436 %
Root relative squared error 31.4285 %
Total Number of Instances 10868
=== Confusion Matrix ===
a b c d e f g h i <-- classified as
1518 1 3 1 0 14 0 0 4 | a = a
3 2446 0 0 0 1 1 27 0 | b = b
0 0 2942 0 0 0 0 0 0 | c = c
0 0 0 470 0 1 1 2 1 | d = d
9 0 0 9 2 19 0 3 0 | e = e
23 1 2 19 0 677 1 22 6 | f = f
4 0 2 0 0 13 379 0 0 | g = g
63 2 6 17 0 15 0 1122 3 | h = h
9 0 0 0 0 9 0 4 991 | i = i
I wonder how Weka evaluate errors(mean absolute error, root mean squared error, ...) using non numerical values('a', 'b', ...).
I mapped each classes to numbers from 0 to 8 and evaluated errors manually, but the evaluation was different from Weka.
How to reimplemen the evaluating steps of Weka?

How to combine similar fields parsed from CSV using Ruby

I am parsing baseball statistics from a CSV file, and I need to account for players who played for multiple teams within a season. Currently my code looks like this:
require 'CSV'
CSV.foreach("Batting-07-12-resaved.csv",{:headers=>:first_row}) do |row|
if row[7].to_i != 0 && row[5] != 0 && row[1].to_i == 2009
avg = row[7].to_f / row[5].to_f
puts row[0] + ": " + avg.round(3).to_s[1..-1]
end
end
The CSV headers look like this, and a player is identified by a key that sort of looks like their name and may recur based on different teams they played for (here are a few of the lines, copied from formatted file):
playerID yearID league teamID G AB R H 2B 3B HR RBI SB CS
aardsda01 2012 AL NYA 1
aardsda01 2010 AL SEA 53 0 0 0 0 0 0 0 0 0
aardsda01 2009 AL SEA 73 0 0 0 0 0 0 0 0 0
aardsda01 2008 AL BOS 47 1 0 0 0 0 0 0 0 0
aardsda01 2007 AL CHA 25 0 0 0 0 0 0 0 0 0
abadfe01 2012 NL HOU 37 7 0 1 0 0 0 0 0 0
abadfe01 2011 NL HOU 28 0 0 0 0 0 0 0 0 0
abadfe01 2010 NL HOU 22 1 0 0 0 0 0 0 0 0
abercre01 2008 NL HOU 34 55 10 17 5 0 2 5 5 2
abercre01 2007 NL FLO 35 76 16 15 3 0 2 5 7 1
abreubo01 2012 AL LAA 8 24 1 5 3 0 0 5 0 0
abreubo01 2012 NL LAN 92 195 28 48 8 1 3 19 6 2
So, for example, the bottom two lines, Bobby Abreu played for two different teams in the 2012 season.
How could I combine the numbers from these two rows under the same playerId for the 2012 season to calculate his 2012 batting average?

You need to keep a data structure that holds data about each playerID as you iterate through the CSV data. Using a hash would be perfect. ruby-doc.org manual page
require 'CSV'
# Hashes are built into ruby. Using a hash literal
# is more idomatic than h = Hash.new() */
h = {}
CSV.foreach("Batting-07-12-resaved.csv",{:headers=>:first_row}) do |row|
if row[7].to_i != 0 && row[5].to_i != 0 && row[1].to_i == 2009
playerData = h[row[0]]
if (!playerData)
playerData = [row[0], row[7].to_f, row[5].to_f]
else
playerData = [row[0], row[7].to_f+playerData[1], row[5].to_f+playerData[2]]
end
h[row[0]]=playerData
end
end
h.each {|key, value|
puts "#{value[0]} is #{value[1]/value[2]}"
}

Random Forest overfitting?

I'm facing the following problem: i'm training a random forest for binary prediction. the data is so structured:
> str(data)
'data.frame': 120269 obs. of 11 variables:
$ SeriousDlqin2yrs : num 1 0 0 0 0 0 0 0 0 0 ...
$ RevolvingUtilizationOfUnsecuredLines: num 0.766 0.957 0.658 0.234 0.907 ...
$ age : num 45 40 38 30 49 74 39 57 30 51 ...
$ NumberOfTime30.59DaysPastDueNotWorse: num 2 0 1 0 1 0 0 0 0 0 ...
$ DebtRatio : num 0.803 0.1219 0.0851 0.036 0.0249 ...
$ MonthlyIncome : num 9120 2600 3042 3300 63588 ...
$ NumberOfOpenCreditLinesAndLoans : num 13 4 2 5 7 3 8 9 5 7 ...
$ NumberOfTimes90DaysLate : num 0 0 1 0 0 0 0 0 0 0 ...
$ NumberRealEstateLoansOrLines : num 6 0 0 0 1 1 0 4 0 2 ...
$ NumberOfTime60.89DaysPastDueNotWorse: num 0 0 0 0 0 0 0 0 0 0 ...
$ NumberOfDependents : num 2 1 0 0 0 1 0 2 0 2 ...
- attr(*, "na.action")=Class 'omit' Named int [1:29731] 7 9 17 33 42 53 59 63 72 87 ...
.. ..- attr(*, "names")= chr [1:29731] "7" "9" "17" "33" ...
I split the data
index <- sample(1:nrow(data),round(0.75*nrow(data)))
train <- data[index,]
test <- data[-index,]
then i run the model and try to make predictions:
model.rf <- randomForest(as.factor(train[,1]) ~ ., data=train,ntree=1000,mtry=10,importance=TRUE)
pred.rf <- predict(model.rf, test, type = "prob")
rfpred <- c(1:22773)
rfpred[pred.rf[,1]<=0.5] <- "yes"
rfpred[pred.rf[,1]>0.5] <- "no"
rfpred <- factor(rfpred)
test[,1][test[,1]==1] <- "yes"
test[,1][test[,1]==0] <- "no"
test[,1] <- factor(test[,1])
confusionMatrix(as.factor(rfpred), as.factor(test$Y))
what I get is the following output:
> print(model.rf)
Call:
randomForest(formula = as.factor(train[, 1]) ~ ., data = train, ntree = 1000, mtry = 10, importance = TRUE)
Type of random forest: classification
Number of trees: 1000
No. of variables tried at each split: 10
OOB estimate of error rate: 0%
Confusion matrix:
0 1 class.error
0 43093 0 0
1 0 25225 0
> head(pred.rf)
0 1
45868.1 1 0
112445 1 0
39001 1 0
133443 1 0
137460 1 0
125835.1 1 0
> confusionMatrix(as.factor(rfpred), as.factor(test$Y))
Confusion Matrix and Statistics
Reference
Prediction no yes
no 14570 0
yes 0 8203
Accuracy : 1
95% CI : (0.9998, 1)
No Information Rate : 0.6398
P-Value [Acc > NIR] : < 2.2e-16
Kappa : 1
Mcnemar's Test P-Value : NA
Sensitivity : 1.0000
Specificity : 1.0000
Pos Pred Value : 1.0000
Neg Pred Value : 1.0000
Prevalence : 0.6398
Detection Rate : 0.6398
Detection Prevalence : 0.6398
Balanced Accuracy : 1.0000
'Positive' Class : no
obviously the model cannot be so accurate!! what's wrong with my code?

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

display nodes dfs algorithm depth - graph-algorithm

Related

What is this function doing in Lua?

Do math on string count (and text parsing with awk)

How does Weka evaluate classifier model

How to combine similar fields parsed from CSV using Ruby

Random Forest overfitting?

Categories

Resources