NetLogo: histogram relative frequency - histogram

I'm still having problems with [histogram].
I have a global variable (age-sick) that stores the age of the turtles when they got sick...and I want to plot the distribution: histogram age-sick
However I do not want the absolute number of turtles who got sick per every age, rather the relative one.
Is there a way to do so?

I have tried to overcome the problem in the following way:​​
let age-freq (list)
let i 0
while [ i <= (max age-sick)] [
let a filter [? = i] age-sick
repeat (length a / length age-sick * 1000) [set age-freq lput i age-freq]
set i i + 1]
histogram age-freq]

Related

Linear Interpolation - shrinking a line

Suppose we have a 1D array named that consists of 9 elements:
Source[0 to 8].
Using "Linear Interpolation" we want to shrink it into a smaller 4 point array: Destination [0 to 3].
This is how I understand the Algorithm:
Calculate the ratio between both array lengths: 9/4 = 2.5
Iterate over the destination coordinates and find the appropriate source coordinate:
Destination [0] = 0 * 2.5 = Source [0] -> Success! use this exact value.
Destination [1] = 1 * 2.5 = Source [2.5] -> No such element! Calculate the average of Source[2] and Source[3].
Destination [2] = 2 * 2.5 = Source [5] -> Success! use this exact value.
Destination [2] = 3 * 2.5 = Source [7.5] -> No such element! Calculate the average of Source[7] and Source[8].
Is this correct ?
Almost correct. 9/4 = 2.25. ;-)
Anyway, if you want to preserve the endpoint values, you should calculate the ratio as (9-1)/(4-1) = 2.666... (Between points 0, 1, 2, 3 there are only three segments, thus the length equals to 3. The same refers to 0...8).
If you don't hit the exact value, remember to compute a weigheted mean, e.g.
Destination[1] = 1 * 2.667 -> (3-2.667)*Source[2] + (2.667-2)*Source[3]
This is from the equation,
y = y0(x1-x) + y1(x-x0)
where, in this case,
x=2.66
x0=2
x1=3
y0=Source[2]
y1=Source[3]

Algorithm to always sum sliders to 100% failing due to zeroes

This is (supposed to be) a function which makes sure that the the sum of a number of slider's values always adds up to globalTotal.
A slider value can be changed manually by the user to changer.value and then when applying this function to the values of the other sliders, it can determine their new or endVal.
It takes the startVal of the slider which needs changing and the original value of the slider that changed changerStartVal and can determine the new value others by weighting.
The problem and my question is. Sometimes remainingStartVals can be zero (when the slider changing gets moved all the way to maximum) or startVal can be zero (when the slider changing is moved to zero and then another slider is moved). When this happens I get a divide-by-zero or a multiply-by-zero respectively. Both of which are bad and lead to incorrect results. Is there an easy way to fix this?
func calcNewVal(startVal: Float, changerStartVal: Float) -> Float {
let remainingStartVals = globalTotal - changerStartVal
let remainingNewVals = globalTotal - changer.value
let endVal = ((startVal * (100 / remainingStartVals)) / 100) * remainingNewVals
return endVal
}
This is a mathematical problem, not a problem related to Swift or any specific programming language so I'll answer with mathematical formulas and explanations rather than code snippets.
I don't really understand your algorithm either. For example in this line:
let endVal = ((startVal * (100 / remainingStartVals)) / 100) * remainingNewVals
you first multiply by 100 and then divide by 100, so you could just leave all these 100 factors out in the first place!
However, I think I understand what you're trying to achieve and the problem is that there is no generic solution. Before writing an algorithm you have to define exactly how you want it to behave, including all edge cases.
Let's define:
vi as the value of the i-th slider and
Δi as the change of the i-th slider's value
Then you have to think of the following cases:
Case 1:
0 < vi ≤ 1 for all sliders (other than the one you changed)
This is probably the common case you were thinking about. In this case you want to adjust the values of your unchanged sliders so that their total change is equal to the change Δchanged of the slider you changed. In other words:
∑i Δi = 0
If you have 3 sliders this reduces to:
Δ1 + Δ2 + Δ3 = 0
And if the slider that changed is the one with i = 1 then this requirement would read:
Δ1 = – (Δ2 + Δ3)
You want the sliders to adjust proportionally which means that this change Δ1 should not be distributed equally on the other sliders but depending on their current value:
Δ2 = – w2 * Δ1
Δ3 = – w3 * Δ1
The normed weight factors are
w2 = v2 / (v2 + v3)
w3 = v3 / (v2 + v3)
Thus we get:
Δ2 = – v2 / (v2 + v3) * Δ1
Δ3 = – v3 / (v2 + v3) * Δ1
So these are the formulas to applied for this particular case.
However, there are quite a few other cases that don't work with this approach:
Case 2:
vi = 0 for at least one, but not all of the sliders (other than the one you changed)
In this case the approach from case 1 would still work (plus it would be the logical thing to do). However, a slider's value would never change if it's zero. All of the change will be distributed over the sliders with a value > 0.
Case 3:
vi = 0 for all sliders (other than the one you changed)
In this case the proportional change doesn't work because there is simply no information how to distribute the change over the sliders. They're all zero! This is actually your zero division problem: In the case where we have 3 sliders and the slider 1 changes we'll get
v2 + v3 = 0
This is only another manifestation of the fact that the weight factors wi are simply undefined. Thus, you'll have to manually define what will happen in this case.
The most plausible thing to do in this case is to distribute the change evenly over all sliders:
Δi = – (1 / n) * Δ1
where n is the number of sliders (excluding the one that was changed!). With this logic, every slider gets "the same share" of the change.
Now that we're clear with our algorithm you can implement these cases in code. Here some pseudo code as an example:
if sum(valuesOfAllSlidersOtherThanTheSliderThatChanged) == 0 {
for allUnchangedSliders {
// distribute change evenly over the sliders
Δi = – (1 / n) * Δ_changedSlider
}
}
else {
for allUnchangedSliders {
// use weight factor to change proportionally
Δi = – v_i / ∑(v_i) * Δ_changedSlider
}
}
Please be aware that you must cache the values of the current state of your sliders at the beginning or (even better) first compute all the changes and then apply all the changes in a batch. Otherwise you will use a value v2' that you just computed for determining the value v3' which will obviously result in incorrect values.
Hey #Sean the simplest adjustment that I could think of here is to check if the remainingStartVals is not 0 that means that there are weights assigned to the other sliders and also check if a single slider had a weight to begin with which means its startVal shouldn't be equal to 0
func calcNewVal(startVal: Float, changerStartVal: Float) -> Float{
var endVal = 0
let remainingStartVals = globalTotal - changerStartVal
if remainingStartVals != 0 || startVal != 0{
let remainingNewVals = globalTotal - changer.value
endVal = ((startVal * (100 / remainingStartVals)) / 100) * remainingNewVals
}
return endVal
}

How to export/convert line projection to excel table and order the Y coornidate

I wrote a code that can get line projection (intensity profile) of an image, and I would like to convert/export this line projection (intensity profile) to excel table, and then order all the Y coordinate. For example, except the maximum and minimum values of all the Y coordinate, I would like to know largest 5 coordinate value and smallest coordinate value.
Is there any code can reach this function? Thanks,
image line_projection
Realimage imgexmp
imgexmp := GetFrontImage()
number samples = 256, xscale, yscale, xsize, ysize
GetSize( imgexmp, xsize, ysize )
line_projection := CreateFloatImage( "line projection", Xsize, 1 )
line_projection = 0
line_projection[icol,0] += imgexmp
line_projection /= samples
ShowImage( line_projection )
Finding a 'sorted' list of values
If you need to sort though large lists of values (i.e. large images) the following might not be very sufficient. However, if your aim is to get the "x highest" values with a relatively small number of X, then the following code is just fine:
number nFind = 10
image test := GetFrontImage().ImageClone()
Result( "\n\n" + nFind + " highest values:\n" )
number x,y,v
For( number i=0; i<nFind; i++ )
{
v = max(test,x,y)
Result( "\t" + v + " at " + x + "\n" )
test[x,y] = - Infinity()
}
Working with a copy and subsequently "removing" the maximum value by changing that pixel value. The max command is fast - even for large images -, but the for-loop iteration and setting of individual pixels is slow. Hence this script is too slow for a complete 'sorting' of the data if it is big, but it can quickly get you the n 'highest' values.
This is a non-coding answer:
If you havea LinePlot display in DigitalMicrograph, you can simply copy-paste that into Excel to get the numbers.
i.e. with the LinePlot image front most, preses CTRL + C to copy
(make sure there are no ROIs on it).
Switch to Excel and press CTRL + V. Done.
==>

logistic regression with gradient descent error

I am trying to implement logistic regression with gradient descent,
I get my Cost function j_theta for the number of iterations and fortunately my j_theta is decreasing when plotted j_theta against the number of iteration.
The data set I use is given below:
x=
1 20 30
1 40 60
1 70 30
1 50 50
1 50 40
1 60 40
1 30 40
1 40 50
1 10 20
1 30 40
1 70 70
y= 0
1
1
1
0
1
0
0
0
0
1
The code that I managed to write for logistic regression using Gradient descent is:
%1. The below code would load the data present in your desktop to the octave memory
x=load('stud_marks.dat');
%y=load('ex4y.dat');
y=x(:,3);
x=x(:,1:2);
%2. Now we want to add a column x0 with all the rows as value 1 into the matrix.
%First take the length
[m,n]=size(x);
x=[ones(m,1),x];
X=x;
% Now we limit the x1 and x2 we need to leave or skip the first column x0 because they should stay as 1.
mn = mean(x);
sd = std(x);
x(:,2) = (x(:,2) - mn(2))./ sd(2);
x(:,3) = (x(:,3) - mn(3))./ sd(3);
% We will not use vectorized technique, Because its hard to debug, We shall try using many for loops rather
max_iter=50;
theta = zeros(size(x(1,:)))';
j_theta=zeros(max_iter,1);
for num_iter=1:max_iter
% We calculate the cost Function
j_cost_each=0;
alpha=1;
theta
for i=1:m
z=0;
for j=1:n+1
% theta(j)
z=z+(theta(j)*x(i,j));
z
end
h= 1.0 ./(1.0 + exp(-z));
j_cost_each=j_cost_each + ( (-y(i) * log(h)) - ((1-y(i)) * log(1-h)) );
% j_cost_each
end
j_theta(num_iter)=(1/m) * j_cost_each;
for j=1:n+1
grad(j) = 0;
for i=1:m
z=(x(i,:)*theta);
z
h=1.0 ./ (1.0 + exp(-z));
h
grad(j) += (h-y(i)) * x(i,j);
end
grad(j)=grad(j)/m;
grad(j)
theta(j)=theta(j)- alpha * grad(j);
end
end
figure
plot(0:1999, j_theta(1:2000), 'b', 'LineWidth', 2)
hold off
figure
%3. In this step we will plot the graph for the given input data set just to see how is the distribution of the two class.
pos = find(y == 1); % This will take the postion or array number from y for all the class that has value 1
neg = find(y == 0); % Similarly this will take the position or array number from y for all class that has value 0
% Now we plot the graph column x1 Vs x2 for y=1 and y=0
plot(x(pos, 2), x(pos,3), '+');
hold on
plot(x(neg, 2), x(neg, 3), 'o');
xlabel('x1 marks in subject 1')
ylabel('y1 marks in subject 2')
legend('pass', 'Failed')
plot_x = [min(x(:,2))-2, max(x(:,2))+2]; % This min and max decides the length of the decision graph.
% Calculate the decision boundary line
plot_y = (-1./theta(3)).*(theta(2).*plot_x +theta(1));
plot(plot_x, plot_y)
hold off
%%%%%%% The only difference is In the last plot I used X where as now I use x whose attributes or features are featured scaled %%%%%%%%%%%
If you view the graph of x1 vs x2 the graph would look like,
After I run my code I create a decision boundary. The shape of the decision line seems to be okay but it is a bit displaced. The graph of the x1 vs x2 with decision boundary is given below:
![enter image description here][2]
Please suggest me where am I going wrong ....
Thanks:)
The New Graph::::
![enter image description here][1]
If you see the new graph the coordinated of x axis have changed ..... Thats because I use x(feature scalled) instead of X.
The problem lies in your cost function calculation and/or gradient calculation, your plotting function is fine. I ran your dataset on the algorithm I implemented for logistic regression but using the vectorized technique because in my opinion it is easier to debug.
The final values I got for theta were
theta =
[-76.4242,
0.8214,
0.7948]
I also used alpha = 0.3
I plotted the decision boundary and it looks fine, I would recommend using the vectorized form as it is easier to implement and to debug in my opinion.
I also think your implementation of gradient descent is not quite correct. 50 iterations is just not enough and the cost at the last iteration is not good enough. Maybe you should try to run it for more iterations with a stopping condition.
Also check this lecture for optimization techniques.
https://class.coursera.org/ml-006/lecture/37

Scaling a number between two values

If I am given a floating point number but do not know beforehand what range the number will be in, is it possible to scale that number in some meaningful way to be in another range? I am thinking of checking to see if the number is in the range 0<=x<=1 and if not scale it to that range and then scale it to my final range. This previous post provides some good information, but it assumes the range of the original number is known beforehand.
You can't scale a number in a range if you don't know the range.
Maybe what you're looking for is the modulo operator. Modulo is basically the remainder of division, the operator in most languages is is %.
0 % 5 == 0
1 % 5 == 1
2 % 5 == 2
3 % 5 == 3
4 % 5 == 4
5 % 5 == 0
6 % 5 == 1
7 % 5 == 2
...
Sure it is not possible. You can define range and ignore all extrinsic values. Or, you can collect statistics to find range in run time (i.e. via histogram analysis).
Is it really about image processing? There are lots of related problems in image segmentation field.
You want to scale a single random floating point number to be between 0 and 1, but you don't know the range of the number?
What should 99.001 be scaled to? If the range of the random number was [99, 100], then our scaled-number should be pretty close to 0. If the range of the random number was [0, 100], then our scaled-number should be pretty close to 1.
In the real world, you always have some sort of information about the range (either the range itself, or how wide it is). Without further info, the answer is "No, it can't be done."
I think the best you can do is something like this:
int scale(x) {
if (x < -1) return 1 / x - 2;
if (x > 1) return 2 - 1 / x;
return x;
}
This function is monotonic, and has a range of -2 to 2, but it's not strictly a scaling.
I am assuming that you have the result of some 2-dimensional measurements and want to display them in color or grayscale. For that, I would first want to find the maximum and minimum and then scale between these two values.
static double[][] scale(double[][] in, double outMin, double outMax) {
double inMin = Double.POSITIVE_INFINITY;
double inMax = Double.NEGATIVE_INFINITY;
for (double[] inRow : in) {
for (double d : inRow) {
if (d < inMin)
inMin = d;
if (d > inMax)
inMax = d;
}
}
double inRange = inMax - inMin;
double outRange = outMax - outMin;
double[][] out = new double[in.length][in[0].length];
for (double[] inRow : in) {
double[] outRow = new double[inRow.length];
for (int j = 0; j < inRow.length; j++) {
double normalized = (inRow[j] - inMin) / inRange; // 0 .. 1
outRow[j] = outMin + normalized * outRange;
}
}
return out;
}
This code is untested and just shows the general idea. It further assumes that all your input data is in a "reasonable" range, away from infinity and NaN.

Resources