Under what circumstances would Julia allocate memory to single digits? - memory

Suppose I write this function
function test_function(T)
c = 1
d = 31
q = 321
b = 32121
a = 10
for i in 1:T
c = d + q + b + a
end
end
There will be no memory allocation. However, in my own code, I wrote a similar loop, but I encounter a huge amount of memory allocation. I can't share the entirety of my code, but when I used --track-allocation=user, I see the following results
80000 q = 3
- p = 0.1
- p_2 = 3
- q_2 = .2
-
240000 r = p - p_2 + q_2 - q;
The code above is in a for loop. This is just strange to me - why would Julia ever allocate memory to single digits?

Related

Training NN with Julia's Flux - Loss function with derivative of output and functions of output

I want to run this NN in which input is time over some interval. There's no label, and the loss function requires the derivative of the outputs and a specified function (H in my code), which is also a function of the outputs. I believe my loss function is not properly set yet. I also would like to see how the loss function decreases, to see how close to the actual function I am getting, but I don't seem to find a away to see how the loss function progresses.
Here is my new code:
using Flux, Zygote, ForwardDiff
##Data
t=vcat(0:0.1:4)
##Problem parameters
α = 2; C = 1; β = 0.5; P = 1; π₀ = 0.5
#Initial and final conditions
x₀ = 0.5
p₄ = 1
t₀ = 0
t𝔣 = 4
#Hidden layer length
len_hidden=5
X = Chain(Dense(1,len_hidden),Dense(len_hidden,1,relu))
x(t) = (t - t₀)*X([t])[1] + x₀
dxdt(t) = ForwardDiff.derivative(x,t)
Ρ = Chain(Dense(1,len_hidden),Dense(len_hidden,1,relu))
p(t) = p₄ + (t - t𝔣)*Ρ([t])[1]
dpdt(t) = ForwardDiff.derivative(p,t)
U = Chain(Dense(1,len_hidden),Dense(len_hidden,1,relu))
u(t) = U([t])[1]
Θ = Flux.params(X,Ρ,U)
H(x,p,u) = α*u*x - C*u^2 + p*β*x*(1 - x)*(P*u - π₀)
#Partials
dHdx(t) = α*u(t) + p(t)*(1 - x(t))*β*(P*u(t) - π₀) - p(t)*x(t)*(P*u(t) - π₀)
dHdp(t) = (1 - x(t))*x(t)*β*(P*u(t) - π₀)
dHdu(t) = α*x(t) - 2*C*u(t) + P*p(t)*β*x(t)*(1 - x(t))
#Loss function
function loss(t)
return (-dxdt(t) + dHdp(t))^2 + (dpdt(t) + dHdx(t))^2 + (dHdu(t))^2
end
opt=Descent()
parameters=Θ
data=t
Flux.train!(loss, parameters, data, opt, cb = () -> println("Training"))
Is the way I wrote the loss function correct? For each time instant (which is my data vector), am I computing the loss with the updated value of each function in loss()? So far, x(0) becomes the imposed initial condition, however it stays constant for all other time instants, which makes me think the loss is not being evaluated and minimized over time taking in considerations the evolution of all other functions.

Minimizing memory usage in Julia function

This function is a workhorse which I want to optimize. Any idea on how its memory usage can be limited would be great.
function F(len, rNo, n, ratio = 0.5)
s = zeros(len); m = copy(s); d = copy(s);
s[rNo]=1
rNo ≤ len-1 && (m[rNo + 1] = s[rNo+1] = -n[rNo])
rNo > 1 && (m[rNo - 1] = s[rowNo-1] = n[rowNo-1])
r=1
while true
for i ∈ 2:len-1
d[i] = (n[i]*m[i+1] - n[i-1]*m[i-1])/(r+1)
end
d[1] = n[1]*m[2]/(r+1);
d[len] = -n[len-1]*m[len-1]/(r+1);
for i ∈ 1:len
s[i]+=d[i]
end
sum(abs.(d))/sum(abs.(m)) < ratio && break #converged
m = copy(d); r+=1
end
return reshape(s, 1, :)
end
It calculates rows of a special matrix exponential which I stack later.
Although the full method is quite faster than built in exp thanks to the special properties, it takes up far more memory as measured by #time.
Since I am a noob in memory management and also in Julia, I am sure it can be optimized quite a bit..
Am I doing something obviously wrong?
I think most of your allocations come from sum(abs.(d))/sum(abs.(m)) < ratio && break #converged. If you replace it with sum(abs, d)/sum(abs,m) < ratio && break #converged those allocations should go away. (it also will be a speed boost).
Your other allocations can be removed by replacing m = copy(d) with m .= d which does an element-wise copy.
There are also a couple of style things where I think you could make this a nicer function to read and use. My changes would be as follows
function F(rNo, v, ratio = 0.5)
len = length(v)
s = zeros(len+1); m = copy(s); d = copy(s);
s[rNo]=1
rNo ≤ len && (m[rNo + 1] = s[rNo+1] = -v[rNo])
rNo > 1 && (m[rNo - 1] = s[rowNo-1] = v[rowNo-1])
r=1
while true
for i ∈ 2:len
d[i] = (v[i]*m[i+1] - v[i-1]*m[i-1]) / (r+1)
end
d[1] = v[1]*m[2]/(r+1);
d[end] = -v[end]*m[end]/(r+1);
s .+= d
sum(abs, d)/sum(abs, m) < ratio && break #converged
m .= d; r+=1
end
return reshape(s, 1, :)
end
The most notable change is removing len from the arguments. Including an array length argument is common in C (and probably others) where finding the length of an array is hard, but in Julia length is cheap (O(1)), and adding extra arguments is just more clutter and confusion for the people using it. I also made use of the fact that julia is able to turn s[end] into s[length(x)] to make this a little cleaner. Also, in general when using Julia you should look for ways to use dotted operations rather than writing for loops. The for loops will be fast, but why take 3 lines to do what you could in 1 shorter line? (I also renamed n to v since to me n is a number and v is a vector, but that is pure preference).
I hope this helps.

Runtime of while loop pseudocode

I have a pseudocode which I'm trying to make a detailed analysis, analyze runtime, and asymptotic analysis:
sum = 0
i = 1
while (i ≤ n){
sum = sum + i
i = 2i
}
return sum
My assignment requires that I write the cost/runtime for every line, add these together, and find a Big-Oh notation for the runtime. My analysis looks like this for the moment:
sum = 0 1
long i = 1 1
while (i ≤ n){ log n + 1
sum = sum + i n log n
i = 2i n log n
}
return sum 1
=> 2 n log n + log n + 4 O(n log n)
is this correct ? Also: should I use n^2 on the while loop instead ?
Because of integer arithmetic, the runtime is
O(floor(ln(n))+1) = O(ln(n)).
Let's step through your pseudocode. Consider the case that n = 5.
iteration# i ln(i) n
-------------------------
1 1 0 5
2 2 1 5
3 4 2 5
By inspection we see that
iteration# = ln(i)+1
So in summary:
sum = 0 // O(1)
i = 1 // O(1)
while (i ≤ n) { // O(floor(ln(n))+1)
sum = sum + i // 1 flop + 1 mem op = O(1)
i = 2i // 1 flop + 1 mem op = O(1)
}
return sum // 1 mem op = O(1)

confused about a pair of recursion relations

thanks in advance for your help in figuring this out. I'm taking an algorithms class and I'm stuck on something. According to the professor, the following holds true where C(1)=1 and n is a power of 2:
C(n) = 2 * C(n/2) + n resolves to C(n) = n * lg(n) + n
C(n) = 2 * C(n/2) + lg(n) resolves to C(n) = 3 * n - lg(n) - 2
The first one I completely grok. As I understand the form, what's stated is that C(n) resolves to two sub-problems, each of which requires n/2 work to solve, and an additional n amount of work to split and merge everything. As such, for every division of the problem, the constant 2 is increased by a factor of ^k (where k is the number of splits), the 2 in n/2 is also increased by a factor of ^k for much the same reason, and the last n is multiplied by a factor of k because each split creates a multiple of k extra work.
My confusion stems from the second relation. Given that the first and second relations are almost identical, why isn't the result of the second something like nlgn+(lgn^2)?
The general result is the Master Theorem
But in this specific case, you can work out the math for a power of 2:
C(2^k)
= 2 * C(2^(k-1)) + lg(2^k)
= 4 * C(2^(k-2)) + lg(2^k) + 2 * lg(2^(k-1))
= ... repeat ...
= 2^k * C(1) + sum (from i=1 to k) 2^(k-i) * lg 2^i
= 2^k + lg(2) * sum (from i=1 to k) 2^(i) * i
= 2^k - 2 + 2^k+1 - k
= 3 * 2^k - k - 2
= 3 * n - lg(n) - 2

Incrementation in Lua

I am playing a little bit with Lua.
I came across the following code snippet that have an unexpected behavior:
a = 3;
b = 5;
c = a-- * b++; // some computation
print(a, b, c);
Lua runs the program without any error but does not print 2 6 15 as expected. Why ?
-- starts a single line comment, like # or // in other languages.
So it's equivalent to:
a = 3;
b = 5;
c = a
LUA doesn't increment and decrement with ++ and --. -- will instead start a comment.
There isn't and -- and ++ in lua.
so you have to use a = a + 1 or a = a -1 or something like that
If you want 2 6 15 as the output, try this code:
a = 3
b = 5
c = a * b
a = a - 1
b = b + 1
print(a, b, c)
This will give
3 5 3
because the 3rd line will be evaluated as c = a.
Why? Because in Lua, comments starts with --. Therefore, c = a-- * b++; // some computation is evaluated as two parts:
expression: c = a
comment: * b++; //// some computation
There are 2 problems in your Lua code:
a = 3;
b = 5;
c = a-- * b++; // some computation
print(a, b, c);
One, Lua does not currently support incrementation. A way to do this is:
c = a - 1 * b + 1
print(a, b, c)
Two, -- in Lua is a comment, so using a-- just translates to a, and the comment is * b++; // some computation.
Three, // does not work in Lua, use -- for comments.
Also it's optional to use ; at the end of every line.
You can do the following:
local default = 0
local max = 100
while default < max do
default = default + 1
print(default)
end
EDIT: Using SharpLua in C# incrementing/decrementing in lua can be done in shorthand like so:
a+=1 --increment by some value
a-=1 --decrement by some value
In addition, multiplication/division can be done like so:
a*=2 --multiply by some value
a/=2 --divide by some value
The same method can be used if adding, subtracting, multiplying or dividing one variable by another, like so:
a+=b
a-=b
a/=b
a*=b
This is much simpler and tidier and I think a lot less complicated, but not everybody will share my view.
Hope this helps!

Resources