How do I setup the timestep when using DifferentialEquations.jl in Julia for an irregular time series? - time-series

Playing with the harmonic oscillator, the differential equation is driven by a regular time series
w_i in the millisecond range.
ζ = 1/4pi # damped ratio
function oscillator!(du,u,p,t)
du[1] = u[2] # y'(t) = z(t)
du[2] = -2*ζ*p(t)*u[2] - p(t)^2*u[1] # z'(t) = -2ζw(t)z(t) -w(t)^2y(t)
end
y0 = 0.0 # initial position
z0 = 0.0002 # initial speed
u0 = [y0, z0] # initial state vector
tspan = (0.0,10) # time interval
dt = 0.001 # timestep
w = t -> freq[Int(floor(t/dt))+1] # time series
prob = ODEProblem(oscillator!,u0,tspan,w) # define ODEProblem
sol = solve(prob,DP5(),adaptive=false,dt=0.001)
How do I setup the timestep when the parameter w_i is an irregular time series in the millisecond range.
date │ w
────────────────────────┼───────
2022-09-26T00:00:00.023 │ 4.3354
2022-09-26T00:00:00.125 │ 2.34225
2022-09-26T00:00:00.383 │ -2.0312
2022-09-26T00:00:00.587 │ -0.280142
2022-09-26T00:00:00.590 │ 6.28319
2022-09-26T00:00:00.802 │ 9.82271
2022-09-26T00:00:00.906 │ -5.21289
....................... | ........

While it's possible to disable adaptivity, and even if it was possible to force arbitrary step sizes, this isn't in general what you want to do, as it limits the accuracy of the solution greatly.
Instead, interpolate the parameter to let it take any value of t.
Fortunately, it's really simple to do!
using Interpolations
...
ts = [0, 0.1, 0.4, 1.0]
ws = [1.0, 2.0, 3.0, 4.0]
w = linear_interpolation(ts, ws)
tspan = first(ts), last(ts)
prob = ODEProblem(oscillator!, u0, tspan, w)
sol = solve(prob, DP5(), dt=0.001)
Of course, it doesn't need to be a linear interpolation.
If you still need the solution saved at particular time points, have a look at saveat for solve. E.g. saving the solution using ts used in the interpolation:
sol = solve(prob, DP5(), dt=0.001, saveat=ts)
Edit: Follow up on comment:
Mathematically, you are always making some assumption about the w(t) over the entire domain tspan. There is no such as "driven by a time series".
For example, the standard Runge-Kutta method you have chosen here will require that the ODE function is at h/2. For the better DP5() it is evaluated at several more sub-steps. This is of course unavoidable, regardless of adaptivity is used or not.
Try adding println(t) into your ODE function and you will see this.
In case someone comes from matlab's ode45, not that it simply still uses adaptivity, and just treats explicit time steps the same as saveat does. And, of course, it will evaluate the function at various t outside of the explicit steps as well.
So even in your first example, you are interpolating your w. You are making a strange type of constant_interpolation (but with floor, which combined with floats will cause other issues, since floor(n*dt/dt) might evaluate to n or n-1.).
And even if you were to pick a method that only will try to evaluate at exactly the predetermined time steps, say e.g. ExplicitEuler(), you are still implicitly making the same assumption that w(t) is constant up until the next time step.
Only now, you are also getting a much worse solution from just the ODE integration.
If a constant-previous type interpolation really is how w(t) is defined over the entire domain (which is what you did with floor(t/dt)) here, then what we have is:
w = extrapolate(interpolate((ts,), ws, Gridded(Constant{Previous}())), Flat())
There is simply mathematically no way we get to ignore what happens across the time-step, and there is no reason to limit the time-stepping to the sample points of our "load" function. It's not natural is correct in any mathematical sense.
u'(t) has to be defined on the entire domain we integrate over.

Related

How to properly reset the ContinuousState in a class derived from LeafSystem?

I want to write a continuous time system derived from the LeafSystem that can have its continuous state reset to other values if some conditions are met. However, the system does not work as what I expected. To find out the reason, I implement a simple multi-step integrator system as below:
class MultiStepIntegrator(LeafSystem):
def __init__(self):
LeafSystem.__init__(self)
self.state_index = self.DeclareContinuousState(1)
self.DeclareStateOutputPort("x", self.state_index)
self.flag_1 = True
self.flag_2 = True
def reset_state(self, context, value):
state = context.get_mutable_continuous_state_vector()
state.SetFromVector(value)
def DoCalcTimeDerivatives(self, context, derivatives):
t = context.get_time()
if t < 2.0:
V = [1]
elif t < 4.0:
if self.flag_1:
self.reset_state(context, [0])
print("Have done the first reset")
self.flag_1 = False
V = [1]
else:
if self.flag_2:
self.reset_state(context, [0])
print("Have done the second reset")
self.flag_2 = False
V = [-1]
derivatives.get_mutable_vector().SetFromVector(V)
What I expect from this system is that it will give me a piecewise and discontinuous trajectory. Given that I set the state initially to be 0, firstly the state will go from 0 to 2 for $t \in [0,2]$, then agian from 0 to 2 for $t \in [2,4]$ and then from 0 to -2 for $t \in [4,6]$.
Then I simulate this system, and plot the logging with
builder = DiagramBuilder()
plant, scene_graph = AddMultibodyPlantSceneGraph(builder, 1e-4)
plant.Finalize()
integrator = builder.AddSystem(MultiStepIntegrator())
state_logger = LogVectorOutput(integrator.get_output_port(), builder, 1e-2)
diagram = builder.Build()
simulator = Simulator(diagram)
log_state = state_logger.FindLog(context)
fig = plt.figure()
t = log_state.sample_times()
plt.plot(t, log_state.data()[0, :])
fig.set_size_inches(10, 6)
plt.tight_layout()
It seems that the resets never happen. However I do see the two logs indicating that the resets are done:
Have done the first reset
Have done the second reset
What happened here? Are there some checkings done behind the scene that the ContinuousState cannot jump (as the name indicates)? How can I reset the state value given that some conditions are met?
Thank you very much for your help!
In DoCalcTimeDerivatives, the context is a const (input-only) argument. It cannot be modified. The only thing DoCalcTimeDerivatives can do is output the derivative to enable the integrator to integrate the continuous state.
Not all integrators used fixed-size time steps. Some might need to evaluate the gradients multiple times before deciding what step size(s) to use. Therefore, it's not reasonable for a dx/dt calculation to have any side-effects. It must be a pure function, where its only consequence is to report a dx/dt.
To change a continuous state value other than through pure integration, the System needs to use an "unrestricted update" event. That event can mutate any and all elements of the State (including continuous state).
If the timing of the discontinuities is periodic (even if some events make no change to the state), you can use DeclarePeriodicUnrestrictedUpdateEvent to declare the update calculation.
If the discontinuities happen per a witness function, see bouncing_ball or rimless_wheel or compass_gait for an example.
If you need a generalized (bespoke) triggering schedule for the discontinuity events, you'll need to override DoCalcNextUpdateTime to manually inject the next event timing, something like the LcmSubscriberSystem. We don't have many good examples of this to my knowledge.

Genetic Algorithm timeseries forcast creating an initial population

I am building a genetic algorithm that does a time series forecast in the symbolic regression analysis. I’m trying to get the algorithm to find an equation that will match the underlying trend of the data. (predict monthly beer sales)
The idea is to use lisp like expressions, which writes the equation in a tree. This allows for branch swapping in the crossover/mating stage.
5* (5 +5)
Written as:
X = '(mul 5 (add 5 5))'
Y = parser(X)
y = ['mul', 5, ['add', 5, 5]]
I want to know how to create an initial population set where the individuals represent different expressions automatically. Where there “fitness” is related to how well each equation matches the underlying trend.
For example, one individual could be: '(add 100 (mul x (sin (mul x 3))))'
Where x is time in months.
How do I automatically generate expressions for my population? I have no idea how to do this, any help would be very appreciated.
You can easily solve this problem with recursion and a random number generator random() which returns a (pseudo-)random float between 0 and 1. Here is some pseudocode:
randomExp() {
// Choose a function(like mul or add):
func = getRandomFunction() // Just choose one of your functions randomly.
arg1 = ""
rand1 = random()
// Choose the arguments. You may choose other percentages here depending how deep you want it to be and how many 'x' you want to have.
if(rand1 < 0.2)
arg1 = randomExp() // Here add a new expression
else if(rand1 < 0.5)
arg1 = "x"
else
arg1 = randomConstant() // Get a random constant in a predefined range.
// Do the same for the second argument:
arg2 = ""
…
…
// Put everything together and return it:
return "("+func+" "+arg1+" "+arg2+")"
}
You might want to also limit the recursion depth, as this might return you a theoretically infinitely long expression.

Performing an "online" linear interpolation

I have a problem where I need to do a linear interpolation on some data as it is acquired from a sensor (it's technically position data, but the nature of the data doesn't really matter). I'm doing this now in matlab, but since I will eventually migrate this code to other languages, I want to keep the code as simple as possible and not use any complicated matlab-specific/built-in functions.
My implementation initially seems OK, but when checking my work against matlab's built-in interp1 function, it seems my implementation isn't perfect, and I have no idea why. Below is the code I'm using on a dataset already fully collected, but as I loop through the data, I act as if I only have the current sample and the previous sample, which mirrors the problem I will eventually face.
%make some dummy data
np = 109; %number of data points for x and y
x_data = linspace(3,98,np) + (normrnd(0.4,0.2,[1,np]));
y_data = normrnd(2.5, 1.5, [1,np]);
%define the query points the data will be interpolated over
qp = [1:100];
kk=2; %indexes through the data
cc = 1; %indexes through the query points
qpi = qp(cc); %qpi is the current query point in the loop
y_interp = qp*nan; %this will hold our solution
while kk<=length(x_data)
kk = kk+1; %update the data counter
%perform online interpolation
if cc<length(qp)-1
if qpi>=y_data(kk-1) %the query point, of course, has to be in-between the current value and the next value of x_data
y_interp(cc) = myInterp(x_data(kk-1), x_data(kk), y_data(kk-1), y_data(kk), qpi);
end
if qpi>x_data(kk), %if the current query point is already larger than the current sample, update the sample
kk = kk+1;
else %otherwise, update the query point to ensure its in between the samples for the next iteration
cc = cc + 1;
qpi = qp(cc);
%It is possible that if the change in x_data is greater than the resolution of the query
%points, an update like the above wont work. In this case, we must lag the data
if qpi<x_data(kk),
kk=kk-1;
end
end
end
end
%get the correct interpolation
y_interp_correct = interp1(x_data, y_data, qp);
%plot both solutions to show the difference
figure;
plot(y_interp,'displayname','manual-solution'); hold on;
plot(y_interp_correct,'k--','displayname','matlab solution');
leg1 = legend('show');
set(leg1,'Location','Best');
ylabel('interpolated points');
xlabel('query points');
Note that the "myInterp" function is as follows:
function yi = myInterp(x1, x2, y1, y2, qp)
%linearly interpolate the function value y(x) over the query point qp
yi = y1 + (qp-x1) * ( (y2-y1)/(x2-x1) );
end
And here is the plot showing that my implementation isn't correct :-(
Can anyone help me find where the mistake is? And why? I suspect it has something to do with ensuring that the query point is in-between the previous and current x-samples, but I'm not sure.
The problem in your code is that you at times call myInterp with a value of qpi that is outside of the bounds x_data(kk-1) and x_data(kk). This leads to invalid extrapolation results.
Your logic of looping over kk rather than cc is very confusing to me. I would write a simple for loop over cc, which are the points at which you want to interpolate. For each of these points, advance kk, if necessary, such that qp(cc) is in between x_data(kk) and x_data(kk+1) (you can use kk-1 and kk instead if you prefer, just initialize kk=2 to ensure that kk-1 exists, I just find starting at kk=1 more intuitive).
To simplify the logic here, I'm limiting the values in qp to be inside the limits of x_data, so that we don't need to test to ensure that x_data(kk+1) exists, nor that x_data(1)<pq(cc). You can add those tests in if you wish.
Here's my code:
qp = [ceil(x_data(1)+0.1):floor(x_data(end)-0.1)];
y_interp = qp*nan; % this will hold our solution
kk=1; % indexes through the data
for cc=1:numel(qp)
% advance kk to where we can interpolate
% (this loop is guaranteed to not index out of bounds because x_data(end)>qp(end),
% but needs to be adjusted if this is not ensured prior to the loop)
while x_data(kk+1) < qp(cc)
kk = kk + 1;
end
% perform online interpolation
y_interp(cc) = myInterp(x_data(kk), x_data(kk+1), y_data(kk), y_data(kk+1), qp(cc));
end
As you can see, the logic is a lot simpler this way. The result is identical to y_interp_correct. The inner while x_data... loop serves the same purpose as your outer while loop, and would be the place where you read your data from wherever it's coming from.

How to randomly get a value from a table [duplicate]

I am working on programming a Markov chain in Lua, and one element of this requires me to uniformly generate random numbers. Here is a simplified example to illustrate my question:
example = function(x)
local r = math.random(1,10)
print(r)
return x[r]
end
exampleArray = {"a","b","c","d","e","f","g","h","i","j"}
print(example(exampleArray))
My issue is that when I re-run this program multiple times (mash F5) the exact same random number is generated resulting in the example function selecting the exact same array element. However, if I include many calls to the example function within the single program by repeating the print line at the end many times I get suitable random results.
This is not my intention as a proper Markov pseudo-random text generator should be able to run the same program with the same inputs multiple times and output different pseudo-random text every time. I have tried resetting the seed using math.randomseed(os.time()) and this makes it so the random number distribution is no longer uniform. My goal is to be able to re-run the above program and receive a randomly selected number every time.
You need to run math.randomseed() once before using math.random(), like this:
math.randomseed(os.time())
From your comment that you saw the first number is still the same. This is caused by the implementation of random generator in some platforms.
The solution is to pop some random numbers before using them for real:
math.randomseed(os.time())
math.random(); math.random(); math.random()
Note that the standard C library random() is usually not so uniformly random, a better solution is to use a better random generator if your platform provides one.
Reference: Lua Math Library
Standard C random numbers generator used in Lua isn't guananteed to be good for simulation. The words "Markov chain" suggest that you may need a better one. Here's a generator widely used for Monte-Carlo calculations:
local A1, A2 = 727595, 798405 -- 5^17=D20*A1+A2
local D20, D40 = 1048576, 1099511627776 -- 2^20, 2^40
local X1, X2 = 0, 1
function rand()
local U = X2*A2
local V = (X1*A2 + X2*A1) % D20
V = (V*D20 + U) % D40
X1 = math.floor(V/D20)
X2 = V - X1*D20
return V/D40
end
It generates a number between 0 and 1, so r = math.floor(rand()*10) + 1 would go into your example.
(That's multiplicative random number generator with period 2^38, multiplier 5^17 and modulo 2^40, original Pascal code by http://osmf.sscc.ru/~smp/)
math.randomseed(os.clock()*100000000000)
for i=1,3 do
math.random(10000, 65000)
end
Always results in new random numbers. Changing the seed value will ensure randomness. Don't follow os.time() because it is the epoch time and changes after one second but os.clock() won't have the same value at any close instance.
There's the Luaossl library solution: (https://github.com/wahern/luaossl)
local rand = require "openssl.rand"
local randominteger
if rand.ready() then -- rand has been properly seeded
-- Returns a cryptographically strong uniform random integer in the interval [0, n−1].
randominteger = rand.uniform(99) + 1 -- randomizes an integer from range 1 to 100
end
http://25thandclement.com/~william/projects/luaossl.pdf

Breaking A* admissibility caused exponential speed-up?

I've been working on a generalized version of the sliding tile puzzle where the tiles do not have numbers. Instead, each location either has a tile or a hole and is represented with a boolean as true or false (tile or hole).
The point of the search is to take an initial state with n tiles and a goal state with n target locations and use A* to find the solution of how to move the tiles so that every target location is populated. Here is an example below for a 4x3 grid:
Initial State:
T F T F
F F T F
F F T T
Goal State
T T T T
T F F F
F F F F
I had been working on different heuristics to do this and the most successful had a logic that went something like this:
int heuristicVal = 0
for every tile (i)...
int closest = infinity
for every goal location (j)...
if (manhattan distance of ij < closest) closest = manhattan distance of ij
end for
heuristicVal += closest
end for
return heuristicVal
Unfortunately, this was still too slow in situations where two or more tiles were being guided by the heuristic to the same target location. I tried multiplying heuristicVal by the number of tiles and suddenly there was an exponential speed-up. Problems that were taking 28 seconds before were taking less than 1 second.
Edit: It turns out it is not always producing optimal solutions after all with this change. However, I don't understand why it sped up so much or why it is still finding the correct (although suboptimal) answer despite no longer being admissible.
If you break admissability, A* no longer works correctly. Note that no longer works correctly doesn't mean you're never gonna get an optimal result - you're just no longer guaranteed to get one. You can also end up converging faster on solution, but what's the point if that solution is not the right one?

Resources