Genetic Algorithm timeseries forcast creating an initial population - time-series

I am building a genetic algorithm that does a time series forecast in the symbolic regression analysis. I’m trying to get the algorithm to find an equation that will match the underlying trend of the data. (predict monthly beer sales)
The idea is to use lisp like expressions, which writes the equation in a tree. This allows for branch swapping in the crossover/mating stage.
5* (5 +5)
Written as:
X = '(mul 5 (add 5 5))'
Y = parser(X)
y = ['mul', 5, ['add', 5, 5]]
I want to know how to create an initial population set where the individuals represent different expressions automatically. Where there “fitness” is related to how well each equation matches the underlying trend.
For example, one individual could be: '(add 100 (mul x (sin (mul x 3))))'
Where x is time in months.
How do I automatically generate expressions for my population? I have no idea how to do this, any help would be very appreciated.

You can easily solve this problem with recursion and a random number generator random() which returns a (pseudo-)random float between 0 and 1. Here is some pseudocode:
randomExp() {
// Choose a function(like mul or add):
func = getRandomFunction() // Just choose one of your functions randomly.
arg1 = ""
rand1 = random()
// Choose the arguments. You may choose other percentages here depending how deep you want it to be and how many 'x' you want to have.
if(rand1 < 0.2)
arg1 = randomExp() // Here add a new expression
else if(rand1 < 0.5)
arg1 = "x"
else
arg1 = randomConstant() // Get a random constant in a predefined range.
// Do the same for the second argument:
arg2 = ""
…
…
// Put everything together and return it:
return "("+func+" "+arg1+" "+arg2+")"
}
You might want to also limit the recursion depth, as this might return you a theoretically infinitely long expression.

Related

How to randomly get a value from a table [duplicate]

I am working on programming a Markov chain in Lua, and one element of this requires me to uniformly generate random numbers. Here is a simplified example to illustrate my question:
example = function(x)
local r = math.random(1,10)
print(r)
return x[r]
end
exampleArray = {"a","b","c","d","e","f","g","h","i","j"}
print(example(exampleArray))
My issue is that when I re-run this program multiple times (mash F5) the exact same random number is generated resulting in the example function selecting the exact same array element. However, if I include many calls to the example function within the single program by repeating the print line at the end many times I get suitable random results.
This is not my intention as a proper Markov pseudo-random text generator should be able to run the same program with the same inputs multiple times and output different pseudo-random text every time. I have tried resetting the seed using math.randomseed(os.time()) and this makes it so the random number distribution is no longer uniform. My goal is to be able to re-run the above program and receive a randomly selected number every time.
You need to run math.randomseed() once before using math.random(), like this:
math.randomseed(os.time())
From your comment that you saw the first number is still the same. This is caused by the implementation of random generator in some platforms.
The solution is to pop some random numbers before using them for real:
math.randomseed(os.time())
math.random(); math.random(); math.random()
Note that the standard C library random() is usually not so uniformly random, a better solution is to use a better random generator if your platform provides one.
Reference: Lua Math Library
Standard C random numbers generator used in Lua isn't guananteed to be good for simulation. The words "Markov chain" suggest that you may need a better one. Here's a generator widely used for Monte-Carlo calculations:
local A1, A2 = 727595, 798405 -- 5^17=D20*A1+A2
local D20, D40 = 1048576, 1099511627776 -- 2^20, 2^40
local X1, X2 = 0, 1
function rand()
local U = X2*A2
local V = (X1*A2 + X2*A1) % D20
V = (V*D20 + U) % D40
X1 = math.floor(V/D20)
X2 = V - X1*D20
return V/D40
end
It generates a number between 0 and 1, so r = math.floor(rand()*10) + 1 would go into your example.
(That's multiplicative random number generator with period 2^38, multiplier 5^17 and modulo 2^40, original Pascal code by http://osmf.sscc.ru/~smp/)
math.randomseed(os.clock()*100000000000)
for i=1,3 do
math.random(10000, 65000)
end
Always results in new random numbers. Changing the seed value will ensure randomness. Don't follow os.time() because it is the epoch time and changes after one second but os.clock() won't have the same value at any close instance.
There's the Luaossl library solution: (https://github.com/wahern/luaossl)
local rand = require "openssl.rand"
local randominteger
if rand.ready() then -- rand has been properly seeded
-- Returns a cryptographically strong uniform random integer in the interval [0, n−1].
randominteger = rand.uniform(99) + 1 -- randomizes an integer from range 1 to 100
end
http://25thandclement.com/~william/projects/luaossl.pdf

Can Z3 call python function during decision making of variables?

I am trying to solve a problem, for example I have a 4 point and each two point has a cost between them. Now I want to find a sequence of nodes which total cost would be less than a bound. I have written a code but it seems not working. The main problem is I have define a python function and trying to call it with in a constraint.
Here is my code: I have a function def getVal(n1,n2): where n1, n2 are Int Sort. The line Nodes = [ Int("n_%s" % (i)) for i in range(totalNodeNumber) ] defines 4 points as Int sort and when I am adding a constraint s.add(getVal(Nodes[0], Nodes[1]) + getVal(Nodes[1], Nodes[2]) < 100) then it calls getVal function immediately. But I want that, when Z3 will decide a value for Nodes[0], Nodes[1], Nodes[2], Nodes[3] then the function should be called for getting the cost between to points.
from z3 import *
import random
totalNodeNumber = 4
Nodes = [ Int("n_%s" % (i)) for i in range(totalNodeNumber) ]
def getVal(n1,n2):
# I need n1 and n2 values those assigned by Z3
cost = random.randint(1,20)
print cost
return IntVal(cost)
s = Solver()
#constraint: Each Nodes value should be distinct
nodes_index_distinct_constraint = Distinct(Nodes)
s.add(nodes_index_distinct_constraint)
#constraint: Each Nodes value should be between 0 and totalNodeNumber
def get_node_index_value_constraint(i):
return And(Nodes[i] >= 0, Nodes[i] < totalNodeNumber)
nodes_index_constraint = [ get_node_index_value_constraint(i) for i in range(totalNodeNumber)]
s.add(nodes_index_constraint)
#constraint: Problem with this constraint
# Here is the problem it's just called python getVal function twice without assiging Nodes[0],Nodes[1],Nodes[2] values
# But I want to implement that - Z3 will call python function during his decission making of variables
s.add(getVal(Nodes[0], Nodes[1]) + getVal(Nodes[1], Nodes[2]) + getVal(Nodes[2], Nodes[3]) < 100)
if s.check() == sat:
print "SAT"
print "Model: "
m = s.model()
nodeIndex = [ m.evaluate(Nodes[i]) for i in range(totalNodeNumber) ]
print nodeIndex
else:
print "UNSAT"
print "No solution found !!"
If this is not a right way to solve the problem then could you please tell me what would be other alternative way to solve it. Can I encode this kind of problem to find optimal sequence of way points using Z3 solver?
I don't understand what problem you need to solve. Definitely, the way getVal is formulated does not make sense. It does not use the arguments n1, n2. If you want to examine values produced by a model, then you do this after Z3 returns from a call to check().
I don't think you can use a python function in your SMT logic. What you could alternatively is define getVal as a Function like this
getVal = Function('getVal',IntSort(),IntSort(),IntSort())
And constraint the edge weights as
s.add(And(getVal(0,1)==1,getVal(1,2)==2,getVal(0,2)==3))
The first two input parameters of getVal represent the node ids and the last integer represents the weight.

if input is nth term in fibonacci series, finding n

in fibonacci series let's assume nth fibonacci term is T. F(n)=T. but i want to write a a program that will take T as input and return n that means which term is it in the series( taken that T always will be a fibonacci number. )i want to find if there lies an efficient way to find it.
The easy way would be to simply start generating Fibonacci numbers until F(i) == T, which has a complexity of O(T) if implemented correctly (read: not recursively). This method also allows you to make sure T is a valid Fibonacci number.
If T is guaranteed to be a valid Fibonacci number, you can use approximation rules:
Formula
It looks complicated, but it's not. The point is: from a certain point on, the ratio of F(i+1)/F(i) becomes a constant value. Since we're not generating Fibonacci Numbers but are merely finding the "index", we can drop most of it and just realize the following:
breakpoint := f(T)
Any f(i) where i > T = f(i-1)*Ratio = f(T) * Ratio^(i-T)
We can get the reverse by simply taking Log(N, R), R being Ratio. By adjusting for the inaccuracy for early numbers, we don't even have to select a breakpoint (if you do: it's ~ correct for i > 17).
The Ratio is, approximately, 1.618034. Taking the log(1.618034) of 6765 (= F(20)), we get a value of 18.3277. The accuracy remains the same for any higher Fibonacci numbers, so simply rounding down and adding 2 gives us the exact Fibonacci "rank" (provided that F(1) = F(2) = 1).
The first step is to implement fib numbers in a non-recursive way such as
fib1=0;fib2=1;
for(i=startIndex;i<stopIndex;i++)
{
if(fib1<fib2)
{
fib1+=fib2;
if(fib1=T) return i;
if(fib1>T) return -1;
}
else
{
fib2+=fib1;
if(fib2=T) return i;
if(fib2>t) return -1;
}
}
Here startIndex would be set to 3 stopIndex would be set to 10000 or so. To cut down in the iteration, you can also select 2 seed number that are sequential fib numbers further down the sequence. startIndex is then set to the next index and do the computation with an appropriate adjustment to the stopIndex. I would suggest breaking the sequence up in several section depending on machine performance and the maximum expected input to minimize the run time.

Sequence constructed from the previous element of the Sequence and another Sequence

For learning purposes I am trying out running a simulation as a sequence with F#. Starting from a sequence of random numbers, map is a straightforward way to generate a sequence of states if the states do not depend on the previous states. Where I run into a problem is when I try to do something like:
State(i+1) = F (State(i), random number)
I managed to get something working by using unfold, passing in the random generator along the lines of
let unfold (day:State,rnd:Random) =
let rand = rnd.NextDouble()
let nextDay = NextState day rand
Some (nextDay, (nextDay, rnd))
However, at least to my inexperienced eyes, something about passing around the Random instance seems fishy. Is there a way to achieve something similar but passing in a sequence of random numbers, rather than the generator?
I think your hunch about passing around a Random instance as being fishy is fair: when mutable state is useful it's a good idea to isolate it, so that you benifit from purity as much as possible.
We can isolate the state here by creating a sequence which yields a different set of random numbers upon each iteration
open System
let rndSeq =
seq {
//note that by putting rnd inside seq expression here, we ensure that each iteration of the sequence
//yields a different sequnce of random numbers
let rnd = new Random()
while true do yield rnd.NextDouble()
}
then, you can use Seq.scan to iterate the random sequence by mapping elements using a function which is informed by the previous element which was mapped.
let runSimulation inputSeq initialState =
inputSeq
|> Seq.scan
(fun (previousState:State) (inputElement:float) -> NextState previousState inputElement)
initialState
runSimulation rndSeq initialState //run the simulation using a random sequence of doubles greater than or equal to 0.0 and less than 1
You can see as an added bonus here that your simulated input and simulation implementation are no longer bound together, you can run your simulation using any input sequence.
I'd agree with BrokenGlass that using a global Random instance feels allright in this case. This is a reasonably localized use of mutable state, so it shouldn't be confusing.
As an alternative to unfold, you can consider writing the computation explicitly:
let simulationStates =
let rnd = new Random()
let rec generate (day:State) = seq {
let rand = rnd.NextDouble()
let nextDay = NextState day rand
yield nextDay
yield! generate nextDay }
generate InitialState
Note that the rnd value is local variable with a scope limited only to the definition of simulationStates. This is quite nice way to keep mutable state separate from the rest of the program.
The version using unfold is probably more succinct; this one may be easier to read, so it depends on your personal style preferences.
Might be against the spirit, but I would just use a global Random instance in this case - alternatively you could define a sequence of random numbers like this:
let randomNumbers =
seq {
let rnd = new Random();
while true do
yield rnd.NextDouble();
}

Lua base converter

I need a base converter function for Lua. I need to convert from base 10 to base 2,3,4,5,6,7,8,9,10,11...36 how can i to this?
In the string to number direction, the function tonumber() takes an optional second argument that specifies the base to use, which may range from 2 to 36 with the obvious meaning for digits in bases greater than 10.
In the number to string direction, this can be done slightly more efficiently than Nikolaus's answer by something like this:
local floor,insert = math.floor, table.insert
function basen(n,b)
n = floor(n)
if not b or b == 10 then return tostring(n) end
local digits = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ"
local t = {}
local sign = ""
if n < 0 then
sign = "-"
n = -n
end
repeat
local d = (n % b) + 1
n = floor(n / b)
insert(t, 1, digits:sub(d,d))
until n == 0
return sign .. table.concat(t,"")
end
This creates fewer garbage strings to collect by using table.concat() instead of repeated calls to the string concatenation operator ... Although it makes little practical difference for strings this small, this idiom should be learned because otherwise building a buffer in a loop with the concatenation operator will actually tend to O(n2) performance while table.concat() has been designed to do substantially better.
There is an unanswered question as to whether it is more efficient to push the digits on a stack in the table t with calls to table.insert(t,1,digit), or to append them to the end with t[#t+1]=digit, followed by a call to string.reverse() to put the digits in the right order. I'll leave the benchmarking to the student. Note that although the code I pasted here does run and appears to get correct answers, there may other opportunities to tune it further.
For example, the common case of base 10 is culled off and handled with the built in tostring() function. But similar culls can be done for bases 8 and 16 which have conversion specifiers for string.format() ("%o" and "%x", respectively).
Also, neither Nikolaus's solution nor mine handle non-integers particularly well. I emphasize that here by forcing the value n to an integer with math.floor() at the beginning.
Correctly converting a general floating point value to any base (even base 10) is fraught with subtleties, which I leave as an exercise to the reader.
you can use a loop to convert an integer into a string containting the required base. for bases below 10 use the following code, if you need a base larger than that you need to add a line that mapps the result of x % base to a character (usign an array for example)
x = 1234
r = ""
base = 8
while x > 0 do
r = "" .. (x % base ) .. r
x = math.floor(x / base)
end
print( r );

Resources