Generating unique solutions for N-Queens - n-queens

I am trying to write a code to generate unique solutions for N-Queens. I am able to generate correct solution for N = 4 and 6 (but not for other cases of N) and have eliminated the cases of generation of non-distinct solutions due to rotation by 180 and 90 degrees. The problem lies in my understanding of how to eliminate other redundant cases.
For example, for N = 5, I should get 2 unique solutions but I am generating 3. The 2 redundant solutions, out of which one is to be eliminated are (the 3rd solution is correct):
Q---- Q----
--Q-- ---Q-
----Q -Q---
-Q--- ----Q
---Q- --Q--

Related

Z3PY extremely slow with many variables?

I have been working with the optimizer in Z3PY, and only using Z3 ints and (x < y)-like constraints in my project. It has worked really well. I have been using up to 26 variables (Z3 ints), and it takes the solver about 5 seconds to find a solution and I have maybe 100 soft constraints, at least. But now I tried with 49 variables, and it does not solve it at all (I shut it down after 1 hour).
So I made a little experiment to find out what was slowing it down, is it the amount of variables or the amount of soft constraints? It seems like the bottle neck is the amount of variables.
I created 26 Z3-ints. Then I added as hard constraints, that it should not be lower than 1 or more than 26. Also, all numbers must be unique. No other constraints was added at all.
In other words, the solution that the solver will find is a simple order [1,2,3,4,5....up to 26]. Ordered in a way that the solver finds out.
I mean this is a simple thing, there are really no constraints except those I mentioned. And the solver solves this in 0.4 seconds or something like that, fast and sufficient. Which is expected. But if I increase the amount of variables to 49 (and of course the constraints now are that it should not be lower than 1 or more than 49), it takes the solver about 1 minute to solve. That seems really slow for such a simple task? Should it be like this, anybody knows? The time complexity is really extremely increased?
(I know that I can use Solver() instead of Optimizer() for this particular experiment, and it will be solved within a second, but in reality I need it to be done with Optimizer since I have a lot of soft constraints to work with.)
EDIT: Adding some code for my example.
I declare an array with Z3 ints that I call "reqs".
The array is consisting of 26 variables in one example and 49 in the other example I am talking about.
solver = Optimize()
for i in (reqs):
solver.add(i >= 1)
for i in (reqs):
solver.add(i <= len(reqs))
d = Distinct(reqs)
solver.add(d)
res = solver.check()
print(res)
Each benchmark is unique, and it's impossible to come up with a good strategy that applies equally well in all cases. But the scenario you describe is simple enough to deal with. The performance problem comes from the fact that Distinct creates too many inequalities (quadratic in number) for the solver, and the optimizer is having a hard time dealing with them as you increase the number of variables.
As a rule of thumb, you should avoid using Distinct if you can. For this particular case, it'd suffice to impose a strict ordering on the variables. (Of course, this may not always be possible depending on your other constraints, but it seems what you're describing can benefit from this trick.) So, I'd code it like this:
from z3 import *
reqs = [Int('i_%d' % i) for i in range(50)]
solver = Optimize()
for i in reqs:
solver.add(i >= 1, i <= len(reqs))
for i, j in zip(reqs, reqs[1:]):
solver.add(i < j)
res = solver.check()
print(res)
print(solver.model())
When I run this, I get:
$ time python a.py
sat
[i_39 = 40,
i_3 = 4,
...
i_0 = 1,
i_2 = 3]
python a.py 0.27s user 0.09s system 98% cpu 0.365 total
which is pretty snippy. Hopefully you can generalize this to your original problem.

"Contrast Error" message with lsmeans Tukey Test on GLM

I have defined a generalised linear model as follows:
glm(formula = ParticleCount ~ ParticlePresent + AlgaePresent +
ParticleTypeSize + ParticlePresent:ParticleTypeSize + AlgaePresent:ParticleTypeSize,
family = poisson(link = "log"), data = PCB)
and I have the below significant interactions
Df Deviance AIC LRT Pr(>Chi)
<none> 666.94 1013.8
ParticlePresent:ParticleTypeSize 6 680.59 1015.4 13.649 0.033818 *
AlgaePresent:ParticleTypeSize 6 687.26 1022.1 20.320 0.002428 **
I am trying to proceed with a posthoc test (Tukey) to compare the interaction of ParticleTypeSize using the lsmeans package. However, I get the following message as soon as I proceed:
library(lsmeans)
leastsquare=lsmeans(glm.particle3,~ParticleTypeSize,adjust ="tukey")
Error in `contrasts<-`(`*tmp*`, value = contrasts.arg[[nn]]) :
contrasts apply only to factors
I've checked whether ParticleTypeSize is a valid factor by applying:
l<-sapply(PCB,function(x)is.factor(x))
l
Sample AlgaePresent ParticlePresent ParticleTypeSize
TRUE FALSE FALSE TRUE
ParticleCount
FALSE
I'm stumped and unsure as to how I can rectify this error message. Any help would be much appreciated!
That error happens when the variable you specify is not a factor. You tested and found that it is, so that's a mystery and all I can guess is that the data changed since you fit the model. So try re-fitting the model with the present dataset.
All that said, I question what you are trying to do. First, you have ParticleTypeSize interacting with two other predictors, which means it is probably not advisable to look at marginal means (lsmeans) for that factor. The fact that there are interactions means that the pattern of those means changes depending on the values of the other variables.
Second, are AlgaePresent and ParticlePresent really numeric variables? By their names, they seem like they ought to be factors. If they are really indicators (0 and 1), that's OK, but it is still cleaner to code them as factors if you are using functions like lsmeans where factors and covariates are treated in distinctly different ways.
BTW, the lsmeans package is being deprecated, and new developments are occurring in its successor, the emmeans package.

Does Z3 have support for optimization problems

I saw in a previous post from last August that Z3 did not support optimizations.
However it also stated that the developers are planning to add such support.
I could not find anything in the source to suggest this has happened.
Can anyone tell me if my assumption that there is no support is correct or was it added but I somehow missed it?
Thanks,
Omer
If your optimization has an integer valued objective function, one approach that works reasonably well is to run a binary search for the optimal value. Suppose you're solving the set of constraints C(x,y,z), maximizing the objective function f(x,y,z).
Find an arbitrary solution (x0, y0, z0) to C(x,y,z).
Compute f0 = f(x0, y0, z0). This will be your first lower bound.
As long as you don't know any upper-bound on the objective value, try to solve the constraints C(x,y,z) ∧ f(x,y,z) > 2 * L, where L is your best lower bound (initially, f0, then whatever you found that was better).
Once you have both an upper and a lower bound, apply binary search: solve C(x,y,z) ∧ 2 * f(x,y,z) > (U - L). If the formula is satisfiable, you can compute a new lower bound using the model. If it is unsatisfiable, (U - L) / 2 is a new upper-bound.
Step 3. will not terminate if your problem does not admit a maximum, so you may want to bound it if you are not sure it does.
You should of course use push and pop to solve the succession of problems incrementally. You'll additionally need the ability to extract models for intermediate steps and to evaluate f on them.
We have used this approach in our work on Kaplan with reasonable success.
Z3 currently does not support optimization. This is on the TODO list, but it has not been implemented yet. The following slide decks describe the approach that will be used in Z3:
Exact nonlinear optimization on demand
Computation in Real Closed Infinitesimal and Transcendental Extensions of the Rationals
The library for computing with infinitesimals has already been implemented, and is available in the unstable (work-in-progress) branch, and online at rise4fun.

Finding the largest prime factor of 600851475143 [duplicate]

This question already has answers here:
Project Euler #3 in Ruby solution times out
(2 answers)
Closed 9 years ago.
I'm trying to use a program to find the largest prime factor of 600851475143. This is for Project Euler here: http://projecteuler.net/problem=3
I first attempted this with this code:
#Ruby solution for http://projecteuler.net/problem=2
#Prepared by Richard Wilson (Senjai)
#We'll keep to our functional style of approaching these problems.
def gen_prime_factors(num) # generate the prime factors of num and return them in an array
result = []
2.upto(num-1) do |i| #ASSUMPTION: num > 3
#test if num is evenly divisable by i, if so add it to the result.
result.push i if num % i == 0
puts "Prime factor found: #{i}" # get some status updates so we know there wasn't a crash
end
result #Implicit return
end
#Print the largest prime factor of 600851475143. This will always be the last value in the array so:
puts gen_prime_factors(600851475143).last #this might take a while
This is great for small numbers, but for large numbers it would take a VERY long time (and a lot of memory).
Now I took university calculus a while ago, but I'm pretty rusty and haven't kept up on my math since.
I don't want a straight up answer, but I'd like to be pointed toward resources or told what I need to learn to implement some of the algorithms I've seen around in my program.
There's a couple problems with your solution. First of all, you never test that i is prime, so you're only finding the largest factor of the big number, not the largest prime factor. There's a Ruby library you can use, just require 'prime', and you can add an && i.prime? to your condition.
That'll fix inaccuracy in your program, but it'll still be slow and expensive (in fact, it'll now be even more expensive). One obvious thing you can do is just set result = i rather than doing result.push i since you ultimately only care about the last viable i you find, there's no reason to maintain a list of all the prime factors.
Even then, however, it's still very slow. The correct program should complete almost instantly. The key is to shrink the number you're testing up to, each time you find a prime factor. If you've found a prime factor p of your big number, then you don't need to test all the way up to the big number anymore. Your "new" big number that you want to test up to is what's left after dividing p out from the big number as many times as possible:
big_number = big_number/p**n
where n is the largest integer such that the right hand side is still a whole number. In practice, you don't need to explicitly find this n, just keep dividing by p until you stop getting a whole number.
Finally, as a SPOILER I'm including a solution below, but you can choose to ignore it if you still want to figure it out yourself.
require 'prime'
max = 600851475143; test = 3
while (max >= test) do
if (test.prime? && (max % test == 0))
best = test
max = max / test
else
test = test + 2
end
end
puts "Here's your number: #{best}"
Exercise: Prove that test.prime? can be eliminated from the if condition. [Hint: what can you say about the smallest (non-1) divisor of any number?]
Exercise: This algorithm is slow if we instead use max = 600851475145. How can it be improved to be fast for either value of max? [Hint: Find the prime factorization of 600851475145 by hand; it's easy to do and it'll make it clear why the current algorithm is slow for this number]

Genetic Algorithms - Crossover and Mutation operators for paths

I was wondering if anyone knew any intuitive crossover and mutation operators for paths within a graph? Thanks!
Question is a bit old, but the problem doesn't seem to be outdated or solved, so I think my research still might be helpful for someone.
As far as mutation and crossover is quite trivial in the TSP problem, where every mutation is valid (that is because chromosome represents an order of visiting fixed nodes - swapping order then always can create a valid result), in case of Shortest Path or Optimal Path, where the chromosome is a exact route representation, this doesn't apply and isn't that obvious. So here is how I approach problem of solving Optimal Path using GA.
For crossover, there are few options:
For routes that have at least one common point (besides start and end node) - find all common points and swap subroutes in the place of crossing
Parent 1: 51 33 41 7 12 91 60
Parent 2: 51 9 33 25 12 43 15 60
Potential crossing point are 33 and 12. We can get following children: 51 9 33 41 7 12 43 15 60 and 51 33 25 12 91 60 that are the result of crossing using both of these crossing points.
When two routes don't have common point, select randomly two points from each parent and connect them (you can use for that either random traversal, backtracking or heuristic search like A* or beam search). Now this path may be treated as crossover path. For better understanding, see below picture of two crossover methods:
see http://i.imgur.com/0gDTNAq.png
Black and gray paths are parents, pink and orange paths are
children, green point is a crossover place, and red points are start
and end nodes. First graph shows first type of crossover, second graph is example of another one.
For mutation, there are also few options. Generally, dummy mutation like swapping order of nodes or adding random node is really ineffective for graphs with average density. So here are the approaches that guarantee valid mutations:
Take randomly two points from path and replace them with a random path between those two nodes.
Chromosome: 51 33 41 7 12 91 60 , random points: 33 and 12, random/shortest path between then: 33 29 71 12, mutated chromosome: 51 33 29 71 12 91 60
Find random point from path, remove it and connect its neighbours (really very similar to the first one)
Find random point from path and find random path to its neighbour
Try subtraversing the path from some randomly chosen point, until reaching any point on the initial route (slight modification of the first method).
see http://i.imgur.com/19mWPes.png
Each graph corresponds to each mutation method in appropriate order. In last example, the orange path is the one that would replace original path between mutation points (green nodes).
Note: this methods obviously may have performance drawback in the case, when finding alternative subroute (using a random or heuristic method) will stuck at some place or find very long and useless subpath, so consider bounding the time of mutation execution or trials number.
For my case, which is finding an optimal path in terms of maximizing sum of vertices weights while keeping sum of nodes weight less than given bound, those methods are quite effective and give a good result. Should you have any question, feel free to ask. Also, sorry for my MS Paint skills ;)
Update
One big hint: I basically used this approach in my implementation, but there was one big drawback of using random path generating. I decided to switch to semi-random route generation using shortest path traversing randomly picked point(s) - it is much more efficent (but obviously may not be applicable for all problems).
Emm.. That is very difficult question, people write dissertations for that and still there is no right answer to that.
The general rule is "it all depends on your domain".
There are some generic GA libraries that will do some work for you, but for the best results it is recommended to implement your GA operations yourself, specifically for your domain.
You might have more luck with answers on Theoretical CS, but you need to expand your question more and add more details about your task and domain.
Update:
So you have a graph. In GA terms, a path through the graph represents an individual, nodes in the path would be chromosomes.
In that case I would say a mutation can be represented as deviation of the path somewhere from the original - one of the nodes is moved somewhere, and the path is adjusted so the start and end values in the path are remaining the same.
Mutation can lead to invalid individuals. And in that case you need to make a decision: allow invalid ones and hope that they will converge to some unexplored solution. Or kill them on the spot. When I was working with GA, I did allow invalid solution, adding "Unfitness" value along with fitness. Some researchers suggest this can help with broad exploring of the solution space.
Crossover can only happen to the paths that are crossing each other: on the point of the crossing, swap the remains of the path with the parents.
Bear in mind that there are various ways for crossover: individuals can be crossed-over in multiple points or just in one. In the case with graphs you can have multiple crossing points, and that can naturally lead to the multiple children graphs.
As I said before, there is no right or wrong way of doing this, but you will find out the best way only by experimenting on it.

Resources