I got different vectors where I want to set constraints for different medians.
Where some medians are calculated for different subsets of the vector.
Eg, I want a constraint for
age = IntVector('age', 10)
male = BoolVector('male', 10)
salary = IntVector('salary', NUM)
where I want a salary median of 50 for all female aged greater 50 and an age median of 40 for all male with salary > 70
So I know how to filter out the relevant data.
If(And(male[i] == False, age[i] > 50)
I know how to get the mean eg:
Sum([If(And(male[i] == False, salary[i] > 50), age[i], 0) for i in range(10)]) / (10 - NUM_MALE) == 50
however for the median I kneed a sorted list so I can say something like:
(age[4] + age[5])/2 = MEAN
However, I cannot model a constraint to ensure the ordered age AND ordered salary, since person_1 will not be the youngest AND have the least salary.
So I would need to have a temporal ordering for all my vectors by either age or salary.
I recently come up with a slightly different approach than in the paper, that is to maintain a list of indices:
in your example, let's say if we have a list of indices for female and salary > 50:
indices = [0, 5, 7, 9]
Then adding mean & median constraints are relatively straightforward:
# adding mean constraint
Sum([age[indices[idx]] for idx in indices]) / 4 == MEAN
# adding median constraint
(ages[indices[1]] + ages[indices[2]] == median * 2)
The next question is how do we come up with the indices above, well we can achieve that by adding some more constraints to the indices:
indices must within the range (e.g, 0 - 9)
indices must be distinct
indices must be in order
male[i] == False for i in indices
salary[i] > 50 for i in indices
Once we have all the constraints in place, z3 will try to find the indices we need, and adding mean & median constraints become a easy task
Related
How long will it take for a program to print a truth table for n propositional symbols?
(Symbols: P1, P2, ..., Pn)
Can't seem to crack this question, not quite sure how to calculate this instance.
It will take time proportional to at least n*2^n. Each of the propositional symbols can assume one of two values - true or false. The portion of the truth table that lists all possible assignments of n such variables must have at least 2 * 2 * … * 2 (n times) = 2^n rows of n entries each; and that's not even counting the subexpressions that make up the rest of the table. This lower bound is tight since we can imagine the proposition P1 and P2 and … and Pn and the following procedure taking time Theta(n*2^n) to write out the answer:
fill up P1's column with 2^(n-1) TRUE and then 2^(n-1) FALSE
fill up P2's column with 2^(n-2) TRUE and then 2^(n-2) FALSE, alternating
…
fill up Pn's column with 1 TRUE and 1 FALSE, alternating
fill up the last column with a TRUE at the top and FALSE all the way down
If you have more complicated propositions then you should probably take the number of subexpressions as another independent variable since that could have an asymptotically relevant effect (using n propositional symbols, you can have arbitrarily many unique subexpressions that must be given their own columns in a complete truth table).
from fig we can see that Arsenal have won three match consecutively but I could not write the query.
Here is a query that should return the maximum number of consecutive wins for Arsenal:
MATCH (a:Club {name:'Arsenal FC'})-[r:played_with]-(:Club)
WITH ((CASE a.name WHEN r.home THEN 1 ELSE -1 END) * (TOINT(r.score[0]) - TOINT(r.score[1]))) > 0 AS win, r
ORDER BY TOINT(r.time)
RETURN REDUCE(s = {max: 0, curr: 0}, w IN COLLECT(win) |
CASE WHEN w
THEN {
max: CASE WHEN s.max < s.curr + 1 THEN s.curr + 1 ELSE s.max END,
curr: s.curr + 1}
ELSE {max: s.max, curr: 0}
END
).max AS result;
The WITH clause sets the win variable to true iff Arsenal won a particular game. Notice that the ORDER BY clause converts the time property to an integer, because the ordering of numeric strings does not work properly if the strings could be of different lengths (I am being a bit picky here, admittedly). The REDUCE function is used to calculate the maximum number of consecutive wins.
======
Finally, here are some suggestions for some improvements to your data model. For example:
It looks like your played_with relationship always points from the home team to the away team. If so, you can get rid of the redundant home and away properties, and you can also rename the relationship type to HOSTED to make the direction of the relationship more clear.
The scores and time should be stored as integers, not strings. That would make your queries more efficient, and easier to write and understand.
You could also consider splitting the scores property into two scalar properties, say homeScore and awayScore, which would make your code more clear. There seems to be no advantage to storing the scores in an array.
If you made all the above suggested changes, then you would just need to change the beginning of the above query to this:
MATCH (a:Club {name:'Arsenal FC'})-[r:HOSTED]-(:Club)
WITH ((CASE a WHEN STARTNODE(r) THEN 1 ELSE -1 END) * (r.homeScore - r.awayScore)) > 0 AS win, r
ORDER BY r.time
...
I have two range vectors (# of hits and misses) that I want to aggregate by their types. Some of the types have hits, other misses, some with both. These are two independant metrics that I'm trying to get a union of but the resulting vector doesn't make sense. It's missing some of the values and I think it's because they have either all hits or misses. Am I doing this completely the wrong way?
sum by (type) (increase(metric_hit{}[24h]) + sum by (type) (increase(metric_miss{}[24h])
First off, it's recommended to always initialise all your potential label values to avoid this sort of issue.
This can be handled with the or operator:
sum by (type) (
(increase(metric_hit[1d]) or metric_miss * 0)
+
(increase(metric_miss[1d]) or metric_hit * 0)
)
I am attempting a coding problem which asks me to print "YES" if the sum of any consecutive array numbers is equal to the given number and "NO" if none.
Here is the question:
Prateek wants to give a party to his N friends on his birthday, where each friend is numbered from 1 to N. His friends are asking for a gift to come to the party, instead of giving him one. The cost of the gifts are given in the array Value where ith friend asks for a gift which has a cost Costi.
But, Prateek has only X amount of money to spend on gifts and he wants to invite his friends which are in continuous range such that sum of the cost of the gifts of those friends will be exactly equal to X.
If he can invite his friends, who can satisfy the above condition then, print YES otherwise print NO.
Input:
The first line contains a single integer T, denoting the number of test cases. In each test case, the following input will be present: - The next line contains two space-separated integers N and X, where N represents the number of friends and X represents amount of money which Prateek can spend on gifts.
- Next N line contains N integers, where ith line contains ith integer, which represents the Costi .
Ouput
Output exactly T lines, each containing the answer to the corresponding test case .
Constraints:
1 <= T <= 10
1 <= N , Costi <= 106
1 <= X <= 1012
SAMPLE INPUT
1
5 12
1
3
4
5
2
SAMPLE OUTPUT
YES
Explanation
In the sample input, T is equal to 1. So, accordingly, in next line, values of N and X are given which are 5 and 12 respectively. In the next 5 lines, you have costi asked by ith friend. As friends numbered from 2 to 4 (inclusively) have gifts value which are {3, 4, 5}, and their sum equals to 12 - that is, the given value of X. So, the answer is YES.
my solution is here
b = Array.new
a = Array.new
t = gets.to_i
if t >= 0 && t <= 10
t.times do
n, x = gets.chomp.split.map(&:to_i)
n.times do
a << gets.to_i
end
(1..a.length).each do |num|
a.each_cons(num).each do |pair|
if pair.inject(:+) == x
b << "YES"
else
b << "NO"
end
end
end
if b.include?("YES")
puts "YES"
else
puts "NO"
end
end
end
Although they have accepted my answer, it does not pass all the test cases, hence I am not satisfied.Can someone help me with a correct, more efficient and elegant solution?
Have a look at each_cons:
array = [1,2,3,4,5]
number = 5
array.each_cons(2) { |pair| puts 'YES' if pair.inject(:+) == number }
#=> 'YES'
number = 10
array.each_cons(2) { |pair| puts 'YES' if pair.inject(:+) == number }
#=> nil
Or when you want to return 'YES' or 'NO':
array.each_cons(2).any? { |pair| pair.inject(:+) == number } ? 'YES' : 'NO'
I suggest that you split your answer into several parts:
Reading input from the user
Determining whether an array contains a consecutive subarray that sums to a given number
Printing YES or NO.
Your code is difficult to read because all these responsibilities are intertwined. Point 2 is crucial. To play nicely with points 1 and 3 it can be solved with a function that accepts an array of numbers and the desired sum as arguments and returns true if there's a consecutive subarray with the desired sum and false otherwise.
The simplest algorithm looks at all subarrays, computes their sums and compare with the desired sum.
def consecutive_sum?(array, sum)
(0...array.size).each do |start|
(start...array.size).each do |stop|
return true if array[start..stop].inject(&:+) == sum
end
end
false
end
start and stop mark the beginning and end of the subarray. Array#inject is used to compute the sum of the subarray.
I'll leave points 1 and 3 to you.
For simplicity say we have a sample set of possible scores {0, 1, 2}. Is there a way to calculate a mean based on the number of scores without getting into hairy lookup tables etc for a 95% confidence interval calculation?
dreeves posted a solution to this here: How can I calculate a fair overall game score based on a variable number of matches?
Now say we have 2 scenarios ...
Scenario A) 2 votes of value 2 result in SE=0 resulting in the mean to be 2
Scenario B) 10000 votes of value 2 result in SE=0 resulting in the mean to be 2
I wanted Scenario A to be some value less than 2 because of the low number of votes, but it doesn't seem like this solution handles that (dreeve's equations hold when you don't have all values in your set equal to each other). Am I missing something or is there another algorithm I can use to calculate a better score.
The data available to me is:
n (number of votes)
sum (sum of votes)
{set of votes} (all vote values)
Thanks!
You could just give it a weighted score when ranking results, as opposed to just displaying the average vote so far, by multiplying with some function of the number of votes.
An example in C# (because that's what I happen to know best...) that could easily be translated into your language of choice:
double avgScore = Math.Round(sum / n);
double rank = avgScore * Math.Log(n);
Here I've used the logarithm of n as the weighting function - but it will only work well if the number of votes is neither too small or too large. Exactly how large is "optimal" depends on how much you want the number of votes to matter.
If you like the logarithmic approach, but base 10 doesn't really work with your vote counts, you could easily use another base. For example, to do it in base 3 instead:
double rank = avgScore * Math.Log(n, 3);
Which function you should use for weighing is probably best decided by the order of magnitude of the number of votes you expect to reach.
You could also use a custom weighting function by defining
double rank = avgScore * w(n);
where w(n) returns the weight value depending on the number of votes. You then define w(n) as you wish, for example like this:
double w(int n) {
// caution! ugly example code ahead...
// if you even want this approach, at least use a switch... :P
if (n > 100) {
return 10;
} else if (n > 50) {
return 8;
} else if (n > 40) {
return 6;
} else if (n > 20) {
return 3;
} else if (n > 10) {
return 2;
} else {
return 1;
}
}
If you want to use the idea in my other referenced answer (thanks!) of using a pessimistic lower bound on the average then I think some additional assumptions/parameters are going to need to be injected.
To make sure I understand: With 10000 votes, every single one of which is "2", you're very sure the true average is 2. With 2 votes, each a "2", you're very unsure -- maybe some 0's and 1's will come in and bring down the average. But how to quantify that, I think is your question.
Here's an idea: Everyone starts with some "baggage": a single phantom vote of "1". The person with 2 true "2" votes will then have an average of (1+2+2)/3 = 1.67 where the person with 10000 true "2" votes will have an average of 1.9997. That alone may satisfy your criteria. Or to add the pessimistic lower bound idea, the person with 2 votes would have a pessimistic average score of 1.333 and the person with 10k votes would be 1.99948.
(To be absolutely sure you'll never have the problem of zero standard error, use two different phantom votes. Or perhaps use as many phantom votes as there are possible vote values, one vote with each value.)