[Homework]Solving the recurrence t(n)=t(n/2)+n^2 using the iteration method - recurrence

Any hints on how to solve t(n) = t(n/2) + n^2 with the iteration method? What I've got so far:

Related

How this method or formula for calculating ROC AUC works?

I was trying to calculate the AUC using MySQL for the data in table like below:
y p
1 0.872637
0 0.130633
0 0.098054
...
...
1 0.060190
0 0.110938
I came across the following SQL query which is giving the correct AUC score (I verified using sklearn method).
SELECT (sum(y*r) - 0.5*sum(y)*(sum(y)+1)) / (sum(y) * sum(1-y)) AS auc
FROM (
SELECT y, row_number() OVER (ORDER BY p) r
FROM probs
) t
Using pandas this can be done as follows:
temp = df.sort_values(by="p")
temp['r'] = np.arange(1, len(df)+1, 1)
temp['yr'] = temp['y']*temp['r']
print( (sum(temp.yr) - 0.5*sum(temp.y)*(sum(temp.y)+1)) / (sum(temp.y) * sum(1-temp.y)) )
I did not understand how we are able to calculate AUC using this method. Can somebody please give intuition behind this?
I am already familiar with the trapezoidal method which involves summing the area of small trapezoids under the ROC curve.
Short answer: it is Wilcoxon-Mann-Whitney statistic, see
https://en.wikipedia.org/wiki/Receiver_operating_characteristic#Area_under_the_curve
The page has proof as well.
The bottom part of your formula is identical to the formula in the wiki. The top part is trickier. f in wiki corresponds to p in your data and t_0 and t_1 are indexes in the data frame. Note that we first sort by p, which makes our life easier.
Note that the double sum may be decomposed as
Sum_{t_1 such that y(t_1)=1} #{t_0 such that p(t_0) < p(t_1) and y(t_0)=0}
Here # stands for the total number of such indexes.
For each row index t_1 (such that y(t_1) =1), how many t_0 are such that p(t_0) < p(t_1) and y(t_0)=0? We know that there are exactly t_1 values of p that are less or equal than t_1 because values are sorted. We conclude that
#{t_0: p(t_0) < p(t_1) and y(t_0)=1) = t_1 - #{t_0: t_0 <= t_1 and y(t_0)=1}
Now imagine scrolling down the sorted dataframe. For the first time we meet y=1, #{t_0: t_0 <= t_1 and y(t_0)=1}=1, for the second time we meet y=1, the same quantity is 2, for the third time we meet y=1, the quantity is 3, and so on. Therefore, when we sum the equality over all indexes t_1 when y=1, we get
Sum_{t_1: y(t_1)=1}#{t_0: p(t_0) < p(t_1) and y(t_0)=1) = Sum_{t_1: y(t_1)=1} t_1 - (1 + 2 + 3 + ... + n),
where n is the total number of ones in y column. Now we need to do one more simplification. Note that
Sum_{t_1: y(t_1)=1} t_1 = Sum_{t_1: y(t_1)=1} t_1 y(t_1)
If y(t_1) is not one, it is zero. Therefore,
Sum_{t_1: y(t_1)=1} t_1 = Sum_{t_1: y(t_1)=1} t_1 y(t_1) = Sum_{t} t y(t)
Plugging this to our formula and using that
1 + 2+ 3 + ... + n = n(n+1)/2
finished the proof of the formula you found.
P.S. I think that posting this question on math or stats overflow would make more sense.

regression with stochastic gradient descent algorithm

I am studying regression with Machine Learning in Action book and I saw a source like below :
def stocGradAscent0(dataMatrix, classLabels):
m, n = np.shape(dataMatrix)
alpha = 0.01
weights = np.ones(n) #initialize to all ones
for i in range(m):
h = sigmoid(sum(dataMatrix[i]*weights))
error = classLabels[i] - h
weights = weights + alpha * error * dataMatrix[i]
return weights
You may guess what the code means. But I didn't understand it. I read the book several times and searched related stuff like wiki or google, where exponential function is from to get weights for minimum differences. And why do we get proper weight using the exponential function with sum of X*weights? It would be kind of OLS. Anyway then we get the result like below:
Thanks!
It just the basics in linear regression. In the for loop it tries to calculate the error function
Z = β₀ + β₁X ; where β₁ AND X are matrices
hΘ(x) = sigmoid(Z)
i.e. hΘ(x) = 1/(1 + e^-(β₀ + β₁X)
then update the weights. normally it's better to give it a high number for iterations in the for loop like 1000, m it would be small i guess.
i want to explain more but i can't explain better than this dude here
Happy learning!!

How to solve T(n)=4T(n/4)+n^2 by recursion tree and master theorem?

T(n) = 4T(n/4) + n^2 (if n=1, T(1)=c for some positive constant)
I asked MathStackExchange but no one answered.
What I want to ask is the answer to solving by master theorem and recursion tree about the same problem.
The conclusion is below sentences.
Master theorem = theta(n^2)
Recursion tree = theta(n^2 log_4 n)
How to solve and what is the answer?
In the first level we have O(n^2) time-complexity. For the second level we have 4 times O(n/4). For the next level 4*4 times O(n/(4*4)) and so on.
So we have
PS:
The last part is a geometric series with a=1 and q = 1/4 summed upto m which m is equal to log_4(n).
Depth of recursion tree can calculate from n/4^i = c formula. So h = log_4(n).

Why does Mahout RMSRecommenderEvaluator evaluate method sometimes results NaN?

Im trying to evaluate my recommendation system using the following code
RecommenderEvaluator rmsEvaluator = new RMSRecommenderEvaluator();
double score = rmsEvaluator.evaluate(recommenderBuilder, null, model, 0.95, 0.05);
System.out.println("RMS Evaluator Score: " + score);
Some time the score is NaN.
Why does Mahout RMSRecommenderEvaluator evaluate method results NaN?
Maybe your dataset is to small to have meaningful results.

Heuristic path algorithm (Pohl) completeness

This is a homework question, exactly as follows:
The heuristic path algorithm (Pohl, 1977) is a best-first search in which the evaluation function is f(n) = (2-w)g(n) + wh(n).
For what values of w is this complete?
Here's what I know:
w = 0: f(n)=2g(n) --> Uniform Cost Search, which is complete.
w = 1: f(n)=g(n) + h(n) --> A*, which is complete.
w = 2: f(n)=2h(n) --> greedy Best First Search, which is not complete.
What about all other values of w?
Please don't just give the answer, help me get to the solution.
Interesting thing about "all other values of w" for w>2: They all have the form f(n) = h(n) - g(n) with some constants in front of h and g. What impact, if any, does subtracting the cost have on completeness? Seems you should be able to generalize from there.

Resources