Runtime of while loop pseudocode - analysis

I have a pseudocode which I'm trying to make a detailed analysis, analyze runtime, and asymptotic analysis:
sum = 0
i = 1
while (i ≤ n){
sum = sum + i
i = 2i
}
return sum
My assignment requires that I write the cost/runtime for every line, add these together, and find a Big-Oh notation for the runtime. My analysis looks like this for the moment:
sum = 0 1
long i = 1 1
while (i ≤ n){ log n + 1
sum = sum + i n log n
i = 2i n log n
}
return sum 1
=> 2 n log n + log n + 4 O(n log n)
is this correct ? Also: should I use n^2 on the while loop instead ?

Because of integer arithmetic, the runtime is
O(floor(ln(n))+1) = O(ln(n)).
Let's step through your pseudocode. Consider the case that n = 5.
iteration# i ln(i) n
-------------------------
1 1 0 5
2 2 1 5
3 4 2 5
By inspection we see that
iteration# = ln(i)+1
So in summary:
sum = 0 // O(1)
i = 1 // O(1)
while (i ≤ n) { // O(floor(ln(n))+1)
sum = sum + i // 1 flop + 1 mem op = O(1)
i = 2i // 1 flop + 1 mem op = O(1)
}
return sum // 1 mem op = O(1)

Related

Best way to parallelize computation over dask blocks that do not return np arrays?

I'd like to return a dask dataframe from an overlapping dask array computation, where each block's computation returns a pandas dataframe. The example below shows one way to do this, simplified for demonstration purposes. I've found a combination of da.overlap.overlap and to_delayed().ravel() as able to get the job done, if I pass in the relevant block key and chunk information.
Edit:
Thanks to a #AnnaM who caught bugs in the original post and then made it general! Building off of her comments, I'm including an updated version of the code. Also, in responding to Anna's interest in memory usage, I verified that this does not seem to take up more memory than naively expected.
def extract_features_generalized(chunk, offsets, depth, columns):
shape = np.asarray(chunk.shape)
offsets = np.asarray(offsets)
depth = np.asarray(depth)
coordinates = np.stack(np.nonzero(chunk)).T
keep = ((coordinates >= depth) & (coordinates < (shape - depth))).all(axis=1)
data = coordinates + offsets - depth
df = pd.DataFrame(data=data, columns=columns)
return df[keep]
def my_overlap_generalized(data, chunksize, depth, columns, boundary):
data = data.rechunk(chunksize)
data_overlapping_chunks = da.overlap.overlap(data, depth=depth, boundary=boundary)
dfs = []
for block in data_overlapping_chunks.to_delayed().ravel():
offsets = np.array(block.key[1:]) * np.array(data.chunksize)
df_block = dask.delayed(extract_features_generalized)(block, offsets=offsets,
depth=depth, columns=columns)
dfs.append(df_block)
return dd.from_delayed(dfs)
data = np.zeros((2,4,8,16,16))
data[0,0,4,2,2] = 1
data[0,1,4,6,2] = 1
data[1,2,4,8,2] = 1
data[0,3,4,2,2] = 1
arr = da.from_array(data)
df = my_overlap_generalized(arr,
chunksize=(-1,-1,-1,8,8),
depth=(0,0,0,2,2),
columns=['r', 'c', 'z', 'y', 'x'],
boundary=tuple(['reflect']*5))
df.compute().reset_index()
-- Remainder of original post, including original bugs --
My example only does xy overlaps, but it's easy to generalize. Is there anything below that is suboptimal or could be done better? Is anything likely to break because it's relying on low-level information that could change (e.g. block key)?
def my_overlap(data, chunk_xy, depth_xy):
data = data.rechunk((-1,-1,-1, chunk_xy, chunk_xy))
data_overlapping_chunks = da.overlap.overlap(data,
depth=(0,0,0,depth_xy,depth_xy),
boundary={3: 'reflect', 4: 'reflect'})
dfs = []
for block in data_overlapping_chunks.to_delayed().ravel():
offsets = np.array(block.key[1:]) * np.array(data.chunksize)
df_block = dask.delayed(extract_features)(block, offsets=offsets, depth_xy=depth_xy)
dfs.append(df_block)
# All computation is delayed, so downstream comptutions need to know the format of the data. If the meta
# information is not specified, a single computation will be done (which could be expensive) at this point
# to infer the metadata.
# This empty dataframe has the index, column, and type information we expect in the computation.
columns = ['r', 'c', 'z', 'y', 'x']
# The dtypes are float64, except for a small number of columns
df_meta = pd.DataFrame(columns=columns, dtype=np.float64)
df_meta = df_meta.astype({'c': np.int64, 'r': np.int64})
df_meta.index.name = 'feature'
return dd.from_delayed(dfs, meta=df_meta)
def extract_features(chunk, offsets, depth_xy):
r, c, z, y, x = np.nonzero(chunk)
df = pd.DataFrame({'r': r, 'c': c, 'z': z, 'y': y+offsets[3]-depth_xy,
'x': x+offsets[4]-depth_xy})
df = df[(df.y > depth_xy) & (df.y < (chunk.shape[3] - depth_xy)) &
(df.z > depth_xy) & (df.z < (chunk.shape[4] - depth_xy))]
return df
data = np.zeros((2,4,8,16,16)) # round, channel, z, y, x
data[0,0,4,2,2] = 1
data[0,1,4,6,2] = 1
data[1,2,4,8,2] = 1
data[0,3,4,2,2] = 1
arr = da.from_array(data)
df = my_overlap(arr, chunk_xy=8, depth_xy=2)
df.compute().reset_index()
First of all, thanks for posting your code. I am working on a similar problem and this was really helpful for me.
When testing your code, I discovered a few mistakes in the extract_features function that prevent your code from returning correct indices.
Here is a corrected version:
def extract_features(chunk, offsets, depth_xy):
r, c, z, y, x = np.nonzero(chunk)
df = pd.DataFrame({'r': r, 'c': c, 'z': z, 'y': y, 'x': x})
df = df[(df.y >= depth_xy) & (df.y < (chunk.shape[3] - depth_xy)) &
(df.x >= depth_xy) & (df.x < (chunk.shape[4] - depth_xy))]
df['y'] = df['y'] + offsets[3] - depth_xy
df['x'] = df['x'] + offsets[4] - depth_xy
return df
The updated code now returns the indices that were set to 1:
index r c z y x
0 0 0 0 4 2 2
1 1 0 1 4 6 2
2 2 0 3 4 2 2
3 1 1 2 4 8 2
For comparison, this is the output of the original version:
index r c z y x
0 1 0 1 4 6 2
1 3 1 2 4 8 2
2 0 0 1 4 6 2
3 1 1 2 4 8 2
It returns lines number 2 and 4, two times each.
The reason why this happens is three mistakes in the extract_features function:
You first add the offset and subtract the depth and then filter out the overlapping parts: the order needs to be swapped
df.y > depth_xy should be replaced with df.y >= depth_xy
df.z should be replaced with df.x, since it is the x dimension that has an overlap
To optimize this even further, here is a generalized version of the code that would work for an arbitrary number of dimension:
def extract_features_generalized(chunk, offsets, depth, columns):
coordinates = np.nonzero(chunk)
df = pd.DataFrame()
rows_to_keep = np.ones(len(coordinates[0]), dtype=int)
for i in range(len(columns)):
df[columns[i]] = coordinates[i]
rows_to_keep = rows_to_keep * np.array((df[columns[i]] >= depth[i])) * \
np.array((df[columns[i]] < (chunk.shape[i] - depth[i])))
df[columns[i]] = df[columns[i]] + offsets[i] - depth[i]
del coordinates
return df[rows_to_keep > 0]
def my_overlap_generalized(data, chunksize, depth, columns):
data = data.rechunk(chunksize)
data_overlapping_chunks = da.overlap.overlap(data, depth=depth,
boundary=tuple(['reflect']*len(columns)))
dfs = []
for block in data_overlapping_chunks.to_delayed().ravel():
offsets = np.array(block.key[1:]) * np.array(data.chunksize)
df_block = dask.delayed(extract_features_generalized)(block, offsets=offsets,
depth=depth, columns=columns)
dfs.append(df_block)
return dd.from_delayed(dfs)
data = np.zeros((2,4,8,16,16))
data[0,0,4,2,2] = 1
data[0,1,4,6,2] = 1
data[1,2,4,8,2] = 1
data[0,3,4,2,2] = 1
arr = da.from_array(data)
df = my_overlap_generalized(arr, chunksize=(-1,-1,-1,8,8),
depth=(0,0,0,2,2), columns=['r', 'c', 'z', 'y', 'x'])
df.compute().reset_index()

Write Recurrence for Given Function

I am trying to write the recurrence relation for the running time of the following function:
function G(n):
if n>0 then:
x=0
for i = 1 to n:
x = x + 1
G(n-1)
end if
What I came up with was:
T(n) = 1 if n <= 0
T(n) = T(n-1) + 1 if n>0
However I was told that this was incorrect and I don't know why or what the correct solution would be. Any help is greatly appreciated!
T(n) = 1 if n <= 0
T(n) = T(n-1) + O(n) if n>0
Instead of O(1), it should be O(n), because you are looping from 1 to n
If you solve the recurrence, the overall complexity will be O(n2)

#Recurrence T(n)=3T(n/3)+Ѳ(log₃n) [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 years ago.
Improve this question
Sorry I have tried a lot to solve this recurrence equation
T (n) = 3T (n / 3) + Ѳ (log3n)
with the replacement method but I can not get the required result:
1) T (n) = O (nlogn)
2) Induction
Base: for every n = 1 -> 1log1 + 1 = 1 = T (1)
Inductive step: T (k) = klogk + k for each k <n
Use k = n / 3
T (n) = 3T (n / 3) + Ѳ (log₃n)
1) T (n) = O (nlogn)
2) Induction
Base: for every n = 1 -> 1log1 + 1 = 1 = T (1)
Inductive step: T (k) = klogk + k for each k <n
Use k = n / 3
T (n) = 3T (n / 3) + Ѳ (log₃n)
= 3 [n / 3logn / 3 + n / 3] + (log₃n)
= nlogn / 3 + n + (log₃n)
= n(logn-log3) + n + (log₃n)
= nlogn-nlog3 + n + (log3n)
Firstly we can (eventually) ignore the base 3 in the Theta-notation, as it amounts to a multiplicative factor as is therefore irrelevant. Then we can try the following method:
1. Hypothesis by inspection:
If we re-substitute T into itself multiple times, we get:
What is the upper limit m? We need to assume that T(n) has a stopping condition, i.e. some value of n where it stops recursing. Assuming that it is n = 1 (it really doesn't matter, as long as it's a constant much smaller than n). Continuing (and briefly restoring the base 3):
Surprisingly the answer is not Ө(n log n).
2. Induction base case
We don't use induction to prove the final result, but the series result we deduced by inspecting the behaviour of the expansion.
For the base case n / 3 = 1, we have:
Which is consistent.
3. Induction recurrence
Again, consistent. Thus by induction the summation result is correct, and T(n) is indeed Ө(n).
4. Numerical tests:
Just in case you still cannot believe that it is Ө(n), here is a numerical test to prove the result.
Javascript code:
function T(n) {
return n <= 1 ? 0 : 3*T(floor(n/3)) + log(n);
}
Results:
n T(n)
--------------------------
10 5.598421959
100 66.33828212
1000 702.3597066
10000 6450.185742
100000 63745.45154
1000000 580674.1886
10000000 8924162.276
100000000 81068207.64
Graph:
The linear relationship is clear.

Is an admisible heuristic always monotone (consistent)?

For the A* search algorithm, provided an heuristic h, supose h is admisible.
That is:
h(n) ≤ h*(n) for every node n, where h* is the real cost from n to goal.
Does this ensure the heuristic is monotone?
That is:
f(n) ≤ g(n') + h(n') for every sucesor n' of n, where f(n)= h(n) + g(n) and g(n) is the accumulated cost.
No.
Assume you have three successor states s1, s2, s3 and a goal state g so that s1 -> s2 -> s3 -> g.
s1 is the starting node.
Consider also the following values for h(s) and h*(s) (i.e. true cost):
h(s1) = 3 , h*(s1) = 6
h(s2) = 4 , h*(s2) = 5
h(s3) = 3 , h*(s3) = 3
h(g) = 0 , h*(g) = 0
Following the only path to the goal we can have that:
g(s1) = 0, g(s2) = 1, g(s3) = 3, g(g) = 6, coinciding with the true cost above.
Although the heuristic function is admissible (h(s) <= h*(s)), f(n) will not be monotonic. For instance f(s1) = h(s1) + g(s1) = 3 while f(s2) = h(s2) + g(s2) = 5 with f(s1) < f(s2). Same holds between f(s2) and f(s3).
Of course this means you have a quite uninformative heuristic.

Calculated nCr mod m (n choose r) for large values of n (10^9)

Now that CodeSprint 3 is over, I've been wondering how to solve this problem. We need to simply calculate nCr mod 142857 for large values of r and n (0<=n<=10^9 ; 0<=r<=n). I used a recursive method which goes through min(r, n-r) iterations to calculate the combination. Turns out this wasn't efficient enough. I've tried a few different methods, but they all seem to not be efficient enough. Any suggestions?
For non-prime mod, factor it (142857 = 3^3 * 11 * 13 * 37) and compute C(n,k) mod p^q for each prime factor of the mod using the general Lucas theorem, and combine them using Chinese remainder theorem.
For example, C(234, 44) mod 142857 = 6084, then
C(234, 44) mod 3^3 = 9
C(234, 44) mod 11 = 1
C(234, 44) mod 13 = 0
C(234, 44) mod 37 = 16
The Chinese Remainder theorem involves finding x such that
x = 9 mod 3^3
x = 1 mod 11
x = 0 mod 13
x = 16 mod 37
The result is x = 6084.
Example
C(234, 44) mod 3^3
First convert n, k, and n-k to base p
n = 234_10 = 22200_3
k = 44_10 = 1122_3
r = n-k = 190_10 = 21001_3
Next find the number of carries
e[i] = number of carries from i to end
e 4 3 2 1 0
1 1
r 2 1 0 0 1
k 1 1 2 2
n 2 2 2 0 0
Now create the factorial function needed for general Lucas
def f(n, p):
r = 1
for i in range(1, n+1):
if i % p != 0:
r *= i
return r
Since q = 3, you will consider only three digits of the base p representation at a time
So
f(222_3, 3)/[f(210_3, 3) * f(011_3, 3)] *
f(220_3, 3)/[f(100_3, 3) * f(112_3, 3)] *
f(200_3, 3)/[f(001_3, 3) * f(122_3, 3)] = 6719344775 / 7
Now
s = 1 if p = 2 and q >= 3 else -1
Then
p^e[0] * s * 6719344775 / 7 mod 3^3
e[0] = 2
p^e[0] = 3^2 = 9
s = -1
p^e[0] * s * 6719344775 = -60474102975
Now you have
-60474102975 / 7 mod 3^3
This is a linear congruence and can be solved with
ModularInverse(7, 3^3) = 4
4 * -60474102975 mod 27 = 9
Hence C(234, 44) mod 3^3 = 9

Resources