I'm trying to make a cross product between two vectors R^n , is there any way to do this in the most optimized way?
I have looked on the accelerate library but still can not find anything
A cross product can exist in Rn if and only if n=0, 1, 3 or 7
Source: http://www.math.csusb.edu/faculty/pmclough/CP.pdf
So no, you certainly won't find any library that does that. If you meant the element-wise, you can use Accelerate. Here is a short test:
import Accelerate
let n = 10_000_000
let a = (0..<n).map{ _ in Double(arc4random()) / Double(UInt32.max) }
let b = (0..<n).map{ _ in Double(arc4random()) / Double(UInt32.max) }
print("A: [\(a.prefix(10).map{ "\($0)" }.joinWithSeparator(", ")), ...]")
print("B: [\(b.prefix(10).map{ "\($0)" }.joinWithSeparator(", ")), ...]")
var result = [Double](count: n, repeatedValue: 0)
let start = mach_absolute_time()
vDSP_vmulD(a, 1, b, 1, &result, 1, UInt(n))
let stop = mach_absolute_time()
let time = Double(stop - start) / Double(NSEC_PER_SEC)
print("Time: \(time) for \(n) elements")
print("Result: [\(result.prefix(10).map{ "\($0)" }.joinWithSeparator(", ")), ...]")
Output:
A: [0.269752697849123, 0.851672558312228, 0.0668649589798564, 0.0955562389212559, 0.255900985620893, 0.93693982901446, 0.085282990495973, 0.732230591525377, 0.588338787804437, 0.952581417968632, ...]
B: [0.750105029379508, 0.0454008649209051, 0.863010750120275, 0.308104009904923, 0.700024090637459, 0.327355608653127, 0.679469040520366, 0.666848364208557, 0.0567599588671606, 0.623293806245386, ...]
Time: 0.024393279 for 10000000 elements
Result: [0.202342855345318, 0.0386666707767751, 0.0577051784059674, 0.0294412603830718, 0.179136854752495, 0.306712507998386, 0.0579471517250063, 0.488286772182162, 0.0333940853957349, 0.593738097764296, ...]
0.024 seconds for 10 million elements is pretty fast
Parallel execution
If you are looking for the Elementwise product operation
a = (1.0, 2.0)
b = (3.0, 4.0)
a * b = (a1*b1, a2*b2) = (3.0, 8.0)
and you want the fasted possible performance available on iOS you should use the simd framework (Single Instruction Multiple Data).
import simd
let v0 = float2(1.0, 2.0)
let v1 = float2(3.0, 4.0)
let res = v0 * v1
print(res) // float2(3.0, 8.0)
Why is simd so fast?
Without simd, calculating a*b would required the execution of 2 steps
calculate a1 * b1 and put the result into res1
calculate a2 * b2 and put the result into res2
On the other hand, using simd both operations are done in parallel. This is possible because the 2 steps have the same operation with different data. This is exactly what simd does allow you do to.
More
From Wikipedia
Single instruction, multiple data (SIMD), is a class of parallel computers in Flynn's taxonomy. It describes computers with multiple processing elements that perform the same operation on multiple data points simultaneously.
Thus, such machines exploit data level parallelism, but not concurrency: there are simultaneous (parallel) computations, but only a single process (instruction) at a given moment.
SIMD is particularly applicable to common tasks like adjusting the contrast in a digital image or adjusting the volume of digital audio.
Most modern CPU designs include SIMD instructions in order to improve the performance of multimedia use.
Related
I'm doing an experiment using face images in PyTorch framework. The input x is the given face image of size 5 * 5 (height * width) and there are 192 channels.
Objective: To obtain patches of x of patch_size(given as argument).
I have obtained the required result with the help of two for loops. But I want a better-vectorized solution so that the computation cost will be very less than using two for loops.
Used: PyTorch 0.4.1, (12 GB) Nvidia TitanX GPU.
The following is my implementation using two for loops
def extractpatches( x, patch_size): # x is bsx192x5x5
patches = x.unfold( 2, patch_size , 1).unfold(3,patch_size,1)
bs,c,pi,pj, _, _ = patches.size() #bs,192,
cnt = 0
p = torch.empty((bs,pi*pj,c,patch_size,patch_size)).to(device)
s = torch.empty((bs,pi*pj, c*patch_size*patch_size)).to(device)
//Want a vectorized method instead of two for loops below
for i in range(pi):
for j in range(pj):
p[:,cnt,:,:,:] = patches[:,:,i,j,:,:]
s[:,cnt,:] = p[:,cnt,:,:,:].view(-1,c*patch_size*patch_size)
cnt = cnt+1
return s
Thanks for your help in advance.
I think you can try this as following. I used some parts of your code for my experiment and it worked for me. Here l and f are the lists of tensor patches
l = [patches[:,:,int(i/pi),i%pi,:,:] for i in range(pi * pi)]
f = [l[i].contiguous().view(-1,c*patch_size*patch_size) for i in range(pi * pi)]
You can verify the above code using toy input values.
Thanks.
Check out this question:
Swift probability of random number being selected?
The top answer suggests to use a switch statement, which does the job. However, if I have a very large number of cases to consider, the code looks very inelegant; I have a giant switch statement with very similar code in each case repeated over and over again.
Is there a nicer, cleaner way to pick a random number with a certain probability when you have a large number of probabilities to consider? (like ~30)
This is a Swift implementation strongly influenced by the various
answers to Generate random numbers with a given (numerical) distribution.
For Swift 4.2/Xcode 10 and later (explanations inline):
func randomNumber(probabilities: [Double]) -> Int {
// Sum of all probabilities (so that we don't have to require that the sum is 1.0):
let sum = probabilities.reduce(0, +)
// Random number in the range 0.0 <= rnd < sum :
let rnd = Double.random(in: 0.0 ..< sum)
// Find the first interval of accumulated probabilities into which `rnd` falls:
var accum = 0.0
for (i, p) in probabilities.enumerated() {
accum += p
if rnd < accum {
return i
}
}
// This point might be reached due to floating point inaccuracies:
return (probabilities.count - 1)
}
Examples:
let x = randomNumber(probabilities: [0.2, 0.3, 0.5])
returns 0 with probability 0.2, 1 with probability 0.3,
and 2 with probability 0.5.
let x = randomNumber(probabilities: [1.0, 2.0])
return 0 with probability 1/3 and 1 with probability 2/3.
For Swift 3/Xcode 8:
func randomNumber(probabilities: [Double]) -> Int {
// Sum of all probabilities (so that we don't have to require that the sum is 1.0):
let sum = probabilities.reduce(0, +)
// Random number in the range 0.0 <= rnd < sum :
let rnd = sum * Double(arc4random_uniform(UInt32.max)) / Double(UInt32.max)
// Find the first interval of accumulated probabilities into which `rnd` falls:
var accum = 0.0
for (i, p) in probabilities.enumerated() {
accum += p
if rnd < accum {
return i
}
}
// This point might be reached due to floating point inaccuracies:
return (probabilities.count - 1)
}
For Swift 2/Xcode 7:
func randomNumber(probabilities probabilities: [Double]) -> Int {
// Sum of all probabilities (so that we don't have to require that the sum is 1.0):
let sum = probabilities.reduce(0, combine: +)
// Random number in the range 0.0 <= rnd < sum :
let rnd = sum * Double(arc4random_uniform(UInt32.max)) / Double(UInt32.max)
// Find the first interval of accumulated probabilities into which `rnd` falls:
var accum = 0.0
for (i, p) in probabilities.enumerate() {
accum += p
if rnd < accum {
return i
}
}
// This point might be reached due to floating point inaccuracies:
return (probabilities.count - 1)
}
Is there a nicer, cleaner way to pick a random number with a certain probability when you have a large number of probabilities to consider?
Sure. Write a function that generates a number based on a table of probabilities. That's essentially what the switch statement you've pointed to is: a table defined in code. You could do the same thing with data using a table that's defined as a list of probabilities and outcomes:
probability outcome
----------- -------
0.4 1
0.2 2
0.1 3
0.15 4
0.15 5
Now you can pick a number between 0 and 1 at random. Starting from the top of the list, add up probabilities until you've exceeded the number you picked, and use the corresponding outcome. For example, let's say the number you pick is 0.6527637. Start at the top: 0.4 is smaller, so keep going. 0.6 (0.4 + 0.2) is smaller, so keep going. 0.7 (0.6 + 0.1) is larger, so stop. The outcome is 3.
I've kept the table short here for the sake of clarity, but you can make it as long as you like, and you can define it in a data file so that you don't have to recompile when the list changes.
Note that there's nothing particularly specific to Swift about this method -- you could do the same thing in C or Swift or Lisp.
This seems like a good opportunity for a shameless plug to my small library, swiftstats:
https://github.com/r0fls/swiftstats
For example, this would generate 3 random variables from a normal distribution with mean 0 and variance 1:
import SwiftStats
let n = SwiftStats.Distributions.Normal(0, 1.0)
print(n.random())
Supported distributions include: normal, exponential, binomial, etc...
It also supports fitting sample data to a given distribution, using the Maximum Likelihood Estimator for the distribution.
See the project readme for more info.
You could do it with exponential or quadratic functions - have x be your random number, take y as the new random number. Then, you just have to jiggle the equation until it fits your use case. Say I had (x^2)/10 + (x/300). Put your random number in, (as some floating-point form), and then get the floor with Int() when it comes out. So, if my random number generator goes from 0 to 9, I have a 40% chance of getting 0, and a 30% chance of getting 1 - 3, a 20% chance of getting 4 - 6, and a 10% chance of an 8. You're basically trying to fake some kind of normal distribution.
Here's an idea of what it would look like in Swift:
func giveY (x: UInt32) -> Int {
let xD = Double(x)
return Int(xD * xD / 10 + xD / 300)
}
let ans = giveY (arc4random_uniform(10))
EDIT:
I wasn't very clear above - what I meant was you could replace the switch statement with some function that would return a set of numbers with a probability distribution that you could figure out with regression using wolfram or something. So, for the question you linked to, you could do something like this:
import Foundation
func returnLevelChange() -> Double {
return 0.06 * exp(0.4 * Double(arc4random_uniform(10))) - 0.1
}
newItemLevel = oldItemLevel * returnLevelChange()
So that function returns a double somewhere between -0.05 and 2.1. That would be your "x% worse/better than current item level" figure. But, since it's an exponential function, it won't return an even spread of numbers. The arc4random_uniform(10) returns an int from 0 - 9, and each of those would result in a double like this:
0: -0.04
1: -0.01
2: 0.03
3: 0.1
4: 0.2
5: 0.34
6: 0.56
7: 0.89
8: 1.37
9: 2.1
Since each of those ints from the arc4random_uniform has an equal chance of showing up, you get probabilities like this:
40% chance of -0.04 to 0.1 (~ -5% - 10%)
30% chance of 0.2 to 0.56 (~ 20% - 55%)
20% chance of 0.89 to 1.37 (~ 90% - 140%)
10% chance of 2.1 (~ 200%)
Which is something similar to the probabilities that other person had. Now, for your function, it's much more difficult, and the other answers are almost definitely more applicable and elegant. BUT you could still do it.
Arrange each of the letters in order of their probability - from largest to smallest. Then, get their cumulative sums, starting with 0, without the last. (so probabilities of 50%, 30%, 20% becomes 0, 0.5, 0.8). Then you multiply them up until they're integers with reasonable accuracy (0, 5, 8). Then, plot them - your cumulative probabilities are your x's, the things you want to select with a given probability (your letters) are your y's. (you obviously can't plot actual letters on the y axis, so you'd just plot their indices in some array). Then, you'd try find some regression there, and have that be your function. For instance, trying those numbers, I got
e^0.14x - 1
and this:
let letters: [Character] = ["a", "b", "c"]
func randLetter() -> Character {
return letters[Int(exp(0.14 * Double(arc4random_uniform(10))) - 1)]
}
returns "a" 50% of the time, "b" 30% of the time, and "c" 20% of the time. Obviously pretty cumbersome for more letters, and it would take a while to figure out the right regression, and if you wanted to change the weightings you're have to do it manually. BUT if you did find a nice equation that did fit your values, the actual function would only be a couple lines long, and fast.
Check out this question:
Swift probability of random number being selected?
The top answer suggests to use a switch statement, which does the job. However, if I have a very large number of cases to consider, the code looks very inelegant; I have a giant switch statement with very similar code in each case repeated over and over again.
Is there a nicer, cleaner way to pick a random number with a certain probability when you have a large number of probabilities to consider? (like ~30)
This is a Swift implementation strongly influenced by the various
answers to Generate random numbers with a given (numerical) distribution.
For Swift 4.2/Xcode 10 and later (explanations inline):
func randomNumber(probabilities: [Double]) -> Int {
// Sum of all probabilities (so that we don't have to require that the sum is 1.0):
let sum = probabilities.reduce(0, +)
// Random number in the range 0.0 <= rnd < sum :
let rnd = Double.random(in: 0.0 ..< sum)
// Find the first interval of accumulated probabilities into which `rnd` falls:
var accum = 0.0
for (i, p) in probabilities.enumerated() {
accum += p
if rnd < accum {
return i
}
}
// This point might be reached due to floating point inaccuracies:
return (probabilities.count - 1)
}
Examples:
let x = randomNumber(probabilities: [0.2, 0.3, 0.5])
returns 0 with probability 0.2, 1 with probability 0.3,
and 2 with probability 0.5.
let x = randomNumber(probabilities: [1.0, 2.0])
return 0 with probability 1/3 and 1 with probability 2/3.
For Swift 3/Xcode 8:
func randomNumber(probabilities: [Double]) -> Int {
// Sum of all probabilities (so that we don't have to require that the sum is 1.0):
let sum = probabilities.reduce(0, +)
// Random number in the range 0.0 <= rnd < sum :
let rnd = sum * Double(arc4random_uniform(UInt32.max)) / Double(UInt32.max)
// Find the first interval of accumulated probabilities into which `rnd` falls:
var accum = 0.0
for (i, p) in probabilities.enumerated() {
accum += p
if rnd < accum {
return i
}
}
// This point might be reached due to floating point inaccuracies:
return (probabilities.count - 1)
}
For Swift 2/Xcode 7:
func randomNumber(probabilities probabilities: [Double]) -> Int {
// Sum of all probabilities (so that we don't have to require that the sum is 1.0):
let sum = probabilities.reduce(0, combine: +)
// Random number in the range 0.0 <= rnd < sum :
let rnd = sum * Double(arc4random_uniform(UInt32.max)) / Double(UInt32.max)
// Find the first interval of accumulated probabilities into which `rnd` falls:
var accum = 0.0
for (i, p) in probabilities.enumerate() {
accum += p
if rnd < accum {
return i
}
}
// This point might be reached due to floating point inaccuracies:
return (probabilities.count - 1)
}
Is there a nicer, cleaner way to pick a random number with a certain probability when you have a large number of probabilities to consider?
Sure. Write a function that generates a number based on a table of probabilities. That's essentially what the switch statement you've pointed to is: a table defined in code. You could do the same thing with data using a table that's defined as a list of probabilities and outcomes:
probability outcome
----------- -------
0.4 1
0.2 2
0.1 3
0.15 4
0.15 5
Now you can pick a number between 0 and 1 at random. Starting from the top of the list, add up probabilities until you've exceeded the number you picked, and use the corresponding outcome. For example, let's say the number you pick is 0.6527637. Start at the top: 0.4 is smaller, so keep going. 0.6 (0.4 + 0.2) is smaller, so keep going. 0.7 (0.6 + 0.1) is larger, so stop. The outcome is 3.
I've kept the table short here for the sake of clarity, but you can make it as long as you like, and you can define it in a data file so that you don't have to recompile when the list changes.
Note that there's nothing particularly specific to Swift about this method -- you could do the same thing in C or Swift or Lisp.
This seems like a good opportunity for a shameless plug to my small library, swiftstats:
https://github.com/r0fls/swiftstats
For example, this would generate 3 random variables from a normal distribution with mean 0 and variance 1:
import SwiftStats
let n = SwiftStats.Distributions.Normal(0, 1.0)
print(n.random())
Supported distributions include: normal, exponential, binomial, etc...
It also supports fitting sample data to a given distribution, using the Maximum Likelihood Estimator for the distribution.
See the project readme for more info.
You could do it with exponential or quadratic functions - have x be your random number, take y as the new random number. Then, you just have to jiggle the equation until it fits your use case. Say I had (x^2)/10 + (x/300). Put your random number in, (as some floating-point form), and then get the floor with Int() when it comes out. So, if my random number generator goes from 0 to 9, I have a 40% chance of getting 0, and a 30% chance of getting 1 - 3, a 20% chance of getting 4 - 6, and a 10% chance of an 8. You're basically trying to fake some kind of normal distribution.
Here's an idea of what it would look like in Swift:
func giveY (x: UInt32) -> Int {
let xD = Double(x)
return Int(xD * xD / 10 + xD / 300)
}
let ans = giveY (arc4random_uniform(10))
EDIT:
I wasn't very clear above - what I meant was you could replace the switch statement with some function that would return a set of numbers with a probability distribution that you could figure out with regression using wolfram or something. So, for the question you linked to, you could do something like this:
import Foundation
func returnLevelChange() -> Double {
return 0.06 * exp(0.4 * Double(arc4random_uniform(10))) - 0.1
}
newItemLevel = oldItemLevel * returnLevelChange()
So that function returns a double somewhere between -0.05 and 2.1. That would be your "x% worse/better than current item level" figure. But, since it's an exponential function, it won't return an even spread of numbers. The arc4random_uniform(10) returns an int from 0 - 9, and each of those would result in a double like this:
0: -0.04
1: -0.01
2: 0.03
3: 0.1
4: 0.2
5: 0.34
6: 0.56
7: 0.89
8: 1.37
9: 2.1
Since each of those ints from the arc4random_uniform has an equal chance of showing up, you get probabilities like this:
40% chance of -0.04 to 0.1 (~ -5% - 10%)
30% chance of 0.2 to 0.56 (~ 20% - 55%)
20% chance of 0.89 to 1.37 (~ 90% - 140%)
10% chance of 2.1 (~ 200%)
Which is something similar to the probabilities that other person had. Now, for your function, it's much more difficult, and the other answers are almost definitely more applicable and elegant. BUT you could still do it.
Arrange each of the letters in order of their probability - from largest to smallest. Then, get their cumulative sums, starting with 0, without the last. (so probabilities of 50%, 30%, 20% becomes 0, 0.5, 0.8). Then you multiply them up until they're integers with reasonable accuracy (0, 5, 8). Then, plot them - your cumulative probabilities are your x's, the things you want to select with a given probability (your letters) are your y's. (you obviously can't plot actual letters on the y axis, so you'd just plot their indices in some array). Then, you'd try find some regression there, and have that be your function. For instance, trying those numbers, I got
e^0.14x - 1
and this:
let letters: [Character] = ["a", "b", "c"]
func randLetter() -> Character {
return letters[Int(exp(0.14 * Double(arc4random_uniform(10))) - 1)]
}
returns "a" 50% of the time, "b" 30% of the time, and "c" 20% of the time. Obviously pretty cumbersome for more letters, and it would take a while to figure out the right regression, and if you wanted to change the weightings you're have to do it manually. BUT if you did find a nice equation that did fit your values, the actual function would only be a couple lines long, and fast.
I wanna combine three graphics on one graph. The data from inside of R which is " nottem ". Can someone help me to write code to put a seasonal mean and harmonic (cosine model) and its time series plots together by using different colors? I already wrote model code just don't know how to combine them together to compare.
Code :library(TSA)
nottem
month.=season(nottem)
model=lm(nottem~month.-1)
summary(nottem)
har.=harmonic(nottem,1)
model1=lm(nottem~har.)
summary(model1)
plot(nottem,type="l",ylab="Average monthly temperature at Nottingham castle")
points(y=nottem,x=time(nottem), pch=as.vector(season(nottem)))
Just put your time series inside a matrix:
x = cbind(serie1 = ts(cumsum(rnorm(100)), freq = 12, start = c(2013, 2)),
serie2 = ts(cumsum(rnorm(100)), freq = 12, start = c(2013, 2)))
plot(x)
Or configure the plot region:
par(mfrow = c(2, 1)) # 2 rows, 1 column
serie1 = ts(cumsum(rnorm(100)), freq = 12, start = c(2013, 2))
serie2 = ts(cumsum(rnorm(100)), freq = 12, start = c(2013, 2))
require(zoo)
plot(serie1)
lines(rollapply(serie1, width = 10, FUN = mean), col = 'red')
plot(serie2)
lines(rollapply(serie2, width = 10, FUN = mean), col = 'blue')
hope it helps.
PS.: zoo package is not needed in this example, you could use the filter function.
You can extract the seasonal mean with:
s.mean = tapply(serie, cycle(serie), mean)
# January, assuming serie is monthly data
print(s.mean[1])
This graph is pretty hard to read, because your three sets of values are so similar. Still, if you want to simply want to graph all of these on the sample plot, you can do it pretty easily by using the coefficients generated by your models.
Step 1: Plot the raw data. This comes from your original code.
plot(nottem,type="l",ylab="Average monthly temperature at Nottingham castle")
Step 2: Set up x-values for the mean and cosine plots.
x <- seq(1920, (1940 - 1/12), by=1/12)
Step 3: Plot the seasonal means by repeating the coefficients from the first model.
lines(x=x, y=rep(model$coefficients, 20), col="blue")
Step 4: Calculate the y-values for the cosine function using the coefficients from the second model, and then plot.
y <- model1$coefficients[2] * cos(2 * pi * x) + model1$coefficients[1]
lines(x=x, y=y, col="red")
ggplot variant: If you decide to switch to the popular 'ggplot2' package for your plot, you would do it like so:
x <- seq(1920, (1940 - 1/12), by=1/12)
y.seas.mean <- rep(model$coefficients, 20)
y.har.cos <- model1$coefficients[2] * cos(2 * pi * x) + model1$coefficients[1]
plot_Data <- melt(data.frame(x=x, temp=nottem, seas.mean=y.seas.mean, har.cos=y.har.cos), id="x")
ggplot(plot_Data, aes(x=x, y=value, col=variable)) + geom_line()
I'm trying to experiment with software defined radio concepts. From this article I've tried to implement a GPU-parallelism Discrete Fourier Transform.
I'm pretty sure I could pre-calculate 90 degrees of the sin(i) cos(i) and then just flip and repeat rather than what I'm doing in this code and that that would speed it up. But so far, I don't even think I'm getting correct answers. An all-zeros input gives a 0 result as I'd expect, but all 0.5 as inputs gives 78.9985886f (I'd expect a 0 result in this case too). Basically, I'm just generally confused. I don't have any good input data and I don't know what to do with the result or how to verify it.
This question is related to my other post here
open Microsoft.ParallelArrays
open System
// X64MulticoreTarget is faster on my machine, unexpectedly
let target = new DX9Target() // new X64MulticoreTarget()
ignore(target.ToArray1D(new FloatParallelArray([| 0.0f |]))) // Dummy operation to warm up the GPU
let stopwatch = new System.Diagnostics.Stopwatch() // For benchmarking
let Hz = 50.0f
let fStep = (2.0f * float32(Math.PI)) / Hz
let shift = 0.0f // offset, once we have to adjust for the last batch of samples of a stream
// If I knew that the periodic function is periodic
// at whole-number intervals, I think I could keep
// shift within a smaller range to support streams
// without overflowing shift - but I haven't
// figured that out
//let elements = 8192 // maximum for a 1D array - makes sense as 2^13
//let elements = 7240 // maximum on my machine for a 2D array, but why?
let elements = 7240
// need good data!!
let buffer : float32[,] = Array2D.init<float32> elements elements (fun i j -> 0.5f) //(float32(i * elements) + float32(j)))
let input = new FloatParallelArray(buffer)
let seqN : float32[,] = Array2D.init<float32> elements elements (fun i j -> (float32(i * elements) + float32(j)))
let steps = new FloatParallelArray(seqN)
let shiftedSteps = ParallelArrays.Add(shift, steps)
let increments = ParallelArrays.Multiply(fStep, steps)
let cos_i = ParallelArrays.Cos(increments) // Real component series
let sin_i = ParallelArrays.Sin(increments) // Imaginary component series
stopwatch.Start()
// From the documentation, I think ParallelArrays.Multiply does standard element by
// element multiplication, not matrix multiplication
// Then we sum each element for each complex component (I don't understand the relationship
// of this, or the importance of the generalization to complex numbers)
let real = target.ToArray1D(ParallelArrays.Sum(ParallelArrays.Multiply(input, cos_i))).[0]
let imag = target.ToArray1D(ParallelArrays.Sum(ParallelArrays.Multiply(input, sin_i))).[0]
printf "%A in " ((real * real) + (imag * imag)) // sum the squares for the presence of the frequency
stopwatch.Stop()
printfn "%A" stopwatch.ElapsedMilliseconds
ignore (System.Console.ReadKey())
I share your surprise that your answer is not closer to zero. I'd suggest writing naive code to perform your DFT in F# and seeing if you can track down the source of the discrepancy.
Here's what I think you're trying to do:
let N = 7240
let F = 1.0f/50.0f
let pi = single System.Math.PI
let signal = [| for i in 1 .. N*N -> 0.5f |]
let real =
seq { for i in 0 .. N*N-1 -> signal.[i] * (cos (2.0f * pi * F * (single i))) }
|> Seq.sum
let img =
seq { for i in 0 .. N*N-1 -> signal.[i] * (sin (2.0f * pi * F * (single i))) }
|> Seq.sum
let power = real*real + img*img
Hopefully you can use this naive code to get a better intuition for how the accelerator code ought to behave, which could guide you in your testing of the accelerator code. Keep in mind that part of the reason for the discrepancy may simply be the precision of the calculations - there are ~52 million elements in your arrays, so accumulating a total error of 79 may not actually be too bad. FWIW, I get a power of ~0.05 when running the above single precision code, but a power of ~4e-18 when using equivalent code with double precision numbers.
Two suggestions:
ensure you're not somehow confusing degrees with radians
try doing it sans-parallelism, or just with F#'s asyncs for parallelism
(In F#, if you have an array of floats
let a : float[] = ...
then you can 'add a step to all of them in parallel' to produce a new array with
let aShift = a |> (fun x -> async { return x + shift })
|> Async.Parallel |> Async.RunSynchronously
(though I expect this might be slower that just doing a synchronous loop).)