Bad acess, multi-threading, GCD, swift - ios

I am trying to translate some sample code from objective-c into swift!
I got it all working except for the multithreading part which is cruical to this simulation.
For some reason when I start using multiple threads it has access errors. Specefically when getting or setting things from the array.
This class is instanced inside of a static class.
var screenWidthi:Int = 0
var screenHeighti:Int = 0
var poolWidthi:Int = 0
var poolHeighti:Int = 0
var rippleSource:[GLfloat] = []
var rippleDest:[GLfloat] = []
func update()
{
let queue = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0)
dispatch_apply(Int(poolHeighti), queue, {(y: size_t) -> Void in
//for y in 0..<poolHeighti
//{
let pw = self.poolWidthi
for x in 1..<(pw - 1)
{
let ai:Int = (y ) * (pw + 2) + x + 1
let bi:Int = (y + 2) * (pw + 2) + x + 1
let ci:Int = (y + 1) * (pw + 2) + x
let di:Int = (y + 1) * (pw + 2) + x + 2
let me:Int = (y + 1) * (pw + 2) + x + 1
let a = self.rippleSource[ai]
let b = self.rippleSource[bi]
let c = self.rippleSource[ci]
let d = self.rippleSource[di]
var result = (a + b + c + d) / 2.0 - self.rippleDest[me]
result -= result / 32.0
self.rippleDest[me] = result
}
}
)
}
It is important to note that there is also another loop that should run on a different thread right after this one, it acesses the same arrays. That being said it will still bad acess without having the 2nd in another thread so I feel that it is irrelivant to show.
If you could please tell me what is going on that causes this crash to happen at randomish times rather then the first time.
If you want reference here is what it was like in objective c
dispatch_queue_t queue = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0);
dispatch_apply(poolHeight, queue, ^(size_t y) {
for (int x=0; x<poolWidth; x++)
{
float a = rippleSource[(y)*(poolWidth+2) + x+1];
float b = rippleSource[(y+2)*(poolWidth+2) + x+1];
float c = rippleSource[(y+1)*(poolWidth+2) + x];
float d = rippleSource[(y+1)*(poolWidth+2) + x+2];
float result = (a + b + c + d)/2.f - rippleDest[(y+1)*(poolWidth+2) + x+1];
result -= result/32.f;
rippleDest[(y+1)*(poolWidth+2) + x+1] = result;
}
});
How do you ensure that variables are able to be accessed from different threads? How about static members?
I only no how to print out the call stack before the app crashes, however after, the only way I know to get to the call stack is to look at the threads. Let me know if there is a different way I should do this.
NOTE: I noticed something wierd. I put a print statement in each loop so I could see what x and y coordinate it was processing to see if the crash was consistant. Obiously that brought the fps down to well under 1 fps, however I did notice it has yet to crash. The program is running perfect so far without any bad acess just at under 1 fps.

The Apple code is using a C-style array, these are "thread safe" when used appropriately - as the Apple code does.
Swift, and Objective-C, arrays are not thread-safe and this is the cause of your issues. You need to implement some form of access control to the array.
A simple method is to associate a GCD sequential queue with each array, then to write to the array dispatch async to this queue, and to read dispatch sync. This is simple but reduces concurrency, to make it better read Mike Ash. For Swift code
Mike Ash is good if you need to understand the issues, and for Swift code you and look at this question - read all the answers and comments.
HTH

Related

Under what circumstances would Julia allocate memory to single digits?

Suppose I write this function
function test_function(T)
c = 1
d = 31
q = 321
b = 32121
a = 10
for i in 1:T
c = d + q + b + a
end
end
There will be no memory allocation. However, in my own code, I wrote a similar loop, but I encounter a huge amount of memory allocation. I can't share the entirety of my code, but when I used --track-allocation=user, I see the following results
80000 q = 3
- p = 0.1
- p_2 = 3
- q_2 = .2
-
240000 r = p - p_2 + q_2 - q;
The code above is in a for loop. This is just strange to me - why would Julia ever allocate memory to single digits?

Why does memkind and numactl improve program performance a lot?

According to the course https://www.coursera.org/learn/parallelism-ia/home/welcome, there is one example which tried to illustrate the improvement from memkind API by using hbw_posix_memalign((void**)&pixel, 64, sizeof(P)*width*height); I think this API only provides us aligned memory allocation. I do not know why this can help improve the GPflops so much as the following shows.
The part of coding is as the following. Here the memory which stores img_in is allocated by memkind API.
template<typename P>
void ApplyStencil(ImageClass<P> & img_in, ImageClass<P> & img_out) {
const int width = img_in.width;
const int height = img_in.height;
P * in = img_in.pixel;
P * out = img_out.pixel;
#pragma omp parallel for
for (int i = 1; i < height-1; i++)
#pragma omp simd
for (int j = 1; j < width-1; j++) {
P val = -in[(i-1)*width + j-1] - in[(i-1)*width + j] - in[(i-1)*width + j+1]
-in[(i )*width + j-1] + 8*in[(i )*width + j] - in[(i )*width + j+1]
-in[(i+1)*width + j-1] - in[(i+1)*width + j] - in[(i+1)*width + j+1];
val = (val < 0 ? 0 : val);
val = (val > 255 ? 255 : val);
out[i*width + j] = val;
}
}
My questions are as the follwing:
Is it only because we can use less memory operation to get our data and then we can improve the performance almost 5 times?
In terms of numaclt, based on the linux documentation, it allows us to bind the processes with specific nodes or cpus. When we use the command numactl -m 1, we can get the improvement 5 times. I not sure if the improvement comes from NUMA communication delay.

Multiplication table in Swift ios

I am learning how to make a multiplication table in swift and used
override func viewDidLoad() {
let n = Int(str)!
while (i<=10) {
let st = "\(n) * \(i) = \(n * i)"
lbl.text = st
i += 1
}
this code. i have a label in which i want to show the table, but the problem is that only the last result which is say 2*10 = 20 is showing and not all the other value. i am confused what to do, please help what to do so that all the values are displayed.
Glad you've decided to learn Swift. You're on the right track, but as others have said, your final iteration of the loop is replacing the contents of lbl.text.
There are many ways to achieve what you want, but for problems like this I'd suggest starting off working in a playground rather than worrying about labels, viewDidLoad and suchlike.
Here's a nice Swift-y way to do what you want
let n = 12
let table = Array(0...10).map({"\(n) * \($0) = \(n * $0)"}).joinWithSeparator("\n")
print("\(table)")
Gives…
12 * 0 = 0
12 * 1 = 12
12 * 2 = 24
12 * 3 = 36
12 * 4 = 48
12 * 5 = 60
12 * 6 = 72
12 * 7 = 84
12 * 8 = 96
12 * 9 = 108
12 * 10 = 120
To break that down…
// Take numbers 0 to 10 and make an array
Array(0...10).
// use the map function to convert each member of the array to a string
// $0 represents each value in turn.
// The result is an array of strings
map({"\(n) * \($0) = \(n * $0)"}).
// Join all the members of your `String` array with a newline character
joinWithSeparator("\n")
Try it for yourself. In Xcode, File -> New -> Playground, and just paste in that code. Good luck!
That's because every time the loop iterates, it overwrites the previous value in label.text. You need to append the new value to existing string value in label, as suggested by RichardG.
let n = Int(str)!
while (i<=10) {
let st = "\(n) * \(i) = \(n * i)"
lbl.text = lbl.text + " " +st //Append new value to already existing value in label.text
i += 1
}
There is also a possibility of UI issue. You have to provide number of lines to the label or it will mess up the display. It also needs to be of size enough to hold your data. A better option would be UITextView which is scrollable, if you are unwilling to handle cases for label height and width. But if you want to stick with UILabel, the following code will resize the label depending on text for you:
lbl.numberOfLines = 0; //You only need to call this once. Maybe in `viewDidLoad` or Storyboard itself.
lbl.text = #"Some long long long text"; //here you set the text to label
[lbl sizeToFit]; //You must call this method after setting text to label
You can also handle that by Autolayout constraints.
Easy way to do it with SWIFT 2.0
var tableOf = 2 //Change the table you want
for index in 1...10 {
print("\(tableOf) X \(index) = \(index * tableOf)")
}
OUTPUT
repeat-while loop, performs a single pass through the loop block first before considering the loop's condition (exactly what do-while loop does).
1) repeat...while Loop
var i = Int()
repeat {
print("\(i) * \(i) = \(i * 11)")
i += 1
} while i <= 11
2) While Loop
var i = Int()
while i <= 11
{
print("\(i) * \(i) = \(i * 11)")
i += 1
}
3) For Loop
for n in 1..<11
{
print("\(n) * \(n) = \(n * 10)")
}

F# recursive function in strange endless loop

I am very green when it comes to F#, and I have run across a small issue dealing with recursive functions that I was hoping could help me understand.
I have a function that is supposed to spit out the next even number:
let rec nextEven(x) =
let y = x + 1
if y % 2 = 0 then y
else nextEven y
// This never returns..
nextEven 3;;
I use the 'rec' keyword so that it will be recursive, although when I use it, it will just run in an endless loop for some reason. If I rewrite the function like this:
let nextEven(x) =
let y = x + 1
if y % 2 = 0 then y
else nextEven y
Then everything works fine (no rec keyword). For some reason I though I needed 'rec' since the function is recursive (so why don't I?) and why does the first version of the function run forever ?
EDIT
Turns out this was a total noob mistake. I had created multiple definitions of the function along the way, as is explained in the comments + answers.
I suspect you have multiple definitions of nextEven. That's the only explanation for your second example compiling. Repro:
module A =
let rec nextEven(x) =
let y = x + 1
if y % 2 = 0 then y
else nextEven y
open A //the function below will not compile without this
let nextEven(x) =
let y = x + 1
if y % 2 = 0 then y
else nextEven y //calling A.nextEven
Try resetting your FSI session.

Unwrapping nested loops in F#

I've been struggling with the following code. It's an F# implementation of the Forward-Euler algorithm used for modelling stars moving in a gravitational field.
let force (b1:Body) (b2:Body) =
let r = (b2.Position - b1.Position)
let rm = (float32)r.MagnitudeSquared + softeningLengthSquared
if (b1 = b2) then
VectorFloat.Zero
else
r * (b1.Mass * b2.Mass) / (Math.Sqrt((float)rm) * (float)rm)
member this.Integrate(dT, (bodies:Body[])) =
for i = 0 to bodies.Length - 1 do
for j = (i + 1) to bodies.Length - 1 do
let f = force bodies.[i] bodies.[j]
bodies.[i].Acceleration <- bodies.[i].Acceleration + (f / bodies.[i].Mass)
bodies.[j].Acceleration <- bodies.[j].Acceleration - (f / bodies.[j].Mass)
bodies.[i].Position <- bodies.[i].Position + bodies.[i].Velocity * dT
bodies.[i].Velocity <- bodies.[i].Velocity + bodies.[i].Acceleration * dT
While this works it isn't exactly "functional". It also suffers from horrible performance, it's 2.5 times slower than the equivalent c# code. bodies is an array of structs of type Body.
The thing I'm struggling with is that force() is an expensive function so usually you calculate it once for each pair and rely on the fact that Fij = -Fji. But this really messes up any loop unfolding etc.
Suggestions gratefully received! No this isn't homework...
Thanks,
Ade
UPDATED: To clarify Body and VectorFloat are defined as C# structs. This is because the program interops between F#/C# and C++/CLI. Eventually I'm going to get the code up on BitBucket but it's a work in progress I have some issues to sort out before I can put it up.
[StructLayout(LayoutKind.Sequential)]
public struct Body
{
public VectorFloat Position;
public float Size;
public uint Color;
public VectorFloat Velocity;
public VectorFloat Acceleration;
'''
}
[StructLayout(LayoutKind.Sequential)]
public partial struct VectorFloat
{
public System.Single X { get; set; }
public System.Single Y { get; set; }
public System.Single Z { get; set; }
}
The vector defines the sort of operators you'd expect for a standard Vector class. You could probably use the Vector3D class from the .NET framework for this case (I'm actually investigating cutting over to it).
UPDATE 2: Improved code based on the first two replies below:
for i = 0 to bodies.Length - 1 do
for j = (i + 1) to bodies.Length - 1 do
let r = ( bodies.[j].Position - bodies.[i].Position)
let rm = (float32)r.MagnitudeSquared + softeningLengthSquared
let f = r / (Math.Sqrt((float)rm) * (float)rm)
bodies.[i].Acceleration <- bodies.[i].Acceleration + (f * bodies.[j].Mass)
bodies.[j].Acceleration <- bodies.[j].Acceleration - (f * bodies.[i].Mass)
bodies.[i].Position <- bodies.[i].Position + bodies.[i].Velocity * dT
bodies.[i].Velocity <- bodies.[i].Velocity + bodies.[i].Acceleration * dT
The branch in the force function to cover the b1 == b2 case is the worst offender. You do't need this if softeningLength is always non-zero, even if it's very small (Epsilon). This optimization was in the C# code but not the F# version (doh!).
Math.Pow(x, -1.5) seems to be a lot slower than 1/ (Math.Sqrt(x) * x). Essentially this algorithm is slightly odd in that it's perfromance is dictated by the cost of this one step.
Moving the force calculation inline and getting rid of some divides also gives some improvement, but the performance was really being killed by the branching and is dominated by the cost of Sqrt.
WRT using classes over structs: There are cases (CUDA and native C++ implementations of this code and a DX9 renderer) where I need to get the array of bodies into unmanaged code or onto a GPU. In these scenarios being able to memcpy a contiguous block of memory seems like the way to go. Not something I'd get from an array of class Body.
I'm not sure if it's wise to rewrite this code in a functional style. I've seen some attempts to write pair interaction calculations in a functional manner and each one of them was harder to follow than two nested loops.
Before looking at structs vs. classes (I'm sure someone else has something smart to say about this), maybe you can try optimizing the calculation itself?
You're calculating two acceleration deltas, let's call them dAi and dAj:
dAi = r*m1*m2/(rm*sqrt(rm)) / m1
dAj = r*m1*m2/(rm*sqrt(rm)) / m2
[note: m1 = bodies.[i].mass, m2=bodies.[j].mass]]
The division by mass cancels out like this:
dAi = rm2 / (rmsqrt(rm))
dAj = rm1 / (rmsqrt(rm))
Now you only have to calculate r/(rmsqrt(rm)) for each pair (i,j).
This can be optimized further, because 1/(rmsqrt(rm)) = 1/(rm^1.5) = rm^-1.5, so if you let r' = r * (rm ** -1.5), then Edit: no it can't, that's premature optimization talking right there (see comment). Calculating r' = 1.0 / (r * sqrt r) is fastest.
dAi = m2 * r'
dAj = m1 * r'
Your code would then become something like
member this.Integrate(dT, (bodies:Body[])) =
for i = 0 to bodies.Length - 1 do
for j = (i + 1) to bodies.Length - 1 do
let r = (b2.Position - b1.Position)
let rm = (float32)r.MagnitudeSquared + softeningLengthSquared
let r' = r * (rm ** -1.5)
bodies.[i].Acceleration <- bodies.[i].Acceleration + r' * bodies.[j].Mass
bodies.[j].Acceleration <- bodies.[j].Acceleration - r' * bodies.[i].Mass
bodies.[i].Position <- bodies.[i].Position + bodies.[i].Velocity * dT
bodies.[i].Velocity <- bodies.[i].Velocity + bodies.[i].Acceleration * dT
Look, ma, no more divisions!
Warning: untested code. Try at your own risk.
I'd like to play arround with your code, but it's difficult since the definition of Body and FloatVector is missing and they also seem to be missing from the orginal blog post you point to.
I'd hazard a guess that you could improve your performance and rewrite in a more functional style using F#'s lazy computations:
http://msdn.microsoft.com/en-us/library/dd233247(VS.100).aspx
The idea is fairly simple you wrap any expensive computation that could be repeatedly calculated in a lazy ( ... ) expression then you can force the computation as many times as you like and it will only ever be calculated once.

Resources