How to integrate in Swift using vDSP - ios

I try to find replacement for SciPy's cumtrapz function function in Swift. I found something called vDSP_vtrapzD but I have no idea how to use it. This is what I've done so far:
import Accelerate
var f1: [Double] = [<some data>]
var tdata: [Double] = [<time vector>]
var output = [Double](unsafeUninitializedCapacity:Int(f1.count), initializingWith: {_, _ in})
vDSP_vtrapzD(&f1, 1, &tdata, &output, 1, vDSP_Length(f1.count))

You're close, but you're using Array.init(unsafeUninitializedCapacity:initializingWith:) incorrectly. From its documentation:
Discussion
Inside the closure, set the initializedCount parameter to the number of elements that are initialized by the closure. The memory in the range buffer[0..<initializedCount] must be initialized at the end of the closure’s execution, and the memory in the range buffer[initializedCount...] must be uninitialized. This postcondition must hold even if the initializer closure throws an error.
This API is a more unsafe (but performant counterpart) to Array.init(repeating:count:), which allocates an array of a fixed size, and spends the time to initialize all its contents). This has two potential drawbacks:
If the purpose of the array is to provide a buffer to write a result into, then initializing it prior to that is redundant and wasteful
If the result you put into that buffer ends up being larger than your
array, you need to remember to manually "trim" the excess off by
copying it into a new array.
Array.init(unsafeUninitializedCapacity:initializingWith:) improves upon this by:
Asking you for the maximum capacity you might possibly need
Giving you a temporary buffer with the capacity
Importantly, it's uninitialized. This makes it faster, but also more dangerous (risk of buffer underflow errors) if used incorrectly.
You then tell it exactly how much of that temporary buffer you actually used
It will automatically copy that much of the buffer into the final array, and return that as the result.
You're using Array.init(unsafeUninitializedCapacity:initializingWith:) as if it were Array.init(repeating:count:). To use it correctly, you would put your initialization logic inside the initializer parameter, like so:
let result = Array<Double>(unsafeUninitializedCapacity: f1.count, initializingWith: { resultBuffer, count in
assert(f1.count == tdata.count)
vDSP_vtrapzD(
&f1, // Double-precision real input vector.
1, // Address stride for A.
&tdata, // Pointer to double-precision real input scalar: step size.
resultBuffer.baseAddress!, // Double-precision real output vector.
1, // Address stride for C.
vDSP_Length(f1.count) // The number of elements to process.,
)
count = f1.count // This tells Swift how many elements of the buffer to copy into the resultant Array
})

FYI, there's a nice Swift version of vDSP_vtrapzD that you can see here: https://developer.apple.com/documentation/accelerate/vdsp/integration_functions. The variant that returns the result uses the unsafeUninitializedCapacity initializer.
On a related note, there's also a nice Swift API to Quadrature: https://developer.apple.com/documentation/accelerate/quadrature-smu
simon

I've been grappling with using DSP in the last week too! Alexander's solution above was helpful. However I'd like to add a couple of things.
vDSP_vtrapzD only allows you to integrate an array of values against a fixed step increment, which is a scalar value. The function doesn't allow you to pass a varying time value for each increment.
The example solution confused me a little as it does a check to make sure the time array is the same size as the data array. This is not necessary as only the first value in the time vector will be used by the vDSP_vtrapzD function.
It's a shame that the vDSP_vtrapzD only takes a scalar as ca a fixed time component. In my experience this doesn't not reflect reality when working with time based data from sensors that don't emit data in precisely the same increments.

Related

How do I find how much memory is taken by a struct?

Simple question: Is there a way how to find out how much memory is taken by particular struct?
Ideally I would like it printed to console.
Edit: Krumelur came with simple solution using sizeof function.
Unfortunatelly it does not seems to work well with arrays. Following code
println("Size of int \(123) is: \(sizeofValue(123))")
println("Size of array \([0]) is: \(sizeofValue([0]))")
println("Size of array \([0, 1, 8, 20]) is: \(sizeofValue([0, 1, 8, 20]))")
Produces this output:
Size of int 123 is: 8
Size of array [0] is: 8
Size of array [0, 1, 8, 20] is: 8
So different sizes of arrays give same size what is surely incorrect (at least for my purpose).
The sizeof(T) operator is available in Swift. It returns the size taken up by the specified type or variable, just like in C.
Unlike C, however, there is no concept of a stack-allocated array (static array). An array is a pointer to an object, meaning that its size will always be a size of a pointer (this is the same as for heap-allocated arrays in C). To get the size of an array, you have to do something like
array.count * sizeof(Telement)
but even that is only true if Telement is not an object that allocates heap memory.
This appears to be supported within the Swift standard library now.
Docs
MemoryLayout.size(ofValue: self)
As Declan McKenna pointed out, MemoryLayout.size is now a part of the standard library ("Foundation").
You use this in one of two ways: either you get the size of a type via <> bracket syntax, or of a value by calling it as a function:
var someInt: Int
let a = MemoryLayout.size(ofValue: someInt)
let b = MemoryLayout<Int>.size
/* a and b are equal */
Suppose you have an array arr of type T, you can get the allocated size as follows:
let size = arr.capacity * MemoryLayout<T>.size
Note that you should use arr.capacity, not arr.count. Capacity refers to how much memory has been reserved for the array, even if all the values haven't been written to.
Another thing to note is that memory allocations can be a bit tricky. If you allocate a large block of memory and never write to it, the operating system might not report the application as actually using that memory. The operating system might let you malloc some enormous block of memory that's a thousand times larger than your actual memory, but if you never actually write anything to it, it won't really get allocated.
The method described in this answer is for getting the hypothetical maximum amount allocated by the array.

Objective-C: Cross correlation of two audio files

I want to perform a cross-correlation of two audio files (which are actually NSData objects). I found a vDSP_convD function in accelerate framework. NSData has a property bytes which returns a pointer to an array of voids - that is the parameter of the filter and signal vector.
I struggled with other parameters. What is the length of these vectors or the length of the result vectors?
I guess:
it's the sum of the filter and signal vector.
Could anyone give me an example of using the vDSP_convD function?
Apple reference to the function is here
Thanks
After reading a book - Learning Core Audio, I have made a demo which demonstrates delay between two audio files. I used new iOS 8 API to get samples from the audio files and a good performance optimization.
Github Project.
a call would look like this:
vDSP_conv ( signal *, signalStride, filter *, filterStride, result*, resultStride, resultLenght, filterLength );
where we have:
signal*: a pointer to the first element of your signal array
signalStride: the lets call it "step size" throug your signal array. 1 is every element, 2 is every second ...
same for filter and result array
length for result and filter array
How long do the arrays have to be?:
As stated in the docs you linked our signal array has to be lenResult + lenFilter - 1 which it is where it gets a little messy. You can find a demonstration of this by Apple here or a shorter answer by SO user Rasman here.
You have to do the zero padding of the signal array by yourself so the vector functions can apply the sliding window without preparation.
Note: You might consider using the Fast-Fourier-Transformation for this, because when you work with audio files i assume, that you have quite some data and there is a significant performance increase from a certain point onwards when using:
FFT -> complex multiplication in frequency domain (which results in a correlation in time domain) -> reverse FFT
here you can find a useful piece of code for this!

multi dimension dynamic array allocation

I want to use dynamic allocation for multi-block CFD code, where, the index (i,j,k) varies for different blocks. I really dont know, how to allocate arbitrary array index for n number of blocks and pass it to subroutines. I have given a sample code, which gives error message "Error: Expression at (1) must be scalar", on compilation using gfortran.
common/iteration/nb
integer, dimension (:),allocatable::nib,njb,nkb
real, dimension (:,:,:,:),allocatable::x,y,z
allocate (nib(nb),njb(nb),nkb(nb))
do l=1,nb
ni=nib(l)
nj=njb(l)
nk=nkb(l)
allocate (x(l,ni,nj,nk),y(l,ni,nj,nk),z(l,ni,nj,nk))
enddo
call gridatt (x,y,z,nib,njb,nkb)
deallocate(x,y,z,nib,njb,nkb)
end
subroutine gridatt (x,y,z,nib,njb,nkb)
common/iteration/nb
integer, dimension (nb)::nib,njb,nkb
real, dimension (nb,nib,njb,nkb)::x,y,z
do l=1,nb
read(7,*)nib(l),njb(l),nkb(l)
read(7,*)(((x(l,i,j,k),i=1,nib(l)),j=1,njb(l)),k=1,nkb(l)),
$ (((y(l,i,j,k),i=1,nib(l)),j=1,njb(l)),k=1,nkb(l)),
$ (((z(l,i,j,k),i=1,nib(l)),j=1,njb(l)),k=1,nkb(l))
enddo
return
end
The error message gfortran gives, is as good as they get. It points to nib in the line
real, dimension (nb,nib,njb,nkb)::x,y,z
nib is declared as an array. This is not allowed. (What would the size of x, y, and z be in this dimension?)
Apart from this, I don't really understand your description of what it is that you are trying to do, and the sample code you show doesn't make much sense to me.
common/iteration/nb
integer, dimension (:),allocatable::nib,njb,nkb
real, dimension (:,:,:,:),allocatable::x,y,z
allocate (nib(nb),njb(nb),nkb(nb))
When writing new code, using modules to communicate between program units is highly preferred. Old style common blocks are to be avoided.
You are trying to allocate nib, njb, and nkb with size nb. The problem is that nb hasn't been given a value yet (and won't be given one anywhere in the code).
do l=1,nb
ni=nib(l)
nj=njb(l)
nk=nkb(l)
allocate (x(l,ni,nj,nk),y(l,ni,nj,nk),z(l,ni,nj,nk))
enddo
Again the problem with nb not having a value. This loop runs for an unknown number of times. You're also using the arrays nib, njb, and nkb, which don't contain any values yet.
In each iteration of the loop x, y, and z get allocated. This will lead to a runtime error in the second iteration, because you can not allocate an already allocated variable. Even if the allocations would work, this loop would be useless, because the three arrays would be reset in each iteration and would eventually be set to the dimensions of the last allocation.
Now that I'm writing this, I'm starting to think that what it is that you are trying to do, is create so called 'jagged arrays': you want to create a block in x(1,:,:,:) that differs in size in the second, third, and/or fourth dimension from the block in x(2,:,:,:), and so on. This is simply not possible in fortran.
One way to achieve this would be to create a user defined type with an allocatable, three dimensional array component, and create an array of this type. You can then allocate the array component to the desired size for each element of the array of user defined type.
This would look something like the following (disclaimer: untested and just one possible way of achieving your goal).
type :: blocktype
real, dimension(:, :, :), allocatable :: x, y, z
end type blocktype
type(blocktype), dimension(nb) :: myblocks
You can then run a loop to allocate x, y, and z to a different size for each array element. This is assuming nb has been set to the required value, and nib, njb, and nkb contain the desired sizes for the different blocks.
do block = 1, nb
ni = nib(block)
nj = njb(block)
nk = nkb(block)
allocate(myblocks(block)%x(ni, nj, nk))
allocate(myblocks(block)%y(ni, nj, nk))
allocate(myblocks(block)%z(ni, nj, nk))
enddo
If you want to do it like this, you will definitely want to put your procedures in modules, because that way you automatically get explicit interfaces, which are required for passing around such arrays of user defined type.
One afterthought: Don't use implicit typing, not even in sample code. Always use implicit none.

Does erlang implement record copy-and-modify in any clever way?

given:
-record(foo, {a, b, c}).
I do something like this:
Thing = #foo{a={1,2}, b={3,4}, c={5,6}},
Thing1 = Thing#foo{a={7,8}}.
From a semantic view, Thing and Thing1 are unique entities. However, from a language implementation standpoint, making a full copy of Thing to generate Thing1 would be intensely wasteful. For example, if the record were a megabyte in size and I made a thousand "copies," each modifying a couple of bytes, I've just burned a gigabyte. If the internal structure kept track of a representation of the parent structure and each derivative marked up that parent in a way that indicated its own change but preserved everyone elses' versions, the derivates could be created with a minimum of memory overhead.
My question is this: is erlang doing anything clever - internally - to keep the overhead of the usual erlang scribble;
Thing = #ridiculously_large_record,
Thing1 = make_modified_copy(Thing),
Thing2 = make_modified_copy(Thing1),
Thing3 = make_modified_copy(Thing2),
Thing4 = make_modified_copy(Thing3),
Thing5 = make_modified_copy(Thing4)
...to a minimum?
I ask because there would be a number of changes to the way that I did cross-process communications if this were the case.
The exact workings of the garbage collection and memory allocation is only known to a few. Thankfully, they are very happy to share their knowledge and the following is based on what I have learnt from the erlang-questions mailing list and by discussing with OTP developers.
When messaging between processes, the content is always copied as there is no shared heap between processes. The only exception is binaries bigger than 64 bytes, where only a reference is copied.
When executing code in one process, only parts are updated. Let's analyze tuples, as that is the example you provided.
A tuple is actually a structure that keeps references to the actual data somewhere on the heap (except for small integers and maybe one more data type which I can't remember). When you update a tuple, using for example setelement/3, a new tuple is created with the given element replaced, however for all other elements only the reference is copied. There is one exception which I have never been able to take advantage of.
The garbage collector keeps track of each tuple and understands when it is safe to reclaim any tuple that is no longer used. It might be that the data referenced by the tuple is still in use, in which case the data itself is not collected.
As always, Erlang gives you some tools to understand exactly what is going on. The efficiency guide details how to use erts_debug:size/1 and erts_debug:flat_size/1 to understand the size of the data structure when used internally in a process and when copied. The trace tools also allows you to understand when, what and how much was garbage collected.
The record foo is of arity four (holding four words), but the whole structure is 14 words in size. Any immediate (pids, ports, small integers, atoms, catch and nil) can be stored directly in the tuples array. Any other term which can't fit into a word, such as other tuples, are not stored directly but referenced by boxed pointers (a boxed pointer is an erlang term with a forwarding address to the real eterm ... just internals).
In your case a new tuple of same arity is created and the atom foo and all the pointers are copied from the previous tuple except for index two, a, which points to the new tuple {7,8} which constitutes 3 words. In all 5 + 3 new words are created on the heap and only 3 words are copied from the old tuple the other 9 words are not touched.
Excessively large tuples are not recommended. When updating a tuple, the whole tuple, i.e the array and not the deep content, needs to copied and then updated in other to preserve a persistent data structure. This will also generate increased garbage, forcing the garbage collector to heat up which also hurts performance. The dict and array modules avoids using large tuples for this reason and have a shallow tuple tree instead.
I can definitely verify what people have already pointed out:
a record is just a tuple with the record name as the first element and all the fields just the following tuple element
when an element of a tuple is changed, updating a field in a record in your case, only the top level tuple is new, all the elements are just reused
This works just because we have immutable data. So in your example each time you update a value in a #foo record none of the data in the elements are copied and only a new 4-element tuple (5 words) is created. Erlang will never does a deep copy in this type of operation or when passing arguments in function calls.
In conclusion:
Thing = #foo{a={1,2}, b={3,4}, c={5,6}},
Thing1 = Thing#foo{a={7,8}}.
Here, if Thing is not used again, it will probably be updated in place and copying of the tuple will be avoided, as the Efficiency Guide says. (tuple and record syntax is complied into something like setelement, I think)
Thing = #ridiculously_large_record,
Thing1 = make_modified_copy(Thing),
Thing2 = make_modified_copy(Thing1),
...
Here the tuples are actually copied every time.
I guess that it would be theoretically possible make an interesting optimization to this. If the compiler could perform escape analysis on the return value of make_modified_copy and detect that the only reference to it is the one returned, in could save this information about the function. When it encounter a call the that function it would know that it is safe to modify the return value in place.
This would only be possible to do on inter module calls, because of the code replace feature.
Maybe one day we will have it.

What happens when increasing the size of a dynamic array?

I'd like to understand what happens when the size of a dynamic array is increased.
My understanding so far:
Existing array elements will remain unchanged.
New array elements are initialised to 0
All array elements are contiguous in memory.
When the array size is increased, will the extra memory be tacked onto the existing memory block, or will the existing elements be copied to a entirely new memory block?
Does changing the size of a dynamic array have consequences for pointers referencing existing array elements?
Thanks,
[edit] Incorrect assumption struck out. (New array elements are initialised to 0)
Existing array elements will remain unchanged: yes
New array elements are initialized to 0: yes (see update) no, unless it is an array of compiler managed types such as string, an other array or a variant
All array elements are contiguous in memory: yes
When the array size is increased, the array will be copied. From the doc:
...memory for a dynamic array is reallocated when you assign a value to the array or pass it to the SetLength procedure.
So yes, increasing the size of a dynamic array does have consequences for pointers referencing existing array elements.
If you want to keep references to existing elements, use their index in the array (0-based).
Update
Comments by Rob and David prompted me to check the initialization of dynamic arrays in Delphi5 (as I have that readily available anyway). First using some code to create various types of dynamic arrays and inspecting them in the debugger. They were all properly initialized, but that could still have been a result prior initialization of the memory location where they were allocated. So checked the RTL. It turns out D5 already has the FillChar statement in the DynArraySetLength method that Rob pointed to:
// Set the new memory to all zero bits
FillChar((PChar(p) + elSize * oldLength)^, elSize * (newLength - oldLength), 0);
In practice Embarcadero will always zero initialise new elements simply because to do otherwise would break so much code.
In fact it's a shame that they don't officially guarantee the zero allocation because it is so useful. The point is that often at the call site when writing SetLength you don't know whether you are growing or shrinking the array. But the implementation of SetLength does know – clearly it has to. So it really makes sense to have a well-defined action on any new elements.
What's more, if they want people to be able to switch easily between managed and native worlds then zero allocation is desirable since that's what fits with the managed code.

Resources