It seems I cannot make C++/CLI structures be aligned with less than 8 bytes. I have a struct of two Int32, allocate a million of them, and voilà: 16 MB memory according to ".NET Memory Profiler" (plus the list data). I set the compiler option to /Zp4 (also tried /Zp1), to Minimize Size (/O1) and Small Code (/Os), just to make sure, I additionally put a "#pragma pack(1)" into my code, to no avail. My struct is still taking up 16 Bytes. I changed it to class, still the same.
Why that?
How to change?
Ciao,
Eike
using namespace System;
#pragma pack(1)
ref struct myStruct
{
Int32 a;
Int32 b;
};
int main(array<System::String ^> ^args)
{
System::Collections::Generic::List<myStruct^> list;
for (int i = 0; i < 1000000; i++)
{
list.Add(gcnew myStruct());
}
// avoid optimization
Console::WriteLine(list[333333]->a);
return 0;
}
You need to use value types to be able to specify alignment. Beyond that I'm not sure this is the best way to measure this. Reference types also have some small built in overhead. Try value struct/value class instead.
Related
I’m trying to make a basic simulation of a 16 bit computer with Swift. The computer will feature
An ALU
2 registers
That’s all. I have enough knowledge to create these parts visually and understand how they work, but it has become increasingly difficult to make larger components with more inputs while using my current approach.
My current approach has been to wrap each component in a struct. This worked early on, but is becoming increasingly difficult to manage multiple inputs while staying true to the principles of computer science.
The primary issue is that the components aren’t updating with the clock signal. I have the output of the component updating when get is called on the output variable, c. This, however, neglects the idea of a clock signal and will likely cause further problems later on.
It’s also difficult to make getters and setters for each variable without getting errors about mutability. Although I have worked through these errors, they are annoying and slow down the development process.
The last big issue is updating the output. The output doesn’t update when the inputs change; it updates when told to do so. This isn’t accurate to the qualities of real computers and is a fundamental error.
This is an example. It is the ALU I mentioned earlier. It takes two 16 bit inputs and outputs 16 bits. It has two unary ALUs, which can make a 16 bit number zero, negate it, or both. Lastly, it either adds or does a bit wise and comparison based on the f flag and inverts the output if the no flag is selected.
struct ALU {
//Operations are done in the order listed. For example, if zx and nx are 1, it first makes input 1 zero and then inverts it.
var x : [Int] //Input 1
var y : [Int] //Input 2
var zx : Int //Make input 1 zero
var zy : Int //Make input 2 zero
var nx : Int //Invert input 1
var ny : Int //Invert input 2
var f : Int //If 0, do a bitwise AND operation. If 1, add the inputs
var no : Int //Invert the output
public var c : [Int] { //Output
get {
//Numbers first go through unary ALUs. These can negate the input (and output the value), return 0, or return the inverse of 0. They then undergo the operation specified by f, either addition or a bitwise and operation, and are negated if n is 1.
var ux = UnaryALU(z: zx, n: nx, x: x).c //Unary ALU. See comments for more
var uy = UnaryALU(z: zy, n: ny, x: y).c
var fd = select16(s: f, d1: Add16(a: ux, b: uy).c, d0: and16(a: ux, b: uy).c).c //Adds a 16 bit number or does a bitwise and operation. For more on select16, see the line below.
var out = select16(s: no, d1: not16(a: fd).c, d0: fd).c //Selects a number. If s is 1, it returns d1. If s is 0, it returns d0. d0 is the value returned by fd, while d1 is the inverse.
return out
}
}
public init(x:[Int],y:[Int],zx:Int,zy:Int,nx:Int,ny:Int,f:Int,no:Int) {
self.x = x
self.y = y
self.zx = zx
self.zy = zy
self.nx = nx
self.ny = ny
self.f = f
self.no = no
}
}
I use c for the output variable, store values with multiple bits in Int arrays, and store single bits in Int values.
I’m doing this on Swift Playgrounds 3.0 with Swift 5.0 on a 6th generation iPad. I’m storing each component or set of components in a separate file in a module, which is why some variables and all structs are marked public. I would greatly appreciate any help. Thanks in advance.
So, I’ve completely redone my approach and have found a way to bypass the issues I was facing. What I’ve done is make what I call “tracker variables” for each input. When get is called for each variable, it returns that value of the tracker assigned to it. When set is called it calls an update() function that updates the output of the circuit. It also updates the value of the tracker. This essentially creates a ‘copy’ of each variable. I did this to prevent any infinite loops.
Trackers are unfortunately necessary here. I’ll demonstrate why
var variable : Type {
get {
return variable //Calls the getter again, resulting in an infinite loop
}
set {
//Do something
}
}
In order to make a setter, Swift requires a getter to be made as well. In this example, calling variable simply calls get again, resulting in a never-ending cascade of calls to get. Tracker variables are a workaround that use minimal extra code.
Using an update method makes sure the output responds to a change in any input. This also works with a clock signal, due to the architecture of the components themselves. Although it appears to act as the clock, it does not.
For example, in data flip-flops, the clock signal is passed into gates. All a clock signal does is deactivate a component when the signal is off. So, I can implement that within update() while remaining faithful to reality.
Here’s an example of a half adder. Note that the tracker variables I mentioned are marked by an underscore in front of their name. It has two inputs, x and y, which are 1 bit each. It also has two outputs, high and low, also known as carry and sum. The outputs are also one bit.
struct halfAdder {
private var _x : Bool //Tracker for x
public var x: Bool { //Input 1
get {
return _x //Return the tracker’s value
}
set {
_x = x //Set the tracker to x
update() //Update the output
}
}
private var _y : Bool //Tracker for y
public var y: Bool { //Input 2
get {
return _y
}
set {
_y = y
update()
}
}
public var high : Bool //High output, or ‘carry’
public var low : Bool //Low output, or ‘sum’
internal mutating func update(){ //Updates the output
high = x && y //AND gate, sets the high output
low = (x || y) && !(x && y) //XOR gate, sets the low output
}
public init(x:Bool, y:Bool){ //Initializer
self.high = false //This will change when the variables are set, ensuring a correct output.
self.low = false //See above
self._x = x //Setting trackers and variables
self._y = y
self.x = x
self.y = y
}
}
This is a very clean way, save for the trackers, do accomplish this task. It can trivially be expanded to fit any number of bits by using arrays of Bool instead of a single value. It respects the clock signal, updates the output when the inputs change, and is very similar to real computers.
I am trying to implement RGB histogram computation for images in Swift (I am new to iOS).
However the computation time for 1500x1000 image is about 66 sec, which I consider to be too slow.
Are there any ways to speed up image traversal?
P.S. current code is the following:
func calcHistogram(image: UIImage) {
let bins: Int = 20;
let width = Int(image.size.width);
let height = Int(image.size.height);
let binStep: Double = Double(bins-1)/255.0
var hist = Array(count:bins, repeatedValue:Array(count:bins, repeatedValue:Array(count:bins, repeatedValue:Int())))
for i in 0..<bins {
for j in 0..<bins {
for k in 0..<bins {
hist[i][j][k] = 0;
}
}
}
var pixelData = CGDataProviderCopyData(CGImageGetDataProvider(image.CGImage))
var data: UnsafePointer<UInt8> = CFDataGetBytePtr(pixelData)
for x in 0..<width {
for y in 0..<height {
var pixelInfo: Int = ((width * y) + x) * 4
var r = Double(data[pixelInfo])
var g = Double(data[pixelInfo+1])
var b = Double(data[pixelInfo+2])
let r_bin: Int = Int(floor(r*binStep));
let g_bin: Int = Int(floor(g*binStep));
let b_bin: Int = Int(floor(b*binStep));
hist[r_bin][g_bin][b_bin] += 1;
}
}
}
As noted in my comment on the question, there are some things you might rethink before you even try to optimize this code.
But even if you do move to a better overall solution like GPU-based histogramming, a library, or both... There are some Swift pitfalls you're falling into here that are good to talk about so you don't run into them elsewhere.
First, this code:
var hist = Array(count:bins, repeatedValue:Array(count:bins, repeatedValue:Array(count:bins, repeatedValue:Int())))
for i in 0..<bins {
for j in 0..<bins {
for k in 0..<bins {
hist[i][j][k] = 0;
}
}
}
... is initializing every member of your 3D array twice, with the same result. Int() produces a value of zero, so you could leave out the triple for loop. (And possibly change Int() to 0 in your innermost repeatedValue: parameter to make it more readable.)
Second, arrays in Swift are copy-on-write, but this optimization can break down in multidimensional arrays: changing an element of a nested array can cause the entire nested array to be rewritten instead of just the one element. Multiply that by the depth of nested arrays and number of element writes you have going on in a double for loop and... it's not pretty.
Unless there's a reason your bins need to be organized this way, I'd recommend finding a different data structure for them. Three separate arrays? One Int array where index i is red, i + 1 is green, and i + 2 is blue? One array of a custom struct you define that has separate r, g, and b members? See what conceptually fits with your tastes or the rest of your app, and profile to make sure it works well.
Finally, some Swift style points:
pixelInfo, r, g, and b in your second loop don't change. Use let, not var, and the optimizer will thank you.
Declaring and initializing something like let foo: Int = Int(whatever) is redundant. Some people like having all their variables/constants explicitly typed, but it does make your code a tad less readable and harder to refactor.
Int(floor(x)) is redundant — conversion to integer always takes the floor.
If you have some issues about performance in your code, first of all, use Time Profiler from Instruments. You can start it via Xcode menu Build->Profile, then, Instruments app opened, where you can choose Time Profiler.
Start recording and do all interactions in the your app.
Stop recording and analyse where is the "tightest" place of your code.
Also check options "Invert call tree", "Hide missing symbols" and "Hide system libraries" for better viewing profile results.
You can also double click at any listed function to view it in code and seeing percents of usage
Maintaining some code from an iOS application, I came upon the following:
CLLocationCoordinate2D inputArray[size]; // CLLocationCoordinate2D is a struct containing two doubles
for (int i = 0; i < size; i++) {
inputArray[i] = ... ; // Fill the array
}
CLLocationCoordinate2D outputArray[size];
functionThatConvertsInputToOutput(inputArray, outputArray, size);
Here we are doing allocation of two struct arrays of dynamic size (cannot determine size at compile time). So called "Variable-length array", based on that SO question ( Declare Dynamic Array ).
I'm well aware that this does not even compile in C/C++ and when looking after similar questions, the answer is often "Use malloc" or "Use NS(Mutable)Array".
But I haven't really found the answer to the question:
What happens in Objective C when declaring int array[size]; ?
The reason I'm wondering is that the piece of code I have reproduced above crashes when using VLA with reasonably large sizes (36000) and does not crash when using malloc:
CLLocationCoordinate2D *inputArray = malloc(sizeof(CLLocationCoordinate2D) * size);
CLLocationCoordinate2D *ouputArray = malloc(sizeof(CLLocationCoordinate2D) * size);
EDIT #1: What wikipedia says about VLA http://en.wikipedia.org/wiki/Variable-length_array
EDIT #2: Crashes are EXC_BAC_ACCESS at odd places in functionThatConvertsInputToOutput or on the line calling functionThatConvertsInputToOutput.
It’s very likely it’s sticking the memory for the array on the stack, which is why you’re crashing when you blow up the stack by 36,000 * sizeof(CLLocationCoordinate2D).
I want to do this:
typedef struct
{
CGPoint vertices[];
NSUInteger vertexCount;
} Polygon;
But it says Field has incomplete type CGPoint [].
You need to do one of two things:
Declare the array to be a fixed size (probably not what you want)
Make it a pointer. But then you need to properly malloc and free the array as needed.
A better choice is to not use a struct and instead create a full class. Then you can add methods and properties as well as make memory management much easier. You are working in Objective-C. Take advantage of the Object Oriented aspects of the language. Add a method to calculate the circumference and area, etc. Put the logic where it belongs.
Set array size CGPoint vertices[count];
Don't you want a unique name for each element of your struct anyway? If you just want a bunch of CGPoint's in a numerical order, with the ability to count how many of them there are you'd be much better served by shoving them in an NSArray or NSMutableArray (stored as NSValue's of course)
The whole point of a struct would be to have easy access to the values by a descriptive name, ie:
typedef struct {
CGPoint helpfulAndDescriptiveNameOne;
CGPoint helpfulAndDescriptiveNameTwoWhichIsDifferentThanTheOtherName;
etc...
NSUInteger vertexCount;
}
For example, a CGRect is just a struct composed of four different CGFloats, each of which is descriptively and helpfully named:
typedef {
CGFloat x;
CGFloat y;
CGFloat width;
CGFloat height;
} CGRect;
I am using ffmpeg library. I want to know how much memory one packet can take.
I debug to check the members in on AVPacket, and none of them seem reasonable, such as AVPacket.size, ec.
If you provide your own data buffer, it needs to have a size of mininum FF_MIN_BUFFER_SIZE. You would then set the AVPacket.size to the allocated size, and AVPacket.data to the memory you've allocated.
Note that all FFmpeg decoding routine will simply fail if you provide your own buffer and it's too small.
The other possibility, is let FFmpeg calculates the optimal size for you.
Then do something like:
AVPacket pkt;
pkt.size = 0;
pkt.data = NULL; // <-- the critical part is there
int got_output = 0;
ret = avcodec_encode_audio2(ctx, &pkt, NULL, &got_output);
and provide this AVPacket to the encoding codec. Memory will be allocated automatically.
You will have to call av_free_packet upon return from the encoder and if got_output is set to 1.
FFmpeg will automatically free the AVPacket content in case of error.
AVPacket::size holds the size of the referenced data. Because it is a generic container for data, there can be no definite answer to the question
how much memory one packet can take
It can actually take from zero to a lot. Everything depends on data type, codec and other related parameters.
From FFmpeg examples:
static void audio_encode_example(const char *filename)
{
// ...
AVPacket pkt;
// ...
ret = avcodec_encode_audio2(c, &pkt, NULL, &got_output);
// ...
if (got_output) {
fwrite(pkt.data, 1, pkt.size, f); // <<--- AVPacket.size
av_free_packet(&pkt);
}