I'm currently trying to port my Java Android library to Swift. In my Android library I'm using a JNI wrapper for Jerasure to call following C method
int jerasure_matrix_decode(int k, int m, int w, int *matrix, int row_k_ones, int *erasures, char **data_ptrs, char **coding_ptrs, int size)
I have to admit that I'm relatively new to Swift so some of my stuff might be wrong. In my Java code char **data_ptrs and char **coding_ptrs are actually two dimensional arrays (e.g. byte[][] dataShard = new byte[3][1400]). These two dimensional arrays contain actual video stream data. In my Swift library I store my video stream data in a [Data] array so the question is what is the correct way to convert the [Data] array to the C char ** type.
I already tried some things but none of them worked. Currently I have following conversion logic which gives me a UnsafeMutablePointer<UnsafeMutablePointer?>? pointer (data = [Data])
let ptr1 = ptrFromAddress(p: &data)
ptr1.withMemoryRebound(to: UnsafeMutablePointer<Int8>?.self, capacity: data.count) { pp in
// here pp is UnsafeMutablePointer<UnsafeMutablePointer<Int8>?>?
}
func ptrFromAddress<T>(p:UnsafeMutablePointer<T>) -> UnsafeMutablePointer<T>
{
return p
}
The expected result would be that jerasure is able to restore missing data shards of my [Data] array when calling the jerasure_matrix_decode method but instead it completely messes up my [Data] array and accessing it results in EXC_BAD_ACCESS. So I expect this is completely the wrong way.
Documentation in the jerasure.h header file writes following about data_ptrs
data_ptrs = An array of k pointers to data which is size bytes
Edit:
The jerasure library is defining the data_ptrs like this
#define talloc(type, num) (type *) malloc(sizeof(type)*(num))
char **data;
data = talloc(char *, k);
for (i = 0; i < k; i++) {
data[i] = talloc(char, sizeof(long)*w);
}
So what is the best option to call the jerasure_matrix_decode method from swift? Should I use something different than [Data]?
Possible similar question:
How to create a UnsafeMutablePointer<UnsafeMutablePointer<UnsafeMutablePointer<Int8>>>
A possible solution could be to allocate appropriate memory and fill it with the data.
Alignment
The equivalent to char ** of the C code would be UnsafeMutablePointer<UnsafeMutablePointer<CChar>?> on Swift side.
In the definition of data_ptrs that you show in your question, we see that each data block is to be allocated with malloc.
A property of C malloc is that it does not know which pointer type it will eventually be cast into. Therefore, it guarantees strictest memory alignment:
The pointer returned if the allocation succeeds is suitably aligned so that it may be assigned to a pointer to any type of object with a fundamental alignment requirement and then used to access such an object or an array of such objects in the space allocated (until the space is explicitly deallocated).
see https://port70.net/~nsz/c/c11/n1570.html#7.22.3
Particularly performance-critical C routines often do not operate byte by byte, but cast to larger numeric types or use SIMD.
So, depending on your internal C library implementation, allocating with UnsafeMutablePointer<CChar>.allocate(capacity: columns) could be problematic, because
UnsafeMutablePointer provides no automated memory management or alignment guarantees.
see https://developer.apple.com/documentation/swift/unsafemutablepointer
The alternative could be to use UnsafeMutableRawPointer with an alignment parameter. You can use MemoryLayout<max_align_t>.alignment to find out the maximum alignment constraint.
Populating Data
An UnsafeMutablePointer<CChar> would have the advantage that we could use pointer arithmetic. This can be achieved by converting the UnsafeMutableRawPointer to an OpaquePointer and then to an UnsafeMutablePointer. In the code it would then look like this:
let colDataRaw = UnsafeMutableRawPointer.allocate(byteCount: cols, alignment: MemoryLayout<max_align_t>.alignment)
let colData = UnsafeMutablePointer<CChar>(OpaquePointer(colDataRaw))
for x in 0..<cols {
colData[x] = CChar(bitPattern: dataArray[y][x])
}
Complete Self-contained Test Program
Your library will probably have certain requirements for the data (e.g. supported matrix dimensions), which I don't know. These must be taken into account, of course. But for a basic technical test we can create an independent test program.
#include <stdio.h>
#include "matrix.h"
void some_matrix_operation(int rows, int cols, char **data_ptrs) {
printf("C side:\n");
for(int y = 0; y < rows; y++) {
for(int x = 0; x < cols; x++) {
printf("%02d ", (unsigned char)data_ptrs[y][x]);
data_ptrs[y][x] += 100;
}
printf("\n");
}
printf("\n");
}
It simply prints the bytes and adds 100 to each byte to be able to verify that the changes arrive on the Swift side.
The corresponding header must be included in the bridge header and looks like this:
#ifndef matrix_h
#define matrix_h
void some_matrix_operation(int rows, int cols, char **data_ptrs);
#endif /* matrix_h */
On the Swift side, we can put everything in a class called Matrix:
import Foundation
class Matrix: CustomStringConvertible {
let rows: Int
let cols: Int
let dataPtr: UnsafeMutablePointer<UnsafeMutablePointer<CChar>?>
init(dataArray: [Data]) {
guard !dataArray.isEmpty && !dataArray[0].isEmpty else { fatalError("empty data not supported") }
self.rows = dataArray.count
self.cols = dataArray[0].count
self.dataPtr = Self.copyToCMatrix(rows: rows, cols: cols, dataArray: dataArray)
}
deinit {
for y in 0..<rows {
dataPtr[y]?.deallocate()
}
dataPtr.deallocate()
}
var description: String {
var desc = ""
for data in dataArray {
for byte in data {
desc += "\(byte) "
}
desc += "\n"
}
return desc
}
var dataArray: [Data] {
var array = [Data]()
for y in 0..<rows {
if let ptr = dataPtr[y] {
array.append(Data(bytes: ptr, count: cols))
}
}
return array
}
private static func copyToCMatrix(rows: Int, cols: Int, dataArray: [Data]) -> UnsafeMutablePointer<UnsafeMutablePointer<CChar>?> {
let dataPtr = UnsafeMutablePointer<UnsafeMutablePointer<CChar>?>.allocate(capacity: rows)
for y in 0..<rows {
let colDataRaw = UnsafeMutableRawPointer.allocate(byteCount: cols, alignment: MemoryLayout<max_align_t>.alignment)
let colData = UnsafeMutablePointer<CChar>(OpaquePointer(colDataRaw))
dataPtr[y] = colData
for x in 0..<cols {
colData[x] = CChar(bitPattern: dataArray[y][x])
}
}
return dataPtr
}
}
You can call it as shown here:
let example: [[UInt8]] = [
[ 126, 127, 128, 129],
[ 130, 131, 132, 133],
[ 134, 135, 136, 137]
]
let dataArray = example.map { Data($0) }
let matrix = Matrix(dataArray: dataArray)
print("before on Swift side:")
print(matrix)
some_matrix_operation(Int32(matrix.rows), Int32(matrix.cols), matrix.dataPtr)
print("afterwards on Swift side:")
print(matrix)
Test Result
The test result is as follows and seems to show the expected result.
before on Swift side:
126 127 128 129
130 131 132 133
134 135 136 137
C side:
126 127 128 129
130 131 132 133
134 135 136 137
afterwards on Swift side:
226 227 228 229
230 231 232 233
234 235 236 237
Related
The following code compiles in gfortran, with a warning about large_array being larger than the limit for a stack variable, stating that the array will be moved to static memory and is therefore not threadsafe:
subroutine stack_size_warning
implicit none
real :: large_array(65536)
print *, large_array
end subroutine stack_size_warning
This subroutine however compiles with no errors or warnings, and I can call it with n values larger than 65536 without issue, at least in simple cases.
subroutine no_warning(n)
implicit none
integer :: n
real :: automatic_array(n)
print *, automatic_array
end subroutine no_warning
Is this second array threadsafe? Where is the memory allocated for automatic_array in this second subroutine? Is the memory allocated and deallocated on every call making it slower than if it was on the stack or if a preallocated array was passed in as a dummy argument?
I wrote the following program to test 3 scenarios, a subroutine with a small array on the stack, another with a large array over the stack limit and thus stored in static memory, and a third where a dummy argument specifies the size of an array defined inside the routine.
Here is that program:
program main
implicit none
call small
call large
call automatic(65536)
end program main
subroutine small
implicit none
real :: small_array(10)
small_array=1.
print *, small_array
end subroutine small
subroutine large
implicit none
real :: large_array(65536)
large_array=1.
print *, large_array
end subroutine large
subroutine automatic(n)
implicit none
integer :: n
real :: automatic_array(n)
automatic_array=1.
print *, automatic_array
end subroutine automatic
Using steve's recommendation I compiled with a tree dump as follows:
gfortran array_dim_test.f90 -o array_dim_test -fdump-tree-original
The full dump is at the end, but to summarize what I see, the automatic subroutine has a try/finally block. In the try block, a call to malloc allocates the memory, and in the finally block, the memory is freed. So I guess this memory is allocated and deallocated on the heap with every call to the subroutine. This intuitively makes sense as how else would the program know what to do with this array that lives only in the subroutine, and whose size is defined in a call to the subroutine, but it is interesting to see the explicit calls in the tree dump. This would appear to be thread-safe then, but perhaps also not the most efficient thing to do if this routine is called many times with the same array size parameter, allocating and deallocating memory with every call.
Here is the tree dump:
__attribute__((fn spec (". w ")))
void automatic (integer(kind=4) & restrict n)
{
void * restrict D.3964;
integer(kind=8) ubound.0;
integer(kind=8) size.1;
real(kind=4)[0:D.3961] * restrict automatic_array;
integer(kind=8) D.3961;
bitsizetype D.3962;
sizetype D.3963;
try
{
ubound.0 = (integer(kind=8)) *n;
size.1 = NON_LVALUE_EXPR <ubound.0>;
size.1 = MAX_EXPR <size.1, 0>;
D.3961 = size.1 + -1;
D.3962 = (bitsizetype) (sizetype) NON_LVALUE_EXPR <size.1> * 32;
D.3963 = (sizetype) NON_LVALUE_EXPR <size.1> * 4;
D.3964 = (void * restrict) __builtin_malloc (MAX_EXPR <(unsigned long) (size.1 * 4), 1>);
automatic_array = (real(kind=4)[0:D.3961] * restrict) D.3964;
{
integer(kind=8) D.3940;
D.3940 = ubound.0;
{
integer(kind=8) S.2;
S.2 = 1;
while (1)
{
if (S.2 > D.3940) goto L.1;
(*automatic_array)[S.2 + -1] = 1.0e+0;
S.2 = S.2 + 1;
}
L.1:;
}
}
{
struct __st_parameter_dt dt_parm.3;
dt_parm.3.common.filename = &"array_dim_test.f90"[1]{lb: 1 sz: 1};
dt_parm.3.common.line = 27;
dt_parm.3.common.flags = 128;
dt_parm.3.common.unit = 6;
_gfortran_st_write (&dt_parm.3);
{
integer(kind=8) D.3944;
struct array01_real(kind=4) parm.4;
D.3944 = ubound.0;
parm.4.span = 4;
parm.4.dtype = {.elem_len=4, .rank=1, .type=3};
parm.4.dim[0].lbound = 1;
parm.4.dim[0].ubound = D.3944;
parm.4.dim[0].stride = 1;
parm.4.data = (void *) &(*automatic_array)[0];
parm.4.offset = -1;
_gfortran_transfer_array_write (&dt_parm.3, &parm.4, 4, 0);
}
_gfortran_st_write_done (&dt_parm.3);
}
}
finally
{
__builtin_free ((void *) automatic_array);
}
}
__attribute__((fn spec (". ")))
void large ()
{
static real(kind=4) large_array[65536];
{
integer(kind=8) S.5;
S.5 = 1;
while (1)
{
if (S.5 > 65536) goto L.2;
large_array[S.5 + -1] = 1.0e+0;
S.5 = S.5 + 1;
}
L.2:;
}
{
struct __st_parameter_dt dt_parm.6;
dt_parm.6.common.filename = &"array_dim_test.f90"[1]{lb: 1 sz: 1};
dt_parm.6.common.line = 19;
dt_parm.6.common.flags = 128;
dt_parm.6.common.unit = 6;
_gfortran_st_write (&dt_parm.6);
{
struct array01_real(kind=4) parm.7;
parm.7.span = 4;
parm.7.dtype = {.elem_len=4, .rank=1, .type=3};
parm.7.dim[0].lbound = 1;
parm.7.dim[0].ubound = 65536;
parm.7.dim[0].stride = 1;
parm.7.data = (void *) &large_array[0];
parm.7.offset = -1;
_gfortran_transfer_array_write (&dt_parm.6, &parm.7, 4, 0);
}
_gfortran_st_write_done (&dt_parm.6);
}
}
__attribute__((fn spec (". ")))
void small ()
{
real(kind=4) small_array[10];
{
integer(kind=8) S.8;
S.8 = 1;
while (1)
{
if (S.8 > 10) goto L.3;
small_array[S.8 + -1] = 1.0e+0;
S.8 = S.8 + 1;
}
L.3:;
}
{
struct __st_parameter_dt dt_parm.9;
dt_parm.9.common.filename = &"array_dim_test.f90"[1]{lb: 1 sz: 1};
dt_parm.9.common.line = 12;
dt_parm.9.common.flags = 128;
dt_parm.9.common.unit = 6;
_gfortran_st_write (&dt_parm.9);
{
struct array01_real(kind=4) parm.10;
parm.10.span = 4;
parm.10.dtype = {.elem_len=4, .rank=1, .type=3};
parm.10.dim[0].lbound = 1;
parm.10.dim[0].ubound = 10;
parm.10.dim[0].stride = 1;
parm.10.data = (void *) &small_array[0];
parm.10.offset = -1;
_gfortran_transfer_array_write (&dt_parm.9, &parm.10, 4, 0);
}
_gfortran_st_write_done (&dt_parm.9);
}
}
__attribute__((fn spec (". ")))
void MAIN__ ()
{
small ();
large ();
{
static integer(kind=4) C.3993 = 65536;
automatic (&C.3993);
}
}
__attribute__((externally_visible))
integer(kind=4) main (integer(kind=4) argc, character(kind=1) * * argv)
{
static integer(kind=4) options.11[7] = {2116, 4095, 0, 1, 1, 0, 31};
_gfortran_set_args (argc, argv);
_gfortran_set_options (7, &options.11[0]);
MAIN__ ();
return 0;
}
I am trying to communicate with a Bluetooth laser tag gun that takes data in 20 byte chunks, which are broken down into 16, 8 or 4-bit words. To do this, I made a UInt8 array and changed the values in there. The problem happens when I try to send the UInt8 array.
var bytes = [UInt8](repeating: 0, count: 20)
bytes[0] = commandID
if commandID == 240 {
commandID = 0
}
commandID += commandIDIncrement
print(commandID)
bytes[2] = 128
bytes[4] = UInt8(gunIDSlider.value)
print("Response: \(laserTagGun.writeValue(bytes, for: gunCControl, type: CBCharacteristicWriteType.withResponse))")
commandID is just a UInt8. This gives me the error, Cannot convert value of type '[UInt8]' to expected argument type 'Data', which I tried to solve by doing this:
var bytes = [UInt8](repeating: 0, count: 20)
bytes[0] = commandID
if commandID == 240 {
commandID = 0
}
commandID += commandIDIncrement
print(commandID)
bytes[2] = 128
bytes[4] = UInt8(gunIDSlider.value)
print("bytes: \(bytes)")
assert(bytes.count * MemoryLayout<UInt8>.stride >= MemoryLayout<Data>.size)
let data1 = UnsafeRawPointer(bytes).assumingMemoryBound(to: Data.self).pointee
print("data1: \(data1)")
print("Response: \(laserTagGun.writeValue(data1, for: gunCControl, type: CBCharacteristicWriteType.withResponse))")
To this, data1 just prints 0 bytes and I can see that laserTagGun.writeValue isn't actually doing anything by reading data from the other characteristics. How can I convert my UInt8 array to Data in swift? Also please let me know if there is a better way to handle 20 bytes of data than a UInt8 array. Thank you for your help!
It looks like you're really trying to avoid a copy of the bytes, if not, then just init a new Data with your bytes array:
let data2 = Data(bytes)
print("data2: \(data2)")
If you really want to avoid the copy, what about something like this?
let data1 = Data(bytesNoCopy: UnsafeMutableRawPointer(mutating: bytes), count: bytes.count, deallocator: .none)
print("data1: \(data1)")
I'm trying to do some binary file parsing in swift, and although i have things working I have a situation where i have variable fields.
I have all my parsing working in the default case
I grab
1-bit field
1-bit field
1-bit field
11-bits field
1-bit field
(optional) 4-bit field
(optional) 4-bit field
1-bit field
2-bit field
(optional) 4-bit field
5-bit field
6-bit field
(optional) 6-bit field
(optional) 24-bit field
(junk data - up until byte buffer 0 - 7 bits as needed)
Most of the data uses only a certain set of optionals so I've gone ahead and started writing classes to handle that data. My general approach is to create a pointer structure and then construct a byte array from that:
let rawData: NSMutableData = NSMutableData(data: input_nsdata)
var ptr: UnsafeMutablePointer<UInt8> = UnsafeMutablePointer<UInt8(rawData.mutableBytes)
bytes = UnsafeMutableBufferPointer<UInt8>(start: ptr, count: rawData.length - offset)
So I end up working with an array of [UInt8] and I can do my parsing in a way similar to:
let b1 = (bytes[3] & 0x01) << 5
let b2 = (bytes[4] & 0xF8) >> 3
return Int(b1 | b2)
So where I run into trouble is with the optional fields, because my data does not lie specifically on byte boundaries everything gets complicated. In the ideal world I would probably just work directly with the pointer and advance it by bytes as needed, however, there is no way that I'm aware of to advance a pointer by 3-bits - which brings me to my question
What is the best approach to handle my situation?
One idea i thought was to come up with various structures that reflect the optional fields, except I'm not sure in swift how to create bit-aligned packed structures.
What is my best approach here? For clarification - the initial 1-bit fields determine which of the optional fields are set.
If the fields do not lie on byte boundaries then you'll have to keep
track of both the current byte and the current bit position within a byte.
Here is a possible solution which allows to read an arbitrary number
of bits from a data array and does all the bookkeeping. The only
restriction is that the result of nextBits() must fit into an UInt
(32 or 64 bits, depending on the platform).
struct BitReader {
private let data : [UInt8]
private var byteOffset : Int
private var bitOffset : Int
init(data : [UInt8]) {
self.data = data
self.byteOffset = 0
self.bitOffset = 0
}
func remainingBits() -> Int {
return 8 * (data.count - byteOffset) - bitOffset
}
mutating func nextBits(numBits : Int) -> UInt {
precondition(numBits <= remainingBits(), "attempt to read more bits than available")
var bits = numBits // remaining bits to read
var result : UInt = 0 // result accumulator
// Read remaining bits from current byte:
if bitOffset > 0 {
if bitOffset + bits < 8 {
result = (UInt(data[byteOffset]) & UInt(0xFF >> bitOffset)) >> UInt(8 - bitOffset - bits)
bitOffset += bits
return result
} else {
result = UInt(data[byteOffset]) & UInt(0xFF >> bitOffset)
bits = bits - (8 - bitOffset)
bitOffset = 0
byteOffset = byteOffset + 1
}
}
// Read entire bytes:
while bits >= 8 {
result = (result << UInt(8)) + UInt(data[byteOffset])
byteOffset = byteOffset + 1
bits = bits - 8
}
// Read remaining bits:
if bits > 0 {
result = (result << UInt(bits)) + (UInt(data[byteOffset]) >> UInt(8 - bits))
bitOffset = bits
}
return result
}
}
Example usage:
let data : [UInt8] = ... your data ...
var bitReader = BitReader(data: data)
let b1 = bitReader.nextBits(1)
let b2 = bitReader.nextBits(1)
let b3 = bitReader.nextBits(1)
let b4 = bitReader.nextBits(11)
let b5 = bitReader.nextBits(1)
if b1 > 0 {
let b6 = bitReader.nextBits(4)
let b7 = bitReader.nextBits(4)
}
// ... and so on ...
And here is another possible implemention, which is a bit simpler
and perhaps more effective. It collects bytes into an UInt, and
then extracts the result in a single step.
Here the restriction is that numBits + 7 must be less or equal
to the number of bits in an UInt (32 or 64). (Of course UInt
can be replace by UInt64 to make it platform independent.)
struct BitReader {
private let data : [UInt8]
private var byteOffset = 0
private var currentValue : UInt = 0 // Bits which still have to be consumed
private var currentBits = 0 // Number of valid bits in `currentValue`
init(data : [UInt8]) {
self.data = data
}
func remainingBits() -> Int {
return 8 * (data.count - byteOffset) + currentBits
}
mutating func nextBits(numBits : Int) -> UInt {
precondition(numBits <= remainingBits(), "attempt to read more bits than available")
// Collect bytes until we have enough bits:
while currentBits < numBits {
currentValue = (currentValue << 8) + UInt(data[byteOffset])
currentBits = currentBits + 8
byteOffset = byteOffset + 1
}
// Extract result:
let remaining = currentBits - numBits
let result = currentValue >> UInt(remaining)
// Update remaining bits:
currentValue = currentValue & UInt(1 << remaining - 1)
currentBits = remaining
return result
}
}
I have to build a compressor based on the Huffman algorithm.
So far, I managed to create the tree with the frequencies of each character and generate a representation with a smaller number of bits for each character.
Is something like this, for the phrase "good this sugarplum":
'o' 000, '' 001, 't' 0100, 'r' 0101, 'p' 0110, 'm' 0111, 'l' 1000, 'i' 1001, 'h' 1010, 'd' 1011, 'a'1100, 'u' 1101, 'g' 1110, 's' 1111
The problem I'm having now is finding a way to save the tree in the archive, so I can rebuild it and then decompress the file.
Any suggestions?
I did some research but found it difficult to understand, so if you can explain in detail, I would appreciate it.
The code I used to read the frequencies from file is:
int main (int argc, char *argv[])
{
int i;
TipoSentinela *sentinela;
TipoLista *no = NULL;
Arv *arvore, *arvore2, *arvore3;
int *repete = (int *) calloc (256, sizeof(int));
if(argc == 2)
{
in = load_base(argv[1]);
le_dados_arquivo (repete); //read the frequencies from the file
sentinela = cria_lista (); //create a marker for the tree node list
for (i = 0; i < 256; i++)
{
if(repete[i] > 0 && i != 0)
{
arvore = arv_cria (Cria_info (i, repete[i])); //create a tree node with the character i and the frequence of it in the file
no = inicia_lista (arvore, no, sentinela); //create the list of tree nodes
}
}
Ordena (sentinela); //sort the tree nodes list by the frequencies
for(Seta_primeiro(sentinela); Tamanho_lista(sentinela) != 1; Move_marcador(sentinela))
{
Seta_primeiro(sentinela); //put the marker in the first element of the list
no = Retorna_marcador(sentinela);
arvore2 = Retorna_arvore (no); //return the tree represented by the list marker
Move_marcador(sentinela); //put the marker to the next element
arvore3 = Retorna_arvore (Retorna_marcador (sentinela)); //return the tree represented by the list marker
arvore = Cria_pai (arvore2, arvore3); //create a tree node that will contain the both arvore2 and arvore3
Insere_arvoreFinal (sentinela, arvore); //insert the node at the end of the list
Remove_arvore (sentinela); //remove the node arvore2 from the list
Remove_arvore (sentinela); //remove the node arvore3 from the lsit
Ordena (sentinela); //sort the list again
}
out = load_out(argv[1]); //open the output file
Codificacao (arvore); //generate the code from each node of the tree
rewind(in);
char c;
while(!feof(in))
{
c = fgetc(in);
if(c != EOF)
arvore2 = Procura_info (arvore, c); //search the character c in the tree
if(arvore2 != NULL)
imprimebit(Retorna_codigo(arvore2), out); //write the code in the file
}
fclose(in);
fclose(out);
free(repete);
arvore = arv_libera (arvore);
Libera_Lista(sentinela);
}
return 0;
}
//bit_counter and cur_byte are global variables
void write_bit (unsigned char bit, FILE *f)
{
static k = 0;
if(k != 0)
{
if(++bit_counter == 8)
{
fwrite(&cur_byte,1,1,f);
bit_counter = 0;
cur_byte = 0;
}
}
k = 1;
cur_byte <<= 1;
cur_byte |= ('0' != bit);
}
//aux is the code of a character in the tree
void imprimebit(char *aux, FILE *f)
{
int i, j;
if(aux == NULL)
return;
for(i = 0; i < strlen(aux); i++)
{
write_bit(aux[i], f); //write the bits of the code in the file
}
}
With this, I can write the code of all characters in the output file, but I can't see a way to store the tree too.
You don't need to send the tree. Just send the lengths. Then establish a consistent algorithm to convert the lengths to codes on both ends. The consistency is called a "canonical" Huffman code. You sort the codes by length, and within each length, sort by the symbol. Then assign codes starting at 0. So you would end up with (_ means space):
_ 000
o 001
a 0100
d 0101
g 0110
h 0111
i 1000
l 1001
m 1010
p 1011
r 1100
s 1101
t 1110
u 1111
I did found a way to store the code of each character.
For example:
I write the tree, starting by the root and going down to the left, then right.
So, if my tree was something like
0
/ \
0 1
/ \ / \
'a' 'b' 'c' 'd'
The header of my file would be someting like this:
001[8 bits from 'a']1[8 bits from b]01[8 bits from c]1[8 bits from d]
With this, I would be able to rebuild my tree.
My problem now is in read bit-by-bit of the header of file to know in wich direction I have to create a new node.
I have a struct, say:
type ASDF struct {
A uint64
B uint64
C uint64
D uint64
E uint64
F string
}
I create a slice of that struct: a := []ASDF{}
I do operations on that slice of the struct (adding/removing/updating structs that vary in contents); how can I get the total size in bytes (for memory) of the slice and its contents? Is there a built-in to do this or do I need to manually run a calculation using unsafe.Sizeof and then len each string?
Sum the size of all memory, excluding garbage collector and other overhead. For example,
package main
import (
"fmt"
"unsafe"
)
type ASDF struct {
A uint64
B uint64
C uint64
D uint64
E uint64
F string
}
func (s *ASDF) size() int {
size := int(unsafe.Sizeof(*s))
size += len(s.F)
return size
}
func sizeASDF(s []ASDF) int {
size := 0
s = s[:cap(s)]
size += cap(s) * int(unsafe.Sizeof(s))
for i := range s {
size += (&s[i]).size()
}
return size
}
func main() {
a := []ASDF{}
b := ASDF{}
b.A = 1
b.B = 2
b.C = 3
b.D = 4
b.E = 5
b.F = "ASrtertetetetetetetDF"
fmt.Println((&b).size())
a = append(a, b)
c := ASDF{}
c.A = 10
c.B = 20
c.C = 30
c.D = 40
c.E = 50
c.F = "ASetDF"
fmt.Println((&c).size())
a = append(a, c)
fmt.Println(len(a))
fmt.Println(cap(a))
fmt.Println(sizeASDF(a))
}
Output:
69
54
2
2
147
http://play.golang.org/p/5z30vkyuNM
I'm afraid to say that unsafe.Sizeof is the way to go here if you want to get any result at all. The in-memory size of a structure is nothing you should rely on. Notice that even the result of unsafe.Sizeof is inaccurate: The runtime may add headers to the data that you cannot observe to aid with garbage collection.
For your particular example (finding a cache size) I suggest you to go with a static size that is sensible for many processors. In almost all cases doing such micro-optimizations is not going to pay itself off.