binary.Read() not producing expected values in struct - parsing

I'm trying to make a MOBI file parser and I'm running into a bit of an issue with trying to parse some of the binary into a struct with binary.Read().
I'm thinking it's an alignment issue, but I'm at a loss for why I'm not getting expected values. I've run the .mobi file through libmobi to test my code's output against, as well as inspected the binary of the .mobi in order to verify that I'm not crazy and the libmobi code wasn't doing something weird (which it's not).
Here's a stripped-down example:
package main
import (
"bytes"
"encoding/binary"
"fmt"
)
type Header struct {
Type [4]byte
Creator [4]byte
Uid uint32
Next uint32
RecordCount uint16
}
func main() {
testBytes := []byte{66, 79, 79, 75, 77, 79, 66, 73, 0, 0, 1, 17, 0, 0, 0, 0, 0, 136}
h := Header{}
buf := bytes.NewBuffer(testBytes)
binary.Read(buf, binary.LittleEndian, &h)
fmt.Printf("%s\n", h.Type) // BOOK, as expected
fmt.Printf("%s\n", h.Creator) // MOBI, as expected
fmt.Printf("%d\n", h.Next) // 0, as expected
fmt.Printf("%d\n", h.Uid)
// expecting Uid to be 273, but it's 285278208...
fmt.Printf("%d\n", h.RecordCount)
// expecting RecordCount to be 136, but it's 34816...
}
Any help would be greatly appreciated!
EDIT: here are the hex bytes from doing a xxd on book.mobi:
424f 4f4b 4d4f 4249 0000 0111 0000 0000 0088

Yes BigEndian works much better
package main
import (
"bytes"
"encoding/binary"
"fmt"
)
type Header struct {
Type [4]byte
Creator [4]byte
Uid uint32
Next uint32
RecordCount uint16
}
func main() {
testBytes := []byte{66, 79, 79, 75, 77, 79, 66, 73, 0, 0, 1, 17, 0, 0, 0, 0, 0, 136}
h := Header{}
buf := bytes.NewBuffer(testBytes)
binary.Read(buf, binary.BigEndian, &h)
fmt.Printf("%s\n", h.Type) // BOOK, as expected
fmt.Printf("%s\n", h.Creator) // MOBI, as expected
fmt.Printf("%d\n", h.Next) // 0, as expected
fmt.Printf("%d\n", h.Uid)
// expecting Uid to be 273, but it's 285278208...
fmt.Printf("%d\n", h.RecordCount)
// expecting RecordCount to be 136, but it's 34816...
}

Related

KafkaConsumer receives other data (DelphiKafkaClient)

With DelphiKafkaClient I tested this code:
ABytes := TFile.ReadAllBytes('d:\red.bmp')
FKafkaProducer.Produce(edtTopic.Text, #abytes, length(ABytes))
//bytes: 66, 77, 186, 1, ...
But in TfrmConsume.Start, line 183 the content differs:
MsgRec.Data := TKafkaUtils.PointerToBytes(Msg.payload, Msg.len)
//bytes 152, 231, 229, 8, ...
However, both data lengths match! What am I doing wrong?

How do I find an error in the while loop?

I need help. I wrote the code and checked it a hundred times. But I can't find the error. All code before the while loop works without errors. Error in the loop itself. When you run it, you get an infinite loop.
I would be grateful if you could tell me where I made the mistake and why it turns out to be an infinite loop.
class Semis {
int i;
double k;
Semis(this.i, this.k);
}
void main() {
var p = [
0, 1, 5, 8, 9, 10, 17, 17, 20, 24, // 0X's
30, 32, 35, 39, 43, 43, 45, 49, 50, 54, // 1X's
57, 60, 65, 68, 70, 74, 80, 81, 84, 85, // 2X's
87, 91, 95, 99, 101, 104, 107, 112, 115, 116, // 3X's
119, 121, 125, 129, 131, 134, 135, 140, 143, 145, // 4X's
151
];
Function cutLog = (List p, int n) {
// Some array to store calculated values
num sum = 0;
int iter = 0;
int stock = n;
List<Semis> pL = [];
var map = Map.fromIterable(p,
key: (index) => p.indexOf(index),
value: (item) => item / (p.indexOf(item) > 0 ? p.indexOf(item) : 1));
var sortedMap = Map.fromEntries(
map.entries.toList()..sort((e1, e2) => e2.value.compareTo(e1.value)));
sortedMap.forEach(
(i, k) => pL.isEmpty || pL.last.i > i ? pL.add(Semis(i, k)) : null);
while (stock > 0) {
if ((stock - pL[iter].i) > 0) {
sum = sum + p[pL[iter].i];
stock = stock - pL[iter].i;
} else
iter++;
}
return sum; // Good luck intern!
};
print(cutLog(p, 5));
}
You get an infinite loop because the condition of the loop never fails.
The condition is that stock > 0. However, what you do in the loop is:
If stock minus some value is >0, you decrement stock. Therefore stock remains higher than 0
Else you increment the iterator.
You never actually allow stock to be decremented enough so that it becomes 0. I think your if comparison should be made using >= 0, if that seems logical for your algorithm. If not, then you probably need to rework it more.
If you look at the list of Semis you create, its last element is Semis(0, 0.0).
That means that your loop, will eventually reach this, and then
if ((stock - pL[iter].i) > 0) {
sum = sum + p[pL[iter].i];
stock = stock - pL[iter].i;
will do nothing because pL[iter].i is zero.
You probably need to bail out of the loop at this point.
Your loop will, as Lyubomir Vasilev says, never have a false condition because stock never reaches zero. Your iter is increment, and if it hadn't been for the Semis(0, _) entry, you would eventually have run past the end of the pL array and gotten an index-out-of-range error. With that zero-value in the loop, you will run forever.

How to read byte array data in Dart?

connecting TCP Socket server and sending Request. and also Server sends the response in Byte array. How to read byte array data in dart.
Socket.connect('localhost', 8081)
.then((socket) {
//Establish the onData, and onDone callbacks
socket.listen((data) {
print(new String.fromCharCodes(data).trim()); //Here data is byte[]
//How to read byte array data
},
onDone: () {
print("Done");
// socket.destroy();
},
onError: (e) {
print('Server error: $e');
});
socket.add([255, 12, 0, 11, 0, 9, 34, 82, 69, 70, 84, 65, 72, 73, 76]);
});
}
It depends on with data type was encoded to bytes. Let's suppose it's String
Then you can do it with dart:convert library.
import 'dart:convert' show utf8;
final decoded = utf8.decode(data);
It's pretty clear that there's a message structure in those bytes. You give two examples of messages:
[255, 12, 0, 11, 0, 9, 34, 82, 69, 70, 84, 65, 72, 73, 76]
and
[255, 20, 0, 11, 0, 0, 0, 15, 80, 82, 69, 77, 84, 65, 72, 73, 76, 45, 53, 53, 57, 55, 48]
Both start with 255, followed by what looks like two or three little endian 16 bit words (12 and 11) and (20, 11 and 0) followed by a string, who's length is encoded in a leading byte. If you are expected to inter-operate with another system, you really need the protocol spec.
Assuming I've guessed the structure correctly, this code
main() {
Uint8List input = Uint8List.fromList([
255,
20,
0,
11,
0,
0,
0,
15,
80,
82,
69,
77,
84,
65,
72,
73,
76,
45,
53,
53,
57,
55,
48
]);
ByteData bd = input.buffer.asByteData();
print(bd.getUint16(1, Endian.little)); // print the first short
print(bd.getUint16(3, Endian.little)); // and the second
print(bd.getUint16(5, Endian.little)); // and the third
int stringLength = input[7]; // get the length of the string
print(utf8.decode(input.sublist(8, 8 + stringLength))); // decode the string
}
produces
20
11
0
PREMTAHIL-55970
as expected

Thread 1: Fatal error: UnsafeMutablePointer.initialize overlapping range

I am trying to use
UnsafeMutablePointer<UnsafeMutablePointer<Float>?>!
as it is required for a parameter for a method i need to use. Yet I have no idea what this is or how to use it.
I created this value by doing this :
var bytes2: [Float] = [39, 77, 111, 111, 102, 33, 39, 0]
let uint8Pointer2 = UnsafeMutablePointer<Float>.allocate(capacity: 8)
uint8Pointer2.initialize(from: &bytes2, count: 8)
var bytes: [Float] = [391, 771, 1111, 1111, 1012, 331, 319, 10]
var uint8Pointer = UnsafeMutablePointer<Float>?.init(uint8Pointer2)
uint8Pointer?.initialize(from: &bytes, count: 8)
let uint8Pointer1 = UnsafeMutablePointer<UnsafeMutablePointer<Float>?>!.init(&uint8Pointer)
uint8Pointer1?.initialize(from: &uint8Pointer, count: 8)
But I get the error :
Thread 1: Fatal error: UnsafeMutablePointer.initialize overlapping range
What am I doing wrong?
You are creating bad behaviour..
var bytes2: [Float] = [39, 77, 111, 111, 102, 33, 39, 0]
let uint8Pointer2 = UnsafeMutablePointer<Float>.allocate(capacity: 8)
uint8Pointer2.initialize(from: &bytes2, count: 8)
Creates a pointer to some memory and initializes that memory to the values stored in bytes2..
So: uint8Pointer2 = [39, 77, 111, 111, 102, 33, 39, 0]
Then you decided to create a pointer that references that pointer's memory:
var uint8Pointer = UnsafeMutablePointer<Float>?.init(uint8Pointer2)
So if you printed uint8Pointer, it would have the EXACT same values as uint8Pointer2.. If you decided to change any of its values as well, it'd also change the values of uint8Pointer2..
So when you do:
var bytes: [Float] = [391, 771, 1111, 1111, 1012, 331, 319, 10]
uint8Pointer?.initialize(from: &bytes, count: 8)
It overwrote the values of uint8Pointer2 with [391, 771, 1111, 1111, 1012, 331, 319, 10]..
So far, uint8Pointer is just a shallow copy of uint8Pointer2.. Changing one affects the other..
Now you decided to do:
let uint8Pointer1 = UnsafeMutablePointer<UnsafeMutablePointer<Float>?>!.init(&uint8Pointer)
uint8Pointer1?.initialize(from: &uint8Pointer, count: 8)
Here you created a pointer (uint8Pointer1) to uint8Pointer and you said uint8Pointer1 initialize with uint8Pointer.. but you're initializing a pointer with a pointer to itself and a count of 8..
First of all, don't bother calling initialize on a pointer to pointer with a value of itself.. It's already pointing to the correct values..
What's nice is that:
uint8Pointer1?.initialize(from: &uint8Pointer, count: 1)
//Same as: memcpy(uint8Pointer1, &uint8Pointer, sizeof(uint8Pointer)`
//However, they both point to the same memory address..
will crash, but:
uint8Pointer1?.initialize(from: &uint8Pointer)
//Same as: `uint8Pointer1 = uint8Pointer`.. Note: Just a re-assignment.
won't.. because it doesn't do a memcpy for the latter.. whereas the former does.
Hopefully I explained it correctly..
P.S. Name your variables properly!
Translation for the C++ people:
//Initial pointer to array..
float bytes2[] = {39, 77, 111, 111, 102, 33, 39, 0};
float* uint8Pointer2 = &bytes[2];
memcpy(uint8Pointer2, &bytes2[0], bytes2.size() * sizeof(float));
//Shallow/Shadowing Pointer...
float* uint8Pointer = uint8Pointer2;
float bytes[] = {391, 771, 1111, 1111, 1012, 331, 319, 10};
memcpy(uint8Pointer, &bytes[0], bytes.size() * sizeof(float));
//Pointer to pointer..
float** uint8Pointer1 = &uint8Pointer;
//Bad.. uint8Pointer1 and &uint8Pointer is the same damn thing (same memory address)..
//See the line above (float** uint8Pointer1 = &uint8Pointer)..
memcpy(uint8Pointer1, &uint8Pointer, 8 * sizeof(uint8Pointer));
//The memcpy is unnecessary because it already pointers to the same location.. plus it's also wrong lol.

How to convert Google spreadsheet's worksheet string id to integer index (GID)?

To export google spreadsheet's single worksheet to CSV, integer worksheet index(GID) is required to be passed.
https://spreadsheets.google.com/feeds/download/spreadsheets/Export?key=%s&gid=%d&exportFormat=csv
But, where are those informations? With gdata.spreadsheets.client, I could find some string id for worksheet like "oc6, ocv, odf".
client = gdata.spreadsheets.client.SpreadsheetsClient()
feed = client.GetWorksheets(spreadsheet, auth_token=auth_token)
And it returns below atom XML. (part of it)
<entry gd:etag=""URJFCB1NQSt7ImBoXhU."">
<id>https://spreadsheets.google.com/feeds/worksheets/0AvhN_YU3r5e9dGpTWGx3UVU3MTczaXJuNEFKQjMwN2c/ocw</id>
<updated>2012-06-21T08:19:46.587Z</updated>
<app:edited xmlns:app="http://www.w3.org/2007/app">2012-06-21T08:19:46.587Z</app:edited>
<category scheme="http://schemas.google.com/spreadsheets/2006" term="http://schemas.google.com/spreadsheets/2006#worksheet"/>
<title>AchievementType</title>
<content type="application/atom+xml;type=feed" src="https://spreadsheets.google.com/feeds/list/0AvhN_YU3r5e9dGpTWGx3UVU3MTczaXJuNEFKQjMwN2c/ocw/private/full"/>
<link rel="http://schemas.google.com/spreadsheets/2006#cellsfeed" type="application/atom+xml" href="https://spreadsheets.google.com/feeds/cells/0AvhN_YU3r5e9dGpTWGx3UVU3MTczaXJuNEFKQjMwN2c/ocw/private/full"/>
<link rel="http://schemas.google.com/visualization/2008#visualizationApi" type="application/atom+xml" href="https://spreadsheets.google.com/tq?key=0AvhN_YU3r5e9dGpTWGx3UVU3MTczaXJuNEFKQjMwN2c&sheet=ocw"/>
<link rel="self" type="application/atom+xml" href="https://spreadsheets.google.com/feeds/worksheets/0AvhN_YU3r5e9dGpTWGx3UVU3MTczaXJuNEFKQjMwN2c/private/full/ocw"/>
<link rel="edit" type="application/atom+xml" href="https://spreadsheets.google.com/feeds/worksheets/0AvhN_YU3r5e9dGpTWGx3UVU3MTczaXJuNEFKQjMwN2c/private/full/ocw"/>
<gs:rowCount>280</gs:rowCount>
<gs:colCount>28</gs:colCount>
</entry>
Also I tried with sheet parameter but failed with "Invalid Sheet" error.
https://spreadsheets.google.com/feeds/download/spreadsheets/Export?key=%s&sheet=XXX&exportFormat=csv
I guess there should be some magic function but could not find it. How can I convert them to integer id?? Or Can I export worksheet with string id?
EDIT: I just made convert table with python. DIRTY but working :-(
GID_TABLE = {
'od6': 0,
'od7': 1,
'od4': 2,
'od5': 3,
'oda': 4,
'odb': 5,
'od8': 6,
'od9': 7,
'ocy': 8,
'ocz': 9,
'ocw': 10,
'ocx': 11,
'od2': 12,
'od3': 13,
'od0': 14,
'od1': 15,
'ocq': 16,
'ocr': 17,
'oco': 18,
'ocp': 19,
'ocu': 20,
'ocv': 21,
'ocs': 22,
'oct': 23,
'oci': 24,
'ocj': 25,
'ocg': 26,
'och': 27,
'ocm': 28,
'ocn': 29,
'ock': 30,
'ocl': 31,
'oe2': 32,
'oe3': 33,
'oe0': 34,
'oe1': 35,
'oe6': 36,
'oe7': 37,
'oe4': 38,
'oe5': 39,
'odu': 40,
'odv': 41,
'ods': 42,
'odt': 43,
'ody': 44,
'odz': 45,
'odw': 46,
'odx': 47,
'odm': 48,
'odn': 49,
'odk': 50,
'odl': 51,
'odq': 52,
'odr': 53,
'odo': 54,
'odp': 55,
'ode': 56,
'odf': 57,
'odc': 58,
'odd': 59,
'odi': 60,
'odj': 61,
'odg': 62,
'odh': 63,
'obe': 64,
'obf': 65,
'obc': 66,
'obd': 67,
'obi': 68,
'obj': 69,
'obg': 70,
'obh': 71,
'ob6': 72,
'ob7': 73,
'ob4': 74,
'ob5': 75,
'oba': 76,
'obb': 77,
'ob8': 78,
'ob9': 79,
'oay': 80,
'oaz': 81,
'oaw': 82,
'oax': 83,
'ob2': 84,
'ob3': 85,
'ob0': 86,
'ob1': 87,
'oaq': 88,
'oar': 89,
'oao': 90,
'oap': 91,
'oau': 92,
'oav': 93,
'oas': 94,
'oat': 95,
'oca': 96,
'ocb': 97,
'oc8': 98,
'oc9': 99
}
I found your question looking for a solution to the same problem, and was surprised that those worksheet IDs actually correspond 1:1 to gids - I originally assumed they were assigned independently, instead of being an exercise in obfuscation.
I was able to find a slightly cleaner solution by reverse-engineering the formula they use to generate worksheet IDs from your table:
worksheetID = (gid xor 31578) encoded in base 36
So, some Python to go from a worksheet ID to gid:
def to_gid(worksheet_id):
return int(worksheet_id, 36) ^ 31578
This is still dirty, but will work for GIDs higher than 99 without requiring giant tables. At least as long as they don't change the generation logic (which they probably won't, as it would break existing IDs that people already use).
This code works with the new Google Sheets.
// Conversion of Worksheet Ids to GIDs and vice versa
// od4 > 2
function wid_to_gid(wid) {
var widval = wid.length > 3 ? wid.substring(1) : wid;
var xorval = wid.length > 3 ? 474 : 31578;
return parseInt(String(widval), 36) ^ xorval;
}
// 2 > od4
function gid_to_wid(gid) {
var xorval = gid > 31578 ? 474 : 31578;
var letter = gid > 31578 ? 'o' : '';
return letter + parseInt((gid ^ xorval)).toString(36);
}
I cannot add a comment to Wasilewski's post because apparently I lack reputation so here are the two conversion functions in Javascript based on Wasilewski's answer:
// Conversion of Worksheet Ids to GIDs and vice versa
// od4 > 2
function wid_to_gid(wid) {
return parseInt(String(wid),36)^31578
}
// 2> 0d4
function gid_to_wid(gid) {
// (gid xor 31578) encoded in base 36
return parseInt((gid^31578)).toString(36);
}
This is a Java adaptation of Buho's code which works with both the new Google Sheets and with the legacy Google Spreadsheets.
// "od4" to 2 (legacy style)
// "ogtw0h0" to 1017661118 (new style)
public static int widToGid(String worksheetId) {
boolean idIsNewStyle = worksheetId.length() > 3;
// if the id is in the new style, first strip the first character before converting
worksheetId = idIsNewStyle ? worksheetId.substring(1) : worksheetId;
// determine the integer to use for bitwise XOR
int xorValue = idIsNewStyle ? 474 : 31578;
// convert to gid
return Integer.parseInt(worksheetId, 36) ^ xorValue;
}
// Convert 2 to "od4" (legacy style)
// Convert 1017661118 to "ogtw0h0" (new style)
public static String gidToWid(int gid) {
boolean idIsNewStyle = gid > 31578;
// determine the integer to use for bitwise XOR
int xorValue = idIsNewStyle ? 474 : 31578;
// convert to worksheet id, prepending 'o' if it is the new style.
return
idIsNewStyle ?
'o' + Integer.toString((worksheetIndex ^ xorValue), 36):
Integer.toString((worksheetIndex ^ xorValue), 36);
}
This is a Clojure adaptation of Buho's and Julie's code which should work with both the new Google Sheets and with the legacy Google Spreadsheets.
(defn wid->gid [wid]
(let [new-wid? (> (.length wid) 3)
wid (if new-wid? (.substring wid 1) wid)
xor-val (if new-wid? 474 31578)]
(bit-xor (Integer/parseInt wid 36) xor-val)))
(defn gid->wid [gid]
(let [new-gid? (> gid 31578)
xor-val (if new-gid? 474 31578)
letter (if new-gid? "o" "")]
(str letter (Integer/toString (bit-xor gid xor-val) 36))))
If you're using Python with gspread, here's what you do:
wid = worksheet.id
widval = wid[1:] if len(wid) > 3 else wid
xorval = 474 if len(wid) > 3 else 31578
gid = int(str(widval), 36) ^ xorval
I'll probably open a PR for this.

Resources