Understanding Luajit SNAP IR instruction - lua

I am trying to trace some register coalescing too complex NYI in my luajit code. From the IR in can see that the snapshot when the NYI happens is pretty full. My attempt is to trace backwards and to find out what causes the snapshot to be filled up.
To start with I am looking to understand what information is given out by the SNAP line. for example in a SNAP line below:
> local x = 1.2 for i=1,1e3 do x = x * -3 end
---- TRACE 1 start stdin:1
0006 MULVN 0 0 1 ; -3
0007 FORL 1 => 0006
---- TRACE 1 IR
.... SNAP #0 [ ---- ]
0001 rbp int SLOAD #2 CI
0002 xmm7 > num SLOAD #1 T
0003 xmm7 + num MUL 0002 -3
0004 rbp + int ADD 0001 +1
.... SNAP #1 [ ---- 0003 ]
0005 > int LE 0004 +1000
.... SNAP #2 [ ---- 0003 0004 ---- ---- 0004 ]
0006 ------------ LOOP ------------
0007 xmm7 + num MUL 0003 -3
0008 rbp + int ADD 0004 +1
.... SNAP #3 [ ---- 0007 ]
0009 > int LE 0008 +1000
0010 rbp int PHI 0004 0008
0011 xmm7 num PHI 0003 0007
If my understanding is correct, in first snapshot second position is written by IR at 0003. Going by the argument of IR at 0003 I guess 0002 (is this a memory location?) is x.
What I do not understand is that in second snapshot line (after IR 0005) 3rd and 6th position is modified by IR at 0004. How is that?
Now, how can I trace which variables are present in a snapshot position in above IR? For eg: in SNAP #7 [ ---- 0007 ].
Also what does the second argument to SLOAD (flags) signify? [I, CI, CRI, T, PI, PRI, R, RI] etc... I have also seen SLOAD with second argument empty.

This has been extensively answerd at luajit mail list by Peter Cawley in the following thread
https://www.freelists.org/post/luajit/Understanding-SNAP

Related

LUA bad argument #2

i am a total beginner with LUA / ESP8266 and i am trying to find out where this error comes from:
PANIC: unprotected error in call to Lua API (bad argument #2 to 'set' (index out of range))
This is the Whole Message in serial monitor:
NodeMCU 2.2.0.0 built with Docker provided by frightanic.com
.branch: master
.commit: 11592951b90707cdcb6d751876170bf4da82850d
.SSL: false
.Build type: float
.LFS: disabled
.modules: adc,bit,dht,file,gpio,i2c,mqtt,net,node,ow,spi,tmr,uart,wifi
build created on 2019-12-07 23:52
powered by Lua 5.1.4 on SDK 2.2.1(6ab97e9)
> Config done, IP is 192.168.2.168
LED-Server started
PANIC: unprotected error in call to Lua API (bad argument #2 to 'set' (index out of range))
ets Jan 8 2013,rst cause:2, boot mode:(3,6)
load 0x40100000, len 27780, room 16
tail 4
chksum 0xbc
load 0x3ffe8000, len 2188, room 4
tail 8
chksum 0xba
load 0x3ffe888c, len 136, room 0
tail 8
chksum 0xf2
csum 0xf2
å¬ú‰.Éo‰ísÉÚo|Ï.å.õd$`..#íú..æÑ2rí.lúN‡.Éo„..l`.Ñ‚r€lÑ$.å...l`.Ñ‚s≤pɉ$.å....l`.Ñ‚r€l.èæ.å...$l`.{$é.êo.Ñü¬cc.ÑÑ".|l.Bè.c.‰è¬.lc‰ÚnÓ.2NN‚....å#€‚n.ÏéÑ.l..$Ïådè|Ïl.é.lÄ.o¸.Ñæ.#".llÏÑè..c...åû„åc.l.Ñb.{$r.
I Uploaded this code (https://github.com/Christoph-D/esp8266-wakelight) to the ESP8266, and did build the correct NodeMCU firmware with all required modules.
The Serial output is ok for a couple of seconds, then i get this error and it starts to repeat rebooting.
Where would i start looking for the Problem?
Thanks a lot!!!
EDIT: there are only a few places in the lua files where anything about "set" is written:
local function update_buffer(buffer, c)
if not c.r_frac then c.r_frac = 0 end
if not c.g_frac then c.g_frac = 0 end
if not c.b_frac then c.b_frac = 0 end
local r2 = c.r_frac >= 0 and c.r + 1 or c.r - 1
local g2 = c.g_frac >= 0 and c.g + 1 or c.g - 1
local b2 = c.b_frac >= 0 and c.b + 1 or c.b - 1
local r3, g3, b3
local set = buffer.set
for i = 1, NUM_LEDS do
if i > c.r_frac then r3 = c.r else r3 = r2 end
if i > c.g_frac then g3 = c.g else g3 = g2 end
if i > c.b_frac then b3 = c.b else b3 = b2 end
set(buffer, i - 1, g3, r3, b3)
end
end
Is there anything wrong?
Just above the for-loop where set is called, try adding this:
print(buffer:size(), NUM_LEDS)
If everything is OK, it should print the same number twice. If NUM_LEDS is larger, then that's your bug.
I don't really get why it uses the global variable in that place anyway; it'd make much more sense to use buffer:size() instead for exactly this reason.

How to compute memory displacement in assembly?

I've been working on yasm assembly language and I generated a listing file that contains the following. I need help understanding how the memory displacement is computed in the first column. Thanks in advance.
1 %line 1+1 memory.asm
2 [section .data]
3 00000000 04000000 a dd 4
4 00000004 CDCC8C40 b dd 4.4
5 00000008 00000000<rept> c times 10 dd 0
6 00000030 01000200 d dw 1, 2
7 00000034 FB e db 0xfb
8 00000035 68656C6C6F20776F72- f db "hello world", 0
9 00000035 6C6400
Assembler is producing bytes (machine code), starting at some start address (here 0) and laying them next to each other. So first a dd 4 produces 4 bytes of data 04 00 00 00, thus memory at addresses 0, 1, 2 and 3 are filled up. Next free slot is at address 4. There goes b dd 4.4, again 4 bytes long. c times 10 dd 0 is 40 bytes long, so 8+40 = 48 (0x30) => next free slot.

Memory range calculation

I have a question about calculating memory addresses:
I am given 3 Memory blocks:
- 1x 1KByte (IC1) - 2^10 Byte
- 2x 4KByte (IC2 + IC3) 2^12 Byte
So far I calculated these memory addresses:
IC1:
0000 0000 0000 0000 (Starting adress)
0000 0011 1111 1111 (Ending adress, I got this from inverting the last 10 digits)
IC2:
0000 0100 0000 0000 (Starting adress)- Last ending adress +1
0000 1011 1111 1111 (Ending adress, I got this from inverting the last 12 digits)
However, at IC3 there has to be some method to get a carry bit into my first 0000-block, as I am running out of space when only using 3 the last 3 hex digits:
IC2:
0000 1100 0000 0000 (Starting adress)- Last ending adress +1
What is the ending address now? If I would invert the last 12 digits again, I would get a hex address which is already in use. It's pretty obvious that the next hex digit has to be increased to 1, but I can't find a rule to do this.
Any advice?
I'm not sure why you're using bit flipping for this, it looks like it should be a very efficient implementation if it works, but it doesn't seem to:
Your IC2 block starting address (in Hex) is 400 (which is 1K from the start of memory, all good so far), but the ending address in hex is BFF when it should be 13FF (1k+4k = 5k) in binary that is 0001 0011 1111 1111
Is there a reason why you cannot calculate these addresses using addition instead of bit-flipping?

Print Bitmap to ESC/POS printer

I am trying to print to a ESC/POS compatible printer and am struggling to get my head around GS v 0. I have just connected from a Mac and sending commands is hex via CoolTerm.
The docs say ...
GS v 0 m xL xH yL yH d1....dk
-----------------------------------------------------
[Name] Print raster bit image
[Format] ASCII GS v 0 m xL xH yL yH d1....dk
Hex 1D 76 30 m xL xH yL yH d1....dk
Decimal 29 118 48 m xL xH yL yH d1....dk
[Range] 0≤xL≤48, xH=0; 0≤yL≤255, yH=0; 0≤d≤255
k=(xL+xH×256)×(yL+yH×256)(k≠0)
[Description] Selects Raster bit-image mode. The value of m selects the mode, as follows:
+------+------------+----------------------------+---------------------------+
| m | MODE | Vertical Dot Density | Horizontal Dot density |
+------+------------+----------------------------+---------------------------+
|0, 48 | Normal | 200 DPI | 200 DPI |
+------+------------+----------------------------+---------------------------+
|1, 49 |Double-width| 200 DPI | 100 DPI |
+------+-------------+---------------------------+---------------------------+
|2, 50 |Double-height| 100 DPI | 200 DPI |
+------+-------------+---------------------------+---------------------------+
|3, 51 | Quadruple | 100 DPI | 100 DPI |
+------+-------------+---------------------------+---------------------------+
• xL, xH, select the number of data bits ( xL+ xH × 256) in the horizontal direction for the bit image.
• yL, yH, select the number of data bits ( yL+ yH × 256) in the vertical direction for the bit image.
• This command has no effect in all print modes (character size, emphasized, double-strike, upside-down, underline, white/black reverse printing, etc.) for raster bit image.
• The part of bit image that exceeds the printable area will not be printed.
• d indicates the bit-image data. Set time a bit to 1 prints a dot and setting it to 0 does not print a dot.
So from this I deduce I need to send the following in HEX
1D 76 30 30 20 00 00 01
Does the image data now follow this, and do I have to send a message saying the image has ended?
I remember there is <ESC>K (not GS) command to print 8 lines of pixels. See ESC commands for details. After K must be sent 2 bytes - number of data bytes and the real data. But it is not guaranteed whether every printer will support this. What is the brand and model of your?
The docs say:
Hex 1D 76 30 m xL xH yL yH d1....dk
with m being 0x48..0x51, so you can't send
1D 76 30 30 20 00 00 01
but rather, for instance
1D 76 30 48 20 00 00 01
I don't think it makes much sense that xH and yL are 0.

How to calculate Internet checksum?

I have a question regarding how the Internet checksum is calculated. I couldn't find any good explanation from the book, so I ask it here.
Have a look at the following example.
The following two messages are sent: 10101001 and 00111001. The checksum is calculated with 1's complement. So far I understand. But how is the sum calculated? At first I thought it maybe is XOR, but it seems not to be the case.
10101001
00111001
--------
Sum 11100010
Checksum: 00011101
And then when they calculate if the message arrived OK. And once again how is the sum calculated?
10101001
00111001
00011101
--------
Sum 11111111
Complement 00000000 means that the pattern is O.K.
It uses addition, hence the name "sum". 10101001 + 00111001 = 11100010.
For example:
+------------+-----+----+----+----+---+---+---+---+--------+
| bin value | 128 | 64 | 32 | 16 | 8 | 4 | 2 | 1 | result |
+------------+-----+----+----+----+---+---+---+---+--------+
| value 1 | 1 | 0 | 1 | 0 | 1 | 0 | 0 | 1 | 169 |
| value 2 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 1 | 57 |
| sum/result | 1 | 1 | 1 | 0 | 0 | 0 | 1 | 0 | 226 |
+------------+-----+----+----+----+---+---+---+---+--------+
If by internet checksum you mean TCP Checksum there's a good explanation here and even some code.
When you're calculating the checksum remember that it's not just a function of the data but also of the "pseudo header" which puts the source IP, dest IP, protocol, and length of the TCP packet into the data to be checksummed. This ties the TCP meta data to some data in the IP header.
TCP/IP Illustrated Vol 1 is a good reference for this and explains it all in detail.
The calculation of the internet checksum uses ones complement arithmetic. Consider the data being checksummed is a sequence of 8 bit integers. First you need to add them using ones complement arithmetic and take the ones complement of the result.
NOTE: When adding numbers ones complement arithmetic, a carryover from the MSB needs to be added to the result. Consider for eg., the addition of 3(0011) and 5(0101).
3'->1100
5'->1010
0110 with a carry of 1
Thus we have, 0111(1's complement representation of -8).
The checksum is the 1's complement of the result obtained int he previous step. Hence we have 1000. If no carry exists, we just complement the result obtained in the summing stage.
The UDP checksum is created on the sending side by summing all the 16-bit words in the segment, with any overflow being wrapped around and then the 1's complement is performed and the result is added to the checksum field inside the segment.
at the receiver side, all words inside the packet are added and the checksum is added upon them if the result is 1111 1111 1111 1111 then the segment is valid else the segment has an error.
exmaple:
0110 0110 0110 0000
0101 0101 0101 0101
1000 1111 0000 1100
--------------------
1 0100 1010 1100 0001 //there is an overflow so we wrap it up, means add it to the sum
the sum = 0100 1010 1100 0010
now let's take the 1's complement
checksum = 1011 0101 0011 1101
at the receiver the sum is calculated and then added to the checksum
0100 1010 1100 0010
1011 0101 0011 1101
----------------------
1111 1111 1111 1111 //clearly this should be the answer, if it isn't then there is an error
references:Computer networking a top-down approach[Ross-kurose]
Here's a complete example with a real header of an IPv4 packet.
In the following example, I use bc, printf and here strings to calculate the header checksum and verify it. Consequently, it should be easy to reproduce the results on Linux by copy-pasting the commands.
These are the twenty bytes of our example packet header:
45 00 00 34 5F 7C 40 00 40 06 [00 00] C0 A8 B2 14 C6 FC CE 19
The sender hasn't calculated the checksum yet. The two bytes in square brackets is where the checksum will go. The checksum's value is initially set to zero.
We can mentally split up this header as a sequence of ten 16-bit values: 0x4500, 0x0034, 0x5F7C, etc.
Let's see how the sender of the packet calculates the header checksum:
Add all 16-bit values to get 0x42C87: bc <<< 'obase=16;ibase=16;4500 + 0034 + 5F7C + 4000 + 4006 + 0000 + C0A8 + B214 + C6FC + CE19'
The leading digit 4 is the carry count, we add this to the rest of the number to get 0x2C8B: bc <<< 'obase=16;ibase=16;2C87 + 4'
Invert¹ 0x2C8B to get the checksum: 0xD374
Finally, insert the checksum into the header:
45 00 00 34 5F 7C 40 00 40 06 [D3 74] C0 A8 B2 14 C6 FC CE 19
Now the header is ready to be sent.
The recipient of the IPv4 packet then creates the checksum of the received header in the same way:
Add all 16-bit values to get 0x4FFFB: bc <<< 'obase=16;ibase=16;4500 + 0034 + 5F7C + 4000 + 4006 + D374 + C0A8 + B214 + C6FC + CE19'
Again, there's a carry count so we add that to the rest to get 0xFFFF: bc <<< 'obase=16;ibase=16;FFFB + 4'
If the checksum is 0xFFFF, as in our case, the IPv4 header is intact.
See the Wikipedia entry for more information.
¹Inverting the hexadecimal number means converting it to binary, flipping the bits, and converting it to hexadecimal again. You can do this online or with Bash: hex_nr=0x2C8B; hex_len=$(( ${#hex_nr} - 2 )); inverted=$(printf '%X' "$(( ~ hex_nr ))"); trunc_inverted=${inverted: -hex_len}; echo $trunc_inverted

Resources