Why doesn't F# Compile Currying into Separate Functions? - f#

So I'm trying to learn F# and as I learn new things I like to look at the IL to see what's happening under the covers. I recently read about Currying, an obvious fundamental of the language.
According to F# for fun and Profit when you create the below function:
let addItems x y = x + y
What is really happening is there are two single argument functions being created.
let addItems x =
let subFunction y =
x + y
subFunction
and when you invoke the function with addItems 5 6 the order of operations are as follows
addItems is called with argument 5
addItems returns subFunction
subFunction is called with argument 6
subFunction has argument 5 in scope so it adds and returns the sum of 5 and 6
All of this sounds fine on the surface. However, when you look at the IL for this it tells a different story.
.method public static int32 testCurry(int32 x,
int32 y) cil managed
{
.custom instance void [FSharp.Core]Microsoft.FSharp.Core.CompilationArgumentCountsAttribute::.ctor(int32[]) = ( 01 00 02 00 00 00 01 00 00 00 01 00 00 00 00 00 )
// Code size 5 (0x5)
.maxstack 8
IL_0000: nop
IL_0001: ldarg.0
IL_0002: ldarg.1
IL_0003: add
IL_0004: ret
} // end of method Sandbox::testCurry
We can clearly see in the IL that a single static function that takes two arguments and returns an Int32 is created.
So my question is, why the discrepancy? This isn't the first time I've seen IL that doesn't jive with the documentation either...

So my question is, why the discrepancy?
The actual compiled IL doesn't need to, and really shouldn't, matter in terms of the behavioral contract. By compiling to a single function, a call gets significantly better optimization at the JIT/runtime level.
The "what is really happening here..." isn't necessarily what is actually happening, it's more "how this should be reasoned about when writing and using F# code is...". The underlying implementation should be free to change as needed, in order to make the best use of the runtime environment.

Related

Repetitive regular expression in Lua

I need to find a pattern of 6 pairs of hexadecimal numbers (without 0x), eg.
"00 5a 4f 23 aa 89"
This pattern works for me, but the question is if there any way to simplify it?
[%da-f][%da-f]%s[%da-f][%da-f]%s[%da-f][%da-f]%s[%da-f][%da-f]%s[%da-f][%da-f]%s[%da-f][%da-f]
Lua patterns do not support limiting quantifiers and many more features that regular expressions support (hence, Lua patterns are not even regular expressions).
You can build the pattern dynamically since you know how many times you need to repeat a part of a pattern:
local text = '00 5a 4f 23 aa 89'
local answer = text:match('[%da-f][%da-f]'..('%s[%da-f][%da-f]'):rep(5) )
print (answer)
-- => 00 5a 4f 23 aa 89
See the Lua demo.
The '[%da-f][%da-f]'..('%s[%da-f][%da-f]'):rep(5) can be further shortened with %x hex char shorthand:
'%x%x'..('%s%x%x'):rep(5)
Lua supports %x for hexadecimal digits, so you can replace all every [%da-f] with %x:
%x%x%s%x%x%s%x%x%s%x%x%s%x%x%s%x%x
Lua doesn't support specific quantifiers {n}. If it did, you could make it quite a lot shorter.
Also you can use a "One or more" with the Plus-Sign to shorten up...
print(('Your MAC is: 00 5a 4f 23 aa 89'):match('%x+%s%x+%s%x+%s%x+%s%x+%s%x+'))
-- Tested in Lua 5.1 up to 5.4
It is described under "Pattern Item:" in...
https://www.lua.org/manual/5.4/manual.html#6.4.1
final solution:
local text = '00 5a 4f 23 aa 89'
local pattern = '%x%x'..('%s%x%x'):rep(5)
local answer = text:match(pattern)
print (answer)

Hexadecimal Convention in Memory

a super stupid question:
I have an integer in my code, which occupies 4 bytes ( of course ), this information in memory is represented as a pack of four hexadecimal of two digits, for example
int x = 1000
in memory is represented as
e8 03 00 00
where the first hex represents the "lower" byte and the last is the "highest".
How is this representation called? Are there other representations? I just need the name. I'm struggling to find online this information :(
Thanks
The word you are looking for is Endianness.

how to use seq2seq to decode concatenated string

Am trying to decode a concatenated String like below ...
SQCB7A750BATWE SQ CB 7 A 750 B A T WE
PT05A1219PY023 PT 05 A 12 19 P Y 023
PT55A1019PX02 PT 55 A 10 19 P X 02
PT33SE2215SW023 PT 33 SE 22 15 S W 023
PT05A2216PW023(LC) PT 05 A 22 16 P W 023 (LC)
am looking for a smarter way rather than hard-coded rules as the input will have variations(number of characters and digits), I came across SEQ2SEQ model and I want to know if it's possible to use it in such problem
I already followed some tutorials to get a taste of it, but the results weren't even close
it also seems there are 2 approaches character level and word level as per this tutorial
Character level:
Input sentence: SQCACA333BA71A
Decoded sentence: P 9(PDD366AZ2IDD4K )F)F(L)L)1)1)1) 6A
-
Input sentence: SQCAAC152DA71A
Decoded sentence: P 9(PDD366AZ2IDD4K )F)F(L)L)1)1)1) 6A
am still trying to implement the word level, but I'd like to know if the problem can be solved using this approach (seq2seq)

What's the representation of a set of char in PASCAL?

They ask me to represet a set of char like into "map memory". What chars are in the set? The teacher told us to use ASCII code, into a set of 32 bytes.
A have this example, the set {'A', 'B', 'C'}
(The 7 comes from 0111)
= {00 00 00 00 00 00 00 00 70 00
00 00 00 00 00 00 00 00 00 00
00}
Sets in pascal can be represented in memory with one bit for every element; if the bit is 1, the element is present in the set.
A "set of char" is the set of ascii char, where each element has an ordinal value from 0 to 255 (it should be 127 for ascii, but often this set is extended up to a byte, so there are 256 different characters).
Hence a "set of char" is represented in memory as a block of 32 bytes which contain a total of 256 bits. The character "A" (upper case A) has an ordinal value of 65. The integer division of 65 by 8 (the number of bits a byte can hold) gives 8. So the bit representing "A" in the set resides in the byte number 8. 65 mod 8 gives 1, which is the second bit in that byte.
The byte number 8 will have the second bit ON for the character A (and the third bit for B, and the fourth for C). All the three characters together give the binary representation of 0000.1110 ($0E in hex).
To demonstrate this, I tried the following program with turbo pascal:
var
ms : set of char;
p : array[0..31] of byte absolute ms;
i : integer;
begin
ms := ['A'..'C'];
for i := 0 to 31 do begin
if i mod 8=0 then writeln;
write(i,'=',p[i],' ');
end;
writeln;
end.
The program prints the value of all 32 bytes in the set, thanks to the "absolute" keyword. Other versions of pascal can do it using different methods. Running the program gives this result:
0=0 1=0 2=0 3=0 4=0 5=0 6=0 7=0
8=14 9=0 10=0 11=0 12=0 13=0 14=0 15=0
16=0 17=0 18=0 19=0 20=0 21=0 22=0 23=0
24=0 25=0 26=0 27=0 28=0 29=0 30=0 31=0
where you see that the only byte different than 0 is the byte number 8, and it contains 14 ($0E in hex, 0000.1110). So, your guess (70) is wrong.
That said, I must add that nobody can state this is always true, because a set in pascal is implementation dependent; so your answer could also be right. The representation used by turbo pascal (on dos/windows) is the most logical one, but this does not exclude other possible representations.

Convert DeDe constant to valid declaration or other interface extraction tool?

I am using DeDe to create an API (Interface) I can compile to. (Strictly legit: while we wait for the vendor to deliver a D2010 version in two months, we can at least get our app compiling...)
We'll stub out all methods.
Dede emits constant declarations like these:
LTIMGLISTCLASS =
00: ÿÿÿÿ....LEADIMGL|FF FF FF FF 0D 00 00 00 4C 45 41 44 49 4D 47 4C|
10: IST32. |49 53 54 33 32 00|;
DS_PREFIX =
0: ÿÿÿÿ....DICM.|FF FF FF FF 04 00 00 00 44 49 43 4D 00|;
How would I convert these into a compilable statement?
In theory, I don't care about the actual values, since I doubt they're use anywhere, but I'd like to get their size correct. Are these integers, LongInts or ???
Any other hints on using DeDe would be welcome.
Those are strings. The first four bytes are the reference count, which for string literals is always -1 ($ffffffff). The next four bytes are the character count. Then comes the characters an a null terminator.
const
LTIMGLISTCLASS = 'LEADIMGLIST32'; // 13 = $0D characters
DS_PREFIX = 'DICM'; // 4 = $04 characters
You don't have to "doubt" whether those constants are used anywhere. You can confirm it empirically. Compile your project without those constants. If it compiles, then they're not used.
If your project doesn't compile, then those constants must be used somewhere in your code. Based on the context, you can provide your own declarations. If the constant is used like a string, then declare a string; if it's used like an integer, then declare an integer.
Another option is to load your project in a version of Delphi that's compatible with the DCUs you have. Use code completion to make the IDE display the constant and its type.

Resources