Convert email attachment to base64 - f#

I want to take an attachment from an email and convert it to a base64 string so that I can store it as JSON.
In C#, I would get the attachment as a System.IO.Stream, read it into a byte array, and then use Convert.ToBase64String.
In F# though, I'm not sure how to this (I'm a beginner) and it feels like there is probably a much more functional way of doing things...?

F# combines functional style with object-oriented style, so that you can easily call .NET libraries from F#. Sometimes there are F#-specific libraries that give you a more functional style for some tasks (like list processing), but I don't think there is anything like that for base64 encoding and streams.
So, given a stream, you can read it into a buffer and then convert to base64 using .NET types as follows:
open System
open System.IO
let stream = // Some stream, for example: new MemoryStream([| 1uy; 2uy; 3uy; 4uy |])
let buffer = Array.zeroCreate (int stream.Length)
stream.Read(buffer, 0, buffer.Length)
Convert.ToBase64String(buffer)

Related

lines=True parameter for the Json Type Provider and Json.Net library?

I am working on this Kaggle competition. The Jupyter notebooks on Kaggle only support R and Python and I wanted to use F# locally. The problem is that the datasets are .json files and both the F# Json Type Provider and Newtonsoft libraries fail when trying to parse the files.
Here are examples of the code failing in F#:
open FSharp.Data
type Context = JsonProvider<"train.json">
let context = Context.
and
open System
open System.IO
open Newtonsoft.Json
open Newtonsoft.Json.Linq
let object = JObject.Parse(File.ReadAllText("train.json"));
object
This Python example uses these line of code to parse them correctly:
train = pd.read_json('../input/stanford-covid-vaccine/train.json', lines=True)
test = pd.read_json('../input/stanford-covid-vaccine/test.json', lines=True)
In the notebook, the author says that without the "lines=True" parameter, the read_json method fails with this trailing error.
My question: assuming tis is the same error, is there a way to apply that same kind of "lines=true" to the .NET libraries to parse the json?
I've seen a few datasets where the format was one valid JSON record per line:
{"event":"nothing 1"}
{"event":"nothing 2"}
{"event":"nothing 3"}
This is not valid JSON overall. I think you can either parse it line-by-line or you can turn it into valid JSON. For line-by-line parsing (which may be more efficient as you can do this in a streaming fashion), I would use:
open FSharp.Data
type Log = JsonProvider<"""{"event":"nothing 1"}""">
for line in File.ReadAllLines("some.json") do
let l = Log.Parse(line)
printfn "%s" l.Event

How to encode a STRING variable into a given code page

I've got a string variable containing a text that I need to encode and write to a file, in UTF-16LE code page.
Currently the following code generates a UTF-8 file and I don't see any option in the statement OPEN DATASET to generate the file in UTF-16LE.
REPORT zmyprogram.
DATA(filename) = `/tmp/myfile`.
OPEN DATASET filename IN TEXT MODE ENCODING DEFAULT FOR OUTPUT.
TRANSFER 'HELLO WORLD' TO filename.
CLOSE DATASET filename.
I guess one solution is to first encode the string in memory, then write the encoded bytes to the file.
Generally speaking, how to encode a string of characters into a given code page, in memory?
In the first part, I explain how to encode a string of characters into a given code page (all is done in memory), and in the second part, I explain specifically how to write files to the application server in a given code page.
General way (all in memory)
If a string of characters (type STRING) has to be encoded, the result has to be stored in a string of bytes, which corresponds to the built-in data type XSTRING.
There are several possibilities which depend on the ABAP version:
Since 7.53, use the class CL_ABAP_CONV_CODEPAGE:
DATA(xstring) = cl_abap_conv_codepage=>create_out( codepage = `UTF-16LE` )->convert( source = `ABCDE` ).
Since 7.02, use the class CL_ABAP_CODEPAGE:
DATA xstring TYPE xstring.
xstring = cl_abap_codepage=>convert_to( source = `ABCDE` codepage = `UTF-16LE` ).
Before 7.02, use the class CL_ABAP_CONV_OUT_CE (documentation provided with the class):
First, instantiate the conversion object, use a SAP code page number instead of the ISO name (list of values shown hereafter):
DATA: conv TYPE REF TO CL_ABAP_CONV_OUT_CE, xstring TYPE xstring.
conv = CL_ABAP_CONV_OUT_CE=>CREATE( encoding = '4103' ). "4103 = utf-16le
Then encode the string and retrieve the bytes encoded:
conv->RESET( ).
conv->WRITE( data = `ABCDE` ).
xstring = conv->GET_BUFFER( ).
Eventually, instead of using RESET, WRITE and GET_BUFFER, the method CONVERT was added in 6.40 and retroported :
conv->CONVERT( EXPORTING data = `ABCDE` IMPORTING buffer = xstring ).
With the class CL_ABAP_CONV_OUT_CE, you need to use the number of the SAP Code Page, not the ISO name. Here are the most common SAP code pages and their equivalent ISO names:
1100: ISO-8859-1
1101: US-ASCII
1160: Windows-1252 ("ANSI")
1401: ISO-8859-2
4102: UTF-16BE
4103: UTF-16LE
4104: UTF-32BE
4105: UTF-32LE
4110: UTF-8
Etc. (the possible values are defined in the table TCP00A, in lines with column CPATTRKIND = 'H').
 
Writing a file on the application server in a given code page
In ABAP, OPEN DATASET can directly specify the target code page, most code pages are supported including UTF-8, but not other UTF (code pages 41xx) which can be done only by the solution explained in 2.3 below (by first encoding in memory).
2.1) IN TEXT MODE ENCODING ...
Possible ENCODING values:
UTF-8: in this mode, it's possible to add the Byte Order Mark if needed, via the option WITH BYTE-ORDER MARK.
DEFAULT: will be UTF-8 in a SAP "Unicode" system (that you can check via the menu System > Status > Unicode System Yes/No), NON-UNICODE otherwise.
NON-UNICODE: will depend on the current ABAP linguistic environment; for language English, it's the character encoding iso-8859-1, for language Polish, it's the character encoding iso-8859-2, etc. (the equivalences are shown in table TCP0C.)
Example in ABAP version 7.52 to write to UTF-8 with the byte order mark:
REPORT zmyprogram.
DATA(filename) = `/tmp/dataset_utf_8`.
OPEN DATASET filename IN TEXT MODE ENCODING UTF-8 WITH BYTE-ORDER MARK FOR OUTPUT.
TRY.
TRANSFER `Witaj świecie` TO filename.
CATCH cx_sy_conversion_codepage INTO DATA(lx).
" Character not supported in language code page
ENDTRY.
CLOSE DATASET filename.
Example in ABAP version 7.52 to write to iso-8859-2 (Polish language here):
REPORT zmyprogram.
SET LOCALE LANGUAGE 'L'. " Polish
DATA(filename) = `/tmp/dataset_nonunicode_pl`.
OPEN DATASET filename IN TEXT MODE ENCODING NON-UNICODE FOR OUTPUT.
TRY.
TRANSFER `Witaj świecie` TO filename.
CATCH cx_sy_conversion_codepage INTO DATA(lx).
" Character not supported in language code page
ENDTRY.
CLOSE DATASET filename.
2.2) IN LEGACY TEXT MODE CODE PAGE ...
Use any code page number except code pages 41xx (i.e. UTF-8 and other UTF; see workaround in 2.3 below).
Example in ABAP version 7.52 to write to iso-8859-2 (code page 1401) :
REPORT zmyprogram.
DATA(filename) = `/tmp/dataset_iso_8859_2`.
OPEN DATASET filename IN LEGACY TEXT MODE CODE PAGE '1401' FOR OUTPUT. " iso-8859-2
TRY.
TRANSFER `Witaj świecie` TO filename.
CATCH cx_sy_conversion_codepage INTO DATA(lx).
" Character not supported in language code page
ENDTRY.
CLOSE DATASET filename.
2.3) UTF = general way + IN BINARY MODE
Example in ABAP version 7.52:
REPORT zmyprogram.
TRY.
DATA(xstring) = cl_abap_codepage=>convert_to( source = `Witaj świecie` codepage = `UTF-16LE` ).
CATCH cx_sy_conversion_codepage INTO DATA(lx).
" Character not supported in language code page
BREAK-POINT.
ENDTRY.
DATA(filename) = `/tmp/dataset_utf_16le`.
OPEN DATASET filename IN BINARY MODE FOR OUTPUT.
TRANSFER xstring TO filename.
CLOSE DATASET filename.

How to read .docx file using F#

How can I read a .docx file using F#. If I use
System.IO.File.ReadAllText("D:/test.docx")
It is returning me some garbage output with beep sounds.
Here is a F# snippet that may give you a jump-start. It successfully extracts all text contents of a Word2010-created .docx file as a string of concatenated lines:
open System
open System.IO
open System.IO.Packaging
open System.Xml
let getDocxContent (path: string) =
use package = Package.Open(path, FileMode.Open)
let stream = package.GetPart(new Uri("/word/document.xml", UriKind.Relative)).GetStream()
stream.Seek(0L, SeekOrigin.Begin) |> ignore
let xmlDoc = new XmlDocument()
xmlDoc.Load(stream)
xmlDoc.DocumentElement.InnerText
printfn "%s" (getDocxContent #"..\..\test.docx")
In order to make it working do not forget to reference WindowsBase.dll in your VS project.
.docx files follow Open Packaging Convention specifications. At the lowest level, they are .ZIP files. To read it programmatically, see example here:
A New Standard For Packaging Your Data
Packages and Parts
Using F#, it's the same story, you'll have to use classes in the System.IO.Packaging Namespace.
System.IO.File.ReadAllText has type of string -> string.
Because a .docx file is a binary file, it's probable that some of the chars in the strings have the bell character. Rather than ReadAllText, look into Word automation, the Packaging, or the OpenXML APIs
Try using the OpenXML SDK from Microsoft.
Also on the linked page is the Microsoft tool that you can use to decompile the office 2007 files. The decompiled code can be quite lengthy even for simple documents though so be warned. There is a big learning curve associated with OpenXML SDK. I'm finding it quite difficult to use.

Backslash read and write and F# interactive console

Edit: whats the difference between reading a backslash from a file and writing it to the interactive window vs writing directly the string to the interactive window ?
For example
let toto = "Adelaide Gu\u00e9nard"
toto;;
the interactive window prints "Adelaide Guénard".
Now if I save a txt file with the single line Adelaide Gu\u00e9nard . And read it in:
System.IO.File.ReadAllLines(#"test.txt")
The interactive window prints [|"Adelaide Gu\u00e9nard"|]
What is the difference between these 2 statements in terms of the interactive window printing ?
As far as I know, there is no library that would decode the F#/C# escaping of string for you, so you'll have to implement that functionality yourself. There was a similar question on how to do that in C# with a solution using regular expressions.
You can rewrite that to F# like this:
open System
open System.Globalization
open System.Text.RegularExpressions
let regex = new Regex (#"\\[uU]([0-9A-F]{4})", RegexOptions.IgnoreCase)
let line = "Adelaide Gu\\u00e9nard"
let line = regex.Replace(line, fun (m:Match) ->
(char (Int32.Parse(m.Groups.[1].Value, NumberStyles.HexNumber))).ToString())
(If you write "some\\u00e9etc" then you're creating string that contains the same thing as what you'd read from the text file - if you use single backslash, then the F# compiler interprets the escaping)
It uses the StructuredFormat stuff from the F# PowerPack. For your string, it's effectively doing printfn toto;;.
You can achieve the same behaviour in a text file as follows:
open System.IO;;
File.WriteAllText("toto.txt", toto);;
The default encoding used by File.WriteAllText is UTF-8. You should be able to open toto.txt in Notepad or Visual Studio and see the é correctly.
Edit: If wanted to write the content of test.txt to another file in the clean F# interactive print, how would i proceed ?
It looks like fsi is being too clever when printing the contents of test.txt. It's formatting it as a valid F# expression, complete with quotes, [| |] brackets, and a Unicode character escape. The string returned by File.ReadAllLines doesn't contain any of these things; it just contains the words Adelaide Guénard.
You should be able to take the array returned by File.ReadAllLines and pass it to File.WriteAllLines, without the contents being mangled.

delphi blowfish mode ecb (python converter to delphi)

I know as a programmer that is rare for someone to do, but I actually need it and can not at all so someone needs to convert this small function cryptography python for delphi.
function: `
from Crypto.Cipher import Blowfish
class Blowfish(object):
cipher = None
def __init__(self, key, mode = Blowfish.MODE_ECB):
self.cipher = Blowfish.new(key, mode)
def encrypt(self, texto):
encriptar = self.cipher.encrypt(texto)
return encriptar `
-
one example
key = 123key
text = hi man
result = ìûÕ]–•¢
I people much times because I tried to do in Delphi and always shows me different results then do better and ask for someone who understands python / delphi
thank so much!
For the comment on DCPcrypt, maybe your python library results the raw encrypted bytes, and the result of DCPcrypt (or other delphi library like Turbo Lockbox) gives you the result encoded in something like UU64 o MIME (this is done to easily transfer o store the result)
If you just want to implement Blowfish algorithm in Delphi, try DCPcrypt.
#Mili, you can't translate this code to delphi because does not exist a RTL library (or function) in delphi with blowfish support, dou you need use a third party component for this. i recommend you the
Delphi Encryption Compedium Part I v.5.2. you can try out this link for more components.
You can also try TurboPower LockBox 3.1.0 at http://lockbox.seanbdurkin.id.au/ .
This library also implements Blowfish.

Resources