Putting/Getting compressed data in SQLite with F# - f#

I am attempting to port an existing project of mine (a web scraper) from Python to F#, in order to learn F#. A component of the program saves compresses large strings (raw HTML) using LZMA, and stores it in SQLite in a makeshift key value table. The HTML string should always be unicode.
Because I am an F# beginner and this requires a lot of .NET interop, I am very confused as to how to accomplish this.
I would like to know how to do this properly in F#, and using LZMA instead of GZip.
Edit
I had difficulty finding an LZMA2 compatible .NET library, as LZMA-SDK uses LZMA1. This would not have been compatible with my existing data compressed using LZMA2. Therefore, along with help from comments I went ahead and implemented this using Gzip.

This uses Gzip for compression and is compatible with the gzip.compress/gzip.decompress functions in Python 3.5.
#if INTERACTIVE
#r "../packages/System.Data.SQLite.Core/lib/net46/System.Data.SQLite.dll"
#endif
open System.IO
open System.IO.Compression
open System.Data.SQLite
let compressString (s:string) =
let bs = System.Text.Encoding.UTF8.GetBytes(s)
use outStream = new MemoryStream()
use gzOutStream = new GZipStream(outStream, CompressionMode.Compress, false)
gzOutStream.Write(bs, 0, bs.Length)
outStream.ToArray()
let decompressString (bs:byte[]) =
use newInStream = new MemoryStream(bs)
use gzOutStream = new GZipStream(newInStream, CompressionMode.Decompress, false)
use sr = new StreamReader(gzOutStream)
sr.ReadToEnd()
let insert dbc (key:string) (value:string) =
let compressed = compressString value
let cmd = new SQLiteCommand("INSERT into kvt (key, value) VALUES (#key, #value)", dbc)
cmd.Parameters.Add(new SQLiteParameter("#key", key)) |> ignore
cmd.Parameters.Add(new SQLiteParameter("#value", compressed)) |> ignore
let res = cmd.ExecuteNonQuery()
res
let fetch dbc (key:string) =
let cmd = new SQLiteCommand("SELECT value FROM kvt WHERE key = #key", dbc)
cmd.Parameters.Add(new SQLiteParameter("#key", key)) |> ignore
let reader = cmd.ExecuteReader()
reader.Read() |> ignore
let compressed = unbox<byte[]> reader.["value"]
decompressString compressed
let create() =
System.Data.SQLite.SQLiteConnection.CreateFile("mydb.sqlite")
let dbc = new SQLiteConnection("Data Source=mydb.sqlite;Version=3;")
dbc.Open()
let cmd = new SQLiteCommand("CREATE TABLE kvt (key TEXT PRIMARY KEY, value BLOB)", dbc)
let res = cmd.ExecuteNonQuery()
dbc

Related

Populate list with Types

Im trying to populate list with my own type.
let getUsers =
use connection = openConnection()
let getString = "select * from Accounts"
use sqlCommand = new SqlCommand(getString, connection)
try
let usersList = [||]
use reader = sqlCommand.ExecuteReader()
while reader.Read() do
let floresID = reader.GetString 0
let exName = reader.GetString 1
let exPass = reader.GetString 2
let user = [floresID=floresID; exName=exName; exPass=exPass]
// what here?
()
with
| :? SqlException as e -> printfn "Došlo k chybě úrovni připojení:\n %s" e.Message
| _ -> printfn "Neznámá výjimka."
In C# I would just add new object into userList. How can I add new user into list? Or is it better approach to get some sort of list with data from database?
Easiest way to do this is with a type provider, so you can abstract away the database. You can use SqlDataConnection for SQLServer, SqlProvider for everything (incl. SQLServer), and also SQLClient for SQLServer.
Here is an example with postgres's dvdrental (sample) database for SQLProvider:
#r #"..\packages\SQLProvider.1.0.33\lib\FSharp.Data.SqlProvider.dll"
#r #"..\packages\Npgsql.3.1.8\lib\net451\Npgsql.dll"
open System
open FSharp.Data.Sql
open Npgsql
open NpgsqlTypes
open System.Linq
open System.Xml
open System.IO
open System.Data
let [<Literal>] dbVendor = Common.DatabaseProviderTypes.POSTGRESQL
let [<Literal>] connString1 = #"Server=localhost;Database=dvdrental;User Id=postgres;Password=root"
let [<Literal>] resPath = #"C:\Users\userName\Documents\Visual Studio 2015\Projects\Postgre2\packages\Npgsql.3.1.8\lib\net451"
let [<Literal>] indivAmount = 1000
let [<Literal>] useOptTypes = true
//create the type for the database, based on the connection string, etc. parameters
type sql = SqlDataProvider<dbVendor,connString1,"",resPath,indivAmount,useOptTypes>
//set up the datacontext, ideally you would use `use` here :-)
let ctx = sql.GetDataContext()
let actorTbl = ctx.Public.Actor //alias the table
//set up the type, in this case Records:
type ActorName = {
firstName:string
lastName:string}
//extract the data with a query expression, this gives you type safety and intellisense over SQL (but also see the SqlClient type provider above):
let qry = query {
for row in actorTbl do
select ({firstName=row.FirstName;lastName=row.LastName})
}
//seq is lazy so do all kinds of transformations if necessary then manifest it into a list or array:
qry |> Seq.toArray
The two important parts are defining the Actor record, and then in the query extracting the fields into a sequence of Actor records. You can then manifest into a list or array if necessary.
But you can also stick to your original solution. In that case just wrap the .Read() into a seq:
First define the type:
type User = {
floresID: string
exName: string
exPass: string
}
Then extract the data:
let recs = cmd.ExecuteReader() // execute the SQL Command
//extract the users into a sequence of records:
let users =
seq {
while recs.Read() do
yield {floresID=recs.[0].ToString()
exName=recs.[1].ToString()
exPass=recs.[2].ToString()
}
} |> Seq.toArray
Taking your code, you can use list expression:
let getUsers =
use connection = openConnection()
let getString = "select * from Accounts"
use sqlCommand = new SqlCommand(getString, connection)
try
[
use reader = sqlCommand.ExecuteReader()
while reader.Read() do
let floresID = reader.GetString 0
let exName = reader.GetString 1
let exPass = reader.GetString 2
let user = [floresID=floresID; exName=exName; exPass=exPass]
yield user
]
with
| :? SqlException as e -> failwithf "Došlo k chybě úrovni připojení:\n %s" e.Message
| _ -> failwithf "Neznámá výjimka."
That being said, I'd use FSharp.Data.SqlClient library so all of that boiler plate becomes a single line with added benefit of type safety (if you change the query, the code will have compile time error which are obvious to fix).

CSV Type Provider cannot find column in F# interactive

So let's say I have a CSV file with a header containing columns Population and Profit, and I'd like to work with it in F# interactive. I have the following code:
#r "../packages/FSharp.Data.1.1.10/lib/net40/FSharp.Data.dll"
open FSharp.Data
// load csv header
let cities = new CsvProvider<"cities.csv">()
// how to reach data
let firstRow = cities.Data |> Seq.head
let firstPopulation = firstRow.Population
let firstProfit = firstRow.Profit
I get an error from F# interactive:
error FS0039: The field, constructor or member 'Population' is not defined
This seems confusing to me, because intellisense in VS has no problem picking up this column from my data via a CSV type provider.
Also, I tried creating a program with the same type provider and it all works just fine. Like this:
open FSharp.Data
[<EntryPoint>]
let main argv =
use file = System.IO.File.CreateText("result.txt")
let csv = new CsvProvider<"cities.csv">()
for record in csv.Data do
fprintfn file "%A" record.Population
0
Am I missing something? Thanks for any answer.
Try this code
let Cities = new CsvProvider<"cities.csv">()
let cities = new Cities()
let firstRow = cities.Rows |> Seq.head

Converting string to UTF8Type in FluentCassandra

I am working with FluentCassandra in F# and attempting to convert a string to a UTF8Type in order to use the ExecuteNonQuery method. Has anyone been successful doing this?
Thanks,
Tom
Thank you Jack P. and Daniel for pointing me in the right direction.
To provide more examples so others can benefit, I am writing a wrapper on top of FluentCassandra in F# to make CRUD functionality much simpler by utilizing the succinctness of F#. I am using Nick Berardi's code as an example for this wrapper:
https://github.com/fluentcassandra/fluentcassandra/blob/master/test/FluentCassandra.Sandbox/Program.cs
For example, if you want to check if a keyspace exists, simply calling the KeySpaceExists(keyspaceName) would allow for checking if a keyspace exists, using CreateKeyspace(keyspaceName) would allow for creation of a keyspace, etc. An example of the library I am creating is here:
namespace Test
open System
open System.Collections.Generic
open System.Configuration
open System.Linq
open System.Text
open System.Windows
open FluentCassandra
open FluentCassandra.Connections
open FluentCassandra.Types
open FluentCassandra.Linq
module Cassandra =
let GetAppSettings (key : string) = ConfigurationManager.AppSettings.Item(key)
let KeyspaceExists keyspaceName =
let server = new Server(GetAppSettings("Server"))
let db = new CassandraContext(keyspaceName, server)
let keyspaceNameExists = db.KeyspaceExists(keyspaceName)
db.Dispose()
keyspaceNameExists
let CreateKeyspace keyspaceName =
let server = new Server(GetAppSettings("Server"))
let db = new CassandraContext(keyspaceName, server)
let schema = new CassandraKeyspaceSchema(Name=keyspaceName)
let keyspace = new CassandraKeyspace(schema,db)
if KeyspaceExists(keyspaceName)=false then keyspace.TryCreateSelf()
db.Dispose()
let DropKeyspace (keyspaceName : string ) =
let server = new Server(GetAppSettings("Server"))
let db = new CassandraContext(keyspaceName, server)
match db.KeyspaceExists(keyspaceName)=true with
// value has "ignore" to ignore the string returned from FluentCassandra
| true -> db.DropKeyspace(keyspaceName) |> ignore
| _ -> ()
db.Dispose()
let ColumnFamilyExists (keyspaceName, columnFamilyName : string) =
let server = new Server(GetAppSettings("Server"))
let db = new CassandraContext(keyspaceName, server)
let schema = new CassandraKeyspaceSchema(Name=keyspaceName)
let keyspace = new CassandraKeyspace(schema,db)
let columnFamilyNameExists = db.ColumnFamilyExists(columnFamilyName)
db.Dispose()
columnFamilyNameExists
let CreateColumnFamily (keyspaceName, columnFamilyName: string) =
if ColumnFamilyExists(keyspaceName,columnFamilyName)=false then
let server = new Server(GetAppSettings("Server"))
let db = new CassandraContext(keyspaceName, server)
let schema = new CassandraKeyspaceSchema(Name=keyspaceName)
let keyspace = new CassandraKeyspace(schema,db)
if ColumnFamilyExists(keyspaceName,columnFamilyName)=false then
keyspace.TryCreateColumnFamily(new CassandraColumnFamilySchema(FamilyName = columnFamilyName, KeyValueType = CassandraType.AsciiType, ColumnNameType = CassandraType.IntegerType, DefaultColumnValueType = CassandraType.UTF8Type))
let ExecuteNonQuery(keyspaceName, query: string) =
let server = new Server(GetAppSettings("Server"))
let db = new CassandraContext(keyspaceName, server)
let schema = new CassandraKeyspaceSchema(Name=keyspaceName)
let keyspace = new CassandraKeyspace(schema,db)
let queryUTF8 = FluentCassandra.Types.UTF8Type.op_Implicit query
try
db.ExecuteNonQuery(queryUTF8)
true
with
| _ -> false
This library allows for very easy one line commands to utilize the FluentCassandra functionality. Of course this is just the start and I plan on amending the above library further.
open System
open System.Linq
open System.Collections.Generic
open System.Configuration
open FluentCassandra.Connections
open FluentCassandra.Types
open FluentCassandra.Linq
open Test.Cassandra
[<EntryPoint>]
let main argv =
CreateKeyspace("test1")
printfn "%s" (ColumnFamilyExists("test1", "table1").ToString())
printfn "%s" (KeyspaceExists("test1").ToString())
CreateColumnFamily("test1","table1")
printfn "%s" (ColumnFamilyExists("test1", "table1").ToString())
let result = ExecuteNonQuery("test1", "CREATE TABLE table2 (id bigint primary key, name varchar)")
printfn "%s" (result.ToString())
let wait = System.Console.ReadLine()
0
Specifically with converting the query string to a UTF8Type, Daniel's approach of utilizing UTF8Type.op_Implicit str worked. You can see how I applied it in the ExecuteNonQuery function above. Thanks again for your help!

Is it possible to create an object expression at runtime? [duplicate]

How do I execute F# code from a string in a compiled F# program?
Here's a little script that uses the FSharp CodeDom to compile a string into an assembly and dynamically load it into the script session.
It uses a type extension simply to allow useful defaults on the arguments (hopefully let bound functions will support optional, named and params arguments in the near future.)
#r "FSharp.Compiler.dll"
#r "FSharp.Compiler.CodeDom.dll"
open System
open System.IO
open System.CodeDom.Compiler
open Microsoft.FSharp.Compiler.CodeDom
let CompileFSharpString(str, assemblies, output) =
use pro = new FSharpCodeProvider()
let opt = CompilerParameters(assemblies, output)
let res = pro.CompileAssemblyFromSource( opt, [|str|] )
if res.Errors.Count = 0 then
Some(FileInfo(res.PathToAssembly))
else None
let (++) v1 v2 = Path.Combine(v1, v2)
let defaultAsms = [|"System.dll"; "FSharp.Core.dll"; "FSharp.Powerpack.dll"|]
let randomFile() = __SOURCE_DIRECTORY__ ++ Path.GetRandomFileName() + ".dll"
type System.CodeDom.Compiler.CodeCompiler with
static member CompileFSharpString (str, ?assemblies, ?output) =
let assemblies = defaultArg assemblies defaultAsms
let output = defaultArg output (randomFile())
CompileFSharpString(str, assemblies, output)
// Our set of library functions.
let library = "
module Temp.Main
let f(x,y) = sin x + cos y
"
// Create the assembly
let fileinfo = CodeCompiler.CompileFSharpString(library)
// Import metadata into the FSharp typechecker
#r "0lb3lphm.del.dll"
let a = Temp.Main.f(0.5 * Math.PI, 0.0) // val a : float = 2.0
// Purely reflective invocation of the function.
let asm = Reflection.Assembly.LoadFrom(fileinfo.Value.FullName)
let mth = asm.GetType("Temp.Main").GetMethod("f")
// Wrap weakly typed function with strong typing.
let f(x,y) = mth.Invoke(null, [|box (x:float); box (y:float)|]) :?> float
let b = f (0.5 * Math.PI, 0.0) // val b : float = 2.0
To use this in a compiled program you would need the purely reflective invocation.
Of course this is a toy compared to a full scripting API that many of us in the community have urgently requested.
best of luck,
Danny
Are you looking for an Eval function?
You might want to try looking at this blog post:
http://fsharpnews.blogspot.com/2007/02/symbolic-manipulation.html
If you read in your expressions into these kind of symbolic datastructures, they are pretty easy to evaluate.
Or, perhaps you are looking for scripting support:
http://blogs.msdn.com/chrsmith/archive/2008/09/12/scripting-in-f.aspx
If you really want dynamic compilation, you could do it with the F# CodeDom provider.
There has been movement on this front. You can now compile using the FSharp.Compiler.Service
simple sample using FSharp.Compiler.Service 5.0.0 from NuGet
open Microsoft.FSharp.Compiler.SimpleSourceCodeServices
let compile (codeText:string) =
let scs = SimpleSourceCodeServices()
let src,dllPath =
let fn = Path.GetTempFileName()
let fn2 = Path.ChangeExtension(fn, ".fs")
let fn3 = Path.ChangeExtension(fn, ".dll")
fn2,fn3
File.WriteAllText(src,codeText)
let errors, exitCode = scs.Compile [| "fsc.exe"; "-o"; dllPath; "-a";src; "-r"; "WindowsBase"; "-r" ;"PresentationCore"; "-r"; "PresentationFramework" |]
match errors,exitCode with
| [| |],0 -> Some dllPath
| _ ->
(errors,exitCode).Dump("Compilation failed")
File.Delete src
File.Delete dllPath
None
then it's a matter of Assembly.LoadFrom(dllPath) to get it into the current app domain.
followed by reflection based-calls into the dll (or possibly Activator.CreateInstance)
Sample LinqPad Usage

How to use OpenXML SDK with F# and MemoryStreams

This article says that you need to use resizable MemoryStreams when working with the OpenXML SDK, and the sample code works fine.
However, when I translate the sample C# code into F#, the document remains unchanged:
open System.IO
open DocumentFormat.OpenXml.Packaging
open DocumentFormat.OpenXml.Wordprocessing
[<EntryPoint>]
let Main args =
let byteArray = File.ReadAllBytes "Test.docx"
use mem = new MemoryStream()
mem.Write(byteArray, 0, (int)byteArray.Length)
let para = new Paragraph()
let run = new Run()
let text = new Text("Newly inserted paragraph")
run.InsertAt(text, 0) |> ignore
para.InsertAt(run, 0) |> ignore
use doc = WordprocessingDocument.Open(mem, true)
doc.MainDocumentPart.Document.Body.InsertAt(para, 0) |> ignore
// no change to the document
use fs = new FileStream("Test2.docx", System.IO.FileMode.Create)
mem.WriteTo(fs)
0
It works fine when I use WordprocessingDocument.Open("Test1.docx", true), but I want to use a MemoryStream. What am I doing wrong?
Changes you're making to doc are not reflected in MemoryStream mem until you close doc. Placing doc.Close() as below
...
doc.MainDocumentPart.Document.Body.InsertAt(para, 0) |> ignore
doc.Close()
...
fixes the problem and you'll get text Newly inserted paragraph at the top of your Test2.docx.
Also your snippet is missing one required reference:
open DocumentFormat.OpenXml.Packaging
from WindowsBase.dll.
EDIT: as ildjarn pointed out the more F#-idiomatic would be the following refactoring:
open System.IO
open System.IO.Packaging
open DocumentFormat.OpenXml.Packaging
open DocumentFormat.OpenXml.Wordprocessing
[<EntryPoint>]
let Main args =
let byteArray = File.ReadAllBytes "Test.docx"
use mem = new MemoryStream()
mem.Write(byteArray, 0, (int)byteArray.Length)
do
use doc = WordprocessingDocument.Open(mem, true)
let para = new Paragraph()
let run = new Run()
let text = new Text("Newly inserted paragraph")
run.InsertAt(text, 0) |> ignore
para.InsertAt(run, 0) |> ignore
doc.MainDocumentPart.Document.Body.InsertAt(para, 0) |> ignore
use fs = new FileStream("Test2.docx", FileMode.Create)
mem.WriteTo(fs)
0

Resources