Lately, I have been interested in using Racket for data manipulation and storing.
One thing that keep floating in my mind is it's the data usage when we create a custom parser.
For example this is some code from a Fosdem 2019 presentation :
ranch.rkt
#lang s-exp workshop/expander
(ranch
(ponies
(pony firestorm "Hyaaaaar")
(pony rarity "I'm so shiiinyyy")
(pony dash "Look at dat speed!")
(pony dog "Guys, I'm not a pony.")))
expander.rkt
#lang racket/base
(provide
(except-out (all-from-out racket/base) #%module-begin)
(rename-out [module-begin #%module-begin])
ranch)
(require
(for-syntax
racket/base
syntax/stx
syntax/parse))
(define-syntax-rule (module-begin expr)
(#%module-begin
(provide the-ranch)
(define the-ranch expr)))
(define-syntax (ranch stx)
(syntax-parse stx
#:datum-literals (ponies pony)
[(_ (ponies (pony name:id cry:str) ...))
#'(lambda (pony-name)
(cond
[(eq? pony-name 'name) cry] ...
(else "This pony does not exist!")))]))
This will create a data structure for the ranch, create a module and modify it afterward to replace the nodes with the lambda pony-name.
We can confirm that by checking in the stepper, at the end of the expansion we got :
(module ranch workshop/expander
(#%module-begin
(module configure-runtime '#%kernel (#%module-begin (#%require racket/runtime-config) (#%app configure (quote #f))))
(#%provide the-ranch)
(define-values (the-ranch)
(lambda (pony-name)
(if (#%app eq? pony-name 'firestorm)
(let-values () (quote "Hyaaaaar"))
(if (#%app eq? pony-name 'rarity)
(let-values () (quote "I'm so shiiinyyy"))
(if (#%app eq? pony-name 'dash)
(let-values () (quote "Look at dat speed!"))
(if (#%app eq? pony-name 'dog) (let-values () (quote "Guys, I'm not a pony.")) (let-values () (quote "This pony does not exist!"))))))))))
Mu question is that, how is this different in memory usage than classic OOP ? If we duplicate the instructions for each data (here the pony), doesn't we get as memory usage as OOP when creating class instances ?
Also, would not it be much efficient to have only one function iterating recursively over the nodes, instead of using macro to transform the data into executable instructions ?
I know I have a lot of misunderstanding on functionnal programming, and certainly the practical goal of the code above, I would be happy to understand better what is going here and have your feedbacks :).
The link of the FOSDEM presentation https://archive.fosdem.org/2019/schedule/event/makeownlangracket/
Related
Can someone explain the following behavior? Specifically, why does the function return a different list every time? Why isn't some-list initialized to '(0 0 0) every time the function is called?
(defun foo ()
(let ((some-list '(0 0 0)))
(incf (car some-list))
some-list))
Output:
> (foo)
(1 0 0)
> (foo)
(2 0 0)
> (foo)
(3 0 0)
> (foo)
(4 0 0)
Thanks!
EDIT:
Also, what is the recommended way of implementing this function, assuming I want the function to output '(1 0 0) every time?
'(0 0 0) is a literal object, which is assumed to be a constant (albeit not protected from modification). So you're effectively modifying the same object every time. To create different objects at each function call use (list 0 0 0).
So unless you know, what you're doing, you should always use literal lists (like '(0 0 0)) only as constants.
On a side note, defining this function in the sbcl REPL you get the following warning:
caught WARNING:
Destructive function SB-KERNEL:%RPLACA called on constant data.
See also:
The ANSI Standard, Special Operator QUOTE
The ANSI Standard, Section 3.2.2.3
Which gives a good hint towards the problem at hand.
'(0 0 0) in code is literal data. Modifying this data has undefined behavior. Common Lisp implementations may not detect it at runtime (unless data is for example placed in some read-only memory space). But it can have undesirable effects.
you see that this data may be (and often is) shared across various invocations of the same function
one of the more subtle possible errors is this: Common Lisp has been defined with various optimizations which can be done by a compiler in mind. For example a compiler is allowed to reuse data:
Example:
(let ((a '(1 2 3))
(b '(1 2 3)))
(list a b))
In above code snippet the compiler may detect that the literal data of a and b is EQUAL. It may then have both variables point to the same literal data. Modifying it may work, but the change is visible from a and b.
Summary: Modification of literal data is a source of several subtle bugs. Avoid it if possible. Then you need to cons new data objects. Consing in general means the allocation of fresh, new data structures at runtime.
Wanted to write one myself, but I found a good one online:
CommonLisp has first class functions, i.e. functions are objects which
can be created at runtime, and passed as arguments to other functions.
--AlainPicard These first-class functions also have their own state, so they are functors. All Lisp functions are functors; there is no
separation between functions that are "just code" and "function
objects". The state takes the form of captured lexical variable
bindings. You don't need to use LAMBDA to capture bindings; a
top-level DEFUN can do it too: (let ((private-variable 42))
(defun foo ()
...))
The code in the place of ... sees private-variable in its lexical
scope. There is one instance of this variable associated with the one
and only function object that is globally tied to the symbol FOO; the
variable is captured at the time the DEFUN expression is evaluated.
This variable then acts something like a static variable in C. Or,
alternately, you can think of FOO as a "singleton" object with an
"instance variable".
--KazKylheku
Ref
http://c2.com/cgi/wiki?CommonLisp
For one of my projects, I am trying to use Common Lisp, specifically SBCL (in the process, learning it. This is one of the motivations.)
I need to read a file with questions and answers, basically like a Standardized test with mainly multiple choice question answers.
I have some sample questions marked with some section markers like "|" for start and "//s" for and of a section. The question paper will have a hierarchical structure like this: Section -> multiple sub-sections -> each sub-section with multiple questions -> each question will have multiple answers one of them being correct.
This hierarchical structure needs to be converted into a json file finally and pushed to an android app for downstream consumption.
STEP-1: After reading from the source Test paper, this is how my list will look like:
(("Test" . "t")
("0.1" . "v")
("today" . "d")
("General Knowledge" . "p")
("Science" . "s")
("what is the speed of light in miles per second?" . "q")
("Choose the best answer from the following" . "i")
("MCQ question" . "n")
("186000" . "c")
("286262" . "w")
("200000" . "w"))
[PS.1] See legend at the end of the post for the explanation of the cdar values like h, p, t , v etc.,
[PS.2] The source file sample attached at the end of this post
Each car of the consed pair representing the content and the cdr representing the section - which will corresponding to a section, sub-section or a question etc.,
STEP-2: Finally I need to convert this into the following format - an alist -
((:QANDA . "Test") (:VERSION . "0.1") (:DATE . "today")
(:SECTION
((:TITLE . "General Knowledge")
(:SUBSECTION
((:SSTITLE . "Science")
(:QUESTION
((:QUESTION . "what is the speed of light in miles per second?")
(:DIRECTIONS . "Choose the best answer from the following")
(:TYPE . "MCQ question")
(:CHOICES ((:CHOICE . "186000") (:CORRECT . "Y"))
((:CHOICE . "286000") (:CORRECT . "N"))
((:CHOICE . "200000") (:CORRECT . "N"))))))))))
to be consumed by cl-json.
STEP-3: cl-json will produce an appropriate json from this.
The json will look like this:
{
"qanda": "Test",
"version": "0.1",
"date": "today",
"section": [
{
"title": "General Knowledge",
"subsection": [
{
"sstitle": "Science",
"question": [
{
"question": "what is the speed of light in miles per second?",
"Directions": "Choose the best answer from the following",
"type": "MCQ question",
"choices": [
{
"choice": "186000",
"Correct": "Y"
},
{
"choice": "286000",
"Correct": "N"
},
{
"choice": "200000",
"Correct": "N"
}
]
}
]
}
]
}
]
}
I've been successful in reading the source file, generating the consed pair list. Where I am struggling is to create this nested list as shown above to feed it to cl-json.
I realized after a bit of struggle that this is more or less like an n-ary tree problem.
Here are my questions:
a) What is the right way to construct such an n-ary tree representation of the Test paper source file?
b) Or is there a better or easier data structure to represent this?
Here is what I tried, where qtree will be '() initially and kvlist is the consed pair list shown above. This is an incomplete code, as I tried push., consing and nconc (with unreliable results).
Step-1 and Step 3 are fine. Step-2 is where I need help.The problem is to how to add child nodes successively by iterating through the kvlist and find the right parent to add the child when more than one parent is there (e.g., adding a question to the second sub-section):
(defun build-qtree (qtree kvlist)
(cond
((eq '() kvlist) qtree)
((equal "h" (cdar kvlist))
(push (car kvlist) qtree)
(build-qtree qtree (cdr kvlist)))
((equal "p" (cdar kvlist))
(nconc (last qtree) '((:SECTION))))
(t
(qtree))))
[PS.1] Legend: This will be used in the conditions branches or may be a defstruct or a dictionary type of list etc.,
t - title, v - version, d - date, p - section, s - sub section, q - question, i - instructions, n - type of question, c - correct answer, w - wrong answer
[PS.2]Source File:
|Test//t
|0.1//v
|today//d
|General Knowledge//p
|Science//s
|what is the speed of light in miles per second?//q
|Choose the best answer from the following//i
|MCQ question//n
|186000//c
|286000//w
|200000//w
You have a simple problem with a complex example. It might be just a simple parsing problem: What you need is a grammar and a parser for it.
Example grammar. Terminal items are in upper case. * means one or more.
s = q
q = Q i*
i = I w*
w = W
Simple parser:
(defun example (sentence)
(labels ((next-item ()
(pop sentence))
(empty? ()
(null sentence))
(peek-item ()
(unless (empty?)
(first sentence)))
(expect-item (sym)
(let ((item (next-item)))
(if (eq item sym)
sym
(error "Parser error: next ~a, expected ~a" item sym))))
(star (sym fn)
(cons (funcall fn)
(loop while (eq (peek-item) sym)
collect (funcall fn)))))
(labels ((s ()
(q))
(q ()
(list (expect-item 'q) (star 'i #'i)))
(i ()
(list (expect-item 'i) (star 'w #'w)))
(w ()
(expect-item 'w)))
(s))))
Example:
CL-USER 10 > (example '(q i w w w i w w))
(Q
((I (W
W
W))
(I (W
W))))
in-range in Racket returns a stream. There are plenty of functions defined on streams from racket/stream library. However i can't use a function stream-take from srfi/41 on them. I wanted to execute
(stream-take 5 (in-range 10))
It complained that stream-take: non-stream argument.
(stream->list (stream-cons 10 (in-range 10)))
The above throws the following error:
stream-promise: contract violation;
given value instantiates a different structure type with the same name
expected: stream?
given: #<stream>
However:
(stream->list (stream-cons 10 stream-null)) ;; works
(stream->list (stream-cons 10 empty-stream)) ;; works
both work fine.
Does the above mean that streams from racket/stream and srfi/41 are incompatible? How can i take 10 items from a racket/stream stream without reinventing the wheel?
Racket 5.3.3
Yes, the kind of stream that (in-range 10) produces is different from srfi/41 streams. In general, you can't expect srfi/41 functions to work on all streams in Racket because a Racket "stream" is actually a generic datatype that dispatches to different method implementations (see gen:stream). In contrast, srfi/41 expects only its own datatype.
(stream-take should probably be added to racket/stream though)
If you want to take 10 items from racket/stream, use (for/list ([x some-stream] [e 10]) x).
So i basically want to printbst's .. here is a little more detail
Provide a function (printbst t) that prints a BST constructed from BST as provided by bst.rkt in the following format:
-Each node in the BST should be printed on a separate line;
-the left subtree should be printed after the root;
-The right subtree should be printed before the root;
-The key value should be indented by 2d spaces where d is its depth, or distance from the root. That is, the root should not be indented, the keys in its subtrees should be intended 2 spaces, the keys in their subtrees 4 spaces, and so on.
For example, the complete tree containing {1,2,3,4,5,6} would be printed like this:
6
5
4
3
2
1
Observe that if you rotate the output clockwise and connect each node to its subtrees, you arrive at the conventional graphical representation of the tree. Do not use mutation.
Here is what i have so far:
#lang racket
;;Note: struct-out exports all functions associated with the structure
(provide (struct-out BST))
(define-struct BST (key left right) #:transparent)
(define (depth key bst)
(cond
[(or (empty? bst) (= key (BST-key bst))) 0]
[else (+ 1 (depth key (BST-right bst)) (depth key (BST-left bst)))]))
(define (indent int)
(cond
[(= int 0) ""]
[else " " (indent (sub1 int))]))
(define (printbst t)
(cond
[(empty? t) (newline)]
[(and (empty? (BST-right t)) (empty? (BST-left t)))
(printf "~a~a" (indent (depth (BST-key t) t)) (BST-key t))]))
My printbst only prints a tree with one node thou .... i have an idea but it involves mutation, which i can't use :( ..... Any suggestions ? Should i change my approach to the problem all together?
Short answer: yes, you're going to want to restructure this more or less completely.
On the bright side, I like your indent function :)
The easiest way to write this problem involves making recursive calls on the subtrees. I hope I'm not giving away too much when I tell you that in order to print a subtree, there's one extra piece of information that you need.
...
Based on our discussion below, I'm going to first suggest that you develop the closely related recursive program that prints out the desired numbers with no indentation. So then the correct output would be:
6
5
4
3
2
1
Updating that program to the one that handles indentation is just a question of passing along a single extra piece of information.
P.S.: questions like this that produce output are almost impossible to write good test cases for, and consequently not great for homework. I hope for your sake that you have lots of other problems that don't involve output....
I just asked a question about how the Erlang compiler implements pattern matching, and I got some great responses, one of which is the compiled bytecode (obtained with a parameter passed to the c() directive):
{function, match, 1, 2}.
{label,1}.
{func_info,{atom,match},{atom,match},1}.
{label,2}.
{test,is_tuple,{f,3},[{x,0}]}.
{test,test_arity,{f,3},[{x,0},2]}.
{get_tuple_element,{x,0},0,{x,1}}.
{test,is_eq_exact,{f,3},[{x,1},{atom,a}]}.
return.
{label,3}.
{badmatch,{x,0}}
Its all just plain Erlang tuples. I was expecting some cryptic binary thingy, guess not. I am asking this on impulse here (I could look at the compiler source but asking questions always ends up better with extra insight), how is this output translated in the binary level?
Say {test,is_tuple,{f,3},[{x,0}]} for example. I am assuming this is one instruction, called 'test'... anyway, so this output would essentially be the AST of the bytecode level language, from which the binary encoding is just a 1-1 translation?
This is all so exciting, I had no idea that I can this easily see what the Erlang compiler break things into.
ok so I dug into the compiler source code to find the answer, and to my surprise the asm file produced with the 'S' parameter to the compile:file() function is actually consulted in as is (file:consult()) and then the tuples are checked one by one for further action(line 661 - beam_consult_asm(St) -> - compile.erl). further on then there's a generated mapping table in there (compile folder of the erlang source) that shows what the serial number of each bytecode label is, and Im guessing this is used to generate the actual binary signature of the bytecode.
great stuff. but you just gotta love the consult() function, you can almost have a lispy type syntax for a random language and avoid the need for a parser/lexer fully and just consult source code into the compiler and do stuff with it... code as data data as code...
The compiler has a so-called pattern match compiler which will take a pattern and compile it down to what is essentially a series of branches, switches and such. The code for Erlang is in v3_kernel.erl in the compiler. It uses Simon Peyton Jones, "The Implementation of Functional
Programming Languages", available online at
http://research.microsoft.com/en-us/um/people/simonpj/papers/slpj-book-1987/
Another worthy paper is the one by Peter Sestoft,
http://www.itu.dk/~sestoft/papers/match.ps.gz
which derives a pattern match compiler by inspecting partial evaluation of a simpler system. It may be an easier read, especially if you know ML.
The basic idea is that if you have, say:
% 1
f(a, b) ->
% 2
f(a, c) ->
% 3
f(b, b) ->
% 4
f(b, c) ->
Suppose now we have a call f(X, Y). Say X = a. Then only 1 and 2 are applicable. So we check Y = b and then Y = c. If on the other hand X /= a then we know that we can skip 1 and 2 and begin testing 3 and 4. The key is that if something does not match it tells us something about where the match can continue as well as when we do match. It is a set of constraints which we can solve by testing.
Pattern match compilers seek to optimize the number of tests so there are as few as possible before we have conclusion. Statically typed language have some advantages here since they may know that:
-type foo() :: a | b | c.
and then if we have
-spec f(foo() -> any().
f(a) ->
f(b) ->
f(c) ->
and we did not match f(a), f(b) then f(c) must match. Erlang has to check and then fail if it doesn't match.