Parsing strings with Scheme - parsing

I am trying to write a simple parser which creates a sxml-expression from a string, e. g.
"This is a [Test]" ===> (item "This is a" (subitem "Test"))
Anybody who is wondering about the square brackets within the given example may have a look at the so called Leiden conventions.
This is the code I have written so far:
(define my-sequence '("this" "[" "is" "a" "]" "test"))
(define (left-square-bracket? item)
(or (equal? item "[")
(eq? item #\x005b)))
(define (right-square-bracket? item)
(or (equal? item "]")
(eq? item #\x005d)))
(define (parse-sequence sequence)
(cond ((null? sequence) '())
((left-square-bracket? (car sequence))
(let ((subsequence (get-subsequence (cdr sequence))))
(list subsequence)))
(else
(cons (car sequence)
(parse-sequence (cdr sequence))))))
(define (get-subsequence sequence)
(if (right-square-bracket? (car sequence))
'()
(cons (car sequence)
(get-subsequence (cdr sequence)))))
Evaluating (parse-sequence my-sequence) yields ("this" ("is" "a")). A nested expression has been created, but the program finished without having evaluated the last item "test". The question is, how do I return from get-subsequence to parse-sequence?
Any help is appreciated, many thanks in advance! :)

To address your initial questions, how to return multiple values: use the "values" form. Here is an example implementation where the inner procedure returns both the remaining list to be processed and the result so far. It recurses on opening brackets.
(define (parse-sequence lst)
(define (parse-seq lst)
(let loop ((lst lst) (res null))
(cond
((null? lst) (values null res))
((string=? (car lst) "[")
(let-values ([(lst2 res2) (parse-seq (cdr lst))])
(loop lst2 (append res (list res2)))))
((string=? (car lst) "]")
(values (cdr lst) res))
(else
(loop (cdr lst) (append res (list (car lst))))))))
(let-values ([(lst res) (parse-seq lst)])
res))
then
(parse-sequence '("this" "is" "a" "test"))
(parse-sequence '("this" "[" "is" "a" "]" "test"))
(parse-sequence '("this" "[" "is" "[" "a" "]" "]" "test"))
will yield
'("this" "is" "a" "test")
'("this" ("is" "a") "test")
'("this" ("is" ("a")) "test")

I made some progress by using open-input-string in combination with read-char:
(define my-sequence (open-input-string "this [is a] test"))
(define (parse-sequence sequence)
`(item
,#(let loop ((next-char (read-char sequence)))
(cond ((eof-object? next-char) '())
((left-square-bracket? next-char)
(let ((subsequence (get-subsequence sequence)))
(cons subsequence
(loop (read-char sequence)))))
(else
(cons next-char
(loop (read-char sequence))))))))
(define (get-subsequence sequence)
`(subitem
,#(let loop ((next-char (read-char sequence)))
(if (right-square-bracket? next-char)
'()
(cons next-char
(loop (read-char sequence)))))))
(parse-sequence my-sequence)
===> (item #\t #\h #\i #\s #\space (subitem #\i #\s #\space #\a) #\space #\t #\e #\s #\t)
Now work goes on, step by step. :)
Any comments and suggestions are still appreciated. :)

Related

Parsing concrete syntax in Scheme

I wrote a procedure that gets a valid prefix list for subtraction (e.g, "(- 6 5)" for what we know as "6-5"). Here is my code:
(define parse-diff-list
(lambda (datum)
(cond
((number? datum) (const-exp datum)) ;; if datum is a number, return const-exp
((pair? datum) ;; if datum is a pair:
(let ((sym (car datum))) ;; let sym be the first of the pair
(cond
((eqv? sym '-) ;; if sym is minus:
(let ((lst1 (parse-diff-list (cdr datum)))) ;; parse second element of subtraction
(let ((lst2 (parse-diff-list (cdr lst1)))) ;; parse first element of subtraction
(cons (diff-exp (car lst1) (car lst2)) (cdr lst2))))) ;; "perform" the subtraction
((number? sym) ;; if sym is number:
(cons (const-exp sym) (cdr datum))) ;; return const-exp with the remainder of the list, yet to be processed
(else (eopl:error 'parse-diff-list "bad prefix-expression, expected - ~s" sym)))))
(eopl:error 'parse-diff-list "bad prefix-expression ~s" datum))))
(define parse-prefix
(lambda (lst)
(car (parse-diff-list lst))))
It works fine logically, but I don't understand the logic of the indentation in printing. For the input:
(parse-prefix '(- - 1 2 - 3 - 4 5))
It prints:
#(struct:diff-exp
#(struct:diff-exp #(struct:const-exp 1) #(struct:const-exp 2))
#(struct:diff-exp #(struct:const-exp 3) #(struct:diff-exp #(struct:const-exp 4) #(struct:const-exp 5)))
While I would want the following print style:
#(struct:diff-exp
#(struct:diff-exp
#(struct:const-exp 1)
#(struct:const-exp 2))
#(struct:diff-exp
#(struct:const-exp 3)
#(struct:diff-exp
#(struct:const-exp 4)
#(struct:const-exp 5)))
It's more than a petty question for me, as it does create indentations but I don't know how it does it.
Thanks a lot!
Take a look at racket/pretty the pretty printing library.
In particular note the parameter (pretty-print-columns) which
you can set like this:
`(pretty-print-columns 40)`
in order to avoid long lines.
http://docs.racket-lang.org/reference/pretty-print.html
(I am guessing you are using DrRacket based on the way the structures are printing)

An iterative program for appending lists in scheme

I am reading Section 2.2 in SICP where the book introduced the procedure for appending two lists. I am trying to implement the append using iteration.
This is my code:
(define (append list1 list2)
(define (append-iter item1 reversed-item1 result)
(if (null? item1)
(if (null? reversed-item1)
result
(append-iter item1
(cdr reversed-item1)
(cons (car reverse) result)))
(append-iter (cdr item1)
(cons (car item1) reversed-item1)
result)))
(append-iter list1 '() list2))
Though it works, but noting the number of the iterations is double the length of list1. Is there a solution whose number of the iterations equals to the length of list1. (without using any fold function)?
Basically how your procedure works is like this:
(define (append l1 l2)
(define (reverse-append rev app)
(if (null? rev)
app
(reverse-append (cdr rev)
(cons (car rev) app))))
(reverse-append (reverse l1) l2))
It's O(N) but it wastes some memory since (reverse l1) space is just used for iteration. If you really need to fix that you need to use mutation:
(define (append-iter . rest)
(let ((result (list 1)))
(let loop ((p result) (lst '()) (rest rest))
(cond
((not (null? lst))
(set-cdr! p (list (car lst)))
(loop (cdr p) (cdr lst) rest))
((null? rest) (cdr result))
((null? (cdr rest))
(set-cdr! p (car rest))
(cdr result))
(else (loop p (car rest) (cdr rest)))))))

Give a stream of numbers in scheme I need to print n numbers separated by comma like (1, 2, 3, 4, ..)

I can print n-numbers as list with this code below:
(define (print-first-n stream1 n)
(cond((= n 0) '())
(else(cons(stream-car stream1) (print-first-n (stream-cdr stream1) (- n 1))))))
But I have no idea about how to add commas.
You can't print a comma in a normal list, but we can build a string with the contents of the stream, separated by commas. This will work, assuming that the string contains numbers:
(define (print-first-n stream1 n)
(cond ((= n 1)
(number->string (stream-car stream1)))
(else
(string-append
(number->string (stream-car stream1)) ", "
(print-first-n (stream-cdr stream1) (- n 1))))))
The above solution is fine for a small value of n, but terribly inefficient for large values (lots of temporary strings will be created, with O(n^2) complexity for the append operation). For a more efficient implementation, consider using SRFI-13's concatenation procedures, like this:
(require srfi/13)
(define (print-first-n stream1 n)
(let loop ((strm stream1) (n n) (acc '()))
(if (= n 1)
(string-concatenate-reverse
(cons (number->string (stream-car strm)) acc))
(loop (stream-cdr strm)
(sub1 n)
(list* ", " (number->string (stream-car strm)) acc)))))
Either way: let's say that integers is an infinite stream of integers starting at 1, this is how it would look:
(print-first-n integers 5)
=> "1, 2, 3, 4, 5"
If the stream contains some other data type, use the appropriate procedure to convert each element to a string.
If your function just prints the stream contents, and doesn't need to build a string (like Óscar's answer), here's my take on it (uses SRFI 41 streams):
(define (print-first-n stream n)
(stream-for-each (lambda (delim item)
(display delim)
(display item))
(stream-cons "" (stream-constant ", "))
(stream-take n stream)))
Example:
> (define natural (stream-cons 1 (stream-map (lambda (x) (+ x 1)) natural)))
> (print-first-n natural 10)
1, 2, 3, 4, 5, 6, 7, 8, 9, 10
To output to a string (like Óscar's answer), just wrap the whole thing in a string port:
(define (print-first-n stream n)
(call-with-output-string
(lambda (out)
(stream-for-each (lambda (delim item)
(display delim out)
(display item out))
(stream-cons "" (stream-constant ", "))
(stream-take n stream)))))

Scheme Text Alignment Program Help [Just need help understanding how to write a function]

I'm working on a text alignment program in scheme that is made to read in text from a file, convert it to a stream, morph the stream to remove extra spaces and newlines, and then align the text to the right. This is part of an assignment that I'm almost finished on, so I just need to conceptually understand how to do it.
The function that I'm having trouble understanding how to do is called 'right-justify'.
I have the main function called at the end of the program, it uses a file called 'hollow.txt', I can supply that if you guys want, but I think I just need help on how to start right-justify.
Thanks very much in advance.
#lang racket
(require racket/stream)
(define file->stream
(lambda (filename)
(let ((in-port (open-input-file filename)))
(letrec
((build-input-stream
(lambda ()
(let ((ch (read-char in-port)))
(if (eof-object? ch)
(begin
(close-input-port in-port)
(stream))
(stream-cons ch (build-input-stream)))))))
(build-input-stream)))))
;main function
(define formatter ;from homework page
(lambda (input-filename output-filename line-length)
(stream->file output-filename
; (right-justify
(insert-newlines line-length
(remove-extra-spaces ;str ;stream argument given.
(remove-newlines ;str ;stream argument given.
(file->stream input-filename)))))));)
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;MORPHING STREAMS.
(define right-justify ;not fully sure how to solve this one.
(lambda (str)
(cond
[ (stream-empty?) str ]
[ (char=? #\newline (stream-rest str))((stream-cons #\space (stream-rest str))) ]
[
(define remove-newlines
(lambda (str)
(cond
[ (stream-empty? str) str ]
[ (char=? #\newline (stream-first str)) (stream-cons #\space (remove-newlines (stream-rest str))) ]
[ else (stream-cons (stream-first str) (remove-newlines (stream-rest str))) ]
)))
(define remove-extra-spaces ;Remove all spaces in the stream.
(lambda (str)
(cond
[ (stream-empty? str) str ]
[(char=? #\space (stream-first str)) (stream-cons #\space ( remove-extra-spaces (space-helper (stream-rest str))))]
[ else (stream-cons (stream-first str) (remove-extra-spaces (stream-rest str))) ]
)))
(define space-helper ;A helper for the remove-extra-spaces function.
(lambda (str)
(cond
[ (stream-empty? str) str]
[ (char=? #\space (stream-first str)) (space-helper(stream-rest str))]
[ else str ]
)))
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; THESE FUNCTIONS MAY NEED WORK.
(define stream->file
(lambda (filename stream-in)
(let ((out-port (open-output-file filename #:exists 'replace )))
(letrec
((build-output-stream
(lambda (str) ;stream
(if (stream-empty? str)
(close-output-port out-port)
(begin
(write-char (stream-first str) out-port)
(build-output-stream (stream-rest str)))
))))
(build-output-stream stream-in)))))
(define insert-newlines
(lambda (line-length str)
(letrec
((insert
(lambda (str count)
(if (stream-empty? str)
str
(let ((n (count-chars-to-next-space str)))
(if (and (< count line-length)
(<= (+ n count) line-length))
(stream-cons
(stream-first str)
(insert (stream-rest str) (+ count 1)))
(stream-cons
#\newline
(insert (trim-spaces str) 0))))))))
(insert (trim-spaces str) 0))))
(define trim-spaces
(lambda (str)
(cond ((stream-empty? str) (stream))
((char=? (stream-first str) #\space)
(trim-spaces (stream-rest str)))
(else str))))
(define count-chars-to-next-space
(lambda (str)
(letrec
((count-ahead
(lambda (str count)
(cond ((stream-empty? str) count)
((char=? (stream-first str) #\space) count)
(else (count-ahead (stream-rest str) (+ count 1)))))))
(count-ahead str 0))))
(formatter "hollow.txt" "h.txt" 30) ;calling the function at the end so that it will run automatically.

Recursively parse org-mode hierarchy

I'm trying to parse org-mode text in this way:
* head
** sub-head
- word :: description
** sub-head
- word :: description
- some notes
* head2
** sub-head2
- some more notes
I am trying to capture the data (such as "word :: description" and "some notes") in such a way that each piece of data preserves what its parent headers are and what the parent's parents are, etc. I envision the data coming out in such a form in elisp:
(
("head"
("sub-head" ("word :: definition"))
("sub-head" ("word :: description" "some notes"))
)
("head2"
("sub-head2" ("some more notes"))
)
)
I am guessing there is an elegant solution using recursion. I'm open to structuring the data in elisp a different way, if there's a better way to do it.
The function org-element-parse-buffer should help. It parses the whole org-mode buffer into a lisp list. You will get more properties than you need.
http://orgmode.org/worg/exporters/org-element-docstrings.html#sec-10
Here's a recursive solution:
(defun org-splitter (str lvl)
(let* ((lst (split-string
str
(concat lvl " ")))
(out (unless (= (length (car lst))
(length str))
(mapcar
(lambda (s)
(and
(string-match "\\([^\n]+\\)\n\\(.*\\)" s)
(list (match-string 1 s)
(org-splitter
(substring-no-properties
s (match-beginning 2))
(concat lvl "\\*")))))
(cdr lst)))))
(if (string= (car lst) "")
out
(cons (car lst) out))))
(defun org-recurse-all ()
(let ((str (buffer-substring-no-properties
(point-min) (point-max))))
(org-splitter str "^\\*")))

Resources