I've got a string like "foo%20bar" and I want "foo bar" out of it.
I know there's got to be a built-in function to decode a URL-encoded string (query string) in Emacs Lisp, but for the life of me I can't find it today, either in my lisp/ folder or with Google.
What is it called?
url-unhex-string
In my case I needed to do this interactively. The previous answers gave me the right functions to call, then it was just a matter of wrapping it a little to make them interactive:
(defun func-region (start end func)
"run a function over the region between START and END in current buffer."
(save-excursion
(let ((text (delete-and-extract-region start end)))
(insert (funcall func text)))))
(defun hex-region (start end)
"urlencode the region between START and END in current buffer."
(interactive "r")
(func-region start end #'url-hexify-string))
(defun unhex-region (start end)
"de-urlencode the region between START and END in current buffer."
(interactive "r")
(func-region start end #'url-unhex-string))
Add salt, I mean bind to keys according to taste.
Emacs is shipped with a URL library that provides a bunch of URL parsing functions—as huaiyuan and Charlie Martin already pointed out. Here is a small example that'd give you an idea how to use it:
(let ((url "http://www.google.hu/search?q=elisp+decode+url&btnG=Google+keres%E9s&meta="))
;; Return list of arguments and values
(url-parse-query-string
;; Decode hexas
(url-unhex-string
;; Retrieve argument list
(url-filename
;; Parse URL, return a struct
(url-generic-parse-url url)))))
=> (("meta" "") ("btnG" "Google+keresés") ("/search?q" "elisp+decode+url"))
I think is better to rely on it rather than Org-mode as it is its main purpose to parse a URL.
org-link-unescape does the job for very simple cases ... w3m-url-decode-string is better, but it isn't built in and the version I have locally isn't working with Emacs 23.
You can grab urlenc from MELPA and use urlenc:decode-region for a region or urlenc:decode-insert to insert your text interactively.
I think you're making it a little too hard: split-string will probably do most of what you want. For fancier stuff, have a look at the functions in url-expand.el; unfortunately, many of them don't have doc-strings, so you may have to read code.
url-generic-parse-url looks like a potential winner.
Related
I am using the Emacs editor together with the org-mode and evil-mode mainly for text handling and documentation. Often there is a topic where several different URLs to websites belong to.
Example: I have a text snippet on how to install Emacs:
*** install emacs
emacs - I want to try org-mode. What's the shortest path from zero to typing? - Stack Overflow
https://stackoverflow.com/questions/4940680/i-want-to-try-org-mode-whats-the-shortest-path-from-zero-to-typing
Index of /gnu/emacs/windows/emacs-26
http://ftp.gnu.org/gnu/emacs/windows/emacs-26/emacs-26.3-x86_64.zip
Installation target:
file://C:\Lupo_Pensuite\MyApps\emacs
How to
file://C:\Lupo_Pensuite\MyDocs\howto.txt
Is it possible to select the region and all the URLs are opened within my default web browser? And the file link is being opened by Windows Explorer? And the text file is opened with the associated editor?
Or even better: emacs is aware that the a.m. text snippet actually is a org-mode chapter. And regardless where within that chapter the cursor is positioned, something like M-x open-all-links-in-chapter is...opening all mentioned links in the current chapter.
Prio 1: is there something like that existing in emacs/org-mode/evil-mode already?
Prio 2: is there a elisp function you know which can achieve this use case?
Enviroment: Cygwin under Windows 10, emacs 26.3, org-mode 9.1.9
It turns out, that org-mode has this already built-in!
Today I was browsing the documentation of org-mode, wondering how exactly C-c C-o is working. That key combo is calling the emacs org-mode function "org-open-at-point". org-open-at-point is opening the URL where the cursor (in emacs speak: point) is positioned.
Now if a C-c C-o is pressed on a heading, then all URL's beneath that heading are opened! Which is exactly what I asked for from the beginning. Thanks a lot, NickD, for your constructive contributions!
Here the original help text:
When point is on a headline, display a list of every link in the entry, so it is possible to pick one, or all, of them.
Warning: used without thought, the following can bring your machine to its knees. I will add some more specific warnings at the end, but be careful!
The basic idea of the code below is to parse the buffer of an Org mode file, in order to get a parse tree of the buffer: that is done by org-element-parse-buffer. We can then use org-element-map to walk the parse tree and select only nodes of type link, applying a function to each one as we go. The function we apply, get-link, munges through the contents of the link node, extracting the type and path and returning a list of those two. Here's how it looks so far:
(defun get-link (x)
(let* ((link (cadr x))
(type (plist-get link :type))
(path (plist-get link :path)))
(if (or (string= type "http") (string= type "https"))
(list type path))))
(defun visit-all-http-links ()
(interactive)
(let* ((parse-tree (org-element-parse-buffer))
(links (org-element-map parse-tree 'link #'get-link)))
links))
Note that I only keep http and https links: you may want to add extra types.
This already goes a long way towards getting you what you want. In fact, if you load the file with the two functions above, you can try it on the following sample Org mode file:
* foo
** foo 1
http://www.google.com
https://redhat.com
* bar
** bar 2
[[https://gnome.org][Gnome]] is a FLOSS project. So is Fedora: https://fedoraproject.org.
* Code
#+begin_src emacs-lisp :results value verbatim :wrap example
(visit-all-http-links)
#+end_src
#+RESULTS:
#+begin_example
(("http" "//www.google.com") ("https" "//redhat.com") ("https" "//gnome.org") ("https" "//fedoraproject.com"))
#+end_example
and evaluating the source block with C-c C-c, you get the results shown.
Now all we need to do is convert each (TYPE PATH) pair in the result list to a real URL and then visit it - here's the final version of the code:
(defun get-link (x)
"Assuming x is a LINK node in an Org mode parse tree,
return a list consisting of its type (e.g. \"http\")
and its path."
(let* ((link (cadr x))
(type (plist-get link :type))
(path (plist-get link :path)))
(if (or (string= type "http") (string= type "https"))
(list type path))))
(defun format-url (x)
"Take a (TYPE PATH) list and return a proper URL. Note
the following works for http- and https-type links, but
might need modification for other types."
(format "%s:%s" (nth 0 x) (nth 1 x)))
(defun visit-all-http-links ()
(interactive)
(let* ((parse-tree (org-element-parse-buffer))
(links (org-element-map parse-tree 'link #'get-link)))
(mapcar #'browse-url (mapcar #'format-url links))))
We add a function format-url that does this: ("http" "//example.com") --> "http://example.com" and map it on the links list, producing a new list of URLS. Then we map the function browse-url (which is provided by emacs) on the resulting list and we watch the browser open them all.
WARNINGS:
If you have hundreds or thousands of links in the file, then you are going to create hundreds or thousands of tabs in your browser. Are you SURE your machine can take it?
If your links point to big objects, that's going to put another kind of memory pressure on your system. Are you SURE your machine can take it?
If your Org mode buffer is big, then org-element-parse-buffer can take a LONG time to process it. Moreover, even though there is a caching mechanism, it is not enabled by default because of bugs, so every time you execute the function you are going to parse the buffer AGAIN from scratch.
Every time you execute the function, you are going to open NEW tabs in your browser.
EDIT in response to questions in comments:
Q1: "visit-all-http-links opens all URLs in the file. My original question was, whether it is possible to open only the URLs which are being found in the current org-mode chapter."
A1: Doing just a region is a bit harder but possible, if you guarantee that the region is syntactically correct Org mode (e.g. a collection of headlines and their contents). You just write the region to a temporary buffer and then do what I did on the temp buffer instead of the original.
Here's the modified code using the visit-url function from Question 2:
(defun visit-all-http-links-in-region (beg end)
(interactive "r")
(let ((s (buffer-substring beg end)))
(with-temp-buffer
(set-buffer (current-buffer))
(insert s)
(let* ((parse-tree (org-element-parse-buffer))
(links (org-element-map parse-tree 'link #'get-link)))
(mapcar #'visit-url (mapcar #'format-url links))))))
(defun visit-all-http-links ()
(interactive)
(visit-all-http-links-in-region (point-min) (point-max)))
Very lightly tested.
Q2: "Every time I execute your function with your example URLs, the URLs are being opened with a different sequence - is it possible to open the URLs in that very sequence which is found in the org file?"
A2: The links are found deterministically in the order that they occur in the file. But the moment you call browse-url, all bets are off, because the URL now belongs to the browser, which will try to open each URL it receives in a separate tab and using a separate thread - in other words asynchronously. You might try introducing a delay between calls, but there are no guarantees:
(defun visit-url(url)
(browse-url)
(sit-for 1 t))
and then use visit-url instead of browse-url in visit-all-urls.
4> abs(1).
1
5> X = abs.
abs
6> X(1).
** exception error: bad function abs
7> erlang:X(1).
1
8>
Is there any particular reason why I have to use the module name when I invoke a function with a variable? This isn't going to work for me because, well, for one thing it is just way too much syntactic garbage and makes my eyes bleed. For another thing, I plan on invoking functions out of a list, something like (off the top of my head):
[X(1) || X <- [abs, f1, f2, f3...]].
Attempting to tack on various module names here is going to make the verbosity go through the roof, when the whole point of what I am doing is to reduce verbosity.
EDIT: Look here: http://www.erlangpatterns.org/chain.html The guy has made some pipe-forward function. He is invoking functions the same way I want to above, but his code doesn't work when I try to use it. But from what I know, the guy is an experienced Erlang programmer - I saw him give some keynote or whatever at a conference (well I saw it online).
Did this kind of thing used to work but not anymore? Surely there is a way I can do what I want - invoke these functions without all the verbosity and boilerplate.
EDIT: If I am reading the documentation right, it seems to imply that my example at the top should work (section 8.6) http://erlang.org/doc/reference_manual/expressions.html
I know abs is an atom, not a function. [...] Why does it work when the module name is used?
The documentation explains that (slightly reorganized):
ExprM:ExprF(Expr1,...,ExprN)
each of ExprM and ExprF must be an atom or an expression that
evaluates to an atom. The function is said to be called by using the
fully qualified function name.
ExprF(Expr1,...,ExprN)
ExprF
must be an atom or evaluate to a fun.
If ExprF is an atom the function is said to be called by using the implicitly qualified function name.
When using fully qualified function names, Erlang expects atoms or expression that evaluates to atoms. In other words, you have to bind X to an atom: X = atom. That's exactly what you provide.
But in the second form, Erlang expects either an atom or an expression that evaluates to a function. Notice that last word. In other words, if you do not use fully qualified function name, you have to bind X to a function: X = fun module:function/arity.
In the expression X=abs, abs is not a function but an atom. If you want thus to define a function,you can do so:
D = fun erlang:abs/1.
or so:
X = fun(X)->abs(X) end.
Try:
X = fun(Number) -> abs(Number) end.
Updated:
After looking at the discussion more, it seems like you're wanting to apply multiple functions to some input.
There are two projects that I haven't used personally, but I've starred on Github that may be what you're looking for.
Both of these projects use parse transforms:
fun_chain https://github.com/sasa1977/fun_chain
pipeline https://github.com/stolen/pipeline
Pipeline is unique because it uses a special syntax:
Result = [fun1, mod2:fun2, fun3] (Arg1, Arg2).
Of course, it could also be possible to write your own function to do this using a list of {module, function} tuples and applying the function to the previous output until you get the result.
I've been working with parsec and I have trouble debugging my code. For example, I can set a breakpoint in ghci, but I'm not sure how to see how much of the input has been consumed, or things like that.
Are there tools / guidelines to help with debugging parsec code?
This page might help.
Debug.trace is your friend, it allows you to essentially do some printf debugging. It evaluates and prints its first argument and then returns its second. So if you have something like
foo :: Show a => a -> a
foo = bar . quux
You can debug the 'value' of foo's parameter by changing foo to the following:
import Debug.Trace(trace)
foo :: Show a => a -> a
foo x = bar $ quux $ trace ("x is: " ++ show x) x
foo will now work the same way as it did before, but when you call foo 1 it will now print x is: 1 to stderr when evaluated.
For more in-depth debugging, you'll want to use GHCI's debugging commands. Specifically, it sounds like you're looking for the :force command, which forces the evaluation of a variable and prints it out. (The alternative is the :print command, which prints as much of the variable as has been evaluated, without evaluating any more.)
Note that :force is more helpful in figuring out the contents of a variable, but may also change the semantics of your program (if your program depends upon laziness).
A general GHCI debugging workflow looks something like this:
Use :break to set breakpoints
Use :list and :show context to check where you are in the code
Use :show bindings to check the variable bindings
Try using :print to see what's currently bound
Use :force if necessary to check your bindings
If you're trying to debug an infinite loop, it also helps to use
:set -fbreak-on-error
:trace myLoopingFunc x y
Then you can hit Ctrl-C during the loop and use :history to see what's looping.
You might be able to use the <?> operator in Text.Parsec.Prim to make better error messages for you and your users. There are some examples in Real World Haskell. If your parser has good sub-parts then you could setup a few simple tests (or use HUnit) to ensure they work separately as expected.
Another useful trick:
_ <- many anyChar >>= fail this will generate an error (Left) of:
unexpected end of input
the remaining 'string'
I think the parserTrace and parserTraced functions mentioned here http://hackage.haskell.org/package/parsec-3.1.13.0/docs/Text-Parsec-Combinator.html#g:1 do something similar to the above.
Is there a good reason why the type of Prelude.read is
read :: Read a => String -> a
rather than returning a Maybe value?
read :: Read a => String -> Maybe a
Since the string might fail to be parseable Haskell, wouldn't the latter be be more natural?
Or even an Either String a, where Left would contain the original string if it didn't parse, and Right the result if it did?
Edit:
I'm not trying to get others to write a corresponding wrapper for me. Just seeking reassurance that it's safe to do so.
Edit: As of GHC 7.6, readMaybe is available in the Text.Read module in the base package, along with readEither: http://hackage.haskell.org/packages/archive/base/latest/doc/html/Text-Read.html#v:readMaybe
Great question! The type of read itself isn't changing anytime soon because that would break lots of things. However, there should be a maybeRead function.
Why isn't there? The answer is "inertia". There was a discussion in '08 which got derailed by a discussion over "fail."
The good news is that folks were sufficiently convinced to start moving away from fail in the libraries. The bad news is that the proposal got lost in the shuffle. There should be such a function, although one is easy to write (and there are zillions of very similar versions floating around many codebases).
See also this discussion.
Personally, I use the version from the safe package.
Yeah, it would be handy with a read function that returns Maybe. You can make one yourself:
readMaybe :: (Read a) => String -> Maybe a
readMaybe s = case reads s of
[(x, "")] -> Just x
_ -> Nothing
Apart from inertia and/or changing insights, another reason might be that it's aesthetically pleasing to have a function that can act as a kind of inverse of show. That is, you want that read . show is the identity (for types which are an instance of Show and Read) and that show . read is the identity on the range of show (i.e. show . read . show == show)
Having a Maybe in the type of read breaks the symmetry with show :: a -> String.
As #augustss pointed out, you can make your own safe read function. However, his readMaybe isn't completely consistent with read, as it doesn't ignore whitespace at the end of a string. (I made this mistake once, I don't quite remember the context)
Looking at the definition of read in the Haskell 98 report, we can modify it to implement a readMaybe that is perfectly consistent with read, and this is not too inconvenient because all the functions it depends on are defined in the Prelude:
readMaybe :: (Read a) => String -> Maybe a
readMaybe s = case [x | (x,t) <- reads s, ("","") <- lex t] of
[x] -> Just x
_ -> Nothing
This function (called readMaybe) is now in the Haskell prelude! (As of the current base -- 4.6)
In Pascal, I have write and writeln. Apparently Lua's print is similar to writeln of Pascal. Do we have something similar to write of Pascal? How can consecutive print commands send their output to the same line?
print("Hello")
print("World")
Output:
Hello
world
I want to have this:
Hello world
Use io.write instead print, which is meant for simple uses, like debugging, anyway.
Expanding on lhf's correct answer, the io library is preferred for production use.
The print function in the base library is implemented as a primitive capability. It allows for quick and dirty scripts that compute something and print an answer, with little control over its presentation. Its principle benefits are that it coerces all arguments to string and that it separates each argument in the output with tabs and supplies a newline.
Those advantages quickly become defects when detailed control of the output is required. For that, you really need to use io.write. If you mix print and io.write in the same program, you might trip over another defect. print uses the C stdout file handle explicitly. This means that if you use io.output to change the output file handle, io.write will do what you expect but print won't.
A good compromise can be to implement a replacement for print in terms of io.write. It could look as simple as this untested sample where I've tried to write clearly rather than optimally and still handle nil arguments "correctly":
local write = io.write
function print(...)
local n = select("#",...)
for i = 1,n do
local v = tostring(select(i,...))
write(v)
if i~=n then write'\t' end
end
write'\n'
end
Once you are implementing your own version of print, then it can be tempting to improve it in other ways for your application. Using something with more formatting control than offered by tostring() is one good idea. Another is considering a separator other than a tab character.
As an alternative, just build up your string then write it out with a single print
You may not always have access to the io library.
You could use variables for "Hello" and "World". Then concatenate them later. Like this:
local h = "Hello"
local w = "World"
print(h..w)
It will be display, in this case, as "HelloWorld". But that's easy to fix. Hope this helped!
Adding on to #Searous's answer, try the following.
local h = "hello"
local w = "world"
print(h.." "..w)
You can concatenate both together, just concatenate a space between both variables.
local h = "Hello"
local w = "World!"
print(h, w)