Is there a way in Go to combine URL paths similarly as we can do with filepaths using path.Join()?
For example see e.g. Combine absolute path and relative path to get a new absolute path.
When I use path.Join("http://foo", "bar"), I get http:/foo/bar.
See in Golang Playground.
The function path.Join expects a path, not a URL. Parse the URL to get a path and join with that path:
u, err := url.Parse("http://foo")
if err != nil { log.Fatal(err) }
u.Path = path.Join(u.Path, "bar.html")
s := u.String()
fmt.Println(s) // prints http://foo/bar.html
Use the url.JoinPath function in Go 1.19 or later:
s, err := url.JoinPath("http://foo", "bar.html")
if err != nil { log.Fatal(err) }
fmt.Println(s) // prints http://foo/bar.html
Use ResolveReference if you are resolving a URI reference from a base URL. This operation is different from a simple path join: an absolute path in the reference replaces the entire base path; the base path is trimmed back to the last slash before the join operation.
base, err := url.Parse("http://foo/quux.html")
if err != nil {
log.Fatal(err)
}
ref, err := url.Parse("bar.html")
if err != nil {
log.Fatal(err)
}
u := base.ResolveReference(ref)
fmt.Println(u.String()) // prints http://foo/bar.html
Notice how quux.html in the base URL does not appear in the resolved URL.
ResolveReference() in net/url package
The accepted answer will not work for relative url paths containing file endings like .html or .img. The ResolveReference() function is the correct way to join url paths in go.
package main
import (
"fmt"
"log"
"net/url"
)
func main() {
u, err := url.Parse("../../..//search?q=dotnet")
if err != nil {
log.Fatal(err)
}
base, err := url.Parse("http://example.com/directory/")
if err != nil {
log.Fatal(err)
}
fmt.Println(base.ResolveReference(u))
}
A simple approach to this would be to trim the /'s you don't want and join. Here is an example func
func JoinURL(base string, paths ...string) string {
p := path.Join(paths...)
return fmt.Sprintf("%s/%s", strings.TrimRight(base, "/"), strings.TrimLeft(p, "/"))
}
Usage would be
b := "http://my.domain.com/api/"
u := JoinURL(b, "/foo", "bar/", "baz")
fmt.Println(u)
This removes the need for checking/returning errors
In 1.19 there will be a new function in the standard library that solves this very neatly.
u, err := url.JoinPath("http://host/foo", "bar/")
https://go.dev/play/p/g422ockBq0q?v=gotip
To join a URL with another URL or a path, there is URL.Parse():
func (u *URL) Parse(ref string) (*URL, error)
Parse parses a URL in the context of the receiver. The provided URL
may be relative or absolute. Parse returns nil, err on parse failure,
otherwise its return value is the same as ResolveReference.
func TestURLParse(t *testing.T) {
baseURL, _ := url.Parse("http://foo/a/b/c")
url1, _ := baseURL.Parse("d/e")
require.Equal(t, "http://foo/a/b/d/e", url1.String())
url2, _ := baseURL.Parse("../d/e")
require.Equal(t, "http://foo/a/d/e", url2.String())
url3, _ := baseURL.Parse("/d/e")
require.Equal(t, "http://foo/d/e", url3.String())
}
I wrote this utility function that works for my purposes:
func Join(basePath string, paths ...string) (*url.URL, error) {
u, err := url.Parse(basePath)
if err != nil {
return nil, fmt.Errorf("invalid url")
}
p2 := append([]string{u.Path}, paths...)
result := path.Join(p2...)
u.Path = result
return u, nil
}
https://play.golang.org/p/-QNVvyzacMM
Related
I'm writing a little web crawler, and a lot of the links on sites I'm crawling are relative (so they're /robots.txt, for example). How do I convert these relative URLs to absolute URLs (so /robots.txt => http://google.com/robots.txt)? Does Go have a built-in way to do this?
Yes, the standard library can do this with the net/url package. Example (from the standard library):
package main
import (
"fmt"
"log"
"net/url"
)
func main() {
u, err := url.Parse("../../..//search?q=dotnet")
if err != nil {
log.Fatal(err)
}
base, err := url.Parse("http://example.com/directory/")
if err != nil {
log.Fatal(err)
}
fmt.Println(base.ResolveReference(u))
}
Notice that you only need to parse the absolute URL once and then you can reuse it over and over.
On top of #Not_a_Golfer's solution.
You can also use base URL's Parse method to provide a relative or absolute URL.
package main
import (
"fmt"
"log"
"net/url"
)
func main() {
// parse only base url
base, err := url.Parse("http://example.com/directory/")
if err != nil {
log.Fatal(err)
}
// and then use it to parse relative URLs
u, err := base.Parse("../../..//search?q=dotnet")
if err != nil {
log.Fatal(err)
}
fmt.Println(u.String())
}
Try it on Go Playground.
I think you are looking for ResolveReference method.
import (
"fmt"
"log"
"net/url"
)
func main() {
u, err := url.Parse("../../..//search?q=dotnet")
if err != nil {
log.Fatal(err)
}
base, err := url.Parse("http://example.com/directory/")
if err != nil {
log.Fatal(err)
}
fmt.Println(base.ResolveReference(u))
}
// gives: http://example.com/search?q=dotnet
I use it for my crawler as well and works like a charm!
I need to send http request to https://some-domain.com/getsomething/?id=myID
I have url and need to add to it a query parameter. Here is my Go code
baseUrl := "https://some-domain.com"
relativeUrl := "/getsomething/"
url, _ := url.Parse(baseUrl)
url.Path = path.Join(url.Path, relativeUrl)
// add parameter to query string
queryString := url.Query()
queryString.Set("id", "1")
// add query to url
url.RawQuery = queryString.Encode()
// print it
fmt.Println(url.String())
In output I see this url: https://some-domain.com/getsomething?id=1
And this one is required: https://some-domain.com/getsomething/?id=1
You can see that there is no / character before ?.
Do you know how to fix it without manual string manipulations?
https://play.golang.org/p/HsiTzHcvlQ
You can use ResolveReference.
package main
import (
"fmt"
"log"
"net/url"
)
func main() {
relativeUrl := "/getsomething/"
u, err := url.Parse(relativeUrl)
if err != nil {
log.Fatal(err)
}
queryString := u.Query()
queryString.Set("id", "1")
u.RawQuery = queryString.Encode()
baseUrl := "https://some-domain.com"
base, err := url.Parse(baseUrl)
if err != nil {
log.Fatal(err)
}
fmt.Println(base.ResolveReference(u))
}
https://play.golang.org/p/BIU29R_XBM
I'm generating some source code using the templates package( is there a better method? )and part of the testing I need to check if the output matches the expected source code.
I tried a string comparison but it fails due the extra spaces / new lines generated by the templates package. I've also tried format.Source with not success. ( FAIL)
I tried to parse the ast of the both sources (see bellow) but the ast doesn't match either even if the code is basically same except the new lines / spaces. (FAIL)
package main
import (
"fmt"
"go/parser"
"go/token"
"reflect"
)
func main() {
stub1 := `package main
func myfunc(s string) error {
return nil
}`
stub2 := `package main
func myfunc(s string) error {
return nil
}`
fset := token.NewFileSet()
r1, err := parser.ParseFile(fset, "", stub1, parser.AllErrors)
if err != nil {
panic(err)
}
fset = token.NewFileSet()
r2, err := parser.ParseFile(fset, "", stub2, parser.AllErrors)
if err != nil {
panic(err)
}
if !reflect.DeepEqual(r1, r2) {
fmt.Printf("e %v, r %s, ", r1, r2)
}
}
Playground
Well, one simple way to achieve this is to use the go/printer library, that gives you better control of output formatting, and is basically like running gofmt on the source, normalizing both trees:
package main
import (
"fmt"
"go/parser"
"go/token"
"go/printer"
//"reflect"
"bytes"
)
func main() {
stub1 := `package main
func myfunc(s string) error {
return nil
}`
stub2 := `package main
func myfunc(s string) error {
return nil
}`
fset1 := token.NewFileSet()
r1, err := parser.ParseFile(fset1, "", stub1, parser.AllErrors)
if err != nil {
panic(err)
}
fset2 := token.NewFileSet()
r2, err := parser.ParseFile(fset1, "", stub2, parser.AllErrors)
if err != nil {
panic(err)
}
// we create two output buffers for each source tree
out1 := bytes.NewBuffer(nil)
out2 := bytes.NewBuffer(nil)
// we use the same printer config for both
conf := &printer.Config{Mode: printer.TabIndent | printer.UseSpaces, Tabwidth: 8}
// print to both outputs
if err := conf.Fprint(out1, fset1, r1); err != nil {
panic(err)
}
if err := conf.Fprint(out2, fset2, r2); err != nil {
panic(err)
}
// they should be identical!
if string(out1.Bytes()) != string(out2.Bytes()) {
panic(string(out1.Bytes()) +"\n" + string(out2.Bytes()))
} else {
fmt.Println("A-OKAY!")
}
}
Of course this code needs to be refactored to not look as stupid. Another approach is instead of using DeepEqual, create a tree comparison function yourself, that skips irrelevant nodes.
This was easier than I thought. All I had to do was to remove the empty new lines(after formatting). Below is the code.
package main
import (
"fmt"
"go/format"
"strings"
)
func main() {
a, err := fmtSource(stub1)
if err != nil {
panic(err)
}
b, err := fmtSource(stub2)
if err != nil {
panic(err)
}
if a != b {
fmt.Printf("a %v, \n b %v", a, b)
}
}
func fmtSource(source string) (string, error) {
if !strings.Contains(source, "package") {
source = "package main\n" + source
}
b, err := format.Source([]byte(source))
if err != nil {
return "", err
}
// cleanLine replaces double space with one space
cleanLine := func(s string)string{
sa := strings.Fields(s)
return strings.Join(sa, " ")
}
lines := strings.Split(string(b), "\n")
n := 0
var startLn *int
for _, line := range lines {
if line != "" {
line = cleanLine(line)
lines[n] = line
if startLn == nil {
x := n
startLn = &x
}
n++
}
}
lines = lines[*startLn:n]
// Add final "" entry to get trailing newline from Join.
if n > 0 && lines[n-1] != "" {
lines = append(lines, "")
}
// Make it pretty
b, err = format.Source([]byte(strings.Join(lines, "\n")))
if err != nil {
return "", err
}
return string(b), nil
}
It seems that URL does not support matrix parameters
// From net/url
type URL struct {
Scheme string
Opaque string // encoded opaque data
User *Userinfo // username and password information
Host string // host or host:port
Path string
RawQuery string // encoded query values, without '?'
Fragment string // fragment for references, without '#'
}
Why ?
How can I extract matrix parameters from an URL ? and when should I use them instead of using requests parameters embedded in the request.URL.RawQuery part of the URL ?
The parameters end up getting put in url.Path. Here's a function which can put them in the Query for you:
func ParseWithMatrix(u string) (*url.URL, error) {
parsed, err := url.Parse(u)
if err != nil {
return nil, err
}
if strings.Contains(parsed.Path, ";") {
q := parsed.Path[strings.Index(parsed.Path, ";")+1:]
m, err := url.ParseQuery(q)
if err != nil {
return nil, err
}
for k, vs := range parsed.Query() {
for _, v := range vs {
m.Add(k, v)
}
}
parsed.Path = parsed.Path[:strings.Index(parsed.Path, ";")]
parsed.RawQuery = m.Encode()
}
return parsed, nil
}
How do we read from a url resource. I have used the https://github.com/Kissaki/rest.go api in the following example. Below is the example I use to write a string to the url http://localhost:8080/cool
But now I need to retrieve the data from the url, how do I read it back?
package main
import (
"fmt"
"http"
"github.com/nathankerr/rest.go"
)
type FileString struct {
value string
}
func (t *FileString) Index(w http.ResponseWriter) {
fmt.Fprintf(w, "%v", t.value)
}
func main(){
rest.Resource("cool", &FileString{s:"this is file"})
http.ListenAndServe(":3000", nil)
}
if you just want to fetch a file over http you could do something like this i guess
resp, err := http.Get("http://your.url.com/whatever.html")
check(err) // does some error handling
// read from resp.Body which is a ReadCloser
response, err := http.Get(URL) //use package "net/http"
if err != nil {
fmt.Println(err)
return
}
defer response.Body.Close()
// Copy data from the response to standard output
n, err1 := io.Copy(os.Stdout, response.Body) //use package "io" and "os"
if err != nil {
fmt.Println(err1)
return
}
fmt.Println("Number of bytes copied to STDOUT:", n)