context: JA's "Programming Erlang" 2ed, chapter 16 on files, page 256, example on working with parsing urls from a Binary.
The steps suggested (after writing code for the scavenge_urls module) are these:
B = socket_examples:nano_get_url("www.erlang.org"),
L = scavenge_urls:bin2urls(B),
scavenge_urls:urls2htmlFile(L,"gathered.html").
And that fails (subtly) - the list L ends up being empty. Running the first step on its own, a strange thing is observed - it does return a binary, but it's not the binary I was looking for:
9> B.
<<"HTTP/1.1 404 Not Found\r\nServer: nginx\r\nDate: Sun, 19 Nov 2017 01:57:07 GMT\r\nContent-Type: text/html; charset=UTF-8\r\n"...>>
shows that this is where the problem lies.
yet in the browser all's good with the mothership! I was able to complete the exercise by replacing the call to socket_examples:nano_get_urls/1 with, first, CURLing for the same url, dumping that into a file, and then file:read_file/1. The next steps all ran fine.
Peeking inside the socket_examples module, I see this:
nano_get_url(Host) ->
{ok,Socket} = gen_tcp:connect(Host,80,[binary, {packet, 0}]), %% (1)
ok = gen_tcp:send(Socket, "GET / HTTP/1.0\r\n\r\n"), %% (2)
receive_data(Socket, []).
receive_data(Socket, SoFar) ->
receive
{tcp,Socket,Bin} -> %% (3)
receive_data(Socket, [Bin|SoFar]);
{tcp_closed,Socket} -> %% (4)
list_to_binary(reverse(SoFar)) %% (5)
end.
Nothing looks suspicious. First it establishes the connection, next it fires a GET, and then it receives the response. I've never before had to explicitly connect first, and fire a GET second, my http client libraries hid that from me. So maybe I don't know what to look for... and I sure trust Joe's code doesn't have any glaring mistakes! =) Yet the lines with comments (3),(4) and (5) aren't something I fully understand.
So, any ideas, fellow Erlangers?
Thank a bunch!
The problem is not Erlang. It looks like the server running erlang.org requires a Host header as well:
$ nc www.erlang.org 80
GET / HTTP/1.0
HTTP/1.1 404 Not Found
Server: nginx
Date: Sun, 19 Nov 2017 05:51:39 GMT
Content-Type: text/html; charset=UTF-8
Content-Length: 162
Connection: close
Vary: Accept-Encoding
<html>
<head><title>404 Not Found</title></head>
<body bgcolor="white">
<center><h1>404 Not Found</h1></center>
<hr><center>nginx</center>
</body>
</html>
$ nc www.erlang.org 80
GET / HTTP/1.0
Host: www.erlang.org
HTTP/1.1 200 OK
Server: nginx
Date: Sun, 19 Nov 2017 05:51:50 GMT
Content-Type: text/html; charset=UTF-8
Content-Length: 12728
Connection: close
Vary: Accept-Encoding
<!DOCTYPE html>
<html>
...
Your Erlang code also works with the Host header after GET HTTP/1.0\r\n:
1> Host = "www.erlang.org".
"www.erlang.org"
2> {ok, Socket} = gen_tcp:connect(Host, 80, [binary, {packet, 0}]).
{ok,#Port<0.469>}
3> ok = gen_tcp:send(Socket, "GET / HTTP/1.0\r\nHost: www.erlang.org\r\n\r\n").
ok
4> flush().
Shell got {tcp,#Port<0.469>,
<<"HTTP/1.1 200 OK\r\nServer: nginx\r\n...>>
Shell got {tcp_closed,#Port<0.469>}
Related
I'm having trouble deciphering this error message that I'm getting when hitting a simple endpoint created with Cowboy. I created a simple app with cowboy (https://github.com/overture8/cow_app) then started the app using rebar3 shell (not sure if this is correct?). Anyway, I'm getting this error when hitting the endpoint:
Error in process <0.232.0> with exit value:
{[{reason,undef},
{mfa,{hello_handler,init,3}},
{stacktrace,
[{hello_handler,init,
[{tcp,http},
{http_req,#Port<0.7138>,ranch_tcp,keepalive,<0.232.0>,<<"GET">>,
'HTTP/1.1',
{{127,0,0,1},49651},
<<"127.0.0.1">>,undefined,8010,<<"/">>,undefined,<<>>,
undefined,[],
[{<<"host">>,<<"127.0.0.1:8010">>},
{<<"connection">>,<<"keep-alive">>},
{<<"cache-control">>,<<"max-age=0">>},
{<<"upgrade-insecure-requests">>,<<"1">>},
{<<"user-agent">>,
<<"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36">>},
{<<"accept">>,
<<"text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8">>},
{<<"dnt">>,<<"1">>},
{<<"accept-encoding">>,<<"gzip, deflate, sdch">>},
{<<"accept-language">>,
<<"en-GB,en;q=0.8,en-US;q=0.6,fr;q=0.4">>}],
[{<<"connection">>,[<<"keep-alive">>]}],
undefined,[],waiting,<<>>,undefined,false,waiting,[],<<>>,
undefined},
[]],
[]},
.
.
.
Maybe I'm just doing something completely wrong - this is my first experience with using Erlang.
Any help would be greatly appreciated.
Your rebar.lock is out of sync with rebar.config and is pointing to version 1.0.1 of cowboy which requires init/3 to be exported, not init/2, which is what the error ... {reason,undef}, {mfa,{hello_handler,init,3}}, ... means.
To fix, run rebar3 upgrade cowboy and then run rebar3 shell. After I run that, the application works fine for me:
$ curl -i http://localhost:8010/
HTTP/1.1 200 OK
server: Cowboy
date: Wed, 07 Sep 2016 09:57:22 GMT
content-length: 13
content-type: text/plain
Hello Erlang!
i am using Hackney's erlang rest client. I followed the steps provided in README.md but I am getting the following error:
17> Method = get.
get
18> URL = <<"www.google.com">>.
<<"www.google.com">>
19> Headers = [].
[]
20> Payload = <<>>.
<<>>
21> Options = [].
[]
22>Test = hackney:request(Method, URL,Headers,Payload,Options).
{error,connect_timeout}
I used the same url using curl and wget and both are working. Is there any issue with erlang ssl or issue with tls? I have edited the question for better understanding
EDIT 1 (using curl -vv google.com)
curl -vv google.com
* About to connect() to proxy <<ip>> port 8080 (#0)
* Trying <<ip>>... connected
* Connected to <<ip>> (<<ip>>) port 8080 (#0)
* Proxy auth using Basic with user '<<user>>'
> GET http://google.com HTTP/1.1
> Proxy-Authorization: <<proxy authorization>>
> User-Agent: curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.14.0.0 zlib/1.2.3 libidn/1.18 libssh2/1.4.2
> Host: google.com
> Accept: */*
> Proxy-Connection: Keep-Alive
>
< HTTP/1.1 301 Moved Permanently
< Location: http://www.google.com/
< Content-Type: text/html; charset=UTF-8
< Date: Tue, 07 Jun 2016 03:49:43 GMT
< Expires: Thu, 07 Jul 2016 03:49:43 GMT
< Cache-Control: public, max-age=2592000
< Server: gws
< Content-Length: 219
< X-XSS-Protection: 1; mode=block
< X-Frame-Options: SAMEORIGIN
< Proxy-Connection: Keep-Alive
< Connection: Keep-Alive
< Age: 2223
<
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>301 Moved</TITLE></HEAD><BODY>
<H1>301 Moved</H1>
The document has moved
here.
</BODY></HTML>
* Connection #0 to host <<ip>> left intact
* Closing connection #0
Hackney do not apply profile proxy settings automatically, so you should take care of proxy settings yourself.
According to the documentation, you should provide the following options:
{proxy, {Host, Port}} %% if http proxy is used
{proxy_auth, {User, Password}}. %% if proxy requires authentication
What do you get when you use the httpc module to do a request via the Erlang shell.
First start inets:
inets:start().
Then try:
{ok, Response} = httpc:request("https://www.google.com").
or
{ok, Response} = httpc:request("http://www.google.com").
If both of these fail to connect, odds are the issue is not hackney related, but rather an issue of Erlang as a whole.
Your error is not an connect_timeout. You are getting an exception of no match of right hand side value because you are missing the = on your last command.
Just change it to
{ok, StatusCode, RespHeaders, ClientRef} = hackney:request(Method,URL,Headers,Payload,Options).
If you use "Follow TCP stream" in wireshark you get a very nice display for the client server dialogue.
One color is the client, the other color is the server.
Is there a way to dump this to a ascii without loosing who said what?
For example:
server> 220 "Welcome to FTP service for foo-server."
client> USER baruser
server> 331 Please specify the password.
client> supersecret
I want to avoid screenshots. Adding "server>" and "client>" to the lines is error prone.
It may not be possible with the GUI version, but it's achievable with the console version tshark:
tshark -r capture.pcap -qz follow,tcp,ascii,<stream_id> > stream.txt
Replace <stream_id> with an actual stream ID (eg: 1):
tshark -r capture.pcap -qz follow,tcp,ascii,1 > stream.txt
This will output an ASCII file. How is it better than saving it directly from the GUI version? Well:
The data sent by the second node is prefixed with a tab to differentiate it from the data sent by the first node.
Since the output in ascii mode may contain newlines, the length of each section of output plus a newline precedes each section of output.
This makes the file easily parsable. Example output:
===================================================================
Follow: tcp,ascii
Filter: tcp.stream eq 1
Node 0: xxx.xxx.xxx.xxx:51343
Node 1: yyy.yyy.yyy.yyy:80
786
GET ...
Host: ...
Connection: keep-alive
Pragma: no-cache
Cache-Control: no-cache
Accept: */*
User-Agent: ...
Referer: ...
Accept-Encoding: ...
Accept-Language: ...
Cookie: ...
235
HTTP/1.1 200 OK
Cache-Control: no-cache, no-store
Pragma: no-cache
Content-Type: ...
Expires: -1
X-Request-Guid: ...
Date: Mon, 31 Aug 2015 10:55:46 GMT
Content-Length: 0
===================================================================
786\n is the length of the first output section from Node 0.
\t235\n is the legnth of the response section from Node 1 and so on.
I have a WebAPI service deployed on a server. There is also an MVC app for testing the API. One such tester runs on my dev machine and another copy (same version) runs on the server where API resides.
The MVC tester app supports calling API directly and also through a built-in 'proxy' (http handler) to bypass the "Access-Control-Allow-Origin" errors. So for example if I run the tester app on my dev machine and I want to receive data from the server, I must use the proxy. This setup works nicely and all of the calls are going through and data is passed properly.
Problem occurs when I don't provide enough inputs (for testing purposes) and API generates a "400 Bad request" error. It only happens when I make a call from my dev machine to remote machine and works fine if I make the same call on the server.
I tested with Postman on the remote server:
post directly to API: POST to 177.77.77.77/v1/feature {JSON payload}
the response I get back contains proper headers, a body with a JSON object describing the error. Same thing happens when I send the same exact command but through server's proxy:
post through proxy: POST to 177.77.77.77/proxy/feature {JSON payload}
both results are identical and this is the expected behavior, which I think allows to make a conclusion that proxy is working and API is working.
When I go back to my dev machine and try the same calls, my result for posting directly to API is the same as above but if I use the server's proxy something else happens. Here's output from Fiddler for the working case (from dev machine directly to API):
POST http://177.77.77.77/v1/feature HTTP/1.1
Host: 177.77.77.77
Connection: keep-alive
Content-Length: 514
User-Agent: Mozilla/5.0 xx..
Cache-Control: no-cache
Origin: chrome-extension://xx-postman-xx
Authorization: Basic pwd=
Content-Type: application/json
Accept: */*
Accept-Encoding: gzip, deflate
Accept-Language: en-US,en;q=0.8
{payload}
response:
HTTP/1.1 400 Bad Request
Cache-Control: no-cache
Pragma: no-cache
Content-Type: application/json; charset=utf-8
Expires: -1
Server: Microsoft-IIS/7.5
X-Powered-By: ASP.NET
Date: Mon, 30 Mar 2015 15:48:52 GMT
Content-Length: 139
{"code":"INVALID_REQUEST","message":"proper error message"}
this is the correct behavior with expected message body (JSON) and correct content-lenght. If I make a request to the server's proxy, my request becomes:
POST 177.77.77.77/proxy/feature HTTP/1.1
Host: 177.77.77.77
Connection: keep-alive
Content-Length: 514
User-Agent: Mozilla/5.0 xx..
Cache-Control: no-cache
Origin: chrome-extension://xx-postman-xx
Authorization: Basic pwd=
Content-Type: application/json
Accept: */*
Accept-Encoding: gzip, deflate
Accept-Language: en-US,en;q=0.8
Cookie: appSettings=xx
{payload}
then the response I get is:
HTTP/1.1 400 Bad Request
Cache-Control: private
Content-Type: text/html
Server: Microsoft-IIS/7.5
X-AspNet-Version: 4.0.30319
X-Powered-By: ASP.NET
Date: Mon, 30 Mar 2015 15:51:37 GMT
Content-Length: 11
Bad Request
the body/payload of the response is lost or replaced by status, content-type changed from JSON to text/html and cache-control became private.
The cookie in the non-working case plays no role. I tried removing it and result is still wrong.
Why do you think this is happening? How can I troubleshoot this and find what portion of code (or maybe IIS setting?) is responsible for sending a different response for what seems to be same requests? After all everything works fine when I run only on dev or only on remote. Could this be some permission issue or something in IIS on the server?
I tried remote-debugging of the proxy code where I would step through the code while sending Post commands (with the same payload) through Postman. In one scenario I ran Postman on the server and made a request to server's proxy and in another scenario I ran Postman on dev and also made requests to server's proxy. In VS I can see that I'm going through the same path in code for both scenarios and that API's response is the same.
Thanks for reading and any attempts to help will be greatly appreciated!
added this to web.config:
<system.webServer>
<httpErrors existingResponse="PassThrough"></httpErrors>
</system.webServer>
found the solution here:
In IIS7.5 what module removes the body of a 400 Bad Request
I'm using a CXF client to communicate with a .net web service running on IIS 6.
This request (anonymised):
POST /EngineWebService_v1/EngineWebService_v1.asmx HTTP/1.1
Content-Type: text/xml; charset=UTF-8
SOAPAction: "http://.../Report"
Accept: */*
User-Agent: Apache CXF 2.2.5
Cache-Control: no-cache
Pragma: no-cache
Host: uat9.gtios.net
Connection: keep-alive
Transfer-Encoding: chunked
followed by 7 chunks of 4089 bytes and one of 369 bytes, generates the following output after the first chunk has been sent:
HTTP/1.1 404 Not Found
Content-Length: 103
Date: Wed, 10 Feb 2010 13:00:08 GMT
Connection: Keep-Alive
Content-Type: text/html
Anyone know how to get IIS to accept chunked input for a POST?
Thanks
Chunked encoding should be enabled by default. You can check your setting with:
C:\Inetpub\AdminScripts>cscript adsutil.vbs get /W3SVC/AspEnableChunkedEncoding
The 404 makes me wonder if it's really a problem with the chunked encoding. Did you triple-check the URL?
You may well have URLScan running on your server. By default URLScan is configured to reject requests that have a transfer-encoding: header and URLScan sends 404 errors (which is conspicuous over a proper server-error).
UrlScan v3.1 failures result in 404 errors and not 500 errors.
Searching for 404 errors in your W3SVC log will include failures due
to UrlScan blocking.
You will need to look at the file located in (path may differ) C:\Windows\System32\inetsrv\URLScan\URLScan.ini. Somewhere in there you will find a [DenyHeaders] section, that will look a bit like this (it will probably have more headers listed).
[DenyHeaders]
transfer-encoding:
Remove transfer-encoding: from this list and it should fix your problem.