Do a http request from lua before haproxy routing a request - lua

I have a Lua proxy that needs to route requests. Each request destination is established based on the response from another HTTP request with a header from the initial request. My understanding is that HAProxy is an event-driven software, so blocking system calls are absolutely forbidden and my code is blocking because is doing an HTTP request.
I read about yielding after the request but I think it won't help since the HTTP request is already started. The library for doing the request is https://github.com/JakobGreen/lua-requests#simple-requests
local requests = require('requests')
core.register_fetches('http_backend', function(txn)
local dest = txn.sf:req_fhdr('X-dest')
local url = "http://127.0.0.1:8080/service";
local response = requests.get(url.."/"+dest);
local json = response.json()
return json.field
end )
How do I convert my code to be non-blocking?

You should consider using HAProxy's SPOE which was created exactly for these blocking scenarios.

I managed to do it using Lua. The thing I was making wrong was using require('requests') this is blocking. Ideally for HA never use a Lua external library. I have to deal with plain sockets and do an HTTP request and very important to use HA core method core.tcp() instead of Lua sockets.

Related

Is there any way, how to get the redirect uri?

Background:
Let's have a WebAssembly (wasm) originating from .net code.
This wasm uses HttpClient and HttpClientHandler to access a backend API at https://api.uri.
The actual backend API location might change in time (like https://api.uri/version-5), but there is still this fixed endpoint, which provides redirection (3xx response) to the current location (which is in the same domain).
The API allows CORS, meaning it sends e.g. Access-Control-Allow-Origin: * headers in the responses.
In the normal (non-wasm) world, one just:
Plainly GETs the https://api.uri with no additional headers (CORS safe).
Retrieve the Location: header (containing e.g. https://api.uri/version-5) from the 3xx response as the final URI.
GETs/POSTs the final URI with additional headers (as needed, e.g. custom, auth, etc.).
Note: In ideal world, the redirection is handled transparently and the first two steps can just be omitted.
Although in the wasm world:
You are not allowed to (let the wasm/browser) send the OPTIONS pre-flight requests to a redirecting endpoint (https://api.uri).
You can't send any non-cors headers, when wanting to prevent pre-flight requests (reason for two stages, plain and full, described above).
You can't see the Location: header value (like https://api.uri/version-5) when trying the manual redirection (HttpClientHandler.AllowAutoRedirect = false), because the response is just artificially crafted with HTTP status code of 0 and ReasonPhrase == "opaqueredirect" - adoption to browser's Fetch API. What a nonsense! #1...
You can't see the auto-followed Location: header value in response.RequestMessage?.RequestUri, when trying the (default) automatic redirection (HttpClientHandler.AllowAutoRedirect = true), because there is still the original URI (https://api.uri) instead of the very expected auto-followed one (https://api.uri/version-5). What a nonsense! #2...
You can't send the full blown request with all the headers and rely on the automatic redirection, because it would trigger pre-flight, which is sill not allowed on redirecting endpoint.
So, the obvious question is:
Is there ANY way, how to handle such simple scenario from the Web Assembly?
(and not crash on CORS)
GET https://api.uri => 3xx, Location: https://api.uri/version-5
GET https://api.uri/version-5, Authorization: Basic BlaBlaBase64= ; Custom: Cool-Value => 200
Note: All this has been discovered within the Uno Platform wasm head, but I believe it applies for any .net wasm.
Note: I also guess "disabled" CORS (on the request side, via Sec-Fetch-Mode: no-cors) wouldn't help either, as then such request is not allowed to have additional headers/methods, right?

How to read POST requests in Lua?

I have this Telegram bot written in Lua that I am doing as a hobby for a language network. And I have been reading new messages via the getUpdates API call all the time. Now I want to rewrite it to use webhooks, but I have no experience with that whatsoever. I have googled but didn't find anything certain. I kinda feel that WSAPI is the library to use, but I am not sure. Moreover, I am not really sure I need any special library just for reading POST requests (which is all that the Telegram bot API uses). I tried using sockets:
socket = require 'socket'
server = assert(socket.bind("*", 9000))
function read(client, pattern, prefix)
local data, emsg, partial = client:receive(pattern, prefix)
if data then
return data
end
if partial and #partial > 0 then
return partial
end
return nil, emsg
end
while true do
local client = server:accept()
client:settimeout(3)
local msg, err = read(client, '*a')
if not err then
print(msg)
client:close()
end
end
The print(msg) here gives me the full POST request including headers, which I am probably able to parse (the body is supposed to always be a JSON). I am not really that familiar with HTTP requests though and I'm not sure I can just throw away everything that goes before the first {.
My setup is Lua 5.2, Ubuntu x64 16.04 and Nginx. What I need to do is to receive and read POST requests, nothing more.
TL;DR: is it okay to parse the POST request I receive from the code above or am I missing something, like a library that'd make my life easier?
Thanks!

Script on esp8266 using nodeMCU constantly runs

I am using lualoader and I loaded the following script from webserver example
-- a simple http server
srv = net.createServer(net.TCP)
srv:listen(80, function(conn)
conn:on("receive", function(sck, payload)
print(payload)
sck:send("HTTP/1.0 200 OK\r\nContent-Type: text/html\r\n\r\n<h1> Hello, NodeMCU.</h1>")
end)
conn:on("sent", function(sck) sck:close() end)
end)
I saved it in a file and loaded it to lualoader and then did dofile. Whenever I load send an HTTP request to the esp8266 it loads the webpage. This is even after running other scripts. From reading the script it looks like it can only handle one HTTP request. Why does it keep handling new http requests?
From reading the script it looks like it can only handle one HTTP request.
Not sure what you mean by that. Do you maybe refer to http://nodemcu.readthedocs.io/en/latest/en/modules/http/? That is about sending out requests, only 1 concurrent requests.
Why does it keep handling new http requests?
The server keeps listening until you close it.
srv:close()

python: how to fetch an url? (with improper response headers)

I want to build a small script in python which needs to fetch an url. The server is a kind of crappy though and replies pure ASCII without any headers.
When I try:
import urllib.request
response = urllib.request.urlopen(url)
print(response.read())
I obtain a http.client.BadStatusLine: 100 error because this isn't a properly formatted HTTP response.
Is there another way to fetch an url and get the raw content, without trying to parse the response?
Thanks
It's difficult to answer your direct question without a bit more information; not knowing exactly how the (web) server in question is broken.
That said, you might try using something a bit lower-level, a socket for example. Here's one way (python2.x style, and untested):
#!/usr/bin/env python
import socket
from urlparse import urlparse
def geturl(url, timeout=10, receive_buffer=4096):
parsed = urlparse(url)
try:
host, port = parsed.netloc.split(':')
except ValueError:
host, port = parsed.netloc, 80
sock = socket.create_connection((host, port), timeout)
sock.sendall('GET %s HTTP/1.0\n\n' % parsed.path)
response = [sock.recv(receive_buffer)]
while response[-1]:
response.append(sock.recv(receive_buffer))
return ''.join(response)
print geturl('http://www.example.com/') #<- the trailing / is needed if no
other path element is present
And here's a stab at a python3.2 conversion (you may not need to decode from bytes, if writing the response to a file for example):
#!/usr/bin/env python
import socket
from urllib.parse import urlparse
ENCODING = 'ascii'
def geturl(url, timeout=10, receive_buffer=4096):
parsed = urlparse(url)
try:
host, port = parsed.netloc.split(':')
except ValueError:
host, port = parsed.netloc, 80
sock = socket.create_connection((host, port), timeout)
method = 'GET %s HTTP/1.0\n\n' % parsed.path
sock.sendall(bytes(method, ENCODING))
response = [sock.recv(receive_buffer)]
while response[-1]:
response.append(sock.recv(receive_buffer))
return ''.join(r.decode(ENCODING) for r in response)
print(geturl('http://www.example.com/'))
HTH!
Edit: You may need to adjust what you put in the request, depending on the web server in question. Guanidene's excellent answer provides several resources to guide you on that path.
What you need to do in this case is send a raw HTTP request using sockets.
You would need to do a bit of low level network programming using the socket python module in this case. (Network sockets actually return you all the information sent by the server as it as, so you can accordingly interpret the response as you wish. For example, the HTTP protocol interprets the response in terms of standard HTTP headers - GET, POST, HEAD, etc. The high-level module urllib hides this header information from you and just returns you the data.)
You also need to have some basic information about HTTP headers. For your case, you just need to know about the GET HTTP request. See its definition here - http://djce.org.uk/dumprequest, see an example of it here - http://en.wikipedia.org/wiki/HTTP#Example_session. (If you wish to capture live traces of HTTP requests sent from your browser, you would need a packet sniffing software like wireshark.)
Once you know basics about socket module and HTTP headers, you can go through this example - http://coding.debuntu.org/python-socket-simple-tcp-client which tells you how to send a HTTP request over a socket to a server and read its reply back. You can also refer to this unclear question on SO.
(You can google python socket http to get more examples.)
(Tip: I am not a Java fan, but still, if you don't find enough convincing examples on this topic under python, try finding it under Java, and then accordingly translate it to python.)
urllib.urlretrieve('http://google.com/abc.jpg', 'abc.jpg')

Supporting the "Expect: 100-continue" header with ASP.NET MVC

I'm implementing a REST API using ASP.NET MVC, and a little stumbling block has come up in the form of the Expect: 100-continue request header for requests with a post body.
RFC 2616 states that:
Upon receiving a request which
includes an Expect request-header
field with the "100-continue" expectation, an origin server MUST
either respond with 100 (Continue) status and continue to read
from the input stream, or respond with a final status code. The
origin server MUST NOT wait for the request body before sending
the 100 (Continue) response. If it responds with a final status
code, it MAY close the transport connection or it MAY continue
to read and discard the rest of the request. It MUST NOT
perform the requested method if it returns a final status code.
This sounds to me like I need to make two responses to the request, i.e. it needs to immediately send a HTTP 100 Continue response, and then continue reading from the original request stream (i.e. HttpContext.Request.InputStream) without ending the request, and then finally sending the resultant status code (for the sake of argument, lets say it's a 204 No Content result).
So, questions are:
Am I reading the specification right, that I need to make two responses to a request?
How can this be done in ASP.NET MVC?
w.r.t. (2) I have tried using the following code before proceeding to read the input stream...
HttpContext.Response.StatusCode = 100;
HttpContext.Response.Flush();
HttpContext.Response.Clear();
...but when I try to set the final 204 status code I get the error:
System.Web.HttpException: Server cannot set status after HTTP headers have been sent.
The .NET framework by default always sends the expect: 100-continue header for every HTTP 1.1 post. This behavior can be programmatically controlled per request via the System.Net.ServicePoint.Expect100Continue property like so:
HttpWebRequest httpReq = GetHttpWebRequestForPost();
httpReq.ServicePoint.Expect100Continue = false;
It can also be globally controlled programmatically:
System.Net.ServicePointManager.Expect100Continue = false;
...or globally through configuration:
<system.net>
<settings>
<servicePointManager expect100Continue="false"/>
</settings>
</system.net>
Thank you Lance Olson and Phil Haack for this info.
100-continue should be handled by IIS. Is there a reason why you want to do this explicitly?
IIS handles the 100.
That said, no it's not two responses. In HTTP, when the Expect: 100-continue comes in as part of the message headers, the client should be waiting until it receives the response before sending the content.
Because of the way asp.net is architected, you have little control over the output stream. Any data that gets written to the stream is automatically put in a 200 response with chunked encoding whenever you flush, be it that you're in buffered mode or not.
Sadly all this stuff is hidden away in internal methods all over the place, and the result is that if you rely on asp.net, as does MVC, you're pretty much unable to bypass it.
Wait till you try and access the input stream in a non-buffered way. A whole load of pain.
Seb

Resources