Erlang Twitter streaming client - handling of chunked responses - twitter

I'm connecting to the Twitter firehose using the Erlang code at the bottom.
Now I am receiving a stream of data fine but am wondering if the Erlang httpc client is handling the CRLF ('\r\n') chunked response delimiter properly; because I was expecting a series of calls to the three handle_info blocks in turn (stream_start, stream*, stream_end) -
handle_info({http, {_RequestId, stream_start, _Headers}}, State) ->
io:format("start~n"),
{noreply, State};
handle_info({http, {_RequestId, stream, Data}}, State) ->
io:format("~p~n", [Data]),
{noreply, State};
handle_info({http, {_RequestId, stream_end, _Headers}}, State) ->
io:format("end~n"),
{noreply, State};
but instead what happens is that the 'stream_start' block is called once at the outset, and then all subsequent data is handled by the 'stream' block; 'stream_end' is never called.
However when I look at the blocks being handled by the 'stream' block, a very large number have the CRLF delimiter as a suffix.
So am wondering if the httpc client is handling chunked block termination correctly; or maybe I haven't configured it properly ?
TIA
%% https://dev.twitter.com/streaming/reference/post/statuses/filter
-module(twitter_streaming_demo).
-behaviour(gen_server).
%% API.
-export([start_link/5]).
%% gen_server.
-export([init/1]).
-export([handle_call/3]).
-export([handle_cast/2]).
-export([handle_info/2]).
-export([terminate/2]).
-export([code_change/3]).
-define(METHOD, "POST").
-define(URL, "https://stream.twitter.com/1.1/statuses/filter.json").
-define(APPLICATION_FORM_URLENCODED, "application/x-www-form-urlencoded").
-define(TRACK, "track").
-record(state, {consumer,
tokens,
url,
query,
request_id}).
%% API.
%% twitter_streaming_demo:start_link("", "", "", "", "").
start_link(ConsumerKey, ConsumerSecret, Token, TokenSecret, Query) ->
gen_server:start_link(?MODULE, [ConsumerKey, ConsumerSecret, Token, TokenSecret, Query], []).
%% gen_server.
init([ConsumerKey, ConsumerSecret, Token, TokenSecret, Query]) ->
Consumer={ConsumerKey, ConsumerSecret, hmac_sha1},
Tokens={Token, TokenSecret},
{ok, #state{consumer=Consumer,
tokens=Tokens,
url=?URL,
query=Query}, 0}.
handle_call(_Request, _From, State) ->
{reply, ignored, State}.
handle_cast(_Msg, State) ->
{noreply, State}.
handle_info(timeout, #state{consumer=Consumer, tokens=Tokens, url=Url, query=Query}=State) ->
{Token, TokenSecret}=Tokens,
Params=[{?TRACK, Query}],
Signed=oauth:sign("POST", Url, Params, Consumer, Token, TokenSecret),
{AuthorizationParams, _QueryParams}=lists:partition(fun({K, _}) -> lists:prefix("oauth_", K) end, Signed),
Request={oauth:uri(Url, []), %% it's a POST request :-)
[oauth:header(AuthorizationParams)],
?APPLICATION_FORM_URLENCODED,
?TRACK++"="++Query},
{ok, RequestId}=httpc:request(post, Request, [], [{sync, false}, {stream, self}]),
{noreply, State#state{request_id=RequestId}};
handle_info({http, {_RequestId, stream_start, _Headers}}, State) ->
io:format("start~n"),
{noreply, State};
handle_info({http, {_RequestId, stream, Data}}, State) ->
io:format("~p~n", [Data]),
{noreply, State};
handle_info({http, {_RequestId, stream_end, _Headers}}, State) ->
io:format("end~n"),
{noreply, State};
handle_info(Info, State) ->
io:format("~p~n", [Info]),
{noreply, State}.
terminate(_Reason, _State) ->
ok.
code_change(_OldVsn, State, _Extra) ->
{ok, State}.

It looks like the httpc client in OTP should handle chunked encoding (bugs notwithstanding).
CRLF delimiters are valid inside chunks if they are inside the length scope of the chunk, i.e. if the length is after the position of the CRLF. Perhaps those are the line feeds you are seeing? Example from Wikipedia:
e\r\n
in\r\n\r\nchunks.\r\n
Here, the length (e or 14) encompasses the CRLFs (which has a length of 2) between in and chunks. (the trailing line feeds are never counted towards the length).
As for the Twitter API (which I'm not familiar with), it could be possible that it never returns (and just gives you chunks forever)?

Related

process_not_owner_of_odbc_connection :ODBC connection

Has anyone ran across this error when querying with erlang odbc:
{:error, :process_not_owner_of_odbc_connection}
I am writing a connection pool and the :ODBC.connect('conn string) is ran inside the genserver and the returned pid is put into the state... But when i get that PID back out of the gen server, I get the above error when trying to run an :odbc.param_query with that pid... How are we suppose to create a connection pool if we can use pids created in a genserver?
Anyone have any ideas?
GenServer:
defmodule ItemonboardingApi.ConnectionPoolWorker do
use GenServer
def start_link([]) do
IO.puts "Starting Connection Pool"
GenServer.start_link(__MODULE__, nil, [])
end
def init(_) do
:odbc.start
{ok, pid} = :odbc.connect('DSN=;UID=;PWD=', [])
IO.inspect pid
{:ok, pid}
end
def handle_call(:get_connection, _from, state) do
IO.puts "Inspecting State:"
IO.inspect state
IO.puts("process #{inspect(self())} getting odbc connection")
{:reply, state, state}
end
end
Calling Function:
def get_standards_like(standard) do
:poolboy.transaction(
:connection_pool,
fn pid ->
connection = GenServer.call(pid, :get_connection)
IO.inspect connection
val =
String.upcase(standard)
|> to_char_list
# Call Stored Proc Statement
spstmt = '{call invlibr.usp_vm_getStandards (?)}'
case :odbc.param_query(connection, spstmt, [
{{:sql_char, 50}, [val]}
])
|> map_to_type(ItemonboardingCore.Standard)
do
{:ok, mapped} ->
{:ok, mapped}
{:updated, affected_count} ->
{:ok, affected_count}
{:executed, o, a} ->
{:ok, o, a}
{:error, _} ->
{:bad}
end
end
)
end
I have confirmed that the pids in the genserver are infect the correct odbc pids and the GenServer.Call returns one of them.
****EDIT****
Here is what I did to fix the issue. I didnt know that the process that created the connection, had to be the process that runs the query. A slight change to my worker to pass the query into has fixed my issue. This is a rough first pass, a few things still need to be done to the worker.
defmodule ItemonboardingApi.ConnectionPoolWorker do
use GenServer
def start_link([]) do
IO.puts "Starting Connection Pool"
GenServer.start_link(__MODULE__, nil, [])
end
def init(_) do
:odbc.start(:app)
{ok, pid} = :odbc.connect('DSN=;UID=;PWD=', [])
IO.inspect pid
{:ok, pid}
end
def handle_call(:get_connection, _from, state) do
{:reply, state, state}
end
def handle_call({:query, %{statement: statement, params: params}}, _from, state) do
#TODO Check if pid is alive and start if needed.
{:reply, :odbc.param_query(state, to_charlist(statement), params), state}
end
end
That's going to be a problem, like Kevin pointed out you can't transfer connection ownership for odbc and other drivers like mysql/otp.
If you want to use a connection pool, take a look at this instead https://github.com/mysql-otp/mysql-otp-poolboy
Otherwise, you can use any pool but the process that executes the sql queries has to be the one that opened the connections.
example in Erlang
sql_query_priv(Conn, Conn_Params, {SQL, Params}) ->
lager:debug("~p:sql_query_priv trying to execute query: ~p, with params: ~p, conn = ~p~n", [?MODULE, SQL, Params, Conn]),
case Conn of
null -> try mysql:start_link(Conn_Params) of
{ok, NewConn} ->
lager:info("~p:sql_query_priv Connection to DB Restored", [?MODULE]),
try mysql:query(NewConn, SQL, Params) of
ok -> {ok, NewConn, []};
{ok, _Columns, Results} -> {ok, NewConn, Results};
{error, Reason} ->
lager:error("~p:sql_query_priv Connection To DB Failed, Reason: ~p~n", [?MODULE, {error, Reason}]),
exit(NewConn, normal),
{error, null, Reason}
catch
Exception:Reason ->
lager:error("~p:sql_query_priv Connection To DB Failed, Exception:~p Reason: ~p~n", [?MODULE, Exception, Reason]),
{error, null, {Exception, Reason}}
end;
{error, Reason} ->
lager:error("~p:sql_query_priv Connection To DB Failed, Reason: ~p~n", [?MODULE, {error, Reason}]),
{error, Conn, Reason}
catch
Exception:Reason ->
lager:error("~p:sql_query_priv Connection To DB Failed, Exception:~p Reason: ~p~n", [?MODULE, Exception, Reason]),
{error, Conn, {Exception, Reason}}
end;
Conn -> try mysql:query(Conn, SQL, Params) of
ok -> {ok, Conn, []};
{ok, _Columns, Results} -> {ok, Conn, Results};
{error, Reason} ->
try exit(Conn, normal) of
_Any -> ok
catch
_E:_R -> ok
end,
lager:error("~p:sql_query_priv Connection To DB Failed, Reason: ~p~n", [?MODULE, {error, Reason}]),
{error, null, Reason}
catch
Exception:Reason ->
try exit(Conn, normal) of
_Any -> ok
catch
_E:_R -> ok
end,
lager:error("~p:sql_query_priv Connection To DB Failed, Exception:~p Reason: ~p~n", [?MODULE, Exception, Reason]),
{error, null, {Exception, Reason}}
end
end.
According to the relevant documentation, an odbc connection is private to the Genserver process that created the connection.
Opens a connection to the database. The connection is associated with
the process that created it and can only be accessed through it. This
function may spawn new processes to handle the connection. These
processes will terminate if the process that created the connection
dies or if you call disconnect/1.
You can compare this to ETS tables that can be set to private, except that in the case of ETS tables you can make them public. This is not possible with odbc connections, as such you should move your code inside the Genserver

Can't get request body in onresponse hook

I want to log all requests along with responses to db. I'm using hooks for that. But it looks like I can't get request body in 'onresponse' hook, it's always <<>>. In 'onrequest' hook I can get request body.
My hooks defined as:
request_hook(Req) ->
%% All is OK: ReqBody contains what I sent:
{ok, ReqBody, Req2} = cowboy_req:body(Req),
io:format("request_hook: body = ~p", [ReqBody]),
Req2.
response_hook(_Status, _Headers, _Body, Req) ->
%% ReqBody is always <<>> at this point. Why?
{ok, ReqBody, Req2} = cowboy_req:body(Req),
io:format("response_hook: body = ~p", [ReqBody]),
Req2.
Is this a bug in cowboy or normal behaviour?
I'm using the latest cowboy available at the time of writing this post (commit: aab63d605c595d8d0cd33646d13942d6cb372b60).
The latest version of Cowboy (as I know from v0.8.2) use following approach to increase performance - cowboy_req:body(Req) return Body and NewReq structure without request body. In other word it is a normal behaviour and you able to retrieve request body only once.
Cowboy does not receive request body as it can be huge, body placed in socket until it became necessary (until cowboy_req:body/1 call).
Also after you retrieve body, it become not available in handler.
So if you want to implement logging and make body available in handler, you can save body on request to shared location and explicitly remove it on response.
request_hook(Req) ->
%% limit max body length for security reasons
%% here we expects that body less than 80000 bytes
{ok, Body, Req2} = cowboy_req:body(80000, Req),
put(req_body, Body), %% put body to process dict
Req2.
response_hook(RespCode, RespHeaders, RespBody, Req) ->
ReqBody = get(req_body),
Req2.
%% Need to cleanup body record in proc dict
%% since cowboy uses one process per several
%% requests in keepalive mode
terminate(_Reason, _Req, _St) ->
put(req_body, undefined),
ok.

Erlang: gen_tcp:recv() does not receive packet sent from client?

I modified this server to use gen_tcp:recv inorder to limit the number of bytes for a packet to 50. I commented out the line inet:setopts(Socket, [{active, once}]), because gen_tcp:recv is supposed to be {active,false}. This is the client side erl shell
2> cp3:client().
exit
3>
and this is the server side erl shell
4> cp3:server().
Started Server:
<0.46.0>
Accept Server:
Pid <0.48.0>
Connection accepted
Accept Server:
Loop Server:
5>
I also wondered how can I know if the socket closed with the return value {tcp_closed, Socket} if gen_tcp:recv doesn't create one?
-module(cp3).
-export([client/0, server/0,start/0,accept/1,enter_loop/1,loop/1]).
client() ->
{ok, Socket} = gen_tcp:connect("localhost", 4001,[list, {packet, 0}]),
ok = gen_tcp:send(Socket, "packet"),
receive
{tcp,Socket,String} ->
io:format("Client received = ~p~n",[String]),
io:format("Client result = ~p~n",[String]),
gen_tcp:close(Socket)
after 1000 ->
exit
end.
server() ->
Pid = spawn(fun()-> start() end),
Pid.
start() ->
io:format("Started Server:~n"),
{ok, Socket} = gen_tcp:listen(4001, [binary, {packet, 0},{reuseaddr, true},{active, false}]),
accept(Socket).
accept(ListenSocket) ->
io:format("Accept Server:~n"),
case gen_tcp:accept(ListenSocket) of
{ok, Socket} ->
Pid = spawn(fun() ->
io:format("Connection accepted ~n", []),
enter_loop(Socket)
end),
io:format("Pid ~p~n",[Pid]),
gen_tcp:controlling_process(Socket, Pid),
Pid ! ack,
accept(ListenSocket);
Error ->
exit(Error)
end.
enter_loop(Socket) ->
%% make sure to acknowledge owner rights transmission finished
receive ack -> ok end,
loop(Socket).
loop(Socket) ->
%% set socket options to receive messages directly into itself
%%inet:setopts(Socket, [{active, once}]),
io:format("Loop Server:~n"),
case gen_tcp:recv(Socket, 50) of
{ok, Data} ->
case Data of
<<"packet">> ->
io:format("Server replying = ~p~n",[Data]),
gen_tcp:send(Socket, Data),
loop(Socket)
end;
{error, Reason} ->
io:format("Error on socket ~p reason: ~p~n", [Socket, Reason])
end.
I am not very clear about your question, but the above code does not work. Hope the following answers your problem. Your tcp receive case gen_tcp:recv(Socket, 50) of has one error. It is waiting for 50 bytes to read. Check the documentation of gen_tcp:recv/2. Change the length (length of packer 6 but preferably to) 0 to receive all the bytes.
The value does not limit the size of data, but it will not send the data back until it receives data of length 50. Instead you may need to accept and then check it.

problems while sending messages using ssl:send

I am writing some code which sends data over ssl sockets.
The sending part is inside a gen_server:call/3 as:
handle_call({send, Data}, _From, #state{socket=Socket} = State) ->
Reply = case ssl:send(Socket, Data) of
ok ->
ok;
{error, Error} ->
{error, Error}
end,
{reply, Reply, State}.
the problem is that if i kill the application which behaves as server at the other side of the connection, the result of the call is 'ok' but the Data is not sent. Does that mean that the socket is viewed as alive untile {ssl_closed, S} is received by the process?
It was my mistake, data is actually sent but never recovered by peer.

Messages received from port in erlang-sqlite3

Erlang-sqlite3 uses a port driver to connect with the SQLite database, and receives messages from the port:
wait_result(Port) ->
receive
{Port, Reply} ->
% io:format("Reply: ~p~n", [Reply]),
Reply;
{error, Reason} ->
io:format("Error: ~p~n", [Reason]),
{error, Reason};
_Else ->
io:format("Else: ~p~n", [_Else]),
_Else
end.
I thought that messages from ports should look like this:
{Port,{data,Data}} Data is received from the external program.
{Port,closed} Reply to Port ! {Pid,close}.
{Port,connected} Reply to Port ! {Pid,{connect,NewPid}}
{'EXIT',Port,Reason} If the port has terminated for some reason.
So, when uncommenting the io:format line in {Port, Reply} clause, I should expect to see {data, ...} for actual replies. I don't; instead I see (for test.erl)
Reply: {ok,101}
Reply: [{columns,["name"]},{rows,[{<<"user">>}]}]
Reply: [{columns,["sql"]},
{rows,[{<<"CREATE TABLE user (id INTEGER PRIMARY KEY, name TEXT, age INTEGER, wage INTEGER)">>}]}]
Reply: {id,1}
Reply: {id,2}
Reply: [{columns,["id","name","age","wage"]},
{rows,[{1,<<"abby">>,20,2000},{2,<<"marge">>,30,2000}]}]
Reply: [{columns,["id","name","age","wage"]},{rows,[{1,<<"abby">>,20,2000}]}]
Reply: [{columns,["id","name","age","wage"]},
{rows,[{1,<<"abby">>,20,2000},{2,<<"marge">>,30,2000}]}]
Reply: {ok,101}
Reply: [{columns,["id","name","age","wage"]},{rows,[{1,<<"abby">>,20,2000}]}]
Reply: {ok,101}
Where am I going wrong?
Will messages I get on a port error look like {'EXIT',Port,Reason} or not?
It seems that between your process and port is another process involved which decodes real port messages. Are you sure that Port is really Port?. Try io:format("Port: ~p~n", [Port]) If you will see something like #Port<0.500> it is port, if it will be something like <0.38.0> there is the man in the middle.
The relevant example in http://www.erlang.org/doc/apps/erts/driver.html is the last one. It turns out that when using driver_output_term, the term is sent by itself:
receive
Result ->
Result
end.
instead of
receive
{Port, {data, Result}} ->
Result
end.

Resources