Cannot get string from StringStream - delphi

I am sending an HTTP Get request to Google's Map API, and I fill my StringStream with the response. However, when I try to read from the stream, I am just presented with an empty string ''.
{ Attempts to get JSON back from Google's Directions API }
function GetJSONString_OrDie(url : string) : string;
var
lHTTP: TIdHTTP;
SSL: TIdSSLIOHandlerSocketOpenSSL;
Buffer: TStringStream;
begin
{Sets up SSL}
SSL := TIdSSLIOHandlerSocketOpenSSL.Create(nil);
{Creates an HTTP request}
lHTTP := TIdHTTP.Create(nil);
{Sets the HTTP request to use SSL}
lHTTP.IOHandler := SSL;
{Set up the buffer}
Buffer := TStringStream.Create(Result);
{Attempts to get JSON back from Google's Directions API}
lHTTP.Get(url, Buffer);
Result:= Buffer.ReadString(Buffer.Size); //An empty string is put into Result
finally
{Frees up the HTTP object}
lHTTP.Free;
{Frees up the SSL object}
SSL.Free;
end;
Why am I getting an empty string back, when I can see that the StringStream Buffer has plenty of data (size of 32495 after the Get is called).
I've tested my call, and I am returned with valid JSON.

First, you are using TStringStream to receive the response data. If you are using Delphi 2009+, DO NOT do that! TStringStream is tied to a specific encoding that has to be declared in the constructor before the stream is populated with data, and it cannot be changed dynamically. The default encoding is TEncoding.Default, which represents the OS default encoding. If the HTTP response uses a different encoding, the data will not decode to a String correctly.
Second, you are not seeking the stream's Position back to 0 before calling ReadString(). An easier way to retrieve a TStringStream's content as a decoded String is to use the DataString property instead, which ignores the Position property and returns the entire stream content as a whole:
Result := Buffer.DataString;
Third, you are doing too much manual work. TIdHTTP.Get() has an overloaded version that returns a decoded String. The benefit of using this method is that it uses the actual charset of the response, rather than the charset of a TStringStream:
function GetJSONString_OrDie(const URL: string): string;
var
lHTTP: TIdHTTP;
SSL: TIdSSLIOHandlerSocketOpenSSL;
begin
{Creates an HTTP request}
lHTTP := TIdHTTP.Create(nil);
try
{Sets the HTTP request to use SSL}
lHTTP.IOHandler := TIdSSLIOHandlerSocketOpenSSL.Create(lHTTP);
{Attempts to get JSON back from Google's Directions API}
Result := lHTTP.Get(URL);
finally
{Frees up the HTTP object}
lHTTP.Free;
end;
end;
Which can be simplified further if you are using an up-to-date version of Indy (see this blog post for details):
function GetJSONString_OrDie(const URL: string): string;
var
lHTTP: TIdHTTP;
begin
{Creates an HTTP request}
lHTTP := TIdHTTP.Create(nil);
try
{Attempts to get JSON back from Google's Directions API}
Result := lHTTP.Get(URL);
finally
{Frees up the HTTP object}
lHTTP.Free;
end;
end;

Maybe first set Buffer.Position := 0?

Related

Upload Word File to extract Text via TIKA REST

I am trying to call Apache-TIKA via their REST API. I have successfully been able to upload a PDF document and return the document's text via CURL
curl -X PUT --data-binary #<filename>.pdf http://localhost:9998/tika --header "Content-type: application/pdf"
That translated to INDY like so:
function GetPDFText(const FileName: String): String;
var
IdHTTP: TIdHTTP;
Params: TIdMultiPartFormDataStream;
begin
IdHTTP := TIdTTP.Create;
try
Params := TIdMultiPartFormDataStream.Create;
try
Params.Add('file', FileName, 'application/pdf')
Result := IdHTTP.PUT('http://localhost:9998/tika', Params);
finally
Params.Free;
end;
finally
IdHTTP.Free;
end;
end;
Now I want to upload a word document (.docx)
I assumed that all I would need to do is change the content Type when I add my file to Params, but that doesn't seem to produce any results, although I get no error reported back. I was able to get the following CURL command to work correctly
CURL -T <myDOCXfile>.docx http://localhost:9998/tika --header "Content-type: application/vnd.openxmlformats-officedocument.wordprocessingml.document"
How do I modify my HTTP call from CURL -X PUT to CURL -T?
There are at least two issues in your implementation:
Your translation from CURL -X PUT to TIdHTTP is wrong.
You don't specify Accept HTTP header to retrieve the extracted text in specific format.
How to translate curl -X PUT to Indy?
At first, lets make it clear that curl -X PUT --data-binary #<filename> <url> is the same as curl -T <filename> <url> when:
<url>'s scheme is HTTP or HTTPS
<url> does not end with /
Therefore using one or the other shouldn't matter in your case. See also curl documentation.
Secondly, TIdMultiPartFormDataStream is designed for use with POST verb, however nothing can stop you from passing it to TIdHTTP.Put, because it is indirectly derived from TStream. There even is a dedicated invariant of TIdHTTP.Post method that accepts TIdMultiPartFormDataStream:
function Post(AURL: string; ASource: TIdMultiPartFormDataStream): string; overload;
To upload file to the service just use TIdHTTP.Put method with TFileStream as an argument while providing proper content type of the file being uploaded in HTTP header.
And finally you're trying to extract plain text from the document, but you didn't specify content type that the service should return. This is done via Accept HTTP header. Default instance of TIdHTTP has property IdHTTP.Request.Accept initialized to 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8' (this may vary depending on Indy version). Therefore by default Tika will return HTML formatted text. To get the plain text you should change it to 'text/plain; charset=utf-8'.
Fixed implementation:
uses IdGlobal, IdHTTP;
function GetDocumentText(const FileName, ContentType: string): string;
var
IdHTTP: TIdHTTP;
Stream: TIdReadFileExclusiveStream;
begin
IdHTTP := TIdHTTP.Create;
try
IdHTTP.Request.Accept := 'text/plain; charset=utf-8';
IdHTTP.Request.ContentType := ContentType;
Stream := TIdReadFileExclusiveStream.Create(FileName);
try
Result := IdHTTP.Put('http://localhost:9998/tika', Stream);
finally
Stream.Free;
end;
finally
IdHTTP.Free;
end;
end;
function GetPDFText(const FileName: string): string;
const
PDFContentType = 'application/pdf';
begin
Result := GetDocumentText(FileName, PDFContentType);
end;
function GetDOCXText(const FileName: string): string;
const
DOCXContentType = 'application/vnd.openxmlformats-officedocument.wordprocessingml.document';
begin
Result := GetDocumentText(FileName, DOCXContentType);
end;
According to the Tika's documentation it also supports posting multipart form data. If you insist on using this approach, then you should change the target resource to /tika/form and switch to Post method in your implementation:
function GetDocumentText(const FileName, ContentType: string): string;
var
IdHTTP: TIdHTTP;
FormData: TIdMultiPartFormDataStream;
begin
IdHTTP := TIdHTTP.Create;
try
IdHTTP.Request.Accept := 'text/plain; charset=utf-8';
FormData := TIdMultiPartFormDataStream.Create;
try
FormData.AddFile('file', FileName, ContentType); { older Indy versions: FormData.Add(...) }
Result := IdHTTP.Post('http://localhost:9998/tika/form', FormData);
finally
FormData.Free;
end;
finally
IdHTTP.Free;
end;
end;
Why does the original implementation in question work with PDF files?
When you Post multipart form data via TIdHTTP, Indy automatically sets content type of the request to 'multipart/form-data; boundary=...whatever...'. This is not the case when you Put (unless you set it manually before performing the request) data and therefore TIdHttp.Request.ContentType remains blank. Now I can only guess that when Tika sees empty content type it falls back to some default type which could be PDF and it's still somehow able to read the document from multipart request.

Delphi indy get page content

I have seen a lot of examples online, but I cannot understand why my code doesn't work.
I have an url that looks like this:
http://www.domain.com/confirm.php?user=USERNAME&id=THEID
confirm.php is a page that does some checks on a MySQL database and then the only output of the page is a 0 or a -1 (true or false):
<?php
//long code...
if ( ... ) {
echo "0"; // success!
die();
} else {
echo "-1"; // fail!
die();
}
?>
My Delphi FireMonkey app has to open the URL above, passing the username and the id, and then read the result of the page. The result is only a -1 or a 0. This is the code.
//I have created a subclass of TThread
procedure TRegister.Execute;
var
conn: TIdHTTP;
res: string;
begin
inherited;
Queue(nil,
procedure
begin
ProgressLabel.Text := 'Connecting...';
end
);
//get the result -1 or 0
try
conn := TIdHTTP.Create(nil);
try
res := conn.Get('http://www.domain.com/confirm.php?user='+FUsername+'&id='+FPId);
finally
conn.Free;
end;
except
res := 'error!!';
end;
Queue(nil,
procedure
begin
ProgressLabel.Text := res;
end
);
end;
The value of res is always error!! and never -1 or 0. Where is my code wrong? The error caught from on E: Exception do is:
HTTP/1.1 406 not acceptable
I have found a solution using System.Net.HttpClient. I can simply use this function
function GetURL(const AURL: string): string;
var
HttpClient: THttpClient;
HttpResponse: IHttpResponse;
begin
HttpClient := THTTPClient.Create;
try
HttpResponse := HttpClient.Get(AURL);
Result := HttpResponse.ContentAsString();
finally
HttpClient.Free;
end;
end;
This works and gives me -1 and 0 as I expected. To get an example of a working code I have tested this:
procedure TForm1.Button1Click(Sender: TObject);
function GetURL(const AURL: string): string;
var
HttpClient: THttpClient;
HttpResponse: IHttpResponse;
begin
HttpClient := THTTPClient.Create;
try
HttpResponse := HttpClient.Get(AURL);
Result := HttpResponse.ContentAsString();
finally
HttpClient.Free;
end;
end;
function GetURLAsString(const aURL: string): string;
var
lHTTP: TIdHTTP;
begin
lHTTP := TIdHTTP.Create;
try
Result := lHTTP.Get(aURL);
finally
lHTTP.Free;
end;
end;
begin
Memo1.Lines.Add(GetURL('http://www.domain.com/confirm.php?user=user&id=theid'));
Memo1.Lines.Add(GetURLAsString('http://www.domain.com/confirm.php?user=user&id=theid'))
end;
end.
The first function works perfectly but Indy raises the exception HTTP/1.1 406 not acceptable. It seems that Indy cannot automatically handle the content type of the page. Here you can see the REST Debugger log:
HTTP Error 406 Not acceptable typically means that the server is not able to respond with the content type the client wanted. Both the Server and Client need to appropriately use the MIME type as you need. In this case, your client's Accept headers should provide the desired type of response, and your server should also be responding with the same. In your case, the Content-Type will most likely be text/plain.
So long story short, your client is expecting a MIME type which the server does not explicitly return in its response. The problem could be on either side, or perhaps both.
Your Client's Accept headers must provide the MIME type(s) you expect and need. Specifically Accept, Accept-Charset, Accept-Language, Accept-Encoding. By default in Indy TIdHTTP, these headers should accept essentially anything, assuming these headers haven't been overwritten. The Accept header is by default set to text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q‌​=0.8 where the */* opens the door for any MIME type.
Your Server's Response's Content-Type must be one of the provided MIME types, as well as the format of the response as also desired by the client. It is likely that your HTTP server is not providing the appropriate Content-Type in its response. If the server responds with anything in the */* filter (which should mean everything), then the client will accept it (assuming the server responds with text/plain). If the server responds with an invalid content type (such as just text or plain), then it could be rejected.

How to set body data indy HTTPS post

So I've looked around, and the only question describing my problem is 6 years old with 0 answers, so I guess I will try again.
I am using delphi 2009 with Indy10.
I am trying to post JSON to an api using HTTPS.
Instance.FHTTP := TIdHTTP.Create;
Instance.FHTTP.IOHandler := TIdSSLIOHandlerSocketOpenSSL.Create(Instance.FHTTP);
{$IFDEF DEBUG}
Instance.FHTTP.ProxyParams.ProxyPort := 8888;
Instance.FHTTP.ProxyParams.ProxyServer := '127.0.0.1';
{$ENDIF}
Instance.FHTTP.Request.ContentType := 'application/json';
Instance.FAccessToken := Instance.FHTTP.Post('https://somedomain.com/api/endpoint', '{JSONName: JSONValue }' );
I have seen many answers suggesting that the JSON payload should be given as a string param in the TidHTTP.Postmethod, but when i try that, it expects a filepath, and throws an error saying:
'Cannot open file "[path to project{JSONName:JSONValue }]". The specified file was not found'.
If i add my JSON to a TStringList and add give that as a parameter, it simply adds the JSON to the header of the request.
Any help is greatly appreciated.
The Post overload that takes a second string indeed interprets it as a filename:
function Post(AURL: string; const ASourceFile: String): string; overload;
That's why this doesn't work. You need to instead use the overload that takes a TStream:
function Post(AURL: string; ASource: TStream): string; overload;
You can put your JSON in a TStringStream:
StringStream := TStringStream.Create('{JSONName: JSONValue }', TEncoding.UTF8);
try
Instance.FAccessToken := Instance.FHTTP.Post('https://somedomain.com/api/endpoint', StringStream);
finally
StringStream.Free;
end;

Get string from a idhttp get

currently I am able to run a command but i cant figure out how to get the result into a string.
I do a get like so
idhttp1.get('http://codeelf.com/games/the-grid-2/grid/',TStream(nil));
and everything seems to run ok, in wireshark i can see the results from that command. Now if i do
HTML := idhttp1.get('http://codeelf.com/games/the-grid-2/grid/');
it will freeze up the app, in wireshark i can see it sent the GET and got a response, but dont know why it freezes up. HTML is just a string var.
EDIT FULL CODE
BUTTON CLICK
login(EUserName.Text,EPassWord.Text);
procedure TForm5.Login(name: string; Pass: string);
var
Params: TStringList;
html : string;
begin
Params := TStringList.Create;
try
Params.Add('user='+name);
Params.Add('pass='+pass);
Params.Add('sublogin=Login');
//post password/username
IdHTTP1.Post('http://codeelf.com/games/the-grid-2/grid/', Params);
//get the grid source
HTML := idhttp1.Get('http://codeelf.com/games/the-grid-2/grid/');
finally
Params.Free;
end;
llogin.Caption := 'Logged In';
end;
RESPONCE
The responce i get says Transfer-Encoding: chunked\r\n and Content-Type: text/html\r\n dont know if that matters.
Thanks
Indy has support for some types of streamed HTTP responses (see New TIdHTTP hoNoReadMultipartMIME flag), but this will only help if the server uses multipart/* responses. The linked blog article explains the details further and also shows how the Indy HTTP component can feed a MIME decoder with a continuous response stream.
If this is not applicable to your case, a workaround is to go down to the "raw" TCP layer, which means send the HTTP request using a TIdTCPClient component, and then read the response line by line (or byte by byte) from the IOHandler. This gives total control over response handling. Request and Response should be processed in a thread to decouple it from the main thread.
TIdHTTP.Post() returns the response data, you should not be calling TIdHTTP.Get() to retrieve it separately:
procedure TForm5.Login(name: string; Pass: string);
var
Params: TStringList;
html : string;
begin
Params := TStringList.Create;
try
Params.Add('user='+name);
Params.Add('pass='+pass);
Params.Add('sublogin=Login');
//post password/username
HTML := IdHTTP1.Post('http://codeelf.com/games/the-grid-2/grid/', Params);
finally
Params.Free;
end;
llogin.Caption := 'Logged In';
end;

Delphi: Using Google URL Shortener with IdHTTP - 400 Bad Request

I'm trying to access the URL Shortener ( http://goo.gl/ ) via its API from within Delphi.
However, the only result I get is: HTTP/1.0 400 Bad Request (reason: parseError)
Here is my code (on a form with a Button1, Memo1 and IdHTTP1 that has IdSSLIOHandlerSocketOpenSSL1 as its IOHandler. I got the necessary 32-bit OpenSSL DLLs from http://indy.fulgan.com/SSL/ and put them in the .exe's directory):
procedure TFrmMain.Button1Click(Sender: TObject);
var html, actionurl: String;
makeshort: TStringList;
begin
try
makeshort := TStringList.Create;
actionurl := 'https://www.googleapis.com/urlshortener/v1/url';
makeshort.Add('{"longUrl": "http://slashdot.org/stories"}');
IdHttp1.Request.ContentType := 'application/json';
//IdHTTP1.Request.ContentEncoding := 'UTF-8'; //Using this gives error 415
html := IdHTTP1.Post(actionurl, makeshort);
memo1.lines.add(idHTTP1.response.ResponseText);
except on e: EIdHTTPProtocolException do
begin
memo1.lines.add(idHTTP1.response.ResponseText);
memo1.lines.add(e.ErrorMessage);
end;
end;
memo1.Lines.add(html);
makeshort.Free;
end;
Update: I have left off my API key in this example (should usually work well without one for a few tries), but if you want to try it with your own, you can substitute the actionurl string with
'https://www.googleapis.com/urlshortener/v1/url?key=<yourapikey>';
The ParseError message leads me to believe that there might be something wrong with the encoding of the longurl when it gets posted but I wouldn't know what to change.
I've been fuzzing over this for quite a while now and I'm sure the mistake is right before my eyes - I'm just not seeing it right now.
Any help is therefore greatly appreciated!
Thanks!
As you discovered, the TStrings overloaded version of the TIdHTTP.Post() method is the wrong method to use. It sends an application/x-www-form-urlencoded formatted request, which is not appropriate for a JSON formatted request. You have to use the TStream overloaded version of the TIdHTTP.Post() method instead`, eg:
procedure TFrmMain.Button1Click(Sender: TObject);
var
html, actionurl: String;
makeshort: TMemoryStream;
begin
try
makeshort := TMemoryStream.Create;
try
actionurl := 'https://www.googleapis.com/urlshortener/v1/url';
WriteStringToStream(makeshort, '{"longUrl": "http://slashdot.org/stories"}', IndyUTF8Encoding);
makeshort.Position := 0;
IdHTTP1.Request.ContentType := 'application/json';
IdHTTP1.Request.Charset := 'utf-8';
html := IdHTTP1.Post(actionurl, makeshort);
finally
makeshort.Free;
end;
Memo1.Lines.Add(IdHTTP1.Response.ResponseText);
Memo1.Lines.Add(html);
except
on e: Exception do
begin
Memo1.Lines.Add(e.Message);
if e is EIdHTTPProtocolException then
Memo1.lines.Add(EIdHTTPProtocolException(e).ErrorMessage);
end;
end;
end;
From the URL shortener API docs:
Every request your application sends to the Google URL Shortener API
needs to identify your application to Google. There are two ways to
identify your application: using an OAuth 2.0 token (which also
authorizes the request) and/or using the application's API key.
Your example does not contain code for OAuth or API key authentication.
To authenticate with an API key, the docs are clear:
After you have an API key, your application can append the query
parameter key=yourAPIKey to all request URLs.

Resources