How do I extract the target URL from a Google search result? - delphi

I am trying to extract URLs from Google search results. I use Indy IdHTTP to get HTML results from Google, and I use Achmad Z's code for getting the link hrefs from the page. How can I get the real link target for each URL instead of the one that goes through Google's redirector?
I tried that but I get an "Operand no applicable" error in this part of the code:
function ToUTF8Encode(str: string): string;
var
b: Byte;
begin
for b in BytesOf(UTF8Encode(str)) do
begin
Result := Format('%s%s%.2x', [Result, '%', b]);
end;
end;
I use Delphi 7 with Indy 9.00.10. Maybe indy update will help ?

If I get it right you are trying to fetch the Google search results using TIdHTTP.Get method. If so, thenyou should definitely focus on some Google Search API implementation because
it's impossible to fetch the results this way because you don't have any access to the document inside the iframe in which the search results are, so you won't get any search results by using HTTP GET in this case (or at least I haven't heard about the request which can do that)
it's against Google policies and you should use proper Google Search API instead, for instance Google SOAP Search API, there are available also several types of Google Search API's for various purposes
You can find e.g. here the Delphi wrapper with the demo code for Google Search API. I've tested it with Delphi 2009 on Windows 7/64 and it works fine for me.

In the previous post here I've tried to explain why you should use Google Search API, in this one I'll try to provide you an example with a hope it will work in your Delphi 7.
You need to have the SuperObject (JSON parser for Delphi), I've used this version (latest at this time). Then you need Indy; the best would be to upgrade to the latest version if possible. I've used the one shipped with Delphi 2009, but I think the TIdHTTP.Get method is so important that it must work fine also in your 9.00.10 version.
Now you need a list box and a button on your form, the following piece of code and a bit of luck (for compatibility :)
The URL request building you can see for instance in the DxGoogleSearchApi.pas mentioned before but the best is to follow the Google Web Search API reference. In DxGoogleSearchApi.pas you can take the inspiration e.g. how to fetch several pages.
So take this as an inspiration
uses
IdHTTP, IdURI, SuperObject;
const
GSA_Version = '1.0';
GSA_BaseURL = 'http://ajax.googleapis.com/ajax/services/search/';
procedure TForm1.GoogleSearch(const Text: string);
var
I: Integer;
RequestURL: string;
HTTPObject: TIdHTTP;
HTTPStream: TMemoryStream;
JSONResult: ISuperObject;
JSONResponse: ISuperObject;
begin
RequestURL := TIdURI.URLEncode(GSA_BaseURL + 'web?v=' + GSA_Version + '&q=' + Text);
HTTPObject := TIdHTTP.Create(nil);
HTTPStream := TMemoryStream.Create;
try
HTTPObject.Get(RequestURL, HTTPStream);
JSONResponse := TSuperObject.ParseStream(HTTPStream, True);
if JSONResponse.I['responseStatus'] = 200 then
begin
ListBox1.Items.Add('Search time: ' + JSONResponse.S['responseData.cursor.searchResultTime']);
ListBox1.Items.Add('Fetched count: ' + IntToStr(JSONResponse['responseData.results'].AsArray.Length));
ListBox1.Items.Add('Total count: ' + JSONResponse.S['responseData.cursor.resultCount']);
ListBox1.Items.Add('');
for I := 0 to JSONResponse['responseData.results'].AsArray.Length - 1 do
begin
JSONResult := JSONResponse['responseData.results'].AsArray[I];
ListBox1.Items.Add(JSONResult.S['unescapedUrl']);
end;
end;
finally
HTTPObject.Free;
HTTPStream.Free;
end;
end;
procedure TForm1.Button1Click(Sender: TObject);
begin
GoogleSearch('Delphi');
end;

Answer to my question , maybe it can help someone:
Fetching web page :
memo1.Lines.Text := idhttp1.Get('http://ajax.googleapis.com/aja...tart=1&rsz=large&q=max');
extracting URL's :
function ExtractText(const Str, Delim1, Delim2: string; PosStart: integer; var PosEnd: integer): string;
var
pos1, pos2: integer;
begin
Result := '';
pos1 := PosEx(Delim1, Str, PosStart);
if pos1 > 0 then
begin
pos2 := PosEx(Delim2, Str, pos1 + Length(Delim1));
if pos2 > 0 then
begin
PosEnd := pos2 + Length(Delim2);
Result := Copy(Str, pos1 + Length(Delim1), pos2 - (pos1 + Length(Delim1)));
end;
end;
end;
And on Button1 just put :
procedure TForm1.Button1Click(Sender: TObject);
var Pos: integer;
sText: string;
begin
sText := ExtractText(Memo1.Lines.Text, '"url":"', '","visibleUrl"', 1, Pos);
while sText <> '' do
begin
Memo2.Lines.Add(sText);
sText := ExtractText(Memo1.Lines.Text, '"url":"', '","visibleUrl"', Pos, Pos);
end;
end;
www.delphi.about.com has nice documentation on string manipulation , Zarko Gajic does great job on that site :)
NOTE: if google changes it's source this will be useless.

Related

Fetch the content of a web page with DELPHI

I am trying to retrieve the <table><tbody> section of this page:
http://www.mfinante.ro/infocodfiscal.html?captcha=null&cod=18505138
I am using Delphi XE7.
I tried using IXMLHttpRequest, WinInet (InternetOpenURL(), InternetReadFile()), TRestClient/TRestRequest/TRestResponse, TIdHTTP.Get(), but all they retrieve is some gibberish, like this:
<html><head><meta http-equiv="Pragma" content="no-cache"/>'#$D#$A'<meta http-equiv="Expires" content="-1"/>'#$D#$A'<meta http-equiv="CacheControl" content="no-cache"/>'#$D#$A'<script>'#$D#$A'(function(){p={g:"0119a4477bb90c7a81666ed6496cf13b5aad18374e35ca73f205151217be1217a93610c5877ece5575231e088ff52583c46a8e8807483e7185307ed65e",v:"87696d3d40d846a7c63fa2d10957202e",u:"1",e:"1",d:"1",a:"challenge etc.
Look at this code for example:
program htttpget;
{$APPTYPE CONSOLE}
{$R *.res}
uses
SysUtils, HTTPApp, IdHTTP, ActiveX;
var
CoResult: Integer;
HTTP: TIdHTTP;
Query: String;
Buffer: String;
begin
try
CoResult := CoInitializeEx(nil, COINIT_MULTITHREADED);
if not((CoResult = S_OK) or (CoResult = S_FALSE)) then
begin
Writeln('Failed to initialize COM library.');
Exit;
end;
HTTP := TIdHTTP.Create;
Query := 'http://www.mfinante.ro/infocodfiscal.html?captcha=null' +
'&cod=18505138';
Buffer := HTTP.Get(Query);
writeln(Buffer);
HTTP.Destroy;
except
end;
end.
What is wrong with this page? I haven not done very many "get" functions in my life, but other websites return normal responses. Can someone at least clarify to me why this isn't working?
Are there other ways to get the content of this web page? Are there other programming languages (Java, scripting, etc) that can do this without third party software (like using Firefox source code to emulate a browser, fetch the page, without showing the window, and then copy the content).
You can use TWebBrowser for this.
See this post: How can I get HTML source code from TWebBrowser
The answer by RRUZ, which you can find in many places on the internet, is not what you are looking for. This gives you are original html source, as would IdHttp.Get().
However, the answer by Mehmet Fide will give you the HTML source of the DOM, which is what you are looking for.
I offer a variation here. (It includes some hacks that were required at the time to get full DOCTYPE. Not sure if they are still needed...)
function EndStr(const S: String; const Count: Integer): String;
var
I: Integer;
Index: Integer;
begin
Result := '';
for I := 1 to Count do
begin
Index := Length(S)-I+1;
if Index > 0 then
Result := S[Index] + Result;
end;
end;
function GetHTMLDocumentSource(WebBrowser: TWebBrowser; var Charset: String):
String;
var
Element: IHTMLElement;
Node: IHTMLDomNode;
Document: IHTMLDocument2;
I: Integer;
S: String;
begin
Result := '';
Document := WebBrowser.Document as IHTMLDocument2;
For I := 0 to Document.all.length -1 do
begin
Element := Document.all.item(I, 0) as IHTMLElement;
If Element.tagName = '!' Then
begin
Node := Element as IHTMLDomNode;
If (Node <> nil) and (Pos('CTYPE', UpperCase(Node.nodeValue)) > 0) Then
begin
S := VarToStr(Node.nodeValue); { don't change case of result }
if Copy(Uppercase(S), 1, 5) = 'CTYPE' then
S := 'DO' + S;
if Copy(Uppercase(S), 1, 7) = 'DOCTYPE' then
S := '<!' + S;
if Uppercase(S) = '<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 TRANSITIONAL//E' then
S := S +'N">';
if EndStr(Lowercase(S), 3) = '.dt' then
S := S + 'd"';
if EndStr(Lowercase(S), 5) = '.dtd"' then
S := S + '>';
Result := Result + S;
end;
end
Else
Result := Result + Element.outerHTML;
If Element.tagName = 'HTML' Then
Break;
end;
Charset := Document.charset;
end;
So call WebBrowser.Navigate(URL), then in OnDocumentComplete event retrieve the Html Source.
However, with your URL you will see the OnDocumentComplete event fires twice :(, so you need to get the Html from the last fire.
You can refer to this post How do I avoid the OnDocumentComplete event for embedded iframe elements? for info on how to get the final OnDocumentComplete event. However, I tried it and it was not working for me. You may need to use some other strategy to get the last event.
Not sure of your needs, but you may also optimize this process by disabling WebBrowser from downloading images. I believe that is possible.
This is normal, you have indeed retrieved the content correctly. What happens in your browser is that the script is executed and the page gets built client side. If you wish to replicate that in your code, then you will need to do the same. Execute the script exactly as the browser would.
What you are really looking for here is what is known as a headless browser. Integrate one of those into your program. Then get the headless browser to process the request, including executing scripts. When it has done executing scripts, read the modified content of the page.

How to download ask.fm specific user wall source code with all questions and answers?

I'm trying to download all the questions and answers form users profile , but there is a problem , if user has big number of questions that I have to click on "Show more" to expand that list.If I try to download for example this persons questions and answers : http://ask.fm/UnaRamekic (random choice) , I'll get only those that are shown , those that are displayed after I click show more are not retrieved with get request.How can I get all the questions with ICS or Indy components. Thanks.
My code:
procedure TForm1.sButton1Click(Sender: TObject);
begin
With HttpCli1 do begin
URL := sedit1.Text;
RequestVer := '1.1';
RcvdStream := TMemoryStream.Create;
try
Get;
except
ShowMessage('There has been an error , check your internet connection !');
RcvdStream.Free;
Exit;
end;
RcvdStream.Seek(0,0);
Memo1.Lines.LoadFromStream(RcvdStream);
RcvdStream.Free;
end;
end;
Warning:
This approach is lame and quite dangerous. It's posting the form data similarly like the Show more button does, but it uses a while loop (to receive all pages), which repeats until the exact constant in response is found (in code it's the LastPageResponse constant), so when the response content of the page will be changed some time and that constant won't be in the response, you will find yourself in the infinite loop.
In the GetAllQuestions function you can specify:
AUser - is the user name after the slash from the URL
AFromDate - is a starting date time from which you want to get results
AStartPage - is a starting page from the AFromDate date time from which you want to get results
The GetAllQuestions function returns a base user's page, followed by line breaks separated content in a range from the base page to all pages from the time and page you specify. Forgot to notice, that the additional content you'll need to parse in a different way than a base page, since it's not a HTML content.
uses
IdHTTP;
implementation
function GetAllQuestions(const AUser: string; AFromDate: TDateTime;
AStartPage: Integer = 1): string;
var
Response: string;
LastPage: Integer;
TimeString: string;
HTTPClient: TIdHTTP;
Parameters: TStrings;
const
LineBreaks = sLineBreak + sLineBreak;
LastPageResponse = '$("#more-container").hide();';
begin
Result := '';
HTTPClient := TIdHTTP.Create(nil);
try
Result := HTTPClient.Get('http://ask.fm/' + AUser) + LineBreaks;
Parameters := TStringList.Create;
try
LastPage := AStartPage;
TimeString := FormatDateTime('ddd mmm dd hh:nn:ss UTC yyyy', AFromDate);
Parameters.Add('time=' + TimeString);
Parameters.Add('page=' + IntToStr(LastPage));
while LastPage <> -1 do
begin
Parameters[1] := 'page=' + IntToStr(LastPage);
Response := HTTPClient.Post('http://ask.fm/' + AUser + '/more',
Parameters);
if Copy(Response, Length(Response) - Length(LastPageResponse) + 1,
MaxInt) = LastPageResponse
then
LastPage := -1
else
LastPage := LastPage + 1;
Result := Result + Response + LineBreaks;
end;
finally
Parameters.Free;
end;
finally
HTTPClient.Free;
end;
end;
And the usage:
procedure TForm1.Button1Click(Sender: TObject);
begin
try
Memo1.Text := GetAllQuestions('TLama', Now);
except
on E: Exception do
ShowMessage(E.Message);
end;
end;
You're not going to be able to do that with Indy or ICS alone. What you initially see is exactly what is being downloaded when you pull the HTTP request.
If you look at the HTML source of the page, you'll see that the "View More" button has a JavaScript event handler attached to it that makes an AJAX request to the server, pulls more data from it, and applies it to the page. If you want to do the same, your code needs to parse things out at least enough to get the right AJAX parameters, then make the request to the server from your Indy or ICS code like any other HTTP request, and deal with the data that comes back.

SOAP client in Delphi "The handle is in the wrong state for the requested operation"

I have built the worlds dumbest and most simple SOAP server, in about 3 clicks, in visual studio. The exact steps in visual studio 2010: First create a new project as a web application, Then add a new item of type web service. (See accepted answer here for picture.) That soap server service Service1 has a simple method GetData:
A snippet from clientService1.pas, created using WSDL importer...
IService1 = interface(IInvokable)
['{967498E8-4F67-AAA5-A38F-F74D8C7E346A}']
function GetData(const value: Integer): string; stdcall;
function GetDataUsingDataContract(const composite: CompositeType2): CompositeType2; stdcall;
end;
When I try to run this method, like this:
procedure TForm3.Button1Click(Sender: TObject);
var
rio : THTTPRIO;
sv:IService1;
addr : string;
data : string;
begin
//addr := '....'; // url from visual studio 2010 live debug instance.
rio := THTTPRIO.Create(nil);
sv := GetIService1( true, addr, rio );
try
data := sv.GetData( 0);
Button1.Caption := data;
finally
sv := nil;
rio.Free;
end;
end;
The error I get is this:
ESOAPHTTPException:
The handle is in the wrong state for the requested operation -
URL:http://localhost:8732/Design_Time_Addresses/WcfServiceLibrary1/Service1/ -
SOAPAction:http://tempuri.org/IService1/GetData'.
The URL works fine when I paste the url above into a web browser, so the usual answer that the SOAP code in Delphi has the tendency to not notice an HTTP failure, does not seem likely. Rather it seems that I am either (a) experiencing breakage in WinInet (known to happen in some versions of windows), or (b) doing something wrong?
It seems to me that anybody who has visual studio and delphi both installed, should be able to try to get the dummy starter Soap server in Visual Studio talking to the soap client in Delphi, without any effort at all. But I can not figure out the simplest things.
At one time there was a discussion about the error in a conversation now long since deleted from Embarcadero forums, by Bruneau Babet, an embarcadero staffer.
Bruno said:
Hello,
I've posted a patched version of SOAPHTTPTrans.pas that contains a fix
for this issue here:
[forum link redacted, it didn't work anymore anyways, the post is gone]
You may still override the event as described in the C++Builder
section referred; or, much simpler, at least for Delphi users, simply
add the updated SOAPHTTPTrans.pas to your app's project. Let us know
if that does not work for you.
Cheers,
Bruneau
You can get the repair and the notes about it in its original forum formatting from the following pastebin link and on bitbucket so you don't have to extract the file from the surrounding text.
Warren Update 2016: I have been informed by someone who tried to use the fix on Delphi XE that this fix does NOT work for them in Delphi XE. Any further updates to the code in bitbucket that resolve the remaining bugs would be appreciated.
I ran into the The handle is in the wrong state for the requested operation issue in November 2018 using Delphi Tokyo 10.2.3, then looked at the code patch in the pastebin link under Arjen's answer.
That code is very old and the test code no longer works (SOAP service unavailable). Also, it is unclear from Bruneau's code what he patched exactly.
Comparing that source and the one from my Delphi version it seems that these are the (two) required modifications in the HandleWinInetError procedure ('PATCH HERE'):
function THTTPReqResp.HandleWinInetError(LastError: DWord;
Request: HINTERNET;
RaiseError: Boolean): DWord;
function CallInternetErrorDlg: DWord;
var
P: Pointer;
begin
Result := InternetErrorDlg(GetDesktopWindow(), Request, LastError,
FLAGS_ERROR_UI_FILTER_FOR_ERRORS or
FLAGS_ERROR_UI_FLAGS_CHANGE_OPTIONS or
FLAGS_ERROR_UI_FLAGS_GENERATE_DATA, P);
{ After selecting client certificate send request again,
Note: InternetErrorDlg always returns ERROR_SUCCESS when called with
ERROR_INTERNET_CLIENT_AUTH_CERT_NEEDED }
if LastError = ERROR_INTERNET_CLIENT_AUTH_CERT_NEEDED then
Result := ERROR_INTERNET_FORCE_RETRY;
end;
const
{ Missing from our WinInet currently }
INTERNET_OPTION_CLIENT_CERT_CONTEXT = 84;
var
Flags, FlagsLen, DWCert, DWCertLen: DWord;
ClientCertInfo: IClientCertInfo;
CertSerialNum: string;
{$IFDEF CLIENT_CERTIFICATE_SUPPORT}
hStore: HCERTSTORE;
CertContext: PCERT_CONTEXT;
{$ENDIF}
begin
{ Dispatch to custom handler, if there's one }
if Assigned(FOnWinInetError) then
Result := FOnWinInetError(LastError, Request)
else
begin
Result := ERROR_INTERNET_FORCE_RETRY;
{ Handle INVALID_CA discreetly }
if (LastError = ERROR_INTERNET_INVALID_CA) and (soIgnoreInvalidCerts in InvokeOptions) then
begin
FlagsLen := SizeOf(Flags);
InternetQueryOption(Request, INTERNET_OPTION_SECURITY_FLAGS, Pointer(#Flags), FlagsLen);
Flags := Flags or SECURITY_FLAG_IGNORE_UNKNOWN_CA;
InternetSetOption(Request, INTERNET_OPTION_SECURITY_FLAGS, Pointer(#Flags), FlagsLen);
end
else if (LastError = ERROR_INTERNET_SEC_CERT_REV_FAILED) and (soIgnoreInvalidCerts in InvokeOptions) then
begin
FlagsLen := SizeOf(Flags);
InternetQueryOption(Request, INTERNET_OPTION_SECURITY_FLAGS, Pointer(#Flags), FlagsLen);
Flags := Flags or SECURITY_FLAG_IGNORE_REVOCATION;
InternetSetOption(Request, INTERNET_OPTION_SECURITY_FLAGS, Pointer(#Flags), FlagsLen);
end
{$IFDEF CLIENT_CERTIFICATE_SUPPORT}
else if (LastError = ERROR_INTERNET_CLIENT_AUTH_CERT_NEEDED) and
Supports(Self, IClientCertInfo, ClientCertInfo) and
(ClientCertInfo.GetCertSerialNumber <> '') then
begin
CertSerialNum := ClientCertInfo.GetCertSerialNumber();
hStore := ClientCertInfo.GetCertStore();
if hStore = nil then
begin
hStore := CertOpenSystemStore(0, PChar('MY'));
ClientCertInfo.SetCertStore(hStore);
end;
CertContext := FindCertWithSerialNumber(hStore, CertSerialNum);
if CertContext <> nil then
begin
ClientCertInfo.SetCertContext(CertContext);
InternetSetOption(Request, INTERNET_OPTION_CLIENT_CERT_CONTEXT,
CertContext, SizeOf(CERT_CONTEXT));
end
else
begin
if RaiseError then RaiseCheck(LastError); // PATCH HERE
Result := CallInternetErrorDlg;
end;
end
{$ENDIF}
else if (LastError = ERROR_INTERNET_CLIENT_AUTH_CERT_NEEDED) and (soPickFirstClientCertificate in InvokeOptions) then
begin
{ This instructs WinInet to pick the first (a random?) client cerficate }
DWCertLen := SizeOf(DWCert);
DWCert := 0;
InternetSetOption(Request, INTERNET_OPTION_SECURITY_SELECT_CLIENT_CERT,
Pointer(#DWCert), DWCertLen);
end
else
begin
if RaiseError then RaiseCheck(LastError); // PATCH HERE
Result := CallInternetErrorDlg;
end;
end;
end;
Note that the RaiseError procedure parameter was not even used before this patch ;-)
Here is some test code using the SOAP service from NOAA's National Digital Forecast Database (NDFD) SOAP Web Service:
Uses SOAP.SOAPHTTPTrans;
const Request2 =
'<soapenv:Envelope xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:ndf="http://graphical.weather.gov/xml/DWMLgen/wsdl/ndfdXML.wsdl">' +
' <soapenv:Header/>' +
' <soapenv:Body>' +
' <ndf:NDFDgenByDay soapenv:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">' +
' <latitude xsi:type="xsd:decimal">38.9936</latitude>' +
' <longitude xsi:type="xsd:decimal">-77.0224</longitude>' +
' <startDate xsi:type="xsd:date">%tomorrow%</startDate>' +
' <numDays xsi:type="xsd:integer">5</numDays>' +
' <Unit xsi:type="dwml:unitType" xmlns:dwml="http://graphical.weather.gov/xml/DWMLgen/schema/DWML.xsd">e</Unit>' +
' <format xsi:type="dwml:formatType" xmlns:dwml="http://graphical.weather.gov/xml/DWMLgen/schema/DWML.xsd">12 hourly</format>' +
' </ndf:NDFDgenByDay>' +
' </soapenv:Body>' +
'</soapenv:Envelope>';
const URL2= 'https://graphical.weather.gov:443/xml/SOAP_server/ndfdXMLserver.php';
procedure TFrmHandleWinINetError.Button1Click(Sender: TObject);
var
RR: THTTPReqResp;
Response: TMemoryStream;
U8: UTF8String;
begin
RR := THTTPReqResp.Create(nil);
try
try
RR.URL := URL2;
RR.UseUTF8InHeader := True;
RR.SoapAction := 'NDFDgenByDay';
Response := TMemoryStream.Create;
RR.Execute(Request2, Response);
SetLength(U8, Response.Size);
Response.Position := 0;
Response.Read(U8[1], Length(U8));
ShowMessage(String(U8));
except
on E:Exception do ShowMessage('ERROR CAUGHT: ' + e.message);
end;
finally
Response.Free;
RR.Free;
end;
end;
end;
Without the patch errors in the tail end of the URL are caught, but errors in the domain name just trigger an empty error message.
With the patch those are also caught.
I have a reported the issue in the RAD Studio Quality Portal under number RSP-21862
Use at your own risk and please report any additional findings.
Addition: The issue was fixed in Dec 2018 in Delphi 10.3 Rio and the Quality Portal issue was closed with the following remark:
In RAD Studio 10.3 the implementation of THTTPReqResp was changed and replaced with THTTPClient. So, this issue no longer applies.
I have not verified this.

How to implement Drag & Drop from outlook mail or thunderbird to a delphi form?

Does anyone already implemented Drag & Drop of email messages from Outlook and/or Thunderbird (from now on "OT") to a Delphi form.
I need to give the user a way to store important emails in my application database without writing OT plugins. Currently they use this technique:
from OT they click on an email,
save as...
save on desktop or temp folder,
drag and drop the saved file on the Delphi form.
While after the modification I want to do:
from OT they click on an email,
drag and drop the saved file on the Delphi form.
So basically I implemented drag & drop from explorer. I need an extra layer that allows my application to see the email originally on OT as a normal file, so I can drag from OT as if it was a normal windows explorer window.
Note: I don't need to support all OT versions. I can accept not to support Outlook 2003 (for example) but not 2010. So in case the technique will not work automatically for all OT versions I will prefer the one that works with the latest.
Final note: It is obvious anyway I am interested only in dragging & dropping emails (and not Outlook Calendar items, for example). An idea would be dragging and Dropping attachments too. But this could be an extra improvement for the future.
First of all, if you can find a ready made library that does this out of the box (like the one suggested by ldsandon) use it, because doing all of this by hand is painful and frustrating. The documentation is at times incomplete and might contain bugs: you'll end up doing stuff by trial and error and Google will not save you because not a lot of people delve into the depths of Ole drag-and-drop, and most of those that do will probably use ready-made code.
How to do this in plain Pascal
Theoretically the API that's used to make your application handle OLE drops is very simple. All you need to do is provide a implementation of the IDropTarget interface that does what you need and call RegisterDragDrop providing the handle for your application's window and the interface.
Here's how my implementation looks like:
TDropTargetImp = class(TInterfacedObject, IDropTarget)
public
function DragEnter(const dataObj: IDataObject; grfKeyState: Longint; pt: TPoint; var dwEffect: Longint): HResult; stdcall;
function DragOver(grfKeyState: Longint; pt: TPoint; var dwEffect: Longint): HResult; stdcall;
function DragLeave: HResult; stdcall;
function Drop(const dataObj: IDataObject; grfKeyState: Longint; pt: TPoint; var dwEffect: Longint): HResult; stdcall;
end;
The implementation of DragEnter, DragOver and DragLeave is trivial, considerring I'm doing this for an experiment: I'll just accept everything:
function TDropTargetImp.DragEnter(const dataObj: IDataObject; grfKeyState: Integer; pt: TPoint; var dwEffect: Integer): HResult;
begin
dwEffect := DROPEFFECT_COPY;
Result := S_OK;
end;
function TDropTargetImp.DragLeave: HResult;
begin
Result := S_OK;
end;
function TDropTargetImp.DragOver(grfKeyState: Integer; pt: TPoint; var dwEffect: Integer): HResult;
begin
dwEffect := DROPEFFECT_COPY;
Result := S_OK;
end;
The real work will be done in TDropTargetImp.Drop.
function TDropTargetImp.Drop(const dataObj: IDataObject; grfKeyState: Integer; pt: TPoint; var dwEffect: Integer): HResult;
var iEnum: IEnumFORMATETC;
DidRead:LongInt;
F: TFormatEtc;
STG:STGMEDIUM;
Response:Integer;
Stream:IStream;
Storage: IStorage;
EnumStg: IEnumStatStg;
ST_TAG: STATSTG;
FileStream: TFileStream;
Buff:array[0..1023] of Byte;
begin
if dataObj.EnumFormatEtc(DATADIR_GET, iEnum) = S_OK then
begin
{
while (iEnum.Next(1, F, #DidRead) = S_OK) and (DidRead > 0) do
begin
GetClipboardFormatName(F.cfFormat, FormatName, SizeOf(FormatName));
ShowMessage(FormatName + ' : ' + IntToHex(F.cfFormat,4) + '; lindex=' + IntToStr(F.lindex));
end;
}
ZeroMemory(#F, SizeOf(F));
F.cfFormat := $C105; // CF_FILECONTENTS
F.ptd := nil;
F.dwAspect := DVASPECT_CONTENT;
F.lindex := 0{-1}; // Documentation says -1, practice says "0"
F.tymed := TYMED_ISTORAGE;
Response := dataObj.GetData(F, STG);
if Response = S_OK then
begin
case STG.tymed of
TYMED_ISTORAGE:
begin
Storage := IStorage(STG.stg);
if Storage.EnumElements(0, nil, 0, EnumStg) = S_OK then
begin
while (EnumStg.Next(1, ST_TAG, #DidRead) = S_OK) and (DidRead > 0) do
begin
if ST_TAG.cbSize > 0 then
begin
Response := Storage.OpenStream(ST_TAG.pwcsName, nil, STGM_READ or STGM_SHARE_EXCLUSIVE, 0, Stream);
if Response = S_OK then
begin
// Dump the stored stream to a file
FileStream := TFileStream.Create('C:\Temp\' + ST_TAG.pwcsName + '.bin', fmCreate);
try
while (Stream.Read(#Buff, SizeOf(Buff), #DidRead) = S_OK) and (DidRead > 0) do
FileStream.Write(Buff, DidRead);
finally FileStream.Free;
end;
end
else
case Response of
STG_E_ACCESSDENIED: ShowMessage('STG_E_ACCESSDENIED');
STG_E_FILENOTFOUND: ShowMessage('STG_E_FILENOTFOUND');
STG_E_INSUFFICIENTMEMORY: ShowMessage('STG_E_INSUFFICIENTMEMORY');
STG_E_INVALIDFLAG: ShowMessage('STG_E_INVALIDFLAG');
STG_E_INVALIDNAME: ShowMessage('STG_E_INVALIDNAME');
STG_E_INVALIDPOINTER: ShowMessage('STG_E_INVALIDPOINTER');
STG_E_INVALIDPARAMETER: ShowMessage('STG_E_INVALIDPARAMETER');
STG_E_REVERTED: ShowMessage('STG_E_REVERTED');
STG_E_TOOMANYOPENFILES: ShowMessage('STG_E_TOOMANYOPENFILES');
else
ShowMessage('Err: #' + IntToHex(Response, 4));
end;
end;
end;
end;
end
else
ShowMessage('TYMED?');
end;
end
else
case Response of
DV_E_LINDEX: ShowMessage('DV_E_LINDEX');
DV_E_FORMATETC: ShowMessage('DV_E_FORMATETC');
DV_E_TYMED: ShowMessage('DV_E_TYMED');
DV_E_DVASPECT: ShowMessage('DV_E_DVASPECT');
OLE_E_NOTRUNNING: ShowMessage('OLE_E_NOTRUNNING');
STG_E_MEDIUMFULL: ShowMessage('STG_E_MEDIUMFULL');
E_UNEXPECTED: ShowMessage('E_UNEXPECTED');
E_INVALIDARG: ShowMessage('E_INVALIDARG');
E_OUTOFMEMORY: ShowMessage('E_OUTOFMEMORY');
else
ShowMessage('Err = ' + IntToStr(Response));
end;
end;
Result := S_OK;
end;
This code accepts the "Drop", looks for some CF_FILECONTENTS, opens it up as TYMED_ISTORAGE, drops every single stream in that storage to a file in C:\Temp\<stream_name>.bin; I tried this with Delphi 2010 and Outlook 2007, it works all right: Opening up those saved files (lots of them!) I can find everything from the email message, in unexpected ways. I'm sure there's documentation somewhere that explains exactly what every one of those files is supposed to contain, but I don't really care about accepting drag-and-dropped files from Outlook so I didn't look to far. Again, ldsandon's link looks promising.
This codes looks fairly short, but that's not the source of the difficulties. The documentation for this was really lacking; I hit road blocks at every corner, starting with this:
F.lindex := 0{-1}; // Documentation says -1, practice says "0"
Msdn's documentation clear says the only valid value for "lindex" is -1: guess what, -1 doesn't work, 0 does!
Then there's this short line of code:
Response := Storage.OpenStream(ST_TAG.pwcsName, nil, STGM_READ or STGM_SHARE_EXCLUSIVE, 0, Stream);
specifically, those two consts:
STGM_READ or STGM_SHARE_EXCLUSIVE
getting that combination was a matter of trial-and-error. I don't like trial and error: Is that the optimal combination of flags for what I want? Will that work on every platform? I don't know...
Then there's the matter of making heads or tail of the actual content received from Outlook. For example the SUBJECT of the email was found in this stream: __substg1.0_800A001F. The body of the message was found in this stream: __substg1.0_1000001F. For an simple email message I got 59 streams of non-zero size.
You have to use OLE Drag&Drop, but then you have to be able to process the data you receive, because each application can store data in its own format. You can find a nice Delphi implementation of OLE Drag&Drop here
Outlook Express and Thunderbird should transfer you data in RFC2822 format or something alike, Outlook will probably transfer data in its own message format, it shuould have been documented as part of the Microsoft Open Specification program

how to retrieve text from website text boxes

how can i retrieve text from a web sites text boxes in delphi for example suppose i type
''tiger'' in google's search box how do i retrieve text from that search box would wm_gettext or getwindowtext work?
i am using delphi 7
try this code.
Only works with internet explorer. tested in Windows Vista,IE8 y Delphi 2007.
Uses
SHDocVw,
mshtml;
procedure GetTextFromEditIExplorer(ListStr: TStringList);
var
ShellWindow : IShellWindows;
Web_Browser : IWebbrowser2;
reg_Shell_window : IDispatch;
Dummy : IHTMLDocument2;
ovElements : OleVariant;
Document : Variant;
WindowsCount : Integer;
ElementsCount : Integer;
FormsCount : Integer;
begin
ShellWindow := CoShellWindows.Create; //Provides access to the collection of open Shell windows
for WindowsCount := 0 to ShellWindow.Count do //iterate through number of windows in the Shell windows collection
begin
reg_Shell_window := ShellWindow.Item(WindowsCount); //Returns the registered Shell window for a specified index.
if reg_Shell_window = nil then Continue; //go to next reg window
reg_Shell_window.QueryInterface(iWebBrowser2, Web_Browser); // determines if an interface can be used with an object
if Web_Browser <> nil then
begin
Web_Browser.Document.QueryInterface(IHTMLDocument2, Dummy);
if Dummy <> nil then
begin
Web_Browser := ShellWindow.Item(WindowsCount) as IWebbrowser2;
Document := Web_Browser;
for FormsCount := 0 to Document.forms.Length - 1 do
begin
ovElements := Document.forms.Item(FormsCount).elements;
for ElementsCount := 0 to ovElements.Length - 1 do
begin
try
if (CompareText(ovElements.item(ElementsCount).tagName, 'INPUT') = 0) and (CompareText(ovElements.item(ElementsCount).type, 'text') = 0) then
ListStr.Add('Control Name ['+ovElements.item(ElementsCount).Name+']'+' Type -> '+ovElements.item(ElementsCount).Type+' -> Value ['+ovElements.item(ElementsCount).Value+']');
except
ListStr.Add('Error Reading element n° '+IntToStr(ElementsCount));
end;
end;
end;
end;
end;
end;
end;
procedure TForm1.btn1Click(Sender: TObject);
var
List : TStringList;
begin
List:=TStringList.Create;
GetTextFromEditIExplorer(List);
ShowMessage(List.Text);
end;
Edit :
Omar unfortunately there is no simple solution to your problem. This is because each browser uses a different interface to interact with the information.
Here are a couple of suggestions
Firefox uses XPCOM, you can research about that.
Try using DDE (dynamic data exchange).
You can use WatiN, is a .net library wich work with firefox and iexplorer. you can see this article to learn how interact with a .net Assembly in delphi win32.
Bye.
The simplest way would be by using a regular expression on the original HTML code from the page.

Resources