I am trying to retrieve the <table><tbody> section of this page:
http://www.mfinante.ro/infocodfiscal.html?captcha=null&cod=18505138
I am using Delphi XE7.
I tried using IXMLHttpRequest, WinInet (InternetOpenURL(), InternetReadFile()), TRestClient/TRestRequest/TRestResponse, TIdHTTP.Get(), but all they retrieve is some gibberish, like this:
<html><head><meta http-equiv="Pragma" content="no-cache"/>'#$D#$A'<meta http-equiv="Expires" content="-1"/>'#$D#$A'<meta http-equiv="CacheControl" content="no-cache"/>'#$D#$A'<script>'#$D#$A'(function(){p={g:"0119a4477bb90c7a81666ed6496cf13b5aad18374e35ca73f205151217be1217a93610c5877ece5575231e088ff52583c46a8e8807483e7185307ed65e",v:"87696d3d40d846a7c63fa2d10957202e",u:"1",e:"1",d:"1",a:"challenge etc.
Look at this code for example:
program htttpget;
{$APPTYPE CONSOLE}
{$R *.res}
uses
SysUtils, HTTPApp, IdHTTP, ActiveX;
var
CoResult: Integer;
HTTP: TIdHTTP;
Query: String;
Buffer: String;
begin
try
CoResult := CoInitializeEx(nil, COINIT_MULTITHREADED);
if not((CoResult = S_OK) or (CoResult = S_FALSE)) then
begin
Writeln('Failed to initialize COM library.');
Exit;
end;
HTTP := TIdHTTP.Create;
Query := 'http://www.mfinante.ro/infocodfiscal.html?captcha=null' +
'&cod=18505138';
Buffer := HTTP.Get(Query);
writeln(Buffer);
HTTP.Destroy;
except
end;
end.
What is wrong with this page? I haven not done very many "get" functions in my life, but other websites return normal responses. Can someone at least clarify to me why this isn't working?
Are there other ways to get the content of this web page? Are there other programming languages (Java, scripting, etc) that can do this without third party software (like using Firefox source code to emulate a browser, fetch the page, without showing the window, and then copy the content).
You can use TWebBrowser for this.
See this post: How can I get HTML source code from TWebBrowser
The answer by RRUZ, which you can find in many places on the internet, is not what you are looking for. This gives you are original html source, as would IdHttp.Get().
However, the answer by Mehmet Fide will give you the HTML source of the DOM, which is what you are looking for.
I offer a variation here. (It includes some hacks that were required at the time to get full DOCTYPE. Not sure if they are still needed...)
function EndStr(const S: String; const Count: Integer): String;
var
I: Integer;
Index: Integer;
begin
Result := '';
for I := 1 to Count do
begin
Index := Length(S)-I+1;
if Index > 0 then
Result := S[Index] + Result;
end;
end;
function GetHTMLDocumentSource(WebBrowser: TWebBrowser; var Charset: String):
String;
var
Element: IHTMLElement;
Node: IHTMLDomNode;
Document: IHTMLDocument2;
I: Integer;
S: String;
begin
Result := '';
Document := WebBrowser.Document as IHTMLDocument2;
For I := 0 to Document.all.length -1 do
begin
Element := Document.all.item(I, 0) as IHTMLElement;
If Element.tagName = '!' Then
begin
Node := Element as IHTMLDomNode;
If (Node <> nil) and (Pos('CTYPE', UpperCase(Node.nodeValue)) > 0) Then
begin
S := VarToStr(Node.nodeValue); { don't change case of result }
if Copy(Uppercase(S), 1, 5) = 'CTYPE' then
S := 'DO' + S;
if Copy(Uppercase(S), 1, 7) = 'DOCTYPE' then
S := '<!' + S;
if Uppercase(S) = '<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 TRANSITIONAL//E' then
S := S +'N">';
if EndStr(Lowercase(S), 3) = '.dt' then
S := S + 'd"';
if EndStr(Lowercase(S), 5) = '.dtd"' then
S := S + '>';
Result := Result + S;
end;
end
Else
Result := Result + Element.outerHTML;
If Element.tagName = 'HTML' Then
Break;
end;
Charset := Document.charset;
end;
So call WebBrowser.Navigate(URL), then in OnDocumentComplete event retrieve the Html Source.
However, with your URL you will see the OnDocumentComplete event fires twice :(, so you need to get the Html from the last fire.
You can refer to this post How do I avoid the OnDocumentComplete event for embedded iframe elements? for info on how to get the final OnDocumentComplete event. However, I tried it and it was not working for me. You may need to use some other strategy to get the last event.
Not sure of your needs, but you may also optimize this process by disabling WebBrowser from downloading images. I believe that is possible.
This is normal, you have indeed retrieved the content correctly. What happens in your browser is that the script is executed and the page gets built client side. If you wish to replicate that in your code, then you will need to do the same. Execute the script exactly as the browser would.
What you are really looking for here is what is known as a headless browser. Integrate one of those into your program. Then get the headless browser to process the request, including executing scripts. When it has done executing scripts, read the modified content of the page.
Related
I've been working with Word2010.pas for the past week and everything went well, until I found out that if you open a document manually, edit it (but don't save), press Alt+F4, a prompt will show up saying if you want to save your document or not, leave it like that. Go into code and try to access that document, all calls will result in EOleException: Call was rejected by callee. Once you cancel that Word save prompt, everything works fine.
I came across this while writing code that periodically checks if a document is open. Here is the function that checks if the document is open: (function runs in a timer every 2 seconds)
function IsWordDocumentOpen(FileName: string): Boolean;
var
WordApp: TWordApplication;
I: Integer;
begin
Result := False;
try
WordApp := TWordApplication.Create(nil);
try
WordApp.Connect;
for I := 1 to WordApp.Documents.Count do
begin
try
if WordApp.Documents.Item(I).FullName = FileName then
begin
Result := True;
System.Break;
end;
except
on E: EOleException do
// I always end up here while the document has the prompt
end;
end;
finally
FreeAndNil(WordApp);
end;
finally
//
end;
end;
Does anyone have any experience with this? Is there some sort of a lock that I'm not aware of?
UPDATE #1: So far the only solution I could find was to implement IOleMessageFilter, this way I do not receive any exceptions but the program stops and waits on the line WordApp.Documents.Item(I).FullName, but that is not what I want. Implementation of IOleMessageFilter goes like this:
type
IOleMessageFilter = class(TInterfacedObject, IMessageFilter)
public
function HandleInComingCall(dwCallType: Longint; htaskCaller: HTask;
dwTickCount: Longint; lpInterfaceInfo: PInterfaceInfo): Longint;stdcall;
function RetryRejectedCall(htaskCallee: HTask; dwTickCount: Longint;
dwRejectType: Longint): Longint;stdcall;
function MessagePending(htaskCallee: HTask; dwTickCount: Longint;
dwPendingType: Longint): Longint;stdcall;
procedure RegisterFilter();
procedure RevokeFilter();
end;
implementation
function IOleMessageFilter.HandleInComingCall(dwCallType: Integer; htaskCaller: HTask; dwTickCount: Integer; lpInterfaceInfo: PInterfaceInfo): Longint;
begin
Result := 0;
end;
function IOleMessageFilter.MessagePending(htaskCallee: HTask; dwTickCount, dwPendingType: Integer): Longint;
begin
Result := 2 //PENDINGMSG_WAITDEFPROCESS
end;
procedure IOleMessageFilter.RegisterFilter;
var
OldFilter: IMessageFilter;
NewFilter: IMessageFilter;
begin
OldFilter := nil;
NewFilter := IOleMessageFilter.Create;
CoRegisterMessageFilter(NewFilter,OldFilter);
end;
function IOleMessageFilter.RetryRejectedCall(htaskCallee: HTask; dwTickCount, dwRejectType: Integer): Longint;
begin
Result := -1;
if dwRejectType = 2 then
Result := 99;
end;
procedure IOleMessageFilter.RevokeFilter;
var
OldFilter: IMessageFilter;
NewFilter: IMessageFilter;
begin
OldFilter := nil;
NewFilter := nil;
CoRegisterMessageFilter(NewFilter,OldFilter);
end;
end;
BEST SOLUTION SO FAR: I used IOleMessageFilter implementation like this: (remember this will stop and wait on the line where I previously got an exception)
function IsWordDocumentOpen(FileName: string): Boolean;
var
OleMessageFilter: IOleMessageFilter;
WordApp: TWordApplication;
I: Integer;
begin
Result := False;
try
OleMessageFilter := IOleMessageFilter.Create;
OleMessageFilter.RegisterFilter;
WordApp := TWordApplication.Create(nil);
try
WordApp.Connect;
for I := 1 to WordApp.Documents.Count do
begin
if WordApp.Documents.Item(I).FullName = FileName then
begin
Result := True;
System.Break;
end;
end;
finally
OleMessageFilter.RevokeFilter;
FreeAndNil(WordApp);
FreeAndNil(OleMessageFilter);
end;
finally
//
end;
end;
Actually, I think that the problem is simply that Word is busy doing a modal dialog and so can't respond to external COM calls. This trivial code produces the same error:
procedure TForm1.Button1Click(Sender: TObject);
begin
Caption := MSWord.ActiveDocument.Name;
end;
Probably the simplest way to avoid this problem is to head it off before if happens. If you are using the TWordApplication server that comes with Delphi (on the Servers components tab), you can attach an event handler to its OnDocumentBeforeClose and use that to present your own "Save Y/N?" dialog and set the event's Cancel param to True to prevent Word's dialog from appearing.
Update: If you try experimenting with this code while the Save dialog is popped up
procedure TForm1.Button1Click(Sender: TObject);
var
vWin,
vDoc,
vApp : OleVariant;
begin
vWin := MSWord.ActiveWindow;
Caption := vWin.Caption;
vDoc := vWin.Document;
vApp := vDoc.Application; // Attempt to read Word Document property
Caption := vDoc.Path + '\';
Caption := Caption + vDoc.Name;
end;
I think you'll find that any attempt to read from the vDoc object will result in the "Call was rejected ..." message, so I am beginning to think that this behaviour is by design - it's telling you that the object is not in a state that it can be interacted with.
Interestingly, it is possible to read the Caption property of the vWin Window object, which will tell you the filename of the file but not the file's path.
Realistically, I still think your best option is to try and get the OnDocumentBeforeClose event working. I don't have Word 2010 installed on this machine by Word 2007 works fine with the Word server objects derived from Word2000.Pas so you might try those instead of Word2010.Pas, just to see.
Another possibility is simply to catch the "Call was rejected ..." exception, maybe return "Unavailable" as the document FullName, and try again later.
If you're not using TWordApplication and don't know how to catch the OnDocumentBeforeClose for the method your using to access Word, let me know how you are accessing it and I'll see if I can dig out some code to do it.
I vaguely recall there's a way of detecting that Word is busy with a modal dialog - I'll see if I can find where I saw that a bit later if you still need it. Your IOleMessageFilter looks more promising than anything I've found as yet, though.
I subscribe to a secure https web page containing a button that downloads some data as csv. I am trying to automate the download without the 'save as' dialog appearing but always seem to get an empty file downloaded. I suspect it has something to do with file type I'm using with IdHttp as most of my code works correctly.
Please can anyone help with my use of IdHttp or see where else I am going wrong?
The download button on the site calls some javascript to perform the download as follows
<a class="dlCSV" href="javascript:void(0);" onclick="dl_module.DownloadCsv();return false;">Download in CSV format…</a>
In Delphi I use a TWeb browser to log on securely and navigate to the page.
Clicking the download button in the TwebBrowser by hand shows the 'save as' dialog and then correctly downloads the csv data, defaulting to the filename 'data.csv'.
Automating clicking the button using execScript (below) also works, again showing the 'save as' dialog and correctly downloading the data with the same default filename.
procedure TForm1.BtnClickDownloadbuttonClick(Sender: TObject);
var TheDocument : IHTMLDocument2; // current HTML document
HTMLWindow: IHTMLWindow2; // parent window of current HTML document
begin
TheDocument := WebBrowser1.Document as IHTMLDocument2; // Get reference to current document
if not Assigned(TheDocument) then
Exit;
HTMLWindow := TheDocument.parentWindow; // Get parent window of current document
if Assigned(HTMLWindow) then
try
HTMLWindow.execScript('dl_module.DownloadCsv()', 'JavaScript'); // execute JS function to do download
except
on E : Exception do
begin
showmessage ('Exception class name = '+E.ClassName+ slinebreak
+ 'Exception message = '+E.Message);
end //on E
end;
end;
Then I added TLama's code from here How do I keep an embedded browser from prompting where to save a downloaded file? to use IDownloadManager to intercept the download and prevent the 'save as' dialog. This is where it seems to go wrong as I then get an empty file downloaded, and not with the name data.csv.
My code for function TWebBrowser.Download, TWebBrowser.InvokeEvent, function TWebBrowser.QueryService and TForm1.FormCreate are identical to that provided by TLama in the link above.
My procedure TForm1.Button1Click is the same except that I changed the download function being called to the one on my page by changing the line
HTMLWindow.execScript('SRT_stocFund.Export()', 'JavaScript');
to
HTMLWindow.execScript('dl_module.DownloadCsv()', 'JavaScript');
and my procedure TForm1.BeforeFileDownload is identical except that because I'm on a secure site I added the variable
var
LHandler: TIdSSLIOHandlerSocketOpenSSL; //<< on a https site
and after creating the Filestream I added the lines
LHandler := TIdSSLIOHandlerSocketOpenSSL.Create(nil);
IdHTTP.IOHandler := LHandler;
The issue seems to be in procedure TForm1.BeforeFileDownload where I note that the value of FileSource is
https://www.the_web_site_name/Ashx/GenericCSV.ashx.
There is a short delay while IdHTTP.Get(FileSource, FileStream); executes and then a file is created on my hard disc but called 'GenericCSV.ashx' (not data.csv) and the file is zero bytes long and completely empty.
Any ideas why its not downloading the file called data.csv (Do I somehow have to execute GenericCSV.ashx as well? if so how?)
For info here is my version of procedure TForm1.BeforeFileDownload
procedure TForm1.BeforeFileDownload(Sender: TObject; const FileSource: WideString; var Allowed: Boolean);
var
IdHTTP: TIdHTTP;
FileTarget: string;
FileStream: TMemoryStream;
LHandler: TIdSSLIOHandlerSocketOpenSSL; // added as its a https site
begin
FileSourceEdit.Text := FileSource;
Allowed := ShowDialogCheckBox.Checked;
if not Allowed then
try
IdHTTP := TIdHTTP.Create(nil);
try
FileStream := TMemoryStream.Create;
LHandler := TIdSSLIOHandlerSocketOpenSSL.Create(nil); //<<< added as its a https site
IdHTTP.IOHandler := LHandler; //<<< added as its a https site
try
IdHTTP.HandleRedirects := True;
IdHTTP.Get(FileSource, FileStream);
FileTarget := IdHTTP.URL.Document;
if FileTarget = '' then
FileTarget := 'File';
FileTarget := ExtractFilePath(ParamStr(0)) + FileTarget;
FileStream.SaveToFile(FileTarget);
finally
FileStream.Free;
end;
finally
IdHTTP.Free;
end;
ShowMessage('Downloading finished! File has been saved as:' + sLineBreak +
FileTarget);
except
on E: Exception do
ShowMessage(E.Message);
end;
end;
After you login, you can use this code to retrieve cookies from TWebBrowser
procedure GetHttpOnlyCookie(const AUrl: string; var ACookies: string);
const
INTERNET_COOKIE_HTTPONLY = 8192;
var
i: Integer;
hModule: THandle;
InternetGetCookieEx: function(lpszUrl, lpszCookieName, lpszCookieData
: PAnsiChar; var lpdwSize: DWORD; dwFlags: DWORD; lpReserved: pointer)
: BOOL; stdCall;
CookieSize: DWORD;
CookieData: PAnsiChar;
begin
LoadLibrary('wininet.dll');
hModule := GetModuleHandle('wininet.dll');
if (hModule <> 0) then
begin
#InternetGetCookieEx := GetProcAddress(hModule, 'InternetGetCookieExA');
if (#InternetGetCookieEx <> nil) then
begin
CookieSize := 1024;
Cookiedata := AllocMem(CookieSize);
try
if InternetGetCookieEx(PAnsiChar(AUrl), nil, Cookiedata, CookieSize, INTERNET_COOKIE_HTTPONLY, nil) then
begin
ACookies:=CookieData;
end;
finally
FreeMem(Cookiedata);
end;
end;
end;
end;
Then you just parse your cookies and add them (you have to create CookieManager in IdHTTP first)
IdHTTP1.CookieManager.AddServerCookie();
Then you start your download and it should work if you passed all parameters correctly (unfortunately, it is not possible to find out what your site requires).
Thank you smooty86 but I think its time I gave up trying to doing it this way and simply parse the page I can see.
I don't mind trying to understand code and adapting it to my needs but its so much harder trying to follow hints and suggestions when I'm working in the dark and especially don't know what parameters are needed everywhere. (I'm not daft, I've been programming for nearly 30 years and have spent over 4 years developing this particular data processing application but rarely touch web stuff)
However, the progress so far is...
Running your GetHttpOnlyCookie code after a successful login using automated filling in of the fields and clicking the login button returned an empty string so I used this code instead that at least seemed to return something that looked a little similar to your cookie string, ie seveveral strings separated by semicolons, most being name=value. (IdCookieManager1 is connected to IdHttp)
CookieList := Tstringlist.Create ;
try
CookieList.Delimiter := ';' ;
document := WebBrowser1.Document as IHTMLDocument2;
CookieList.DelimitedText := document.cookie;
for i := 0 to CookieList.Count-1 do
IdCookieManager1.AddCookie(CookieList[i],LOGIN_URL)
finally
CookieList.Free;
end;
Then in my original procedure BeforeFileDownload I try to log IdHttp into the site as well using code I adapted from here Log in to website from Delphi and the the cookies held in the cookie manager.
Displaying the string returned showed lots of HTML that appeared to represent the oringinal log in page and not the page you see after log in
procedure TFrmInportGrades.BeforeFileDownload(Sender: TObject; const FileSource: WideString; var Allowed: Boolean);
var
FileTarget: string;
FileStream: TMemoryStream;
request : Tstringlist;
s : string;
begin
FileSourceEdit.Text := FileSource;
Allowed := ShowDialogCheckBox.Checked;
if not Allowed then
begin
try
FileStream := TMemoryStream.Create;
IdHTTP.CookieManager := IdCookieManager1;
s := LogInIdHttp; //<<<< log in the IdHttp
showmessage(s); //<<<< debug
IdHTTP.Get(FileSource, FileStream);
FileTarget := IdHTTP.URL.Document;
if FileTarget = '' then
FileTarget := 'File';
FileTarget := ExtractFilePath(ParamStr(0)) + FileTarget;
FileStream.SaveToFile(FileTarget);
finally
FileStream.Free;
end;
ShowMessage('Downloading finished! File has been saved as:' + sLineBreak +
FileTarget);
end;
end;
The login code I used is below but I don't really know what I am doing here or what needs to be put into the Request.Add() parameters. I used 'Inspect element' from firefox to get the name of the user and password boxes and put the correct users name and password after the '=' sign in lines {3} and {4}. In lines {2},{6} and {7} I put the url of the log in site. I've no idea what lines {1}, {2}, {5} do or even if they are correct or necessary
function TFrmInportGrades.LogInIdHttp: string;
var
Request: TStringList;
Response: TMemoryStream;
LHandler: TIdSSLIOHandlerSocketOpenSSL; // added as its a https site
begin
Result := '';
try
Response := TMemoryStream.Create;
try
Request := TStringList.Create;
try
{1} Request.Add('op=login');
{2} Request.Add('redirect=https://www.thewebsite.com/Login.aspx' );
{3} Request.Add('ctl00$ctl00$Body$Body$loginManager$ctl00$loginEmailInput=usernme');
{4} Request.Add('ctl00$ctl00$Body$Body$loginManager$ctl01$passwordInput=password'});
LHandler := TIdSSLIOHandlerSocketOpenSSL.Create(nil); //<<< added as its a https site
IdHTTP.IOHandler := LHandler; //<<< added as its a https site
IdHTTP.AllowCookies := True;
IdHTTP.HandleRedirects := True;
{5} IdHTTP.Request.ContentType := 'application/x-www-form-urlencoded';
{6} IdHTTP.Post('https://www.thewebsite.com/Login.aspx', Request, Response);
{7} Result := IdHTTP.Get('https://www.thewebsite.com/Login.aspx');
finally
Request.Free;
end;
finally
Response.Free;
end;
except
on E: Exception do
ShowMessage(E.Message);
end;
end;
The net result of all this is that I don't get a file created at all now, not even a zero byte one. This all seems very overcomplicated simply to avoid or automate the 'Save As' dialog and is requiring lots of code that I won't be able to maintan afterwards. Unless somebody has a simpler solution I'll just parse what I can see (BTW I tried TEmbeddedWebBrowser but there is so little documentation for it I couldn't see how to make it download correctly. Might try again later.) Thank you for trying to help!
I am trying to extract URLs from Google search results. I use Indy IdHTTP to get HTML results from Google, and I use Achmad Z's code for getting the link hrefs from the page. How can I get the real link target for each URL instead of the one that goes through Google's redirector?
I tried that but I get an "Operand no applicable" error in this part of the code:
function ToUTF8Encode(str: string): string;
var
b: Byte;
begin
for b in BytesOf(UTF8Encode(str)) do
begin
Result := Format('%s%s%.2x', [Result, '%', b]);
end;
end;
I use Delphi 7 with Indy 9.00.10. Maybe indy update will help ?
If I get it right you are trying to fetch the Google search results using TIdHTTP.Get method. If so, thenyou should definitely focus on some Google Search API implementation because
it's impossible to fetch the results this way because you don't have any access to the document inside the iframe in which the search results are, so you won't get any search results by using HTTP GET in this case (or at least I haven't heard about the request which can do that)
it's against Google policies and you should use proper Google Search API instead, for instance Google SOAP Search API, there are available also several types of Google Search API's for various purposes
You can find e.g. here the Delphi wrapper with the demo code for Google Search API. I've tested it with Delphi 2009 on Windows 7/64 and it works fine for me.
In the previous post here I've tried to explain why you should use Google Search API, in this one I'll try to provide you an example with a hope it will work in your Delphi 7.
You need to have the SuperObject (JSON parser for Delphi), I've used this version (latest at this time). Then you need Indy; the best would be to upgrade to the latest version if possible. I've used the one shipped with Delphi 2009, but I think the TIdHTTP.Get method is so important that it must work fine also in your 9.00.10 version.
Now you need a list box and a button on your form, the following piece of code and a bit of luck (for compatibility :)
The URL request building you can see for instance in the DxGoogleSearchApi.pas mentioned before but the best is to follow the Google Web Search API reference. In DxGoogleSearchApi.pas you can take the inspiration e.g. how to fetch several pages.
So take this as an inspiration
uses
IdHTTP, IdURI, SuperObject;
const
GSA_Version = '1.0';
GSA_BaseURL = 'http://ajax.googleapis.com/ajax/services/search/';
procedure TForm1.GoogleSearch(const Text: string);
var
I: Integer;
RequestURL: string;
HTTPObject: TIdHTTP;
HTTPStream: TMemoryStream;
JSONResult: ISuperObject;
JSONResponse: ISuperObject;
begin
RequestURL := TIdURI.URLEncode(GSA_BaseURL + 'web?v=' + GSA_Version + '&q=' + Text);
HTTPObject := TIdHTTP.Create(nil);
HTTPStream := TMemoryStream.Create;
try
HTTPObject.Get(RequestURL, HTTPStream);
JSONResponse := TSuperObject.ParseStream(HTTPStream, True);
if JSONResponse.I['responseStatus'] = 200 then
begin
ListBox1.Items.Add('Search time: ' + JSONResponse.S['responseData.cursor.searchResultTime']);
ListBox1.Items.Add('Fetched count: ' + IntToStr(JSONResponse['responseData.results'].AsArray.Length));
ListBox1.Items.Add('Total count: ' + JSONResponse.S['responseData.cursor.resultCount']);
ListBox1.Items.Add('');
for I := 0 to JSONResponse['responseData.results'].AsArray.Length - 1 do
begin
JSONResult := JSONResponse['responseData.results'].AsArray[I];
ListBox1.Items.Add(JSONResult.S['unescapedUrl']);
end;
end;
finally
HTTPObject.Free;
HTTPStream.Free;
end;
end;
procedure TForm1.Button1Click(Sender: TObject);
begin
GoogleSearch('Delphi');
end;
Answer to my question , maybe it can help someone:
Fetching web page :
memo1.Lines.Text := idhttp1.Get('http://ajax.googleapis.com/aja...tart=1&rsz=large&q=max');
extracting URL's :
function ExtractText(const Str, Delim1, Delim2: string; PosStart: integer; var PosEnd: integer): string;
var
pos1, pos2: integer;
begin
Result := '';
pos1 := PosEx(Delim1, Str, PosStart);
if pos1 > 0 then
begin
pos2 := PosEx(Delim2, Str, pos1 + Length(Delim1));
if pos2 > 0 then
begin
PosEnd := pos2 + Length(Delim2);
Result := Copy(Str, pos1 + Length(Delim1), pos2 - (pos1 + Length(Delim1)));
end;
end;
end;
And on Button1 just put :
procedure TForm1.Button1Click(Sender: TObject);
var Pos: integer;
sText: string;
begin
sText := ExtractText(Memo1.Lines.Text, '"url":"', '","visibleUrl"', 1, Pos);
while sText <> '' do
begin
Memo2.Lines.Add(sText);
sText := ExtractText(Memo1.Lines.Text, '"url":"', '","visibleUrl"', Pos, Pos);
end;
end;
www.delphi.about.com has nice documentation on string manipulation , Zarko Gajic does great job on that site :)
NOTE: if google changes it's source this will be useless.
Does anyone already implemented Drag & Drop of email messages from Outlook and/or Thunderbird (from now on "OT") to a Delphi form.
I need to give the user a way to store important emails in my application database without writing OT plugins. Currently they use this technique:
from OT they click on an email,
save as...
save on desktop or temp folder,
drag and drop the saved file on the Delphi form.
While after the modification I want to do:
from OT they click on an email,
drag and drop the saved file on the Delphi form.
So basically I implemented drag & drop from explorer. I need an extra layer that allows my application to see the email originally on OT as a normal file, so I can drag from OT as if it was a normal windows explorer window.
Note: I don't need to support all OT versions. I can accept not to support Outlook 2003 (for example) but not 2010. So in case the technique will not work automatically for all OT versions I will prefer the one that works with the latest.
Final note: It is obvious anyway I am interested only in dragging & dropping emails (and not Outlook Calendar items, for example). An idea would be dragging and Dropping attachments too. But this could be an extra improvement for the future.
First of all, if you can find a ready made library that does this out of the box (like the one suggested by ldsandon) use it, because doing all of this by hand is painful and frustrating. The documentation is at times incomplete and might contain bugs: you'll end up doing stuff by trial and error and Google will not save you because not a lot of people delve into the depths of Ole drag-and-drop, and most of those that do will probably use ready-made code.
How to do this in plain Pascal
Theoretically the API that's used to make your application handle OLE drops is very simple. All you need to do is provide a implementation of the IDropTarget interface that does what you need and call RegisterDragDrop providing the handle for your application's window and the interface.
Here's how my implementation looks like:
TDropTargetImp = class(TInterfacedObject, IDropTarget)
public
function DragEnter(const dataObj: IDataObject; grfKeyState: Longint; pt: TPoint; var dwEffect: Longint): HResult; stdcall;
function DragOver(grfKeyState: Longint; pt: TPoint; var dwEffect: Longint): HResult; stdcall;
function DragLeave: HResult; stdcall;
function Drop(const dataObj: IDataObject; grfKeyState: Longint; pt: TPoint; var dwEffect: Longint): HResult; stdcall;
end;
The implementation of DragEnter, DragOver and DragLeave is trivial, considerring I'm doing this for an experiment: I'll just accept everything:
function TDropTargetImp.DragEnter(const dataObj: IDataObject; grfKeyState: Integer; pt: TPoint; var dwEffect: Integer): HResult;
begin
dwEffect := DROPEFFECT_COPY;
Result := S_OK;
end;
function TDropTargetImp.DragLeave: HResult;
begin
Result := S_OK;
end;
function TDropTargetImp.DragOver(grfKeyState: Integer; pt: TPoint; var dwEffect: Integer): HResult;
begin
dwEffect := DROPEFFECT_COPY;
Result := S_OK;
end;
The real work will be done in TDropTargetImp.Drop.
function TDropTargetImp.Drop(const dataObj: IDataObject; grfKeyState: Integer; pt: TPoint; var dwEffect: Integer): HResult;
var iEnum: IEnumFORMATETC;
DidRead:LongInt;
F: TFormatEtc;
STG:STGMEDIUM;
Response:Integer;
Stream:IStream;
Storage: IStorage;
EnumStg: IEnumStatStg;
ST_TAG: STATSTG;
FileStream: TFileStream;
Buff:array[0..1023] of Byte;
begin
if dataObj.EnumFormatEtc(DATADIR_GET, iEnum) = S_OK then
begin
{
while (iEnum.Next(1, F, #DidRead) = S_OK) and (DidRead > 0) do
begin
GetClipboardFormatName(F.cfFormat, FormatName, SizeOf(FormatName));
ShowMessage(FormatName + ' : ' + IntToHex(F.cfFormat,4) + '; lindex=' + IntToStr(F.lindex));
end;
}
ZeroMemory(#F, SizeOf(F));
F.cfFormat := $C105; // CF_FILECONTENTS
F.ptd := nil;
F.dwAspect := DVASPECT_CONTENT;
F.lindex := 0{-1}; // Documentation says -1, practice says "0"
F.tymed := TYMED_ISTORAGE;
Response := dataObj.GetData(F, STG);
if Response = S_OK then
begin
case STG.tymed of
TYMED_ISTORAGE:
begin
Storage := IStorage(STG.stg);
if Storage.EnumElements(0, nil, 0, EnumStg) = S_OK then
begin
while (EnumStg.Next(1, ST_TAG, #DidRead) = S_OK) and (DidRead > 0) do
begin
if ST_TAG.cbSize > 0 then
begin
Response := Storage.OpenStream(ST_TAG.pwcsName, nil, STGM_READ or STGM_SHARE_EXCLUSIVE, 0, Stream);
if Response = S_OK then
begin
// Dump the stored stream to a file
FileStream := TFileStream.Create('C:\Temp\' + ST_TAG.pwcsName + '.bin', fmCreate);
try
while (Stream.Read(#Buff, SizeOf(Buff), #DidRead) = S_OK) and (DidRead > 0) do
FileStream.Write(Buff, DidRead);
finally FileStream.Free;
end;
end
else
case Response of
STG_E_ACCESSDENIED: ShowMessage('STG_E_ACCESSDENIED');
STG_E_FILENOTFOUND: ShowMessage('STG_E_FILENOTFOUND');
STG_E_INSUFFICIENTMEMORY: ShowMessage('STG_E_INSUFFICIENTMEMORY');
STG_E_INVALIDFLAG: ShowMessage('STG_E_INVALIDFLAG');
STG_E_INVALIDNAME: ShowMessage('STG_E_INVALIDNAME');
STG_E_INVALIDPOINTER: ShowMessage('STG_E_INVALIDPOINTER');
STG_E_INVALIDPARAMETER: ShowMessage('STG_E_INVALIDPARAMETER');
STG_E_REVERTED: ShowMessage('STG_E_REVERTED');
STG_E_TOOMANYOPENFILES: ShowMessage('STG_E_TOOMANYOPENFILES');
else
ShowMessage('Err: #' + IntToHex(Response, 4));
end;
end;
end;
end;
end
else
ShowMessage('TYMED?');
end;
end
else
case Response of
DV_E_LINDEX: ShowMessage('DV_E_LINDEX');
DV_E_FORMATETC: ShowMessage('DV_E_FORMATETC');
DV_E_TYMED: ShowMessage('DV_E_TYMED');
DV_E_DVASPECT: ShowMessage('DV_E_DVASPECT');
OLE_E_NOTRUNNING: ShowMessage('OLE_E_NOTRUNNING');
STG_E_MEDIUMFULL: ShowMessage('STG_E_MEDIUMFULL');
E_UNEXPECTED: ShowMessage('E_UNEXPECTED');
E_INVALIDARG: ShowMessage('E_INVALIDARG');
E_OUTOFMEMORY: ShowMessage('E_OUTOFMEMORY');
else
ShowMessage('Err = ' + IntToStr(Response));
end;
end;
Result := S_OK;
end;
This code accepts the "Drop", looks for some CF_FILECONTENTS, opens it up as TYMED_ISTORAGE, drops every single stream in that storage to a file in C:\Temp\<stream_name>.bin; I tried this with Delphi 2010 and Outlook 2007, it works all right: Opening up those saved files (lots of them!) I can find everything from the email message, in unexpected ways. I'm sure there's documentation somewhere that explains exactly what every one of those files is supposed to contain, but I don't really care about accepting drag-and-dropped files from Outlook so I didn't look to far. Again, ldsandon's link looks promising.
This codes looks fairly short, but that's not the source of the difficulties. The documentation for this was really lacking; I hit road blocks at every corner, starting with this:
F.lindex := 0{-1}; // Documentation says -1, practice says "0"
Msdn's documentation clear says the only valid value for "lindex" is -1: guess what, -1 doesn't work, 0 does!
Then there's this short line of code:
Response := Storage.OpenStream(ST_TAG.pwcsName, nil, STGM_READ or STGM_SHARE_EXCLUSIVE, 0, Stream);
specifically, those two consts:
STGM_READ or STGM_SHARE_EXCLUSIVE
getting that combination was a matter of trial-and-error. I don't like trial and error: Is that the optimal combination of flags for what I want? Will that work on every platform? I don't know...
Then there's the matter of making heads or tail of the actual content received from Outlook. For example the SUBJECT of the email was found in this stream: __substg1.0_800A001F. The body of the message was found in this stream: __substg1.0_1000001F. For an simple email message I got 59 streams of non-zero size.
You have to use OLE Drag&Drop, but then you have to be able to process the data you receive, because each application can store data in its own format. You can find a nice Delphi implementation of OLE Drag&Drop here
Outlook Express and Thunderbird should transfer you data in RFC2822 format or something alike, Outlook will probably transfer data in its own message format, it shuould have been documented as part of the Microsoft Open Specification program
how can i retrieve text from a web sites text boxes in delphi for example suppose i type
''tiger'' in google's search box how do i retrieve text from that search box would wm_gettext or getwindowtext work?
i am using delphi 7
try this code.
Only works with internet explorer. tested in Windows Vista,IE8 y Delphi 2007.
Uses
SHDocVw,
mshtml;
procedure GetTextFromEditIExplorer(ListStr: TStringList);
var
ShellWindow : IShellWindows;
Web_Browser : IWebbrowser2;
reg_Shell_window : IDispatch;
Dummy : IHTMLDocument2;
ovElements : OleVariant;
Document : Variant;
WindowsCount : Integer;
ElementsCount : Integer;
FormsCount : Integer;
begin
ShellWindow := CoShellWindows.Create; //Provides access to the collection of open Shell windows
for WindowsCount := 0 to ShellWindow.Count do //iterate through number of windows in the Shell windows collection
begin
reg_Shell_window := ShellWindow.Item(WindowsCount); //Returns the registered Shell window for a specified index.
if reg_Shell_window = nil then Continue; //go to next reg window
reg_Shell_window.QueryInterface(iWebBrowser2, Web_Browser); // determines if an interface can be used with an object
if Web_Browser <> nil then
begin
Web_Browser.Document.QueryInterface(IHTMLDocument2, Dummy);
if Dummy <> nil then
begin
Web_Browser := ShellWindow.Item(WindowsCount) as IWebbrowser2;
Document := Web_Browser;
for FormsCount := 0 to Document.forms.Length - 1 do
begin
ovElements := Document.forms.Item(FormsCount).elements;
for ElementsCount := 0 to ovElements.Length - 1 do
begin
try
if (CompareText(ovElements.item(ElementsCount).tagName, 'INPUT') = 0) and (CompareText(ovElements.item(ElementsCount).type, 'text') = 0) then
ListStr.Add('Control Name ['+ovElements.item(ElementsCount).Name+']'+' Type -> '+ovElements.item(ElementsCount).Type+' -> Value ['+ovElements.item(ElementsCount).Value+']');
except
ListStr.Add('Error Reading element n° '+IntToStr(ElementsCount));
end;
end;
end;
end;
end;
end;
end;
procedure TForm1.btn1Click(Sender: TObject);
var
List : TStringList;
begin
List:=TStringList.Create;
GetTextFromEditIExplorer(List);
ShowMessage(List.Text);
end;
Edit :
Omar unfortunately there is no simple solution to your problem. This is because each browser uses a different interface to interact with the information.
Here are a couple of suggestions
Firefox uses XPCOM, you can research about that.
Try using DDE (dynamic data exchange).
You can use WatiN, is a .net library wich work with firefox and iexplorer. you can see this article to learn how interact with a .net Assembly in delphi win32.
Bye.
The simplest way would be by using a regular expression on the original HTML code from the page.