how to use MARK.JS (or any plug in) with WEBDRIVER throught JAVA to highlight all instances of specified word in web page - highlight

I want to write an auto-test to check spelling in web pages using web service's API and highlight the words which are detected as incorrect for the following printscreening to file as a bug report proving.
Searching the web i've understood that the only way to highlight a separate word is to use JavaScript via JavaScriptExecutor.
I've found how to highlight web elements containing incorrect words, but unable to carry out separate word highlighting.
So, after collecting some information i have made next steps:
uploaded mark.js plugin on github
put the external .js into the DOM
set string with word
called mark.js on html body of the page:
WebDriver driver = new FirefoxDriver();
driver.manage().window().maximize();
driver.get("http://stackoverflow.com/questions/16251505/how-to-highlight-all-text-occurrences-in-a-html-page-with-javascript");
((JavascriptExecutor) driver)
.executeScript("var addscript=window.document.createElement('script');addscript.type='text/javascript';addscript.src='http://github.com/my3tahk/codekeep/blob/master/mark.min.js';document.getElementsByTagName('body')[0].appendChild(addscript);");
((JavascriptExecutor) driver)
.executeScript("return typeof(somefunc)").toString().equals("function");
String word = "text";
((JavascriptExecutor) driver)
.executeScript("var instance = new Mark(document.querySelector('body.context'));instance.mark('"+ word +"', {'element': 'span','className': 'highlight'});");
Console returns:
Exception in thread "main" org.openqa.selenium.WebDriverException: Mark is not defined
My question: How to correctly use mark.js (or another suggested plug in) in this case.
P.S.: Due to i'm newbie, please give a detailed description with full explanations.

i've used cdn.jsdelivr.net instead of .js posted on github. (thx to #dude)
document.querySelec‌​torAll('body') must be used instead of document.querySelec‌​tor('body.context')
the result working code:
WebDriver driver = new FirefoxDriver();
driver.manage().window().maximize();
driver.get("http://stackoverflow.com/questions/16251505/how-to-highlight-all-text-occurrences-in-a-html-page-with-javascript");
((JavascriptExecutor) driver)
.executeScript("var addscript=window.document.createElement('script');addscript.type='text/javascript';addscript.src='https://cdn.jsdelivr.net/mark.js/7.0.2/mark.min.js';document.getElementsByTagName('body')[0].appendChild(addscript);");
((JavascriptExecutor) driver)
.executeScript("return typeof(somefunc)").toString().equals("function");
String word = "text";
((JavascriptExecutor) driver)
.executeScript("var instance = new Mark(document.querySelectorAll('body'));instance.mark('"+ word +"', {'element': 'span','className': 'highlight'});");

Related

Appium Android: How to test pdf displayed inside WebView

In Android, I want to test PDF which contains terms and conditions, but this displayed inside WebView. I am able to switch to WebView, I am using below code.
String strWebContextName = getContexts().stream().filter(ctx -> ctx.contains(“WEBVIEW_”)).findAny().orElse(null);
if (Objects.nonNull(strWebContextName)) {
((AndroidDriver) getBaseMobileDriver()).context(strWebContextName);
}
Then locate the script tag and get the content
#FindBy(xpath = “//script[#type=“text/javascript” and contains(text(),”_init")]")
private WebElement webElementPdfPath;
String htmlCode = (String) ((JavascriptExecutor) getBaseMobileDriver()).executeScript(“return arguments[0].innerHTML;”, webElementPdfPath);
After this I don’t know how to proceed? Please help
In my experience with verifying PDF's in a WebView is that there is only limited you can search for with selectors. I'm used to only class or type attributes of the PDF container. I have never been able to search for specific text in an PDF with XPath (PDF's are also not part of the HTML but more an extension which opens the document).
Try a simpler XPath: //script[#type='text/javascript']. This way you know the PDF is opened, but that's all.
I've done this with desktop browsers as well. For browsers, there is no way to identify inner PDF elements, but only limited to: //embed[#type='application/x-google-chrome-pdf']. If I needed to verify the PDF with conditions I've used SikuliX image recognition for instance.

Develop Printer Driver which can Read file and Write extra data

I need to develop a printer driver which can:-
Read the printed file (knowing the data inside the file)
Write extra information to the end of printed file. (eg. bar-code or QR code)
I plan to use V4 printer driver as template for me to start my development. I already tried to built this V4 printer driver in Visual Studio.
V4 printer driver solution explorer
Understanding the architecture of V4 printer driver may need lot of times. Besides that, I am still new in driver development, so it is hard for me to understand the document provided by Microsoft.
Can anyone suggest where should I start to code and recommend me any useful method/function or library. It will be useful if anyone can recommend some useful related reading material and what basic knowledge should I know.
See the Microsoft sample code here.
Create a "Render Filter" project (C++ project) in your "V4 Printer Driver" solution and add the sample code in "StartOperation_throws" method of newly created Render Filter.
Then use following sample code to add a custom content to your file:
XPS_COLOR testColor;
testColor.value.sRGB.alpha=0xFF;
testColor.value.sRGB.red=0xFF;
testColor.value.sRGB.green=0xFF;
testColor.value.sRGB.blue=0xFF;
testColor.colorType = XPS_COLOR_TYPE_SRGB;
FLOAT Font_Size = 14;
XPS_POINT OrgPoint = {123,123};
LPCWSTR TestStr = _T("Sample Text");
LPCWSTR Name_fnt = _T("SampleFontFile.TTF");
at the end, call "AddCustomTextToXpsDoc" using above parameters to add your text in
printable xps file.

Programmatically open a UserForm from an external script / program / host application

Given that I have a UserForm embedded in one of the following:
an Excel workbook named c:\myBook.xslm
a Word document c:\myDocument.docm, or
a PowerPoint presentation named c:\myPresentation.ppm
What Automation properties / methods do I need to use in order to open and display the UserForm from an external script / host application / program?
For example, let's say I have the following JScript running under WSH:
var app = new ActiveXObject('Excel.Application');
app.Visible = true;
var book = app.Workbooks.Open('c:\myBook.xlsm');
// open UserForm here
How would I proceed to open the UserForm?
Note: I am looking for a solution that would work with an arbitrary document. This precludes manually (but not programmatically as part of the script) adding a Sub to show the UserForm, which can be called from the external script.
An idea is to override the document open mehtods. At least for Microsoft Word and Excel:
' Word
Private Sub Document_Open()
UserForm1.Show
End Sub
' Excel
Private Sub Workbook_Open()
UserForm1.Show
End Sub
At word and excel document open, the dialog will be shown.
For Powerpoint its a bit more complicated:
How to auto execute a macro when opening a Powerpoint presentation?
Update
After additional information in the question this is not a solution anymore. A starting point how to create code an add it to a VBA project can be found here: https://stackoverflow.com/a/34838194/1306012
Also this website provides additional information: http://www.cpearson.com/excel/vbe.aspx

"document" in mozilla extension js modules?

I am building Firefox extension, that creates single XMPP chat connection, that can be accessed from all tabs and windows, so I figured, that only way to to this, is to create connection in javascript module and include it on every browser window. Correct me if I am wrong...
EDIT: I am building traditional extension with xul overlays, not using sdk, and talking about those modules: https://developer.mozilla.org/en-US/docs/Mozilla/JavaScript_code_modules
So I copied Strophe.js into js module. Strophe.js uses code like this:
/*_Private_ function that creates a dummy XML DOM document to serve as
* an element and text node generator.
*/
[---]
if (document.implementation.createDocument === undefined) {
doc = this._getIEXmlDom();
doc.appendChild(doc.createElement('strophe'));
} else {
doc = document.implementation
.createDocument('jabber:client', 'strophe', null);
}
and later uses doc.createElement() to create xml(or html?) nodes.
All worked fine, but in module I got error "Error: ReferenceError: document is not defined".
How to get around this?
(Larger piece of exact code: http://pastebin.com/R64gYiKC )
Use the hiddenDOMwindow
Cu.import("resource://gre/modules/Services.jsm");
var doc = Services.appShell.hiddenDOMWindow.document;
It sounds like you might not be correctly attaching your content script to the worker page. Make sure that you're using something like tabs.attach() to attach one or more content scripts to the worker page (see documentation here).
Otherwise you may need to wait for the DOM to load, waiting for the entire page to load
window.onload = function ()
{
Javascript code goes here
}
Should take at least diagnose that issue (even if the above isn't the best method to use in production). But if I had to wager, I'd say that you're not attaching the content script.

tika returning incorrect line of text for pdf with lots of tables

I am using tika to extract text from a pdf file that has lot of tables.
java -jar tika-app-0.9.jar -t https://s3.amazonaws.com/centraldoc/alg1.pdf
It is returning some invalid text and sometimes it is trimming white space between 2 words; for example it returns
"qu inakli fmyathematical ideas to the real world" instead of "Link mathematical ideas to the real world".
Is there a way to minimize this kind of error? or is there another library that I can use? Does it make sense to use OCR to process these kind of pdf.
Try to control order when using PDFBox parser: PDFTextStripper has a flag that controls the order of lines in the document. By default (in PDFBox) it's set to false for performance reasons (no order preserved), but Tika changed its behavior between releases switching this flag on and off.
More details exactly on this problem in my blog Extracting text from PDF files with Apache Tika 0.9 (and PDFBox under the hood).
To get text from PDF to display in the right order, I had to set the SortByPosition flag to true... (tika-app-1.19.jar)
BodyContentHandler handler = new BodyContentHandler();
Metadata metadata = new Metadata();
ParseContext context = new ParseContext();
PDFParser pdfParser = new PDFParser();
PDFParserConfig config = pdfParser.getPDFParserConfig();
config.setSortByPosition(true); // needed for text in correct order
pdfParser.setPDFParserConfig(config);
pdfParser.parse(is, handler, metadata, context);

Resources