Is it possible to parse the sourcecode of this webpage? - html-parsing

I'm trying to make a program that fetches information from this webpage, www.sio.no , but when I try to view the sourcecode from my web-browser I only see some javascript commands.
Is there a way around this so I can access the text on the webpage?

Definitely possible, the "head" does have a lot of "script" tags in it, but if you scroll to the bottom you should see some html spaced by bazillions of newlines.
You should be able to parse the html from this page without problems. I used firefox to check that.

Related

Unable to customize specflow report with screenshots path

I am trying to generate specflow mstestexecution report with the screenshots path mentioned as link. I made use of Console.Writeline() to mention the screenshots path but its getting dispalyed as text in the report. Please provide your inputs on the same.
The HTML report sees everything as plain text even if you use HTML tags. This is by design. You can change the behaviour of specflow.exe yourself, it is an open source project on GitHub.
When you don't want to dive to deep into that, you need a more ugly work around: You could consider to tag your links with another token (like {img} instead of ) and than search and replace in the .html file for all {img} to for example.

Using script or Automator to set page settings, margins and page wrap automatically on Text Edit files

I'm putting together an installation using Processing, where users type and their text is printed on a receipt printer.
I've got Processing saving out time-stamped text files to a folder, and a folder action in Automator watching that folder and sending to print.
My problem is that these .txt files need some intervention...
Format > Wrap to page
Change margins
Select 80mm receipt roll in Page Setup
I think I have the margins thing figured out by adding some code to the file header on the Processing side. With the rest, I'm drawing a complete blank.
I've tried setting the receipt roll as the default page size in 'Print and scan' in system prefs, but the receipt page size doesn't show in the list in system prefs, only shows on the page size list from within Text Edit application.
I suppose what I'm asking - is there a way of setting TextEdit's default to page wrap, certain page size, certain printer - then a folder action can just print away (I hope).
The idea is that these text files spit out of the receipt printer automatically with no intervention. Does anyone have any ideas? Thanks in advance.
Have you experimented with the settings available for TextEdit in AppleScript? If you look under the print settings section (in TextEdit's Script dictionary), there are a number of options available, which may help you achieve something pretty close to what you want. You could then drop the AppleScript into a Run AppleScript action in your Automator folder action.
Alternatively, you could go completely nuts and design a template in Pages that meets your criteria, and then extract your text, paste into your Pages template, and print that out. A whole lot more work, but once it became functional, you would only need to change the Pages template in the future to meet changing needs.

Highlights Epub open in UIWebView

Any one can help me out for highlighting on epub. I open epub on UIWebView and allow user to highlight the text which he can see in future too. I am having database to store highlights detail. I am able to highlight using javascript. But since it uses range etc it remain till page is not change or reload.
Those who donot know what is epub you can suggest me on simple html page.
Please any one can suggest me how to implement highlight just like ibook have
Any logics?
Thanks in advance
See the following for the Javascript required:
https://github.com/fedefrappi/AePubReader/blob/master/AePubReader/SearchWebView.js
Download entire project to see how it's used.

textarea that was using plain text with option of markdown or textile filter now needs images

My clients can enter text into textarea and have the option to use the markdown or textile filters for each textarea.
With some models (articles, newsletter, etc) they can upload images to associate with the model, which are displayed in a column next to the text of the text.
This worked fine for a while, but they have now told me that the want the ability to put the images INSIDE the text a specific positions.
What is the best way to go about this? I suppose I may have to use a wysiwyg for this, but would rather not. And how would this work for images which are not yet on the server, etc?
There are different directions you could go to:
Follow the path of Confluence, which released in their new version 4.0 a rewritten WYSIWYG editor, that stores as source XHTML, not any more wiki markup.
Leads to an update of all pages when migrating.
Was pretty difficult to implement. I do not know if they use any more the TinyMCE editor of previous versions.
Follow the format of markdown how to include images in your source format. So by typing: This is my text. !image.png! The inline image shows ..., you will have a format that is understandable.
You have to expand the interpretation, so that the !<filename>! will be mapped that is stored locally anyway.
You have to add clear-up dialogs for the images that are yet not known, so doing bulk uploads ...
You may provide a drag area on your view, that then shows the filename and gives examples how to include that inside the text area.
Go for something in between, by allowing users to drag images inside the editor. There are plugins written in Javascript that allow you to do that, e.g. UI Draggable for jquery
I have no idea how to integrate that image inside the text editor. Overlay?
So the second one is the easiest, and the user knows how to do it. If they only decide that this is the solution they want to have :-)
I think I'm going to use a combination of #2 above, and the Liquid templating engine.

Workaround to hide browser headers when printing from browsers?

I need to allow printing from my web app that hides default browser headers (eg. the URL and "Page x of y"). I specifically need a workaround that does not require a) browser settings modifications by the user or b) first outputting to a PDF. Neither of these extra steps are acceptable for my situation.
As you can see in the screenshot below, it says "print test - Google Docs" along with the URL. These are the headers. I do not want these to show up.
Is there a way I can a) hide headers via Javascript, b) print via an embedded Flash SWF or Java Applet, or c) something else?
Without any more info, I can just suggest this:
<a href="javascript:window.print()">
<img src="print_image.gif" BORDER="0"</a>
Take a look here http://www.htmlgoodies.com/beyond/javascript/article.php/3471121/Print-a-Web-Page-Using-JavaScript.htm, if it doesn't fit your needs, please give us more details.

Resources