I have in my browser.xul code,what I am tyring to is to fetch data from an html file and to insert it into my div element.
I am trying to use div.innerHTML but I am getting an exception:
Component returned failure code: 0x804e03f7
[nsIDOMNSHTMLElement.innerHTML]
I tried to parse the HTML using Components.interfaces.nsIScriptableUnescapeHTML and to append the parsed html into my div but my problem is that style(attribute and tag) and script isn`t parsed.
First a warning: if your HTML data comes from the web then you are trying to build a security hole into your extension. HTML code from the web should never be trusted (even when coming from your own web server and via HTTPS) and you should really use nsIScriptableUnescapeHTML. Styles should be part of your extension, using styles from the web isn't safe. For more information: https://developer.mozilla.org/En/Displaying_web_content_in_an_extension_without_security_issues
As to your problem, this error code is NS_ERROR_HTMLPARSER_STOPPARSING which seems to mean a parsing error. I guess that you are trying to feed it regular HTML code rather than XHTML (which would be XML-compliant). Either way, a better way to parse XHTML code would be DOMParser, this gives you a document that you can then insert into the right place.
If the point is really to parse HTML code (not XHTML) then you have two options. One is using an <iframe> element and displaying your data there. You can generate a data: URL from your HTML data:
frame.src = "data:text/html;charset=utf-8," + encodeURIComponent(htmlData);
If you don't want to display the data in a frame you will still need a frame (can be hidden) that has an HTML document loaded (can be about:blank). You then use Range.createContextualFragment() to parse your HTML string:
var range = frame.contentDocument.createRange();
range.selectNode(frame.contentDocument.documentElement);
var fragment = range.createContextualFragment(htmlData);
XML documents don't have innerHTML, and nsIScriptableUnescapeHTML is one way to get the html parsed but it's designed for uses where the HTML might not be safe; as you've found out it throws away the script nodes (and a few other things).
There are a couple of alternatives, however. You can use the responseXML property, although this may be suboptimal unless you're receiving XHTML content.
You could also use an iframe. It may seem old-fashioned, but an iframe's job is to take a url (the src property) and render the content it receives, which necessarily means parsing it and building a DOM. In general, when an extension running as chrome does this, it will have to take care not to give the remote content the same chrome privilages. Luckily that's easily managed; just put type="content" on the iframe. However, since you're looking to import the DOM into your XUL document wholesale, you must have already ensured that this remote content will always be safe. You're evidently using an HTTPS connection, and you've taken extra care to verify the identity of the server by making sure it sends the right certificate. You've also verified that the server hasn't been hacked and isn't delivering malicious content.
Related
From the picture we see the value of RSSI, there is a code on lua and sh that displays this value, the script sh writes a value to the file, in lua we write it to a variable from the file and assign it to the label element
os.execute('/bin/rssi')
file = io.open("/tmp/rssi", "r");
d:option(DummyValue, "label", "rssi: "..(file:read("*line")));
file:close();
Everything works, but I want to see information in the web interface every N second. I will be grateful for your help.
It's something that has to be implemented at the frontend (HTML, JS, PHP), not in the Lua backend. I don't know how the kids do it these days, but from what I know, you'd need to use JavaScript to refresh that part of the HTML document every few seconds, as you want.
The way it works is as such:
Request a web page from Chrome/Firefox/Opera/Edge;
Web server opens requested file, if file is a script, script runs;
Script retrieves data from systems, databases, etc.;
3.1 Lua script runs, returns value (of RSSI, for you);
3.2 Script replaces variable by value returned by Lua script;
3.3 Script returns HTML code where variables have been replaced by values from databases, systems, etc.;
Web server sends data over the network;
Client web browser displays the data, usually as a HTML document formatted by CSS, with JavaScript interactivity, and automatic activities.
In your case, you'd want this:
JavaScript in client browser refreshes part of the document, essentially going through steps 1-5, but only replacing a portion of what's being displayed (an HTML element).
Quoting from Scrapy Official Documentation :
Scrapy comes with its own mechanism for extracting data. They’re called selectors because they “select” certain parts of the HTML document specified either by XPath or CSS expressions. Source
After reading this, I'm still not sure whether Scrapy works by directly selecting parts of the HTML document by using XPath/CSS expressions or selecting nodes from DOM Tree which is rendered by the browser?
Still confused whether DOM Parsing and HTML Parsing is the same or not...
After reading this, I'm still not sure whether Scrapy works by directly selecting parts of the HTML document by using XPath/CSS expressions or selecting nodes from DOM Tree which is rendered by the browser?
For sure the former, as there is definitely no browser involved. Even the "CSS" part is just syntatic sugar for the XPath part -- which one can see by printing out an "in progress" Selector:
>>> print(Selector(text="<html><div class='foo'></div></html>").css(".foo"))
[<Selector xpath="descendant-or-self::*[#class and contains(concat(' ', normalize-space(#class), ' '), ' foo ')]" data='<div class="foo"></div>'>]
Still confused whether DOM Parsing and HTML Parsing is the same or not...
Strictly speaking, I believe they are different. For example, lxml is able to parse HTML, but it does so in its own way, and materializes an object tree that is xml.etree compatible, and not that of the DOM. There is a minimal DOM library that html5lib can target, which is about the closest you'll get to "what a browser would build"
I'm not clear of when (or if) I should use multiple Grails encodeAsXXX calls.
This reference says you need to encodeAsURL and then encodeAsJavaScript: http://grailsrocks.com/blog/2013/4/19/can-i-pwn-your-grails-application
It also says you need to encodeAsURL and then encodeAsHTML, I don't understand why this is necessary in the case shown but not all the time?
Are there other cases I should me using multiple chained encoders?
If I'm rendering a URL to a HTML attribute should I encodeAsURL then encodeAsHTML?
If I'm rendering a URL to a JavaScript variable sent as part of a HTML document (via a SCRIPT element) should I encodeAsURL, encodeAsJavaScript then encodeAsHTML?
If I'm rendering a string to a JavaScript variable sent as part of a HTML document should I encodeAsJavaScript then encodeAsHTML?
The official docs - https://docs.grails.org/latest/guide/security.html - don't show any examples of multiple chained encoders.
I can't see how I can understand what to do here except by finding the source for all the encoders and looking at what they encode and what's valid on the receiving end - but I figure it shouldn't be that hard for a developer and there is probably something simple I'm missing or some instructions I haven't found.
FWIW, I think the encoders I'm talking about are these ones:
https://docs.spring.io/spring-framework/docs/current/javadoc-api/org/springframework/web/util/JavaScriptUtils.html#javaScriptEscape-java.lang.String-
https://docs.oracle.com/javase/7/docs/api/java/net/URLEncoder.html#encode(java.lang.String,%20java.lang.String)
https://docs.spring.io/spring/docs/current/javadoc-api/org/springframework/web/util/HtmlUtils.html#htmlEscape-java.lang.String-
.
It is certainly important to always consider XSS but in reading your question I think you are overestimating what you need to do. As long as you're using Grails 2.3 or higher and your grails.views.default.codec is set to html which it will be by default, everything rendered in your GSP with ${} will be escaped properly for you.
It is only when you are intentionally bypassing the escaping, such as if you need to get sanitized user input back into valid JavaScript within your GSP for some reason, that you would need to use the encodeAsXXX methods or similar.
I would argue (and the article makes a mention of this as well) that this should raise a smell anyway, as you probably should have that JavaScript encapsulated in a different file or TagLib where the escaping is handled.
Bottom line, use the encoding methods only if you are overriding the default HTML encoding, otherwise ${} handles it for you.
I have a Rails/Ember one-page app. Burp reports that
The value of the 'content_type' JSON parameter is copied into the HTML
document as plain text between tags. The payload
da80balert(1)4f31e was submitted in the content_type
JSON parameter. This input was echoed unmodified in the application's
response.
I can't quite parse this message referring to "is copied into" and "was submitted" in, but basically what is happening is:
A PUT or POST from the client contains ...<script>...</script>... in some field.
The server handles this request, and sends back the created object in JSON format, which includes the string in question
The client then displays that string, using the standard Embers/Handlebars {{content_type}}, which HTML-escapes the string and inserts it into the DOM, so the browser displays it on the screen as originally entered (and of course does NOT execute it).
So yes, the input was indeed echoed unmodified in the application's response. However, the application's response was not HTML, in which case there would indeed be a problem, but JSON, containing strings which when referred to by Handlebars will always be escaped properly for proper display in the browser.
So my question is, is this in fact a vulnerability? I have taken great care with my Ember app and can prove that no data from JSON objects is ever inserted "raw" into the DOM. Or is this a false positive given rise to by the mere fact the unescaped string may be found in the response if looked for using an unintelligent string comparison, not taking into account the fact that the JSON will be processed/escaped by the client-side framework?
To put it a different way, in a classic webapp spitting out HTML from the server, we know that user input such as the above must be escaped/sanitized properly. Unsanitized data "on the wire" in and of itself represents a vulnerability. However, in a one-page app based on JSON coming back from the server, the escaping/sanitization occurs in the client; the JSON on the "wire" may contain unsanitized data, and this is as expected. Am I missing something here?
There are subtle ways in which you can trick IE9 and older into treating JSON as HTML. So even if the server's response has a Content-Type header of application/json, IE will second guess it. This is called content type sniffing, and can be disabled by adding the X-Content-Type-Options: nosniff header.
JSON is not an executable format so your understanding is correct.
I did a demo of this exact problem in my talk on securing single page web apps at OWASP AppSec EU 2013 which someone put up on youtube here: http://m.youtube.com/watch?v=Femsrx0m9bU
In Firefox extension we use parseFragment (documentation) to parse a string of HTML (received from 3rd party server) into a sanitized DocumentFragment as it required by Mozilla. The only problem, the parser removes all attributes we need, for example, class attribute.
Is it possible somehow to keep class attributes while parsing HTML with parseFragment?
P.S. I know that in Gecko 14.0 they replaced this function with another which supports sanitizing parameters. But what to do with Gecko < 14.0?
No, the whitelist is hardcoded and cannot be adjusted. However, the class attribute is in the whitelist and should be kept, you probably meant the style attribute? If you need a customized behavior you will have to use a different solution (like DOMParser which can parse HTML documents in Firefox 12).
As to older Firefox versions, you can parse XHTML data with DOMParser there. If you really have HTML then I am only aware of one way to parse it without immediately inserting it into a document (which might cause various security issues): range.createContextualFragment(). You need an HTML document for that, if you don't have one - a hidden <iframe> loading about:blank will do as well. Here is how it works:
// Get the HTML document
var doc = document.getElementById("dummyFrame").contentDocument;
// Parse data
var fragment = doc.createRange().createContextualFragment(htmlData);
// Sanitize it
sanitizeData(fragment);
Here sanitizing the data is your own responsibility. You probably want to base your sanitization on Mozilla's whitelist that I linked to above - remove all tags and attributes that are not on that list, also make sure to check the links. The style attribute is a special case: it used to be insecure but IMHO no longer is given than -moz-binding isn't supported on the web any more.