Making tagsoup markup cleansing optional - orbeon

Tagsoup is interfering with input and formatting it incorrectly. For instance when we have the following markup
Text outside anchor
It is formatted as follows
Text outside anchor
This is a simple example but we have other issues as well. So we made tagsoup cleanup/formatting optional by adding an extra attribute to textarea control.
Here is the diff(https://github.com/binnyg/orbeon-forms/commit/044c29e32ce36e5b391abfc782ee44f0354bddd3).
Textarea would now look like this
<textarea skip-cleanmarkup="true" mediatype="text/html" />
Two questions
Is this the right approach?
If I provide a patch can it make it to orbeon codebase?
Thanks
BinnyG

Erik, Alex, et al
I think there are two questions here:
The first Concern is a question of Tag Soup and the clean up that happens OOTB: Empty tags are converted to singleton tags which when consumed/sent to the client browser as markup gets "fixed" by browsers like firefox but because of the loss of precision they do the wrong thing.
Turning off this clean up helps in this case but for this issue alone is not really the right answer because we it takes away a security feature and a well-formed markup feature... so there may need to be some adjustment to the handling of at least certain empty tags (other than turning them in to invalid singleton tags.)
All this brings us to the second concern which is do we always want those features in play? Our use-case says no. We want the user to be able to spit out whatever markup they want, invalid or not. We're not putting the form in an app that needs to protect the user from cross script coding, we're building a tool that lets users edit web pages -- hence we have turned off the clean-up.
But turning off cleanup wholesale? Well it's important that we can do it if that's what our usecase calls for but the implementation we have is all or nothing. It would be nice to be able to define strategies for cleanup. Make that function plug-able. For example:
* In the XML Config of the system define a "map" of config names to class names which implement the a given strategy. In the XForm Def the author would specify the name from the map.

If TagSoup transforms:
Text outside anchor
Into:
Text outside anchor
Wouldn't that be bug in TagSoup? If that was the case, then I'd say that it is better to fix this issue rather than disable TagSoup. But, it isn't a bug in TagSoup; here is what seems to be happening. Say the browsers sends the following to the client:
<a shape="rect"></a>After<br clear="none">
This goes through TagSoup, the result goes through the XSLT clean-up code, and the following is sent to the browser:
<a shape="rect"/>After<br clear="none"/>
The issue is on the browser, which transforms this into:
<a shape="rect">After</a><br clear="none"/>
The problem is that we serialize this as XML with Dom4jUtils.domToString(cleanedDocument), while it would be more prudent to serialize it as HTML. Here we could use the Saxon serializer. It is also used from HTMLSerializer. Maybe you can try changing this code to use it instead of using Dom4jUtils.domToString(). You'll let us know what you find when a get a chance to do that.

Binesh and I agree, if there is a bug it would be a good idea to address the issue closer to the root. But I think the specific issue he is only part of the matter.
We're thinking it would be best to have some kind of name-to-strategy mapping so that RTEs can call in the server-side processing that is right for them or the default if it's not specified.

Related

Ranorex Spy - The recognition elements on the page (prioritizing attributes)

I have question related to the Ranorex Spy.
Is it possible to recognize (by default) elements of the page by attribute other than id, e.g. data-id
I know that I can modify this later manually for each element (but it is time consuming)
Currently:
\input[#id='..."]
Expected (automatically, by Ranorex Spy):
\input[#data-id='..."]
I personnally did not bother using this (because we use many frameworks and whats desired in one framework is not necessarily whats desired in another one) but I think you can achieve priorisation of XPath rules using the RanoreXPath Weight Rules.
Following is a Ranorex article describing how to do this: http://www.ranorex.com/support/user-guide-20/ranorexpath-weight-rule-library.html
Good luck!

How to suppress false positives in Fortify

I have two questions regarding Fortify.
1 - Lets say I have a windows forms app, which asks for a username
and password, and the name of the textbox for password is
texboxPassword. So in the designer file, you have the following,
generated by the designer.
//
// texboxPassword
//
this.texboxPassword.Location = new System.Drawing.Point(16, 163);
this.texboxPassword.Name = "texboxPassword";
this.texboxPassword.Size = new System.Drawing.Size(200, 73);
this.texboxPassword.TabIndex = 3;
Fortify marks this as a password in comment issue. How can I suppress this by creating a custom rule? I don't want to suppress the whole issue because I still would like to catch certain patterns (such as password followed by = or : in comments) but the blanket search where any line that contains password is flagged is creating so many false positives. I looked into creating a structural rule but could not figure out how to remove the associated tag (where can I find the tag for password in comment anyways?)
2 - Let's say I have a custom UI control. This control html encodes everything and in my context, it is good enough to avoid XSS. Needless to say, it is being flagged by Fortify. How can I suppress XSS when I have a certain control type in my UI and all of its methods are safe for XSS (they sanitize) in my context? I have tried a DataflowCleanseRule (with a label just to test the concept) and wanted to mark get_Text() and set_Text() as sanitizer functions, but it did not make a difference and Fortify still flagged it for XSS.
<DataflowCleanseRule formatVersion="3.16" language="dotnet">
<RuleID>0D495522-BA81-440E-B191-48A67D9092BE</RuleID>
<TaintFlags>+VALIDATED_CROSS_SITE_SCRIPTING_REFLECTED,+VALIDATED_CROSS_SITE_SCRIPTING_PERSISTENT,+VALIDATED_CROSS_SITE_SCRIPTING_DOM,+VALIDATED_CROSS_SITE_SCRIPTING_POOR_VALIDATION</TaintFlags>
<FunctionIdentifier>
<NamespaceName>
<Pattern>System.Web.UI.WebControls</Pattern>
</NamespaceName>
<ClassName>
<Pattern>Label</Pattern>
</ClassName>
<FunctionName>
<Pattern>_Text</Pattern>
</FunctionName>
<ApplyTo implements="true" overrides="true" extends="true"/>
</FunctionIdentifier>
<OutArguments>return</OutArguments>
</DataflowCleanseRule>
Thank you in advance for your help
This is parsed using regular expressions. Unless you think you are able to create a regular expression that can parse human language properly, I would leave it alone and just audit it as not an issue.
The Pattern tag uses a java regular expression in the body, so should be used as user2867433 suggested. However, you stated
This control html encodes everything and in my context, it is good enough to avoid XSS
If you are going to use a custom rule, this has to assume that it will work in EVERY context, as say in the future somebody writes a piece of code that uses get_Text and then places this directly into a piece of JavaScript, html encoding will do NOTHING to stop the XSS problem here. I would advise again to audit this as not an issue or a false positive due to the validation used and explain why it's good enough in that context
Within "Pattern" you can use Java-Regex. So it should work if you use [gs]et_Text

Angular: Input with model binding acts on keypress, only when there's no controller

Angular newb here, thoughts appreciated...
Say I want an input field to control the window title as you type. A field with a model binding and no associated controller acts on keypress, as intended. However, there has to be a bit more logic to it -- default value before any user input, also used if the input is blanked.
Adding a controller bound to enclosing elements gives a place for that logic, but the change-on-keypress behavior is gone. I'm sure it's possible to recreate it by hand or with ui, but since it's inherently there without the controller, I'm wondering if I'm missing the simple clean way.
Simple version, acts on keypress, but with no smarts:
<title ng-bind-template="{{windowTitle}}">Default Title (not seen)</title>
<input ng-model="windowTitle" type="text">
Putting controller bindings on the head (for the title) and a containing div (for the input), and setting a default $scope.windowTitle inside the controller function does use that default value, but it breaks the auto-update.
I know in real life you'd want a real model, but what I'm really trying to understand is these two ways angular appears to work. I haven't found anything specifically describing two different implicit input binding behaviors, but I haven't been through all the docs yet.
How should I be thinking about this?
Edit: It's not the window title or default value per se that I'm interested in. I'm trying to understand this:
When there's no controller on either the field or the title, typing in the field changes the window title immediately, on keypress. The title is directly linked to the field value, with no other angular hookup.
With controller bindings around the title and the field, typing in the field has no effect on the title.
What I've since realized (I think) is that ng-controller bindings create a new instance of the controller each time. Here's the non-working code I didn't show before:
<title ng-controller="TitleCtrl" ng-bind-template="{{windowTitle || 'Foo'}}">Foo</title>
...
<label ng-controller="TitleCtrl">
<input ng-model="windowTitle" type="text">
{{windowTitle}}
</label>
The value set by the model binding to the field is shown correctly within that instance of that controller, and updates on keypress, as before. But since those two controller instances are separate, the binding to the title works but the data it points to isn't bound to the field.
Isn't that right? The reason it works with no controllers is that that makes the value global, so the title binding sees the value set by the field binding.
So what's the canonical way to reference data from some other area? Create a service?
I realize that this is basic angular stuff, just getting started here, so thanks!
Edit 2
On reflection, I've come to seriously disrespect this whole question, even though I wrote it.
It's based on way-too-early poor understanding of the Angular application model. I had worked through only part of the official tutorials, and jumped ahead to removing all the js from a not big but not totally trivial existing app, and exploring what Angular could to in that context.
I got some very quick bang for the buck, getting several pieces of functionality working with very little code, and simple, clear markup, felt good. But I really had short-circuited internalizing the Angular way of thinking, and my quick and dirty no-architecture approach broke down when different parts of the page needed to coordinate with each other, as in this question.
I've postponed that project while I go back to tutorials and other learning. If other folks think this question should be deleted, I'd add my vote. Or maybe it's a useful on some level, ignorant though it is.
Well, there are multiple ways to achieve the behavior you want without using an explicit controller and model, you could:
<title ng-bind-template="{{windowTitle && windowTitle || 'default'}}"></title>
Or in a more simple way:
<title>{{windowTitle && windowTitle || 'default' }}</title>
In both cases, we're using the conditional expression:
(condition) && (answer if true) || (answer if false)
You should however strive to remove logic from the templates.

Is there a best method to evaluate URL variables to determine the appropriate page?

I am using ColdFusion 9.0.1.
I have a new web site that uses Bikes.cfm and Makers.cfm as template pages. I need to be able to pass BikeID and MakerID to both of the these pages, along with other variables. I don't want to use the Actual page name in the URL, such as this:
MyDomain.com/Bikes.cfm?BikeID=1234&MakerID=1234
I want my URL to look more like this:
MyDomain.com/?BikeID=1234&MakerID=1234
I need to NOT specify the page name in the URL.
I want these two URLs to access different data:
MyDomain.com/?BikeID=1234&MakerID=1234 // goes to bike page
MyDomain.com/?MakerID=1234&BikeID=1234 // goes to maker page
So, if BikeID appears in the URL before MakerID, go to the Bikes.cfm page. If MakerID appears before BikeID, go the Makers.cfm page.
Is there an easy and existing method to arrange the URL keys in such a way to have them point to the appropriate page?
Should I just parse the the URL as a list and determine the first ID and go to the appropriate page? Is there a better way?
Any thoughts or hints or ideas would be appreciated.
UPDATE -- It certainly appears that using the order of parameters in a URL is a bad idea for the following reasons:
1) many programs append variables to the URL
2) some programs may reorder the variables
3) GoogleBot may not consider order relevant and will most likely not index the site correctly.
Thanks to everyone who provided advice in a positive manner that my approach was probably a bad idea and would not produce the results I wanted. Thanks to everyone who suggested alternate means to produce the results I wanted.
If anyone of you positive people would like to put your positive comment/advice as an answer, I'd be happy to accept it as the answer.
Despite my grave misgivings about the whole idea, here's how I would do it if I were forced to do so:
index.cfm:
<cfswitch expression="#ListFirst(cgi.query_string, '=')#">
<cfcase value="BikeID">
<cfinclude template="Bikes.cfm">
</cfcase>
<cfcase value="MakerID">
<cfinclude template="Makers.cfm">
</cfcase>
<cfdefaultcase>
<cfinclude template="Welcome.cfm">
</cfdefaultcase>
</cfswitch>

MVC3 Razor and Javascript - lots of syntax errors in green

I am starting to notice problems when I try and code my javascript and use functions that are in my viewmodel. Things like this:
case 37:
#if (Model.GoLeft)
{
Here I get a syntax error and the words "expected constant" for Model. Is there some solution to this? Do I need to upgrade something so it works?
I checked around on stackoverflow. Someone else suggested that I should separate my js but that doesn't help me as for example in this case where I want the keypress to do something if on a certain type of page where the Model allows it. If the js is in another file I can't code this way.
thanks
your approach is just wrong. Dont generate JS code in views by ifs. You defintely should keep your JS separetly (so that browser can efectively cache and reuse it). If you need to change behaviour of client-side code according to model values, do that by generation only some king of "flags" (JS have multiple ways to do that, i am not expert in JS - for example global variable works always, but there are more elegant and recommended ways) and in your client-side method test for their presence and fork your code by that.

Resources