Data sanitization and DB storage

Data sanitization and DB storage - ruby-on-rails

If I have the following data, what is the best option in terms of Database storage.
Here is text&lt;br&gt;&lt;br&gt;Here is some more text
I see that I have 3 options:
Store in DB as it is then decode at runtime: <p>hello</p>
Decode and then store in DB: <p>hello</p>
Strip tags completely: Hello
Are there any big "No No's" with any of the above, just looking for some advice on best practice. Also worth noting that I will have absolutely no control over the data that I receive.

Depending on your requirement, I suggest to either strip the tags or store the unencoded version.
If you don't need the tags, the you can strip them and store the plain text.
If you need to preserve the tags and the formatting, then it's easier to save the unencoded version. Dealing with real tags it's much simpler.
Also, it's a view responsibility to encode the output. In fact, it strictly depends on where you are going to print the string.
In the console, for example, tags doesn't create any issue. It's just when you need to print the string into an HTML view. But fortunately Rails takes care of output sanitization for you, so you don't need to store the sanitized version in the database.

Convert the data to canonical form, and store that. That is, you should store <p>Hello</p> or Here is text<br><br>Here is some more text (though I doubt that's the decoding you intended for your example).
Then, you can search without having to worry about how it was encoded (Ö, Ö or Ö, for example?), and just encode it to whatever format is appropriate for display on rendering.

Related

is it good practice to encode user input to database?

I am wondering if it is consider good practice to encode user input to database.
Or is it ok to not encode to user input instead.
Currently my way of doing it is to encode it when entering database and use Html.DisplayFor to display it.

No. You want to keep the input in its original form until you need it and know what the output type is. It might be HTML for now, but later if you want to change it to json, text file, xml, etc the encoding might make it look different then you want.
So, first you want to make sure you are securely validating your input. It is a good idea to know what are the requirements for each of your inputs and validate that they are withing the correct length, range, character set, etc. It will be to your interest to limit the type of characters that are allowed as valid characters of an input type. (If using Regular Expressions to validate input ensure you do not use a regular expressions that is susceptible to a Regular Expression Denial of Service.
When moving the data around in your code ensure that you are properly handling the data in a manner that it will not turn into an Injection Attack.
Since you are talking about a database, the best practice is to use paramaterized statements. Check out the prevention methods in the above link.
Then when it comes outputting using MVC, if you are not using RAW or MvcHtmlString functions/calls, then the output is automatically encoded. With the automatic encoding, you want to make sure you are using the AntiXss encoder and not the default (whitelist approach vs. blacklist). Link
If you are using Raw or MvcHtmlString, you want to make sure you COMPLETE TRUST the values (you hard coded them in) or you manually encode them using the AntiXss Encoder class.

No it is not necessary to encode all the user inputs, rather if you want to avoid the script injection either you my try to validate the fields for special characters like '<', '>', '/', etc. else your Html helper method itself will do the needful.

Removing HTML url tags in iOS

I am writing an iOS app that downloads some data from a server that's not under my control. I am not using custom data detectors. The strings in the returned JSON still contain their HTML url tags, and I want to remove them because I want to display the strings in a UITextView, and these kind of strings
<strong>Instagram</strong> / <strong>Behance</strong>
<strong>Live Now</strong>
What I really want is this:
Instagram Behance
Live Now
What is the best way to go about this?
Should I strip the url tags from the text using regex?
Would I lose the link "descriptions" (in the above example, "Instagram" and "Behance") when I do that?
Would this be way easier using a UIWebView?
If this would be too hard/impossible, it'd be okay to only have the urls, without their descriptions.
Thank you!

Should I strip the url tags from the text using regex?
No. HTML is too complex to be properly parsed using a RegEx. You'll need an XML parser.
Would I lose the link "descriptions" (in the above example, "Instagram" and "Behance") when I do that?
You wouldn't have to using an XML parser. Using a RegEx, you might, especially if you can't control exactly what's returned.
Would this be way easier using a UIWebView?
Yep. That's what I would do, unless you have a good reason not to.

Any problems with using a period in URLs to delimiter data?

I have some easy to read URLs for finding data that belongs to a collection of record IDs that are using a comma as a delimiter.
Example:
http://www.example.com/find:1%2C2%2C3%2C4%2C5
I want to know if I change the delimiter from a comma to a period. Since periods are not a special character in a URL. That means it won't have to be encoded.
Example:
http://www.example.com/find:1.2.3.4.5
Are there any browsers (Firefox, Chrome, IE, etc) that will have a problem with that URL?
There are some related questions here on SO, but none that specific say it's a good or bad practice.

To me, that looks like a resource with an odd query string format.
If I understand correctly this would be equal to something like:
http://www.example.com/find?id=1&id=2&id=3&id=4&id=5
Since your filter is acting like a multi-select (IDs instead of search fields), that would be my guess at a standard equivalent.
Browsers should not have any issues with it, as long as the application's route mechanism handles it properly. And as long as you are not building that query-like thing with an HTML form (in which case you would need JS or some rewrites, ew!).
May I ask why not use a more standard URL and querystring? Perhaps something that includes element class (/reports/search?name=...), just to know what is being queried by find. Just curious, I knows sometimes standards don't apply.

Rails 3 dealing with special characters

I want to provide user with ability to fill-in input field with special characters (i.e. ¥ and others).
User input could be saved in xml file and later fetched and rendered back to form input.
What is the best practice of saving special symbols to xml (maybe using html entities or hexadecimal form)?
Thanks for advance.

I'd say if you save the file in utf-8 you will have no problems.
If some controller/view has problems with encoding you have to place this in the first line:
# encoding: utf-8

There's nothing special about them and you can don't need to encode them. Let your XML library deal with that, XML supports unicode ever since, and what you call "special symbols" are just unicode characters.

XSS in Rails' JSON

I'm rendering content using Backbone in Rails. Some of the json properties i'm getting from the models will be html attributes, some of them might be used inside the javascript and others will be inserted between html elements. All of these require different escaping mechanisms, how do people deal with this?

In our project we are using doT templates which (as most other) allow for interpolation with encoding ({{! ... }}). You could also try to encode all data and strip any possible javascripts server side when data is saved to be 100% sure you won't get anything malicious
Additionally if you are using jquery methods remember to use text method to insert data rather then html as text will automatically encode it.
And I really recommend the doT! It's super fast and we've managed to make it play really nicely with requirejs

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart