i am getting links of a website using Html Agility pack with console application c#, by giving the divs that i want and get the links from those divs, my question is the thing i am doing is crawling or parsing, if not then what is crawling
Related
i am getting links of a website using Html Agility pack with console application c#, by giving the divs that i want and get the links from those divs, my question is the thing i am doing is crawling or parsing, if not then what is crawling
I'd like to implement a WYSIWYG HTML editor in my web application. I looked at the Codeplex AntiXSS by Microsoft for Crosssite Scripting protection, and the feedback seems really bad.
The alternative I have in mind is converting the input from the editor to RTF and then back to HTML and only then ship it to the database so it can be served later. I understand that this is an incredibly inefficient method, but the question is if that way I can guarantee no scripts at all. or in other words, can this provide me a complete XSS protection?
My site allows site-users to write blog-posts
class BlogPost
{
[AllowHtml]
public string Content;
}
The site is created using a MVC5 Internet application template and uses bootstrap 3 for it's CSS. So I decided to use http://jhollingworth.github.io/bootstrap-wysihtml5 to take care of all the JavaScript Part of a Rich Text Editor.
It works like a charm. But in order to make the POST happen, I had to add the [AllowHtml] attribute as in the code above. So now I'm scared of dangerous stuff that can get into the database and be in-turn displayed to all users.
I tried giving values like <script>alert("What's up?")</script> etc in the form and it seemed to be fine... the text was displayed exactly the same way (<script> became <script>. But this conversion seemed to be done by the javascript plugin I used.
So I used fiddler to compose a POST request with the same script tag and this time, the page actually executed the JavaScript code.
Is there any way I can figure out vulnerable input like <script> and even Link...?
Unfortunately, you have to sanitize the HTML yourself. See these on how people did it:
How to sanitize input from MCE in ASP.NET? - whitelist using Html Agility Pack
.NET HTML Sanitation for rich HTML Input - blacklist using Html Agility Pack
An alternative to accepting HTML is to accept markdown or BBCode instead. Both of them are widely used (markdown is used by stackoverflow!) and eliminate the need to sanitize the input. There are rich editors available too.
Edit
I found that Microsoft Web Protection Library can sanitize HTML input
through AntiXss.GetSafeHtml and AntiXss.GetSafeHtmlFragment.
Documentation is really poor though and seems like you can't configure which tags are valid.
I faced the same problem sanitizing wysihtml5 content on the server side. I was rather charmed by how wysihtml5 performed client side sanitation and implemented this using Html Agility Pack: HtmlRuleSanitizer on Github
Also available as NuGet package.
The reason for not using Microsoft's AntiXss is that it's not possible to enforce more detailed rules like what to do with tags. This results in tags being completely deleted when it for example would make sense to preserve the textual content. In addition I wanted to have a white listing approach on everything (CSS, tags and attributes).
As I am newbie to JQuery. I have certain questions in mind regarding JQuery.
Can I make complete website with Jquery?
Means in ASP.Net website we use Server Controls to design page, Can we make all this functionality on .ASPX page using JQuery?
If yes, then how to handle server side events?
For designing .ASPX pages, what we prefer to use? JQuery standard controls or Plugin?
No. JQuery is not a server-side framework. It's a client side DOM Manipulation domain specific language and API that enables client-side code to work cross-browser, and includes a variety of utility and helper functions for AJAX, deferred callback resolution, and generic functional programming.
In short, it is not meant to replace your server-side code.
The jQuery framework is only a javascript library which means it can only handle events or actions on the client-side. It doesn't matter what backend you are using for your website (PHP, ASP.Net, Python), javascript only works once the page has been rendered and sent to the browser. Try reading up on the docs for jQuery here: http://docs.jquery.com/
If you have any questions specifically about jQuery programming, we would be more than happy to answer them.
I am having concerns with html5 based mobile Apps.
In jquery mobile I have seen some of the multipage templates which are working good on chrome as a webpage but if i consider mobile Apps single page templates works good but so many lines of code in one html file is very much hard to understood.
Is there any tool that can bind multiple HTML files in a single file which helps in fast processing?
also which are the best practices that i can follow as to handle these issues.
Hi I'm trying to make sense of your question and I think you should probably go with something like http://www.codiqa.com/
There you can use a GUI to build jQuery Mobile apps.
They have a 15 day free trial (formerly 30), so you can check it out before you decide.