Easiest way of porting html table data to readable document - asp.net-mvc

Ok,
For the past 6 months i've been struggeling to build a system that allows user input in form of big sexy textareas(with loads of support for tables,list etc). Pretty much enables the user to input data as if it were word. However when wanting to export all this data I haven't been able to find a working solution...
My first step was to try and find a reporting software that did support raw HTML from the data source and render it as normal html, worked perfectly except that the keep together function is awful, either data is split in half(tables,lists etc) which I dont want. Or report always skips to the next page to avoid this, ending up in 15+ empty pages within the final document.
So Im looking for some kind of tip/direction to what would be the best solution to export my data into a readable document(pdf or word pref).
What I got is the following data breakdown, where data is often raw html.
-Period
--Unit
---Group
----Question
-----Data
What would be the best choice? Trying to render html to pdf or rtf? I need tips :(
And also sometimes the data is 2-3 pages long with mixed tables lists and plain text.

I would suggest that you try to keep this in the browser, and add a print stylesheet to the HTML to make it render one way on the screen and another way on paper. Adding a print stylesheet to your HTML is as easy as this:
<link rel="stylesheet" media="print" href="print.css">
You should be able to parse the input it with something like Html Agility Pack and transform it (i.e. with XSLT) to whatever output format you want.
Another option is to write HTML to the browser, but with Content-Type set to a Microsoft Word-specific variant (there are several to choose from, depending on the version of Word you're targeting) should make the browser ask if the user wants to open the page with Microsoft Word. With Word 2007 and newer you can also write Office Open XML Word directly, since it's XML-based.
The content-types you can use are:
application/msword
For binary Microsoft Word files, but should also work for HTML.
application/vnd.openxmlformats-officedocument.wordprocessingml.document
For the newer "Office Open XML" formats of Word 2007 and newer.

A solution you could use is to run an application on the server using System.Diagnostics.Process that will convert the site and save it as a PDF document.
You could use wkhtmltopdf which is an open source console program that can convert from HTML to PDF or image.
The installer for windows can be obtained from wkhtmltox-0.10.0_rc2 Windows Installer (i368).
After installing wkhtmltopdf you can copy the files in the installation folder inside your solution. You can use a setup like this in the solution:
The converted pdf's will be saved to the pdf folder.
And here is code for doing the conversion:
var wkhtmltopdfLocation = Server.MapPath("~/wkhtmltopdf/") + "wkhtmltopdf.exe";
var htmlUrl = #"http://stackoverflow.com/q/7384558/750216";
var pdfSaveLocation = "\"" + Server.MapPath("~/wkhtmltopdf/pdf/") + "question.pdf\"";
var process = new Process();
process.StartInfo.UseShellExecute = false;
process.StartInfo.CreateNoWindow = true;
process.StartInfo.FileName = wkhtmltopdfLocation;
process.StartInfo.Arguments = htmlUrl + " " + pdfSaveLocation;
process.Start();
process.WaitForExit();
The htmlUrl is the location of the page you need to convert to pdf. It is set to this stackoverflow page. :)

Its a general question, but two things come to mind the Visitor Pattern and Changing the Mime Type.
Visitor Pattern
You can have two seperate rendering techniques. This would be up to your implementation.
MIME Type
When the request is made write date out in the Response etc
HttpContext.Current.Response.Clear();
HttpContext.Current.Response.Charset = "utf-16";
HttpContext.Current.Response.ContentEncoding = System.Text.Encoding.GetEncoding("windows-1250");
HttpContext.Current.Response.AddHeader("content-disposition", string.Format("attachment; filename={0}.doc", filename));
HttpContext.Current.Response.ContentType = "application/msword";
HttpContext.Current.Response.Write("-Period");
HttpContext.Current.Response.Write("/n");
HttpContext.Current.Response.Write("--Unit");
HttpContext.Current.Response.Write("/n");
HttpContext.Current.Response.Write("---Group");
HttpContext.Current.Response.Write("/n");
HttpContext.Current.Response.Write("----Question");
HttpContext.Current.Response.Write("/n");
HttpContext.Current.Response.Write("-----Data");
HttpContext.Current.Response.Write("/n");
HttpContext.Current.Response.End();

Here is another option, use print screens (Although it doesnt take care of scrolling, I think you should be able to build this in). This example can be expanded to meet the needs of your business, although it is a hack of sorts. You pass it a URL it generates an image.
Call like this
protected void Page_Load(object sender, EventArgs e)
{
int screenWidth = Convert.ToInt32(Request["ScreenWidth"]);
int screenHeight = Convert.ToInt32(Request["ScreenHeight"]);
string url = Request["Url"].ToString();
string bitmapName = Request["BitmapName"].ToString();
WebURLToImage webUrlToImage = new WebURLToImage()
{
Url = url,
BrowserHeight = screenHeight,
BrowserWidth = screenWidth,
ImageHeight = 0,
ImageWidth = 0
};
webUrlToImage.GenerateBitmapForUrl();
webUrlToImage.GeneratedImage.Save(Server.MapPath("~") + #"Images\" +bitmapName + ".bmp");
}
Generate an image from a webpage.
using System;
using System.Drawing;
using System.Windows.Forms;
using System.Threading;
using System.IO;
public class WebURLToImage
{
public string Url { get; set; }
public Bitmap GeneratedImage { get; private set; }
public int ImageWidth { get; set; }
public int ImageHeight { get; set; }
public int BrowserWidth { get; set; }
public int BrowserHeight { get; set; }
public Bitmap GenerateBitmapForUrl()
{
ThreadStart threadStart = new ThreadStart(ImageGenerator);
Thread thread = new Thread(threadStart);
thread.SetApartmentState(ApartmentState.STA);
thread.Start();
thread.Join();
return GeneratedImage;
}
private void ImageGenerator()
{
WebBrowser webBrowser = new WebBrowser();
webBrowser.ScrollBarsEnabled = false;
webBrowser.Navigate(Url);
webBrowser.DocumentCompleted += new
WebBrowserDocumentCompletedEventHandler(webBrowser_DocumentCompleted);
while (webBrowser.ReadyState != WebBrowserReadyState.Complete)
Application.DoEvents();
webBrowser.Dispose();
}
void webBrowser_DocumentCompleted(object sender,
WebBrowserDocumentCompletedEventArgs e)
{
WebBrowser webBrowser = (WebBrowser)sender;
webBrowser.ClientSize = new Size(BrowserWidth, this.BrowserHeight);
webBrowser.ScrollBarsEnabled = false;
GeneratedImage = new Bitmap(webBrowser.Bounds.Width, webBrowser.Bounds.Height);
webBrowser.BringToFront();
webBrowser.DrawToBitmap(GeneratedImage, webBrowser.Bounds);
if (ImageHeight != 0 && ImageWidth != 0)
GeneratedImage =
(Bitmap)GeneratedImage.GetThumbnailImage(ImageWidth, ImageHeight,
null, IntPtr.Zero);
}
}

Related

C# MVC: Encoding a png, jpg, or pdf return value to prevent XSS

Suppose I have an C# MVC app which has a controller method that returns one of 3 content types: image png, image jpeg, or application pdf. I have read that it is possible to have images that contain XSS payloads. What would be the best way to Encode/escape these return contents so they aren't vulnerable to XSS? The controller method looks like this:
string contentType = "image/png";
MemoryStream mem = new MemoryStream();
if (ImageFormat == null || ImageFormat == "")
{
image.Save(mem, System.Drawing.Imaging.ImageFormat.Png);
}
else
{
if (ImageFormat.ToUpper() == "PNG") image.Save(mem, System.Drawing.Imaging.ImageFormat.Png);
if (ImageFormat.ToUpper() == "JPEG")
{
image.Save(mem, System.Drawing.Imaging.ImageFormat.Jpeg);
contentType = "image/jpeg";
}
}
mem.Position = 0;
mem.Seek(0, SeekOrigin.Begin);
return this.Image(mem, contentType);
Where Image is defined the following class here:
using …
namespace x.Classes
{
public static class ControllerExtensions
{
public static ImageResult Image(this Controller controller, Stream imageStream, string contentType)
{
return new ImageResult(imageStream, contentType);
}
}
}
And the OutputStream is written to using:
using …
namespace x.Classes
{
public class ImageResult : ActionResult
{
public ImageResult(Stream imageStream, string contentType)
{
if (imageStream == null)
throw new ArgumentNullException("imageStream");
if (contentType == null)
throw new ArgumentNullException("contentType");
this.ImageStream = imageStream;
this.ContentType = contentType;
}
public Stream ImageStream { get; private set; }
public string ContentType { get; private set; }
public override void ExecuteResult(ControllerContext context)
{
if (context == null)
throw new ArgumentNullException("context");
HttpResponseBase response = context.HttpContext.Response;
response.ContentType = this.ContentType;
byte[] buffer = new byte[4096];
while (true)
{
int read = this.ImageStream.Read(buffer, 0, buffer.Length);
if (read == 0)
break;
response.OutputStream.Write(buffer, 0, read);
}
response.End();
}
}
}
Is there a way for me to escape/encode the buffer that is getting written to the OutputStream here:`
response.OutputStream.Write(buffer, 0, read);
To protect against XSS attacks? For example if this were HTML that was being returned:
response.OutputStream.Write(HttpUtility.HtmlEncode(buffer), 0, read);
But we know we are returning a jpeg, pdf, or png which means Html encode won't work here. So what do we use to safely escape/encode an image/pdf?
By the time you have buffer ready, it's too late. The same as with HTML, you want to context-sensitively encode any user input in those files, not the whole thing.
Now, with images this doesn't make much sense in the context of XSS, an image is rendered by an image renderer, and not as html, so there won't be any javascript to be run. The general best practice for uploaded images is to process them on the server and save them as a new image, because this removes all unnecessary things, but it has its risks as well if your processor itself is the target of an attack.
SVG for example is a different beast, SVG can have code in it, as can PDF. But again, PDFs will be open on the client with a PDF viewer, not in the context of the web application even if the PDF viewer is the browser itself (the browser hopefully separates Javascript in the PDF from the web page even if the origin is the same).
But javascript in a PDF can still be an issue for the client. Javascript running in a PDF may do harmful things, the simplest of which is consume client resources (ie. DoS of some sort), or it may try to break out of the PDF context somehow exploiting a viewer vulnerability. So the attack would be that one user uploads a malicious PDF for others to download. I think the best you can do against this is scan uploaded files for malware (which you should do anyway).
If you are generating all of this from user input (images, PDFs), then the libraries you use should take care of properly encoding values so that a malicious user can't inject code in a PDF. When the PDF is already generated, you can't "fix" it anymore, user input is mixed with code.
Also make sure to set the following header in responses (along with the correct Content-Type of course):
X-Content-Type-Options: nosniff
You do not need to encode the images themselves, you need to encode/escape the links to the images.
For example:
Link Title
where image.url.png?logout comes from user input.
You would url encode image.url.png?logout as image.url.png%3Flogout so that it is rendered useless to an attacker.

Reload video after new source file in MVC website

I am working on a web service (using ASP.NET MVC), where I need to display two videos. These two videos are regularly updated.
The models used in the MVC project simply store the path to the videos (would it be a better idea to have the video files in the model?).
I expected that, after changing the video files (new .mp4 files, keeping the old path) and reloading, the videos would change, but this is not the case.
How can I make my videos update? So far I have tried creating an HTML function with "load()", similarly as in this link, but it does not work. (Also I am not sure this would be a good solution for me, since I need to force the update from the controller.)
Thank you!
Edit: Adding some code, as requested in the comments.
The model (yes, the name is old, I will change it to "video"):
public class Picture
{
public int PictureID { get; set; }
public string FilePath { get; set; }
public string PicName { get; set; }
public int Xsize { get; set; }
public int Ysize { get; set; }
public Picture()
{
FilePath = "../../VideoDatabase/";
PicName = "Default";
Xsize = 320;
Ysize = 200;
}
public void LinkNameAndPath()
{
FilePath = "../../VideoDatabase/" + PicName + ".mp4";
}
}
In the controller, creating a model object:
Picture CreateDefault(string pictureName)
{
Picture picture = new Picture();
picture.FilePath += "Candidates/" + pictureName + ".mp4";
picture.PicName = pictureName;
return picture;
}
In the controller, calling the view:
public ActionResult Index()
{
if (candidates.Count != 2)
{
return RedirectToAction("UnexpectedError");
}
else
{
return View(candidates);
}
}
In the view, the video object:
<video id="candidate0video" width="432" height="240" controls autoplay loop>
<source src=#Model[0].FilePath type="video/mp4">
Your browser does not support the video tag.
</video>
You might want to debug your view model first. That way, you will be able to tell whether or not the view model is getting the correct values. If the file path is not updated correctly, then obviously the video won't display right on the page.
You probably should also check these things as well
Did the pictureName parameter get passed in successfully?
What about the method LinkNameAndPath()? Did it actually get called or not?
Update
public ActionResult Index()
{
var path = HostingEnvironment.MapPath("[Your file path]");
HttpContext.Response.AddHeader("Content-Disposition", "attachment; filename=[your file name].mp4");
var objFile = new FileInfo(path);
var stream = objFile.OpenRead();
var objBytes = new byte[stream.Length];
stream.Read(objBytes, 0, (int)objFile.Length);
HttpContext.Response.BinaryWrite(objBytes);
}
<video id="candidate0video" width="432" height="240" controls autoplay loop>
<source src=#Url.Action("Index","[your controller name]") type="video/mp4">
<!-- or try the following -->
<!-- <source src=#Url.Content(Model[0].FilePath) type="video/mp4"> -->
Your browser does not support the video tag.
</video>
A friend of mine offered useful insight into the matter. I translate (and adapt) here what he said, which is the base of my solution to the problem (basically solved by using new names for the files that were updated):
It is complicated. The first time you download an asset, the server adds a header HTTP "Cache-control" to tell the client to cache locally the file during some given time.
The idea is that you want active caches to reduce the load on your server. (Maybe in a development environment it makes sense to deactivate the cache.)
The typical solution is to change the name of the files automatically adding the hash of the contents of the file. This way, if you have a text file (CSS/JS) clients will download the new version when you update the content.
I recommend you to look for a library that manages that automatically (supposing the videos are uploaded through your app and not using FTP directly).

Add PDFObject from already created pdf element

I have a pdf element that I am returning as a string base64 element since it is an MVC Web Application and the files live on a server. I am currently using PDFObject and pdf.js to try and view this PDF in the browser. However, I seem unable to display the PDF, unless I pass a url, which won't work when I put this application in IIS on a server.
So is there a way to have my embedded pdf with the src="{my base 64 string}, and then wrap the PDFObject around that? If not, is there a way, via PDFObject, to use a base64 string instead of a url?
Also, this is in IE 11
UPDATE
Here is my controller
public ActionResult GetPDFString(string instrumentType, string booktype, string book, string startpage, string EndPage)
{
LRS_Settings settings = ctxLRS.LRS_Settings.FirstOrDefault();
string root = settings.ImagePathRoot;
string state = settings.State;
string county = settings.County;
g_filePath = #"\\10.20.170.200\Imaging\GA\075\Daily\" + instrumentType + "\\" + book + "\\";
//g_filePath = #"\\10.15.100.225\sup_court\Imaging\GA\075\Daily\" + instrumentType + "\\" + book + "\\";
byte[] file = imgConv.ConvertTifToPDF(g_filePath, booktype, book, startpage, EndPage);
var ms = new MemoryStream(file);
var fsResult = new FileStreamResult(ms, "application/pdfContent");
return fsResult;
//return imgConv.ConvertTifToPDF(g_filePath, booktype, book, startpage, EndPage);
}
Here is my jquery
var options = {
pdfOpenParams: {
navpanes: 1,
toolbar: 0,
statusbar: 0,
pagemode: 'none',
pagemode: "none",
page: 1,
zoom: "page-width",
enableHandToolOnLoad: true
},
forcePDFJS: true,
PDFJS_URL: "/PDF.js/web/viewer.html"
}
PDFObject.embed("#Url.Action("GetPDFString", "DocumentView", new { instrumentType = ViewBag.instrumentType, BookType = Model.BookType, Book = ViewBag.Book, StartPage = ViewBag.StartPage, EndPage = ViewBag.endPage, style = "height:100%; width100%;" })", "#PDFViewer", options);
The problem is now, instead of showing the PDF inside of #PDFViewer, it is trying to download the file. Could someone please assist me on the final piece to the puzzle. This is driving me crazy.
Have you tried to use just the standard html to do this instead?
Controller Action
public ActionResult GetAttachment(string instrumentType, string booktype, string book, string startpage, string EndPage)
{
var fileStream = new FileStream(Server.MapPath("~/Content/files/sample.pdf"),
FileMode.Open,
FileAccess.Read
);
var fsResult = new FileStreamResult(fileStream, "application/pdf");
return fsResult;
}
In your view
<div id="PDFViewer">
<embed src="#Url.Action("GetAttachment", "DocumentView", new { instrumentType = ViewBag.instrumentType, BookType = Model.BookType, Book = ViewBag.Book, StartPage = ViewBag.StartPage, EndPage = ViewBag.endPage })" width="100%" height="100%" type="application/pdf"></embed>
</div>
Would this suit your requirements rather than using PDFObject?
Be sure to set the content disposition header to inline or the browser will try to download the file rather than render it in the viewport.
See Content-Disposition:What are the differences between "inline" and "attachment"?
As far as PDFObject versus plain HTML, for troubleshooting I always recommend trying static markup (no JS) to display the same PDF. If it works there, the problem may lie with PDFObject (in this case, PDFObject's handling of Base64 strings). If the PDF is not properly rendered via plain markup, the issue probably lies with your file/Base64.
You can grab a copy of static markup from the PDFObject static markup generator: http://pdfobject.com/generator
(I should add that I can't speak to PDF.js' handling of Base64 strings...)

What templating library can be used with Asp .NET MVC?

In my MVC 5 app I need to be able to dynamically construct a list of fully qualified external URL hyperlinks, alone with some additional data, which will come from the Model passed in. I figure - I will need to construct my anchor tags something like this:
{{linkDisplayName}}
with AngularJS this would be natural, but, I have no idea how this is done in MVC.
Is there a templating library that can be used for this?
1) Create a model to Hold the Links
public class LinkObject
{
public string Link { get; set; }
public string Description { get; set; }
}
2) In your Action you can use ViewBag, ViewData or even pass the list inside you Model. I will show you how to do using ViewBag
public ActionResult MyDynamicView()
{
//Other stuff and code here
ViewBag.LinkList = new List<LinkObject>()
{
new LinkObject{ Link ="http://mylink1.com", Description = "Link 1"},
new LinkObject{ Link ="http://mylink2.com", Description = "Link 2"},
new LinkObject{ Link ="http://mylink3.com", Description = "Link 3"}
};
return View(/*pass the model if you have one*/);
}
3) In the View, just use a loop:
#foreach (var item in (List<LinkObject>)ViewBag.LinkList)
{
#item.Description
}
Just create a manual one for that, no need to do it from a template. For example, in javascript
function groupAnchor(url,display){
var a = document.createElement("a");
a.href = url;
a.className = "list-group-item";
a.target = "_blank";
a.innerHTML = display;
return a;
}
And then use that function to modify your html structure
<div id="anchors"></div>
<script>
document.getElementById("anchors").appendChild(groupAnchor("http://google.com","Google"));
</script>
Your approach to modification will more than likely be more advanced than this, but it demonstrates the concept. If you need these values to come from server side then you could always iterate over a set using #foreach() and issue either the whole html or script calls there -- or, pass the set from the server in as json and then use that in a function which is set up to manage a list of anchors.
To expand on this, it is important to avoid sending html to the view from a razor iteration. The reason being that html constructed by razor will increase the size of the page load, and if this is done in a list it can be a significant increase.
In your action, construct the list of links and then serialize them so they can be passed to the view
public ActionResult ViewWithLinks()
{
var vm = new ViewModel();
vm.Links = Json(LinkSource.ToList()).Data;
//or for a very simple test for proof of concept
var Numbers = Json(Enumerable.Range(0,100).ToList()).Data;
ViewData["numbers"] = Numbers ;
return View(vm);
}
where all you need is an object to hold the links in your view model
public class ViewModel
{
public ICollection<Link> Links { get; set; }
}
public class Link
{
public string text { get; set; }
public string href { get; set; }
}
and then in your view you can consume this json object
var allLinks = #Html.Raw(Json.Encode(Model.Links));
var numbersList = #Html.Raw(Json.Encode(ViewData["linkTest"]));//simple example
Now you can return to the above function in order to place it on the page by working with the array of link objects.
var $holder = $("<div>");
for(var i = 0; i < allLinks.length; i++){
$holder.append(groupAnchor(allLinks[i].href,allLinks[i].text));
}
$("#linkArea").append($holder);
The benefit is that all of this javascript can be cached for your page. It is loaded once and is capable of handling large amounts of links without having to worry about sending excessive html to the client.

ASP.NET MVC Dynamically generated image URLs

I have an ASP.NET MVC application where I am displaying images.
These images could be located on the file system or inside a database. This is fine as I can use Url.Action in my image, call the action on my controller and return the image from the relevant location.
However, I want to be able to support images stored in Amazon S3. In this case, I don't want my controller action to return the image, it should instead generate an image URL for Amazon S3.
Although I could just perform this logic inside my view e.g.
<%if (Model.Images[0].ImageLocation == ImageLocation.AmazonS3) {%>
// render amazon image
I need to ensure that the image exists first.
Essentially I need to pass a size value to my controller so that I can check that the image exists in that size (whether it be in the database, file system or amazon s3). Once I am sure that the image exists, then I return the URL to it.
Hope that makes sense,
Ben
Try the following approach.
A model class for an image tag.
public class ImageModel
{
public String Source { get; set; }
public String Title { get; set; }
}
Helper
public static String Image(this HtmlHelper helper, String source, String title)
{
var builder = new TagBuilder("img");
builder.MergeAttribute("src", source);
builder.MergeAttribute("title", title);
return builder.ToString();
}
View with Model.Images of type IEnumerable<ImageModel>
...
<%= Html.Image(Model.Images[0].Source, Model.Images[0].Title) %>
Action
public ActionResult ActionName(/*whatever*/)
{
// ...
var model = ...;
//...
var model0 = ImageModel();
if (Image0.ImageLocation == ImageLocation.AmazonS3)
model0.Source = "an amazon url";
else
model0.Source = Url.Action("GetImageFromDatabaseOrFileSystem", "MyController", new { Id = Image0.Id });
model0.Title = "some title";
model.Images.Add(model0);
// ...
return View(model);
}
An action is a kind of a pseudo code, however the idea should be clear.
After several iterations I have come up with a workable solution, although I'm still not convinced its the best solution.
Originally I followed Anton's suggestion and just set the image url accordingly within my controller action. This was simple enough with the following code:
products.ForEach(p =>
{
p.Images[0].Url = _mediaService.GetImageUrl(p.Images[0], 200);
});
However, I soon found that this approach did not give me the flexibility I needed. Often I will need to display images of different sizes and I don't want to use properties of my model for this such as Product.FullSizeImageUrl, Product.ThumbnailImageUrl.
As far as "Product" is concerned it only knows about the images that were originally uploaded. It doesn't need to know about how we manipulate and display them, or whether we are caching them in Amazon S3.
In web forms I might use a user control to display product details and then use a repeater control to display images, setting the image urls programatically in code behind.
I found that the use of RenderAction in ASP.NET MVC gave me similar flexibility:
Controller Action:
[ChildActionOnly]
public ActionResult CatalogImage(CatalogImage image, int targetSize)
{
image.Url = _mediaService.GetImageUrl(image, targetSize);
return PartialView(image);
}
Media Service:
public MediaCacheLocation CacheLocation { get; set; }
public string GetImageUrl(CatalogImage image, int targetSize)
{
string imageUrl;
// check image exists
// if not exist, load original image from store (fs or db)
// resize and cache to relevant cache location
switch (this.CacheLocation) {
case MediaCacheLocation.FileSystem:
imageUrl = GetFileSystemImageUrl(image, targetSize);
break;
case MediaCacheLocation.AmazonS3:
imageUrl = GetAmazonS3ImageUrl(image, targetSize);
break;
default:
imageUrl = GetDefaultImageUrl();
break;
}
return imageUrl;
}
Html helper:
public static void RenderCatalogImage(this HtmlHelper helper, CatalogImage src, int size) {
helper.RenderAction("CatalogImage", "Catalog", new { image = src, targetSize = size });
}
Usage:
<%Html.RenderCatalogImage(Model.Images[0], 200); %>
This now gives me the flexibility I require and will support both caching the resized images to disk or saving to Amazon S3.
Could do with some url utility methods to ensure that the generated image URL supports SSL / virtual folders - I am currently using VirtualPathUtility.
Thanks
Ben
You can create a HttpWebRequest to load the image. Check the header in the response, if it's 200 that means it was successful, otherwise something went wrong.

Resources