Editing a big imported file on a second page - asp.net-mvc

This is mostly theoretical question, since I actually can implement it in any way, but it confuses me a bit. So, suppose I present a user with a page to select an Excel file, which is then uploaded to the server. Server code parses the file, and presents the user with another page with many options. The user can select and deselect some of them, edit names, and then click OK - after which the server has to process only the selected options.
The question may be:
is it better to store parsed file in Session?
is it better to push parsed data to client's page and then receive it back?
Here's example:
public class Data
{
public string Name { get; set; } // shown to user, can be changed
public bool Selected { get; set; } // this is in ViewModel but anyway
public string[] InternalData { get; set; } // not shown to user
}
// 1st option is to receive data via POST
public ActionResult ImportConfirmed(IList<Data> postitems)
{
// 2nd option is to receive only user changes via POST
var items = Session["items"] as IList<Data>;
items = items.Where(postitems of same name selected);
items.ForEach(set name to postitems name);
}
Obviously option #2 has less side effects, since it does not have global state. But in option #1 we don't push loads of useless-to-user data to the client. And this can be a lot.
Of course this problem is not new, and as always, the answer is: it depends.
I have to admit, I don't have any exact question in mind. I can't even tell why I don't like the Session solution which takes only couple of additional lines of code. The reason I ask is that I've read about Weblocks concept and was very impressed. So, I tried to invent something similar in ASP.NET MVC and failed to. Thus, I wonder, is there any elegant way to deal with such situations? By elegant I mean something that doesn't show it uses Session, easy to use, handles expirations (cleans up the Session if user does not press the final "Save" button), etc. Something like:
var data = parse(filestream);
var confirmationPostData = ShowView("Confirm", data);
items = items.Where(confirmationPostData of same name selected);
items.ForEach(set name to confirmationPostData name);
Here ShowView actually sends GET, wait for user's POST, and returns. Kind of. I do not insist, I just show the way that impressed me (in Weblocks - if I actually did understand it correctly).
Does everyone just use Session in such cases? Or is there a better way (except learning LISP which I already started to investigate if I can cope with)? Maybe, async actions in MVC v2 do it?
UPDATE: storing in DB/temp files, it works. I do sometimes store in DB. However this needs a way to expire the data since user may just abandon it (as simple as closing the browser). What I'm asking for: is there a proven and elegant way to solve it - not about how to do it. An abstraction built on top of serialization not tied to particular DB/file implementation, something like this.

I'm not sure what the purpose of uploading the Excel file is, but I like to make all actions that affect the long term state of the application, for the user, persisted. For example, what if the user uploads the file, changes a couple of options, then goes to lunch. If you store the info in session, it may be gone when they get back, ditto for storing it in the page with hidden variables. What about storing it in a DB?

I would store the file at the temp folder and only associate the name of the file with the user session so that later it can be processed:
// Create a temp file in the Temp folder and return its name:
var tempFile = Path.GetTempFileName();
// write to the temp file and put the filename into the session
// so that the next request can fetch the file and process it
There's a flaw with the GetTempFileName that I once fell into because I didn't read the documentation carefully. It says that the method will start throwing exceptions if you have more than 65535 files in the temp folder. So remember to always delete the temp file once you've finished processing it.
Another alternative to the temp folder would be to store the file into a database, but I am a little skeptic about storing files inside a relational database.

Related

What is available for limiting the use of extend when using Breezejs, such users cant get access to sensitive data

Basically this comes up as one of the related posts:
Isn't it dangerous to have query information in javascript using breezejs?
It was someone what my first question was about, but accepting the asnwers there, i really would appreciate if someone had examples or tutorials on how to limit the scope of whats visible to the client.
I started out with the Knockout/Breeze template and changed it for what i am doing. Sitting with a almost finished project with one concern. Security.
I have authentication fixed and is working on authorization and trying to figure out how make sure people cant get something that was not intended for them to see.
I got the first layer fixed on the root model that a member can only see stuff he created or that is public. But a user may hax together a query using extend to fetch Object.Member.Identities. Meaning he get all the identities for public objects.
Are there any tutorials out there that could help me out limiting what the user may query.?
Should i wrap the returned objects with a ObjectDto and when creating that i can verify that it do not include sensitive information?
Its nice that its up to me how i do it, but some tutorials would be nice with some pointers.
Code
controller
public IQueryable<Project> Projects()
{
//var q = Request.GetQueryNameValuePairs().FirstOrDefault(k=>k.Key.ToLower()=="$expand").Value;
// if (!ClaimsAuthorization.CheckAccess("Projects", q))
// throw new WebException("HET");// UnauthorizedAccessException("You requested something you do not have permission too");// HttpResponseException(HttpStatusCode.MethodNotAllowed);
return _repository.Projects;
}
_repository
public DbQuery<Project> Projects
{
get
{
var memberid = User.FindFirst("MemberId");
if (memberid == null)
return (DbQuery<Project>)(Context.Projects.Where(p=>p.IsPublic));
var id = int.Parse(memberid.Value);
return ((DbQuery<Project>)Context.Projects.Where(p => p.CreatedByMemberId == id || p.IsPublic));
}
}
Look at applying the Web API's [Queryable(AllowedQueryOptions=...)] attribute to the method or doing some equivalent restrictive operation. If you do this a lot, you can subclass QueryableAttribute to suit your needs. See the Web API documentation covering these scenarios.
It's pretty easy to close down the options available on one or all of your controller's query methods.
Remember also that you have access to the request query string from inside your action method. You can check quickly for "$expand" and "$select" and throw your own exception. It's not that much more difficult to block an expand for known navigation paths (you can create white and black lists). Finally, as a last line of defense, you can filter for types, properties, and values with a Web API action filter or by customizing the JSON formatter.
The larger question of using authorization in data hiding/filtering is something we'll be talking about soon. The short of it is: "Where you're really worried, use DTOs".

Hydrate related objects

I am looking for a simple way to hydrate a related object. A Note belongs to a Document and only owners of a Document can add Notes so when a user tries to edit a Note, I need to hydrate the related Document in order to find out if the user has access to it. In my Service layer I have the following:
public void editNote(Note note)
{
// Get the associated Document object (required for validation) and validate.
int docID = noteRepository.Find(note.NoteID).DocumentID;
note.Document = documentRepository.Find(docID);
IDictionary<string, string> errors = note.validate();
if (errors.Count > 0)
{
throw new ValidationException(errors);
}
// Update Repository and save.
noteRepository.InsertOrUpdate(note);
noteRepository.Save();
}
Trouble is, noteRepository.InsertOrUpdate(note) throws an exception with "An object with the same key already exists in the ObjectStateManager." when the repository sets EntityState.Modified. So a number of questions arise:
Am I approaching this correctly and if so, how do I get around the exception?
Currently, the controller edit action takes in a NoteCreateEditViewModel. Now this does have a DocumentID field as this is required when creating a new Note as we need to know which Document to attach it to. But for edit, I cannot use it as a malicious user could provide a DocumentID to which they do have access and thus edit a Note they don't own. So should there be seperate viewmodels for create and edit or can I just exclude the DocumentID somehow on edit? Or is there a better way to go about viewmodels such that an ID is not required?
Is there a better way to approach this? I have read that I should just have a Document repository as an aggregate and lose the Note repository but am not sure if/how this helps.
I asked a similar question related to this but it wasn't very clear so hoping this version will allow someone to understand and thus point me in the right direction.
EDIT
Based on the information provided by Ladislav Mrnka and the answer detailed here: An object with the same key already exists in the ObjectStateManager. The ObjectStateManager cannot track multiple objects with the same key, it seems that my repository method need to be like the following:
public void InsertOrUpdate(Note note)
{
if (note.NoteID == default(int)) {
// New entity
context.Notes.Add(note);
} else {
// Existing entity
//context.Entry(note).State = EntityState.Modified;
context.Entry(oldNote).CurrentValues.SetValues(note);
}
}
But how do I get the oldNote from the context? I could call context.Entry(Find(note.NoteID)).CurrentValues.SetValues(note) but am I introducing potential problems here?
Am I approaching this correctly and if so, how do I get around the exception?
I guess this part of your code loads the whole Node from the database to find DocumentID:
int docID = noteRepository.Find(note.NoteID).DocumentID;
In such case your InsertOrUpdate cannot take your node and attach it to context with Modified state because you already have note with the same key in the context. Common solution is to use this:
objectContext.NoteSet.ApplyCurrentValues(note);
objectContext.SaveChanges();
But for edit, I cannot use it as a malicious user could provide a DocumentID to which they do have access and thus edit a Note they don't own.
In such case you must add some security. You can add any data into hidden fields in your page but those data which mustn't be changed by the client must contain some additional security. For example second hidden field with either signature computed on server or hash of salted value computed on server. When the data return in the next request to the server, it must recompute and compare signature / hash with same salt and validate that the passed value and computed value are same. Sure the client mustn't know the secret you are using to compute signature or salt used in hash.
I have read that I should just have a Document repository as an aggregate and lose the Note repository but am not sure if/how this helps.
This is cleaner way to use repositories but it will not help you with your particular error because you will still need Note and DocumentId.

How can I store user information in MVC between requests

I have an MVC2-site using Windows authentication.
When the user requests a page I pull some user information from the database. The class I retrieve is a Person class.
How can get this from the database when the user enters the site, and pick up the same class without touching the db on all subsequent page requests?
I must admit, I am pretty lost when it comes to session handling in ASP.net MVC.
You can store that kind of information in HttpContextBase.Session.
One option is to retrieve the Person object from your database on the first hit and store it in System.Web.HttpContext.Current.Cache, this will allow extremely fast access and your Person data will be temporarily stored in RAM on the web server.
But be careful: If you are storing significantly large amount of user data in this way, you could eat up a lot of memory. Nevertheless, this will be perfectly fine if you only need to cache a few thousand or so. Clearly, it depends upon how many users you expect to be using your app.
You could add like this:
private void CachePersonData (Person data, string storageKey)
{
if (HttpContext.Current.Cache[storageKey] == null)
{
HttpContext.Current.Cache.Add(storageKey,
data,
null,
Cache.NoAbsoluteExpiration,
TimeSpan.FromDays(1),
CacheItemPriority.High,
null);
}
}
... and retrieve like this:
// Grab data from the cache
Person p = HttpContext.Current.Cache[storageKey];
Don't forget that the object returned from the cache could be null, so you should check for this and load from the database as necessary (then cache).
First of all, if you are using a load balanced environment, I wouldn't recommend any solution that you try without storing it in a database, because it will eventually fail.
If you are not in a load balancing environment, you can use TempData to store your object and then retrieve it in the subsequent request.
HttpContext.Current.Session[key];

Data Access Layer - static list objects and caching

i am devloping a site using .net MVC
i have a data access layer which basically consists of static list objects that are created from data within my database.
The method that rebuilds this data first clears all the list objects. Once they are empty it then add the data. Here is an example of one of the lists im using. its a method which generates all the UK postcodes. there are about 50 methods similar to this in my application that return all sorts of information, such as towns, regions, members, emails etc.
public static List<PostCode> AllPostCodes = new List<PostCode>();
when the rebuild method is called it first clears the list.
ListPostCodes.AllPostCodes.Clear();
next it re-bulilds the data, by calling the GetAllPostCodes() method
/// <summary>
/// static method that returns all the UK postcodes
/// </summary>
public static void GetAllPostCodes()
{
using (fab_dataContextDataContext db = new fab_dataContextDataContext())
{
IQueryable AllPostcodeData = from data in db.PostCodeTables select data;
IDbCommand cmd = db.GetCommand(AllPostcodeData);
SqlDataAdapter adapter = new SqlDataAdapter();
adapter.SelectCommand = (SqlCommand)cmd;
DataSet dataSet = new DataSet();
cmd.Connection.Open();
adapter.FillSchema(dataSet, SchemaType.Source);
adapter.Fill(dataSet);
cmd.Connection.Close();
// crete the objects
foreach (DataRow row in dataSet.Tables[0].Rows)
{
PostCode postcode = new PostCode();
postcode.ID = Convert.ToInt32(row["PostcodeID"]);
postcode.Outcode = row["OutCode"].ToString();
postcode.Latitude = Convert.ToDouble(row["Latitude"]);
postcode.Longitude = Convert.ToDouble(row["Longitude"]);
postcode.TownID = Convert.ToInt32(row["TownID"]);
AllPostCodes.Add(postcode);
postcode = null;
}
}
}
The rebuild occurs every 1 hour. this ensures that every 1 hour the site will have fresh set of cached data.
the issue ive got is that occasionally if during a rebuild, the server will be hit by a request and an exception is thrown. The exception is "Index was outside the bounds of the array." it is due to when a list is being cleared.
ListPostCodes.AllPostCodes.Clear(); - // throws exception - although its not always in regard to this list.
Once this exception is thrown application dies, All users are affected. I have to restart the server to fix it.
i have 2 questions...
If i utilise caching instead of static objects would this help ?
Is there any way i can say "while the rebuild is taking place, wait for it to complete until accepting requests"
any help is most appricaiated ;)
truegilly
1 If i utilise caching instead of
static objects would this help ?
Yes, all the things you do are easier done by the caching functionality that is build into ASP.NET
Is there any way i can say "while the
rebuild is taking place, wait for it
to complete until accepting requests"
The common pattern goes like this:
You request data from the Data layer
If the Datlayer sees that there is data in the cache, then it serves the data from cache
If no data is in the cache the data is requested from the db and put into cache. After that it is served to the client
There are rules (CacheDependency and Timeout) when the cache is to be cleared.
The easiest solution would be you stick to this pattern: This way the first request would hit the database and other requests get served from the cache. You trigger the refresh by implementing an SQLCacheDependency
You have to make sure that your list is not modified by one thread while other threads are trying to use it. This would be a problem even if you used the ASP.NET cache since collections are just not thread-safe. One way you can do this is by using a SynchronizedCollection instead of a List. Then make sure to use code like the following when you access the collection:
lock (synchronizedCollection.SyncRoot) {
synchronizedCollection.Clear();
etc...
}
You will also have to use locking when you read the collection. If you are enumerating over it, you should probably make a copy before doing so as you don't want to lock for a long time. For example:
List<whatever> tempCollection;
lock (synchrnonizedCollection.SyncRoot) {
tempCollection = new List<whatever>(synchronizedCollection);
}
//use temp collection to access cached data
The other option would be to create a ThreadSafeList class that uses locking internally to make the list object itself thread-safe.
I agree with Tom, you will have to do synchronization to make this work. One thing that would improve the performance is not clearing the list until you actually receive the new values from the database:
// Modify your function to return a new list instead of filling the existing one.
public static List<PostCode> GetAllPostCodes()
{
List<PostCode> temp = new List<PostCode>();
...
return temp;
}
And when you rebuild the data:
List<PostCode> temp = GetAllPostCodes();
AllPostCodes = temp;
This makes sure that your cached list is still valid while GetAllPostCodes() is executing. It also has the advantage that you can use a read-only list which makes the synchronization a bit easier.
In your case you need to refresh the data every one hour.
1) IT should use cache with absolute expiration set to 1 hour, so it expires after every 1 hour. Check the Cache before using it, by doing a NULL check.If its NULL get the data from DB and populate the Cache.
2) With above approach the disadvantage is that data can be stale by 1 hour. So if u want most updated data at all times, use SQLCacheDependency (PUSH). so whenever there is a change in the select command u r using, cache will be refreshed from the database with updated data.

ASP.NET MVC - Sharing Session State Between Controllers

I am still mostly unfamiliar with Inversion of Control (although I am learning about it now) so if that is the solution to my question, just let me know and I'll get back to learning about it.
I have a pair of controllers which need to a Session variable, naturally nothing too special has happen because of how Session works in the first place, but this got me wondering what the cleanest way to share related objects between two separate controllers is. In my specific scenario I have an UploadController and a ProductController which work in conjunction with one another to upload image files. As files are uploaded by the UploadController, data about the upload is stored in the Session. After this happens I need to access that Session data in the ProductController. If I create a get/set property for the Session variable containing my upload information in both controllers I'll be able to access that data, but at the same time I'll be violating all sorts of DRY, not to mention creating a, at best, confusing design where an object is shared and modified by two completely disconnected objects.
What do you suggest?
Exact Context:
A file upload View posts a file to UploadController.ImageWithpreview(), which then reads in the posted file and copies it to a temporary directory. After saving the file, another class produces a thumbnail of the uploaded image. The path to both the original file and the generated thumbnail are then returned with a JsonResult to a javascript callback which updates some dynamic content in a form on the page which can be "Saved" or "Cancelled". Whether the uploaded image is saved or it is skipped, I need to either move or delete both it and the generated thumbnail from the temporary directory. To facilitate this, UploadController keeps track of all of the upload files and their thumbnails in a Session-maintained Queue object.
Back in the View: after the form is populated with a generated thumbnail of the image that was uploaded, the form posts back to the ProductsController where the selected file is identified (currently I store the filename in a Hidden field, which I realize is a horrible vulnerability), and then copied out of the temp directory to a permanent location. Ideally, I would like to simply access the Queue I have stored in the Session so that the form does not need to contain the image location as it does now. This is how I have envisioned my solution, but I'll eagerly listen to any comments or criticisms.
A couple of solutions come to mind. You could use a "SessionState" class that maps into the request and gets/sets the info as such (I'm doing this from memory so this is unlikely to compile and is meant to convey the point):
internal class SessionState
{
string ImageName
{
get { return HttpContext.Current.Session["ImageName"]; }
set { HttpContext.Current.Session["ImageName"] = value; }
}
}
And then from the controller, do something like:
var sessionState = new SessionState();
sessionState.ImageName = "xyz";
/* Or */
var imageName = sessionState.ImageName;
Alternatively, you could create a controller extension method:
public static class SessionControllerExtensions
{
public static string GetImageName(this IController controller)
{
return HttpContext.Current.Session["ImageName"];
}
public static string SetImageName(this IController controller, string imageName)
{
HttpContext.Current.Session["ImageName"] = imageName;
}
}
Then from the controller:
this.SetImageName("xyz");
/* or */
var imageName = this.GetImageName();
This is certainly DRY. That said, I don't particularly like either of these solutions as I prefer to store as little data, if any, in session. But if you're intent is to hold onto all of this information without having to load/discern it from some other source, this is the quickest (dirtiest) way I can think of to do it. I'm quite certain there's a much more elegant solution, but I don't have all of the information about what it is you're trying to do and what the problem domain is.
Keep in mind that when storing information in the session, you will have to dehydrate/rehydrate the objects via serialization and you may not be getting the performance you think you are from doing it this way.
Hope this helps.
EDIT: In response to additional information
Not sure on where you're looking to deploy this, but processing images "real-time" is a sure fire way to be hit with a DoS attack. My suggestion to you is as follows -- this is assuming that this is public facing and anyone can upload an image:
1) Allow the user to upload an image. This image goes into the processing queue for background processing by the application or some service. Additionally, the name of the image goes into the user's personal processing queue -- likely a table in the database. Information about background processing in a web app can be found # Schedule a job in hosted web server
2) Process these images and, while processing, display a "processing graphic". You can have an ajax request on the product page that checks for images being processed and trys to reload them every X seconds.
3) While an image is being "processed", the user can opt out of processing assuming they're the one that uploaded the image. This is available either on the product page(s) that display the image or on a separate "user queue" view that will allow them to remove the image from consideration.
So, you end up with some more domain objects and those objects are managed by the queue. I'm a strong advocate of convention over configuration so the final destination of the product image(s) should be predefined. Something like:
images/products/{id}.jpg or, if a collection, images/products/{id}/{sequence}.jpg.
You then don't need to know the destination in the form. It's the same for all images.
The queue then needs to know where the temp image was uploaded and what the product id was. The queue worker pops items from the queue, processes them, and stores them accordingly.
I know this sounds a little more "structured" than what you originally intended, but I think it's a little cleaner.
Is there complete equivalence between the UploadController and ProductController?
As files are uploaded by the UploadController, data about the upload is stored in the Session. After this happens I need to access that Session data in the ProductController.
As I read that the UploadControl needs read and write access to Upload data, the ProductController needs only read.
If that's true then you can make it clear by using an immuatable wrapper around the upload information and have the UploadController put that into the session.
The Session itself is by definiton a public shared noticeboard, decouples explicit relationships at the cost of allowing anyone to get and put. You could allow the ProductController to know about the UploadController and hence remove the need for passing the upload information via the session. My instinct is that the upload info is interesting to the public, so using Session is reasonable.
I don't see any DRY violation here, we are explicitly trying to separate responsibilities.

Resources