I'm facing problem related to file system design and I really appreciate the effort from any professional opinion in file system and upload files in web application. what we need is to decide between two or more option how to design the file system structure, and how to choose the best structure criteria.
the system is academic system that allow users to upload files as attachment within posts in the system. our focus is to design the hierarchy of the file system needed to store files. notice that the system will be adapting large amount of files through the file structure.
as agreed, two solution where found, storing URL and file attributes in DB, and storing the files in this hierarchy:
User -> xx_user -> posts -> xx_post -> files attached to xx_post
Class-> xx_class -> tasks -> xx_task -> files attached to xx_task
or
Tasks -> xx_task -> files attached to xx_task
Posts -> xx_post -> files attached to xx_post
note: xx means id
In case one: small number of files in every folders but large number of folders.
In case two: large number of files in every folders but reduced hierarchy and folders number
So, which one is the best approach for implementing this for Scalable Web Application , note this is only part of file system introduced as example. if you would choose one, please explain to me why should I go for it in terms of lookup speed and performance.
Thanks in advance for the help.
Related
In the executable I am reverse-engineering, there are several references to a path in my D:\ drive. However, I do not have a D:\ drive connected. Is it possible that it creates a temporary storage site in the executable?
For example, there is a string:
D:\BuildAgent\...\bin\...\fileIWantToSee.jpg
IDA even believes that the symbol information is in the D drive, and attempts to look for it, to no avail. There are many instances of file references within these strings, and many of them end with a:
Line: **LINENUMBER**
Where would I go about trying to find where this storage is located? Thank you!
EDIT: Could it be in a specific section?
Is it possible that it creates a temporary storage site in the executable?
This is possible. There exists at least one product (http://www.boxedapp.com/, kind of our competitor :) that lets the application create such container -- the calls to file APIs are intercepted by the code added to the application by this product, and this added code handles specific paths in a different way (emulating file operations), letting all other calls go to Windows API.
I'm developing a MVC5 web app, hosted through azure, that lets you manage your movies (it's just for myself at the moment). I'm trying to find a way to scan a local folder on the users pc for a list of file names. I do realise the security/permissions issues I might run into. I do not need the file uploaded, only the full file name.
It would work by the user being able to select a folder where they store their movies and it will take in all the file names, including the ones in any sub directories.
I tried a multiple file upload form but quickly ran into issues with the max request limit which I tried messing around with but it proved redundant in the end. I can settle for the user selecting multiple files but would rather it done the directory way.
I know this might prove impossible in the end but any help would be greatly appreciated.
Does Paperclip scans the files for errors, malicious software, viruses before uploading to database? If no, what are the viable solutions.
And, is it better to first create a separate folder for each user before they upload files and store in their respective folders? What are the merits and demerits of it? Is it possible to specify this with Paperclip?
Thanks
Re viruses etc, this might be useful - Rails / Heroku - How to anti-virus scan uploaded file?
Re storing each user's files in a seperate folder: the conventional way would be to store every FILE in a separate folder, and then link the files to the user via the database (eg a user_id field on the file records). As far as merits and demerits go, besides it not being conventional, one thing to bear in mind would be that if a user's files are stored in a single folder, then if they upload a two files with the same name then the second would overwrite the first (unless of course you put them in separate folders within the user's folder). This could be a good thing or bad thing depending on your requirements.
BTW - a slightly pedantic note: files aren't uploaded to the database (at least not normally) - they are uploaded to a filesystem, and a corresponding record is created in the database. The files don't go into the database (as i say, usually: it is possible to store files as blobs in the DB but it's not good practise and not usual).
I can use:
#+INCLUDE:
to include an org file in another org file, which allows me to assemble, say, a website from various org files. I'm exporting from the C-c C-e exporter in org-mode 7.5.
I could maintain a quite complex publication this way. This modular approach is quite common in, e.g. LaTeX and Texinfo publications.
However, links to images no longer work from the #+INCLUDEd org files. What seems to be happening is that the path to the images is taken as being from the org file that I am exporting from, rather than the actual org file that references the image.
The only ways I can see to resolve this are to:
use a flat file structure; or
make the image path from the referencing file (which I might not know in advance) rather than itself.
Neither of these is really sustainable.
How do I tell org to use the correct image path from its own relevant org file rather than the parent org file?
From what I know of the exporter, INCLUDE files are inserted into the document before export. Therefore the content is part of the document before it starts following paths to reach any links to files (images).
After a bit of testing you likely will need to use absolute file paths. Since you move between Windows and Linux your best bet would be to use a consistent scheme on both starting from your home directory.
Like that you can make the Org link:
[[~/path/to/image.jpg]], which will work on both systems (assuming you have set %HOME% on Windows).
Option 1 is potentially an alternative (although I agree it wouldn't be ideal at all), whereas the second option would have obvious pitfalls if you INCLUDE the file in more than one future document.
I would like to create a simple file repository in Ruby on Rails. Users have their accounts, and after one logs in they can upload a file or download files previously uploaded.
The issue here is the security. Files should be safe and not available to anyone but the owners.
Where, in which folder, should I store the files, to make them as safe as possible?
Does it make sense, to rename the uploaded files, store the names in a database and restore them when needed? This might help avoid name conflicts, though I'm not sure if it's a good idea.
Should the files be stored all in one folder, or should they be somewhat divided?
rename the files, for one reason, because you have no way to know if today's file "test" is supposed to replace last week's "test" or not (perhaps the user had them in different directories)
give each user their own directory, this prevents performance problems and makes it easy to migrate, archive, or delete a single user
put metadata in the database and files in the file system
look out for code injection via file name
This is an interesting question. Depending on the level of security you want to apply I would recommend the following:
Choose a folder that is only accessible by your app server (if you chose to store in the FS)
I would always recommend to rename the files to a random generated hash (or incremntally generated name like used in URL shorteners, see the open source implementation of rubyurl). However, I wouldn't store them in a database because filesystems are built for handling files, so let it do the job. You should store the meta data in the database to be able to set the right file name when the user downloads the file.
You should partition the files among multiple folders. This gives you multiple advantages. First, filesystems are not built to handle millions of files in a single folder. If you have operations that try to get all files from a folder this takes significantly more time. If you obfuscate the original file name you could create one directory for each letter in the filename and would get a fairly good distributed number of files per directory.
One last thing to consider is the possible collision of file names. A user should not be able to guess a filename from another user. So you might need some additional checks here.
Depending on the level of security you want to achieve you can apply more and more patterns.
Just don't save the files in the public folder and create a controller that will send the files.
How you want to organise from that point on is your choice. You could make a sub folder per user. There is no need to rename from a security point of view, but do try to cleanup the filename, spaces and non ascii characters make things harder.
For simple cases (where you don't want to distribute the file store):
Store the files in the tmp directory. DON'T store them in public. Then only expose these files via a route and controller where you do the authentication/authorisation checks.
I don't see any reason to rename the files; you can separate them out into sub directories based on the user ID. But if you want to allow the uploading of files with the same name then you may need to generate a unique hash or something for each file's name.
See above. You can partition them any way you see fit. But I would definitely recommend partitioning them and not lumping them in one directory.