I've coded a random access file to be created, and am wondering;
Is the random access file stored on the RAM, or the hard drive?
If it's the hard drive, why is it called a "Random Access" file?
Thanks
Random access has nothing to do with storage location (which for a file is the disk). It has to do with how you can access (read/write) that file content.
Random access means you can access any location in the file between the start and end, in any order, at any time. It's the opposite of sequential access, which means you can only access the file from start to end.
In other words, with random access you can read the last byte (or block of bytes) from the file, the first byte (or block), and then a byte or block from the middle somewhere. With sequential access, you have to read the first byte/block, then the second byte/block, then the third, and so on.
Random access is opposite to sequential access. Random file read / write is somewhat similar to accessing RAM. A file is a file, can be read into RAM, but stored in hard drive.
Related
We have a scenario in our project where there are files coming from the client with the same file name, sometimes with the same file size too. Currently when we upload a file, we are checking the new file name with the existing files in the database and if there is a reference we are marking it as duplicate and would not allow to upload at all. But now we have a requirement to check the content of the file when they have the same file name. So we need to find out a solution to differentiate such files based on contents. So, how do we efficiently do that - meaning how to do it avoiding even a minute chance of error?
Rails 3.1, Ruby 1.9.3
Below is one option I have read from a web reference.
require 'digest'
digest_value = Digest::MD5.base64digest(File.read( file_path ))
And the above line will read all the contents of the incoming file and based on which it will generate a unique hash, right? Then we can use it for unique file identification. But we have more than 500 users simultaneously working in 24/7 mode and most of them will be doing this operation. So, if the incoming file has a huge size (> 25MB) then the Digest will take more time to read the whole contents and there by suffer performance issues. So, what could be a better solution considering all these facts?
I have read the question and the comments and I have to say you have the problem stated not 100% correct. It seems that what you need is to identify identical content. Period. Despite whether name and size are equal or not. Correct me if I am wrong, but you likely don’t want to allow users to update 100 duplicates of the same file just because the user has 100 copies of it in local, having different names.
So far, so good. I would use the following approach. The file name is not involved anyhow. The file size might help in terms of fast-check the uniqueness (sizes differ hence files are definitely different.)
Then one might allow the upload with an instant “OK” response. Afterwards, the server in the background should run Digest::MD5, comparing the file against all already uploaded. If there is a duplicate, the new copy of the file should be removed, but the name should stay on the filesystem, being a symbolic link to the original.
That way you’ll not frustrate users, giving them an ability to have as many copies of the file as they want under different names, while preserving the HDD volume at the lowest possible level.
I'm building an app that makes extensive use of CoreData and a lot of my models have UIImage and NSData properties (for images and videos). Since it's not a great idea to store that data directly into CoreData, I built a file manager class that writes the files into different buckets in the documents directory depends on the context in which was created and media type.
My question now is how do I manage the documents directory? Is there a way to detect how much space the app has used up out of its total allocated space? Additionally, what is the best way to go about cleaning those directories; do I check every time a file is written or only on app launch, ect ect.
Is there a way to detect how much space the app has used up out of its total allocated space?
Apps don't have a limit on total allocated space, they're limited by the amount of space on the device. You can find out how much space you're using for these files by using NSFileManager to scan the directories. There are several methods that do this in different ways-- check out enumeratorAtPath:, for example. For each file, use a method like attributesOfItemAtPath:error: to get the file size.
Better would be to track the file sizes as you create and delete files. Keep a running total, stored in user defaults. When you create a new file, increase it by the amount of new data. When you remove a file, decrease the running total.
Additionally, what is the best way to go about cleaning those directories; do I check every time a file is written or only on app launch, ect ect.
If these files are local data that's inherently part of the associated Core Data object, the sensible approach is to delete a file when its Core Data object is deleted. The managed object needs the data file, so don't delete the file if you still use the object. That means there must be some way to link the two, but I'm assuming that's already true since you say that these files are used by managed objects somehow.
If the files are something like cached data that's easily re-created or re-downloaded, you should put them in the location returned by NSTemporaryDirectory(). Then iOS can delete them when it thinks the space is needed. You can also clear out old files whenever it seems appropriate, by scanning for older files or ones that haven't been used in a while (the details depend on exactly how you use the files).
what is the difference between file and random access file?
A random access file is a file where you can "jump" to anywhere within it without having to read sequentially until the position you are interested in.
For example, say you have a 1MB file, and you are interested in 5 bytes that start after 100k of data. A random access file will allow you to "jump" to the 100k-th position in one operation. A non-random access file will require you to read 100k bytes first, and only then read the data you're interested in.
Hope that helps.
Clarification: this description is language-agnostic and does not relate to any specific file wrapper in any specific language/framework.
Almost nothing these days. There used to be a time in certain operating systems where there were different types of files - some of which could be accessed randomly (at any point in the file) and others which could only be accessed sequentially. This made more sense when you were using a sequential medium such as tape. Any file system worth its salt these days only supports random access.
I have an MFC app which is wizard based. The App asks a user a variable number of questions which are then written to an INI file which is later encrypted when the user clicks Finish.
All the INI file parsers I have seen so far seen read or write to a physical file on Disk. I don't want to do this as the INI file contains confidential information. Instead I would like the INI file to be only based in memory and never written to disk in an un-encrypted form.
As the app allows users to go back and change answers, It occurred to me that I could use an in memory Database for this purpose but again I do not want anything written to Disk and don't want to ship a DB with my app if it can be avoided.
I have to use an INI file as it the file when un-encrypted will be processed by a 3rd party.
Any suggestions welcomed.
Thanks..
I have an IniFile C++ class which allows you to work with Ini files in memory:
http://www.lemonteam.com/downloads/inifile.h
It's a short, well documented single .h file. Sample usage:
IniFile if ( "myinifile.ini" );
if.SetString( "mykey", "myvalue" );
// Nothing gets actually written to disk until you call Flush(), Close() or the object is deleted
if.Flush();
if.Close();
You should be able to modify the Flush() method so that it applies some kind of encryption to the saved data.
Sounds like a good application for a memory-mapped file, since you can control when your in-memory view gets flushed back to the file on disk.
Why would you need to have it in an ini file format if it is never stored to disk?
Why not just keep it in memory as a data structure and use your normal ini file methods to write it to disk when you want to.
If you don't want to save into file, what is the point of using INI file then?
INI API is bascially a property bag or key value pair based on disk file.
If you don't want to use file, I suggest you use your own hash or dictionary data structure to store the key value parirs
We take text/csv like data over long periods (~days) from costly experiments and so file corruption is to be avoided at all costs.
Recently, a file was copied from the Explorer in XP whilst the experiment was in progress and the data was partially lost, presumably due to multiple access conflict.
What are some good techniques to avoid such loss? - We are using Delphi on Windows XP systems.
Some ideas we came up with are listed below - we'd welcome comments as well as your own input.
Use a database as a secondary data storage mechanism and take advantage of the atomic transaction mechanisms
How about splitting the large file into separate files, one for each day.
If these machines are on a network: send a HTTP post with the logging data to a webserver.
(sending UDP packets would be even simpler).
Make sure you only copy old data. If you have a timestamp on the filename with a 1 hour resolution, you can safely copy the data older than 1 hour.
If a write fails, cache the result for a later write - so if a file is opened externally the data is still stored internally, or could even be stored to a disk
I think what you're looking for is the Win32 CreateFile API, with these flags:
FILE_FLAG_WRITE_THROUGH : Write operations will not go through any intermediate cache, they will go directly to disk.
FILE_FLAG_NO_BUFFERING : The file or device is being opened with no system caching for data reads and writes. This flag does not affect hard disk caching or memory mapped files.
There are strict requirements for successfully working with files opened with CreateFile using the FILE_FLAG_NO_BUFFERING flag, for details see File Buffering.
Each experiment much use a 'work' file and a 'done' file. Work file is opened exclusively and done file copied to a place on the network. A application on the receiving machine would feed that files into a database. If explorer try to move or copy the work file, it will receive a 'Access denied' error.
'Work' file would become 'done' after a certain period (say, 6/12/24 hours or what ever period). So it create another work file (the name must contain the timestamp) and send the 'done' through the network ( or a human can do that, what is you are doing actually if I understand your text correctly).
Copying a file while in use is asking for it being corrupted.
Write data to a buffer file in an obscure directory and copy the data to the 'public' data file periodically (every 10 points for instance), thereby reducing writes and also providing a backup
Write data points discretely, i.e. open and close the filehandle for every data point write - this reduces the amount of time the file is being accessed provided the time between data points is low