Bundle and load objects to be randomly read in an iOS app - ios

I want to bundle JSON objects, each keyed by a unique int, so that my app can randomly deserialize X of them at any given point. I see two ways to do this without touching core data (please tell me if there's more):
1) Bundle a single JSON file which stores all the objects, deserialize the file during startup into some collection, and randomly deserialize X objects
2) Bundle each object in a separate file named by its unique int, fetch random X files and deserialize the object
Which approach is better? Is there a limit on how many files an iOS app can store? If the approach is dependent on the number of strings the app has to store, please say at which point is better to do one approach over the other.

I would tend to favour the multiple files approach, although it does depend a little on how large the files (and total amount of data) are. Either way the data is going to be included in your app bundle - so the app distribution size will be the same.
Using individual files will use less RAM while your app is running, at the expense of some processor time to read in the files. If you need to frequently change the files you are using (once per second or less) then a single file in memory may be better.
There is a limit on the number of files you can store in the file system, but unless you are talking about millions it probably won't be an issue.

Related

Uniquely identify files with same name and size but with different contents

We have a scenario in our project where there are files coming from the client with the same file name, sometimes with the same file size too. Currently when we upload a file, we are checking the new file name with the existing files in the database and if there is a reference we are marking it as duplicate and would not allow to upload at all. But now we have a requirement to check the content of the file when they have the same file name. So we need to find out a solution to differentiate such files based on contents. So, how do we efficiently do that - meaning how to do it avoiding even a minute chance of error?
Rails 3.1, Ruby 1.9.3
Below is one option I have read from a web reference.
require 'digest'
digest_value = Digest::MD5.base64digest(File.read( file_path ))
And the above line will read all the contents of the incoming file and based on which it will generate a unique hash, right? Then we can use it for unique file identification. But we have more than 500 users simultaneously working in 24/7 mode and most of them will be doing this operation. So, if the incoming file has a huge size (> 25MB) then the Digest will take more time to read the whole contents and there by suffer performance issues. So, what could be a better solution considering all these facts?
I have read the question and the comments and I have to say you have the problem stated not 100% correct. It seems that what you need is to identify identical content. Period. Despite whether name and size are equal or not. Correct me if I am wrong, but you likely don’t want to allow users to update 100 duplicates of the same file just because the user has 100 copies of it in local, having different names.
So far, so good. I would use the following approach. The file name is not involved anyhow. The file size might help in terms of fast-check the uniqueness (sizes differ hence files are definitely different.)
Then one might allow the upload with an instant “OK” response. Afterwards, the server in the background should run Digest::MD5, comparing the file against all already uploaded. If there is a duplicate, the new copy of the file should be removed, but the name should stay on the filesystem, being a symbolic link to the original.
That way you’ll not frustrate users, giving them an ability to have as many copies of the file as they want under different names, while preserving the HDD volume at the lowest possible level.

Solution For Monitoring and Maintaining App's Size on Disc

I'm building an app that makes extensive use of CoreData and a lot of my models have UIImage and NSData properties (for images and videos). Since it's not a great idea to store that data directly into CoreData, I built a file manager class that writes the files into different buckets in the documents directory depends on the context in which was created and media type.
My question now is how do I manage the documents directory? Is there a way to detect how much space the app has used up out of its total allocated space? Additionally, what is the best way to go about cleaning those directories; do I check every time a file is written or only on app launch, ect ect.
Is there a way to detect how much space the app has used up out of its total allocated space?
Apps don't have a limit on total allocated space, they're limited by the amount of space on the device. You can find out how much space you're using for these files by using NSFileManager to scan the directories. There are several methods that do this in different ways-- check out enumeratorAtPath:, for example. For each file, use a method like attributesOfItemAtPath:error: to get the file size.
Better would be to track the file sizes as you create and delete files. Keep a running total, stored in user defaults. When you create a new file, increase it by the amount of new data. When you remove a file, decrease the running total.
Additionally, what is the best way to go about cleaning those directories; do I check every time a file is written or only on app launch, ect ect.
If these files are local data that's inherently part of the associated Core Data object, the sensible approach is to delete a file when its Core Data object is deleted. The managed object needs the data file, so don't delete the file if you still use the object. That means there must be some way to link the two, but I'm assuming that's already true since you say that these files are used by managed objects somehow.
If the files are something like cached data that's easily re-created or re-downloaded, you should put them in the location returned by NSTemporaryDirectory(). Then iOS can delete them when it thinks the space is needed. You can also clear out old files whenever it seems appropriate, by scanning for older files or ones that haven't been used in a while (the details depend on exactly how you use the files).

Loading images from AppBundle vs. CoreData

I'm making a catalog where the cells in my collection view will be either an image with a label or a pdf. There will be many collections and they themselves will be static. I want the user to be able to save the cells he likes and view them in his own custom view.
1) I could to store the image as data in Core Data.
2) I could just include the image in my App Bundle and load the image from there every time my app starts.
I've got it into to my head that reading data from a Core Data Store would give me more options when building my app as well as offer some boost in performance as opposed to reading it from the app bundle. Is that true? Keeping in mind of course that most of the data is static.
It seems inefficient to have images both serialized images in my app bundle and the pure data as well.
I think I'd rather have it all in the store but they have to be loaded from the bundle at some point in code right?
I'd love to know how other developers do it.
Now in Core Data there is an "allows external storage" option for binary data, which basically means if your file is bigger than 1 MB it will be stored automatically outside of your database, and you have to do nothing differently. In my opinion that's the way to get the best of both worlds, increased performance + automatization + fast queries (although they are slower than usual when you allow external storage, but still faster than doing it yourself)

iOS: using iCloud document storage for a small XML based database

Just wanted to know if this is a good idea:
I want to use iCloud to sync data between different devices in my iOS app. It's just a list of small objects without connections. But storing this list in the key/value store won't work because it's space is restricted to 1 MB or so and the list might get bigger (not much, but could...). Core data seems like an overkill to me and there is also the problem of possible duplicates.
So I wonder if it makes sense to subclass UIDocument to handle the XML file. Every object has an ID, so merging different versions of the file should be no problem.
The choice of XML depends on the format of the data store (monolithic or transactions) and the volume of updates. If the entire file (1 MB+) is constantly being written to by your app (and hence sync'ed to iCloud) or if a small change causes the entire store to be sync'ed to iCloud then I would use Core Data. The advantage of core data is that only the transaction logs you require (or have changed) are synced.

How efficient iOS file system in dealing with large number of files in single folder

If I have large number of files (n x 100K individual files) what would be most efficient way to store them in iOS file system (from speed of access to the file by path point of view)? Should I dump them all in single folder or break them in multilevel folder hierarchy.
Basically this breaks in three questions:
does file access time depend on number of "sibling" files (I think
answer is yes. If I am correct file names are organized into b-tree
so it should be O(log n))?
how expensive is traversing from one folder to another along the
path (is it something like m * O( log nm ) - where m is number of
components in the path and nm is number of "siblings" at each path
component )?
What gets cached at file system level to make above assumptions incorrect?
It would be great if some one had direct experience with this kind of problem and can share some real life results.
You comments will be highly appreciated
This seems like it might provide relevant, hard data:
File System vs Core Data: the image cache test
http://biasedbit.com/blog/filesystem-vs-coredata-image-cache
Conclusion:
File system cache is, as expected, faster. Core Data falls shortly behind when storing (marginally slower) but load times are way higher when performing single random accesses.
For such a simple case Core Data functionality really doesn't pay up, so stick to the file system version.
I think you should store everything is a one folder and create a hash table which include key (file name) and value (source path) pare.By creating hash table complexity with be constant log(1) and this will speed up your process as well.
The file system is not an optimal database. With that many thousands of files, you should consider using Core Data, or other database instead to store the name and contents of each file.

Resources