An option ram_file is described in DETS Erlang docs
open_file(name, args)
{ram_file, boolean()} - Whether the table is to be kept in RAM. Keeping the table in RAM can sound like an anomaly, but can enhance the performance of applications that open a table, insert a set of objects, and then close the table. When the table is closed, its contents are written to the disk file. Defaults to false.
this will perform save on disk after insert or update
What if i'll use
open - then lookup - then close?
I haven't checked in the docs, but I assume that this mean that the VM will open the file and potentially mmap it, so it is constantly kept in the memory and it will be synchronised with the file on the disk, but I think that the change still can end in the cache, so it can be not immediately written to disk. If you want to be sure that all changes have been flushed to the disk then use dets:sync/1 call on the table to force data to be flushed which is explicitly stated in the docs:
This also applies to tables that have been opened with flag ram_file set to true. In this case, the contents of the RAM file are flushed to disk.
It will not open nor close file after each lookup, but it will keep it opened until dets:close/1 will not be called on the given table. Opening and closing table for each lookup on the other hand can be expensive, so it would make whole usage of DETS a little bit meaningless.
Related
I'm building an app that makes extensive use of CoreData and a lot of my models have UIImage and NSData properties (for images and videos). Since it's not a great idea to store that data directly into CoreData, I built a file manager class that writes the files into different buckets in the documents directory depends on the context in which was created and media type.
My question now is how do I manage the documents directory? Is there a way to detect how much space the app has used up out of its total allocated space? Additionally, what is the best way to go about cleaning those directories; do I check every time a file is written or only on app launch, ect ect.
Is there a way to detect how much space the app has used up out of its total allocated space?
Apps don't have a limit on total allocated space, they're limited by the amount of space on the device. You can find out how much space you're using for these files by using NSFileManager to scan the directories. There are several methods that do this in different ways-- check out enumeratorAtPath:, for example. For each file, use a method like attributesOfItemAtPath:error: to get the file size.
Better would be to track the file sizes as you create and delete files. Keep a running total, stored in user defaults. When you create a new file, increase it by the amount of new data. When you remove a file, decrease the running total.
Additionally, what is the best way to go about cleaning those directories; do I check every time a file is written or only on app launch, ect ect.
If these files are local data that's inherently part of the associated Core Data object, the sensible approach is to delete a file when its Core Data object is deleted. The managed object needs the data file, so don't delete the file if you still use the object. That means there must be some way to link the two, but I'm assuming that's already true since you say that these files are used by managed objects somehow.
If the files are something like cached data that's easily re-created or re-downloaded, you should put them in the location returned by NSTemporaryDirectory(). Then iOS can delete them when it thinks the space is needed. You can also clear out old files whenever it seems appropriate, by scanning for older files or ones that haven't been used in a while (the details depend on exactly how you use the files).
I'm importing data into Core Data and find that the save operation is slow. Using the iOS simulator, I watch the sqlite-wal file grow and grow until its over 7GB in size.
I'm importing approx 5000 records with about 10 fields. This isn't a lot of data.
Each object I'm inserting has a to-one relation to various other objects (6 relations total). All of those records combined equal less than 20 fields. There are no images or any binary data or anything that I can see that would justify why the resulting size of the WAL file is so huge.
I read the sqlite docs describing the wal file and I don't see how this can happen. The source data isn't more than 50 MB.
My app is multi-threaded.
I create a managed object context in the background thread that performs the import (creates and saves the core data objects).
Without writing the code out here, has anyone encountered this? Anyone have a thought on what I should be checking. The code isn't super simple and all the parts would take time to input here so lets start with general ideas.
I'll credit anyone who gets me going in the right direction.
Extra info:
I've disabled the undo manager for the context as I don't need that (I think it's nil by default on iOS but I explicitly set it to nil).
I only call save after the entire loop is complete and all managed objects are in ram (ram goes up to 100 MB btw).
The loop and creation of the core data objects takes only 5 seconds or so. The save takes almost 3 minutes as it writes the the awl file.
It seems my comment to try using the old rollback(DELETE) journal mode rather than WAL journal mode fixed the problem. NOTE that there seem to be a range of problems when using WAL journal mode including the following:
this problem
problems with database migrations when using the migratePersistentStore API
problems with lightweight migrations
Perhaps we should start a Core Data WAL problems page and get a comprehensive list and ask Apple to fix the bugs.
Note that the default mode under OS X 10.9 and iOS 7 now uses WAL mode. To change this back add the following option
#{ NSSQLitePragmaOptions : #{ #"journal_mode" : #"DELETE" } }
All changed pages of a transaction get appended to the -wal file.
If you are importing multiple records, you should, if possible, use a single transaction for the entire import.
SQLite cannot do a full WAL checkpoint while some other connection is reading the database (which might just be some statement that you forgot to close).
We are having a strange situation while trying to dbexport/dbimport an Informix database.
while importing the DB we got the error:
1213 - Character to numeric conversion error
I checked at which does does the import stops.
I edited the corresponding file (sed -n '1745813,1745815p' table.unl) and have seen data that look to be corrupt.
3.0]26.0]018102]0.0]20111001.0]0.0]77.38]20111012.0]978]04]0.0072]6.59]6.59]29.93]29.93]77.38]
3.0]26.0]018102]0.0]20111001.0]0.0]143.69]20111012.0]978]04]0.0144]6.59]6.59]48.79]48.79]143.69]
]0.000/]]-0.000000000000000000000000000000000000000000000000000044]8\00\00\07Ú\00\00Õ²\00\00\07P27\00\00\07Ú\00\00i]-0.00000000000000000000000000000000000000000000000000000000000000000000000000000000000000999995+']-49999992%(000000000000000000.0]-989074999997704800000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000.0]-999992%(0000000000000000000000.0]]]Ú\00\00]*00000015056480000000000000000000000000000000000000000000000000000000000.0]-92%'9999)).'000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000.0]-;24944999992%(000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000.0]-81%-999994;2475200000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000.0]]-97704751999992%(00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000.0]
The first two lines are OK. The rest seems to be corrupt data.
I do not know how the data appears here since it does not appear in select statement.
I exported only the affected table and figured out that the same data is there.
I looked for a filter that matches all the rows, I used it in another export. This time the corrupt data is not there.
Any idea about what might be the reason behind this?
Best Regards
Arthur
Arthur,
Trying answering the question, why the database is generating corrupted data.
You will need to investigate.
The common causes is :
Occur some crash at your OS/Hardware
Occur some lack of energy
Occur some crash at your database or they process are killed by some admin.
After any of problems above, your FS become corrupt and probably at recovery (fsck) mess the database data.
Probably you are working with a Journaling FS , which ext3,ext4,ntfs is...
If you don't know anything about any events like described before, you need to investigate into the online.log of our Informix database , looking for any start of the engine without a regular shutdown before. Look at your OS logs will help too to found for any involuntary restart of the OS (lack of energy or crash).
Now about the solutions.
Recover a backup
Then you can export just the table corrupted and replace it at your dbexport.
You can do this with archecker. (must be Informix version grater of 10.FC4)
This article maybe will help you if need : Table Level Restore - Pretty Useful Stuff
Export your table just like your describe at the comments.
But this will not recover the corrupt data, they just will "save" the "good" data and discard the "bad" data.
created a new table copy of the first one.
Insert into table 2 select * from table1 where (my filter which matched all rows)
recreated table indexes
renamed tables
Depending how is bad is the corrupted data sometimes you not able to export all "good" data at just one select, you need workaround the "bad" data , check this IBM article:
Unloading around table corruption
Ways to prevent this kind of problem or make easily any recover
First, of course, there is no way to prevent any crash...
What you can do is try minimize the damage after any crash.
Do not use journal file systems!
(at linux, use ext2 FS or RAW devices)
Enable KAIO (for RAW) or DIRECT_IO (of any FS) at Informix configuration.
This will prevent the database to use the OS cache, making more secure the process of writing data at your disk. At some situations this can slower down or speed up your database, depends a lot of your hardware/storage.
Configure your backup to work and test/check it with some frequency.
I recommend configure the backup of full database + logical logs backup.
Depending the version of the Informix and which license you have, you maybe have the rights to configure a cold RSS server ("cluster" secondary node) which will work as active-passive mode at different server and will reduce dramatically you chances to loose any data after any crash at the mainly server.
After any crash, run oncheck to detect the if occur some corruption :
How to use oncheck to detect corruption
My situation:
I have an app that needs to store 10,000 - 30,000 locations in some sort of storage method, which are then displayed on a MKMapView as individual pins. I also have a server that needs to be able to add to the database through pushing out changes.
Through grouping pins I've eliminated all issues with the MKMapView, my biggest focus is now on speed, storage and being able to add and replace the storage contents. What I'm currently doing is I have a text file of currently 1,000 locations as JSON-formatted, then they're just read as an array and sent to my custom map view (no issues there). My only issue is how I could update that text file (rather than downloading massive amounts of data), and store almost 30,000 locations.
Is this even feasible? It seems my current setup could scale pretty much perfectly, it's just this updating system that is causing me a headache.
Your current setup won't scale forever because you have to load the entire file into memory in one chunk. Eventually it will get to large to manage and will eat up to much memory. Unable to purge memory in the event of system low-memory, the system will shut your app down i.e. it won't be able to stay in the background but will have to reboot each time the user switches back to it.
To update, you will have to load in the entire file, parse the JSON, figure out how to update the resulting data structure, then write it all to file. One error anywhere in the process could corrupt the entire file.
You really need to look at using Core Data or even SQL. Core Data has a learning curve but once you master it, it makes implementing designs like your app trivial. You also get automatic scaling and efficient memory management.
We take text/csv like data over long periods (~days) from costly experiments and so file corruption is to be avoided at all costs.
Recently, a file was copied from the Explorer in XP whilst the experiment was in progress and the data was partially lost, presumably due to multiple access conflict.
What are some good techniques to avoid such loss? - We are using Delphi on Windows XP systems.
Some ideas we came up with are listed below - we'd welcome comments as well as your own input.
Use a database as a secondary data storage mechanism and take advantage of the atomic transaction mechanisms
How about splitting the large file into separate files, one for each day.
If these machines are on a network: send a HTTP post with the logging data to a webserver.
(sending UDP packets would be even simpler).
Make sure you only copy old data. If you have a timestamp on the filename with a 1 hour resolution, you can safely copy the data older than 1 hour.
If a write fails, cache the result for a later write - so if a file is opened externally the data is still stored internally, or could even be stored to a disk
I think what you're looking for is the Win32 CreateFile API, with these flags:
FILE_FLAG_WRITE_THROUGH : Write operations will not go through any intermediate cache, they will go directly to disk.
FILE_FLAG_NO_BUFFERING : The file or device is being opened with no system caching for data reads and writes. This flag does not affect hard disk caching or memory mapped files.
There are strict requirements for successfully working with files opened with CreateFile using the FILE_FLAG_NO_BUFFERING flag, for details see File Buffering.
Each experiment much use a 'work' file and a 'done' file. Work file is opened exclusively and done file copied to a place on the network. A application on the receiving machine would feed that files into a database. If explorer try to move or copy the work file, it will receive a 'Access denied' error.
'Work' file would become 'done' after a certain period (say, 6/12/24 hours or what ever period). So it create another work file (the name must contain the timestamp) and send the 'done' through the network ( or a human can do that, what is you are doing actually if I understand your text correctly).
Copying a file while in use is asking for it being corrupted.
Write data to a buffer file in an obscure directory and copy the data to the 'public' data file periodically (every 10 points for instance), thereby reducing writes and also providing a backup
Write data points discretely, i.e. open and close the filehandle for every data point write - this reduces the amount of time the file is being accessed provided the time between data points is low