How to deal with historicization data in a data lake vs data warehouse? - data-warehouse

It is possible (or even a core functionality) having data historicized within a classic data warehouse. Data will be added to the data warehouse over time and it is possible to move in time over the data.
If I just want to use the data lake and to have also data historicization for the business user, would this be possible? And if yes, how would a possible approach look like?

Yes - you can do it. If you just do inserts of data then you will have, by default, a full history of all your data.
The possible approaches would be entirely dependent on the technology you were running to support your data lake, how you have structured your data in the data lake, the tools your business users were using to access the data, etc. So without much more information from you it's not possible to give you an answer - other than the generic "yes, it is possible to hold historic data in a data lake"

Your classic data warehouse will bring data together, modelled with time series at the centre.
Data lakes hold the raw data in the original format, which typically will not be stored with time series in mind. You are able to store your data so that the time series and historical changes can be worked out, but a data lake will be missing the pre modelled, easily accessible time series aspect of a data warehouse.

Related

iOS: save text to documents directory or core data entity?

In my app, the user has a collection of leaves (and a leaf is a core data entity in my XCode project). I'd like to create a note-taking feature so that they can write/save notes for each leaf in their collection. I would imagine this would require saving text for each item as an attribute of the the core data entity or storing it in the documents directory
Is it better memory management/app efficiency to save text with core data or documents directory? I know that its not good to save images in core data and I was wondering if there are other best practices I should be aware of/can implement!
Since you're talking about written notes, there's no need to worry about efficiency at all. We're talking about a few hundred bytes per note. That's at least a factor 1000 away from where performance might become an issue.
Just do what's the easiest. As you already seem to suggest that would be storing the note texts directly in CoreData objects.

Best Way to save data locally on the device in iOS app

I am working on my first iOS app which will be deployed for both iPhones and iPads. The app contains data that needs to be bundled with the app, which will be used when the device will be offline.
The offline version has atleast 35-40 records with each record containing images(which would be bundled in the app, only names will be saved), and a varchar field which would be atleast 1000 words and a boolean field.
I have found three possible solution for the same
Save all the fields using database (SQlite or Coredata), however
I am concerned about the table which will have 1000 words. But since
the varchar field might vary, I need to allocate max 2000 (or more,depending on the actual length of keywords)
limit (which will lead unnecessary allocation of memory resources)
Another Approach I would like to have is save the information in form of json locally and use it as and when required and save the boolean fields(only true locally in NSUserDefaults)
Use the JSON Approach as discussed above and create a database for managing the boolean fields.
I would like to seek the opinion of StackOverflow community on what would be ideal/optimized approach for this scenario. Also, I am open for any other approach as well.
Edit 1
Proposed provisional databased structure
Listing Table
id -> int (autoincrement)
name -> varchar(25)
imagename -> varchar(10)
desription -> varchar(2000)
favorite -> boolean
It sounds as though the text field (with the 1000-2000 words) is static text that is bundled with the app and can not be changed by the user of the app. If that's the case, then you can store that data in the app bundle with plist files, or JSON files and load it on demand (assuming you don't need to search though it).
Then, if each of those records has only a single boolean value that is changeable by the user, those could be stored in NSUserDefaults very easily (since you've stated you're only dealing with 35-40 records). You'd use the id to link the boolean to the data file.
You could use Core Data or Realm to store the data, but it may be overkill if you don't need a search feature and the user can't change the text. But if you do go with a database option, be aware that you can not store static data (the text), in a location that is backed up by iCloud, or Apple will reject your app. Regardless of whether you use iCloud in the app or not. So if you were to create a Core Data persistent store and save it to the users Documents folder, then load in all the static data, you will be rejected. You would want to save that data store in the users Cache folder so that iCloud doesn't back it up. The issue you'll hit after that though is that you want the user's choices that are your boolean values backed up. This means they need to be stored in a different place. Core Data does have a feature that lets you create Configurations where it will separate the user changeable data from the non-changeable data, but again, that's overkill for your case.
I'd recommend starting with Realm over 'Core Data` for such a small dataset. It's much easier to get up and running.
If you need to look into the fields, CoreData is th best approach, because you can easily access your data using NSPredicates, ( Like an SQL where statement` ).
But if you need to load everything at each launch, you can just store everything to a file ( plist, son ... ), because it is really more easy to managed, and to update ( If you update CoreData Model, the change may be complicated on the App update ).
So my short answer is :
If you need to teach into your data => Core Data
Else => File on local Storage
Do not use UserDefault to achieve that, this is not designed for it.
It is depends on you data base's complexity and operation.
See, first thing is whatever data base system you used, there is no performance difference.
Every database system has different complexity and operation limitations.
For example, It is very simpler to use NSUserDefaults to store and retrieve data.
But if you required to do relational operation between bulk data then it is better to use sqlite or core data. Relational database operations are easily performed by sqlite or coredata compare to others.
There is another option is property - list also available if you data is only type of key value pair.
Core data is totally based on sqlite. In root core data itself using sqlite.
Difference between core data and sqlite is : core data provides more flexibility to use it but it is comparatively hard or complex to learn. Where sqlite not provide flexibility compare to core data but it is less complex to learn and use. Flexibility means for example : you can see visual representation of core data. Can visually add entity or attributes etc.
So, select data base as your need and complexity of use or base on operations that you will required to perform. Your database is not much big and not complex and not required any relational operations or multiple tables then you can use user defaults or property list also.
Hope this will help :)
Pitching in with an option.
To make it easy for yourself later on, I would suggest using CoreData. This lets you easily manage the products for reading and writing. This also gives you a good persistent storage.
To the description issue; you could store the description for each product in its own file with a unique name that references the product you store in CoreData. In your CoreData entity you define a descriptionFile, which will hold the file path.
This approach makes it easier for CoreData when you fetch the objects, maybe you want a browse view where you don't need to display the description, therefore you don't need to load the description text into memory. When a product is selected, load the description file for that product and display the text within.
Happy Coding :)

Should plists be imported to CoreData?

I have several big plists in my app. I use them to get necessary input data to my app. While app is running, this data used in various random visual representations. Also, I have favorites feature, where I save some favorite pieces of data. For favorites feature I use CoreData. I transfer some object from my "runtime" data to CoreData and save it.
But should I transfer all data from plists to CoreData, when I launch app for the first time? Or is it ok, to use plists to get data from them every launch?
For example, if we'd talking about reading app. We have some text file on disk. Should I transfer all file to CoreData, when launch first time? Or is it ok, just to save user bookmarks to CoreData?
Core data and plist both are used for store the data. so, if you get data from plist or core data at every launch, there is no problem at all. But if you want to manage complex relational database then you should use core data or sqlite. so, choose storing system as per your requirement like if you want to store user's default credential then you can use nsuserdefault and if you use it to store complex data then also it will work fine but you will possible to face trouble to face some kind of functional operation. So, main concern and your answer there is no difference you get in performance whatever database system you used.
Hoe this will help :)
If you have to only read the data or update all data from plist allmost all the time plist may be ok, also it will be more easy to access then Core Data
Both plist and Core data can be used as persistant storage, but Core Data will have some addtional benifit like i have listed below:
Data stored in the Core Data is pretty secure, so if you can store some sesitive information in the Core Data, data store in plist can be seen directly in some ways.
If you have to perform some insert,update,delete or search on the data it will be better on the Core Data instead of plist.
If you want something like relation or mapping between data it will be possible with Core Data only
So based on the requirement you can choose your storage options

Storing Potentially Large Amounts of Data with Core Data

I'm working on an app that lets users store items in different collections. Each item contains the item name, price, UPC, and any images the user associates with it either from the camera or from photo albums. The issue I'm seeing with using Core Data is that there could be issues if the user has a ton of items in their collections. I'm wanting to have this available completely offline. How should I go about this?
If you are using SQLite, Core Data can handle more data than you have physical space on an iOS device. SQLite has been used to store terabytes of data. Core Data handles the memory management extremely well. The combination makes the quantity of data a non-issue.

iOS Application: Portable Data Storage of many Key/Values

What is a good way to store many key/value pair entries in a mobile (iOS) application, such that they can be easily exported/imported?
I have considered a single large JSON file - would this be too slow/large with 200,000+ entries?
I have also considered CoreData - but could the data be moved easily via, for example, email?
Think of an address book. Contacts can be easily imported/exported, what data storage model would be comparable?
Thank you.
EDIT: Examples
Notes - be able to select and view short notes in a table. Each note is < 100 characters.
Saved bookmarks - each bookmark is stored in a table.
I have considered a single large JSON file - would this be too
slow/large with 200,000+ entries?
I don't know. I can make a guess. The guess would be yes, it's both too large and too slow. However, you can always test it to find out.
I have also considered CoreData - but could the data be moved easily
via, for example, email?
That depends on how you want to share the data. You call email easy?
Core Data is a framework. You can use any type of backend you want (you can even write your own). The most common is probably SQLite.
If you use Core Data, you can keep the data files in a separate subdirectory and copy them just like any other file.
However, if you want to share data via an online service, you may want the ability to import/export JSON files.
If you are talking about synchronization, then that's a different beast entirely.
Basically, there is no single right answer. You have to assess your requirements, and then determine which solution meets your needs.
On the surface, it seems like using Core Data would be a good fit, but it depends on how you want to use the data in your application. Only you know that answer.

Resources