I've linked a datasource to mindsdb, using scout. I'm wondering, does it automatically pull new data and retrain the model incrementially, or do I have to trigger and retrain the entire thing?
MindsDB supports a SQL statement that could trigger the retrain of the existing model called RETRAIN. For e.g, if your model is called churn_predictor you will call:
RETRAIN churn_predictor
Auto-retraining functionality is in beta and it will be first available only for streams(Kafka, Redis).
at the moment in the open-source version you will need to retrain manually. We are looking at releasing model versioning and auto retraining and comparison but this feature is not yet available in the OS version. For more info, you can visit https://docs.mindsdb.com/databases/.
Related
Im looking for a similar function to run spaCy offline like this one from huggingFaceexample
basiclly, i want to use local files only and not try to look things up whan using a model.
So when i use a model it will not change even if there is an update avilable and i will use the model i have locally.
i have tried using nlp.to_disk("./en_example_pipeline") and then using spacy.load("./en_example_pipeline") but I'm not sure that this method will not update the model if one is avilable. the docs are not clear.
spaCy does not automatically update or download models on its own.
For pretrained pipelines provided by spaCy, they are only downloaded if you use the spacy download command or related functions.
In the specific case of spacy-transformers, which uses HuggingFace Transformers internally, if you specify a model and it's not present already, it will be downloaded. When a model is loaded it may be checked for updates, it depends on the HuggingFace library and the model implementation.
Models are never automatically checked for updates by spaCy code, but
models can contain arbitrary code, so you may need to be careful about that. spaCy's pretrained pipelines are not automatically checked for updates.
If you are using a pipeline you created yourself, like with nlp.to_disk, spaCy won't access the Internet (unless one of your own components is doing that).
I have a stream of user-item pairs, hold a block based on last 6M records and update it each minute. I don't like that between these rebuilds some important data might be unused. For example new user has joined the system, but the model doesn't know about him yet. I've found class PlusAnonymousConcurrentUserDataModel, which allows to add few entries to the model and get more accurate recommendation. Documentation proposes more constrained usage scenario for it yet: I have to:
allocate temporary user
add extra data
get recommendation
and then release user and extra data
Is it ok to use this class for collecting data iteratively till model is actually rebuilt by timer? What is the right way to do this? It seems that PlusAnonymousConcurrentUserDataModel is a bit for different purposes.
This part of Mahout is very old an being deprecated. I think it is not even in the 0.14.0 build, you would have to build from source.
Mahout now uses a whole new technology for recommending. The new algorithm is called Correlated Cross-Ocurrence (CCO). The old method you are using does not make use of real time input as you have outlined. CCO can recommend to anonymous users that have not been built into the model as long as there is behavioral data for them in some form.
The architecture to implement CCO requires a datastore in a DB and a KNN engine (search engine) to make model queries. These are all packaged together in Apache PredictionIO + the Universal Recommender template.
Community support for the Universal Recommender itself can be found here: https://groups.google.com/forum/#!forum/actionml-user or on the mailing lists of the other projects.
My submitted app has two coredata models and I made some changes to the model by adding some attributes to the current model. so I added a new version to the model and I enabled lightweight migration but this error appear when migrating reason = "Can't find model for source store"
I follow the second answer at this question [Core Data - lightweight migrations and multiple core data model files (xcdatamodel) ][2] and it works great at simulator but not work at devices and fire the same error.
So maybe a step-by-step explanation is best...?
I encountered confusion when learning about data models, so I am presenting my own ideas about the issues that I stumbled over during this part of my Core Data education... (which is, by the way, only really just begun in the overall scheme of things).
I cannot stress enough the importance of reading a couple of good books and developing a solution based on the advice contained within... so with that in mind...
A book that I often recommend for those interested in Core Data is from The Pragmatic Bookshelf – "Core Data, 2nd Edition, Data Storage and Management for iOS, OS X, and iCloud" (Jan 2013) by Marcus S. Zarra, and in particular Chapter 3 titled "Versioning and Migration”.
It is important to recognise that to migrate successfully, Core Data requires ALL PREVIOUS ORIGINAL INTACT UNALTERED VERSIONS of the data model.
Why?
An example...
user1 updates every time a new version of the app is released, however in the latest update, this correlates with the third oldest data model version.
user2 has not updated the app for four months - three App Store release / versions ago, which happens to correlate with the seventh oldest data model version.
user3 was using an Android phone, realised the error of his ways, and returned to his iPhone 4, with your app installed, but not updated for one year, which correlates with the nineteenth oldest data model version, when the app used two different data model containers.
So how is Core Data to know how to migrate the previous app's SQLite database to the current version, so that the database will work with the code in your app?
(Now I don't understand this entirely so please forgive my ignorance, but) my understanding is that Core Data uses the hash signed values of previous versions of your data model to identify which data model your app is currently using, and based on that, applies migration to update your data model - and here is the important part - ONE DATA MODEL VERSION AT A TIME!
This is critical to understand. When you understand this, you understand that Core Data requires ALL previous data model versions, unmodified, to migrate successfully. Each previous data model version is required to successfully complete each step in the migration process.
Let's review my example.
When they download the app update, and run the app for the first time following this latest update:
user1's version of the app has three data model versions to migrate to arrive at an SQLite database that aligns with the latest data model.
user2's version of the app has seven data model versions to migrate to arrive at an SQLite database that aligns with the latest data model.
user3's version of the app has nineteen data model versions to migrate, but to add to this, the two previous data models must merge between data models version 10 and data models version 11 (for example), to arrive at an SQLite database that aligns with the latest data model.
So if you remove or alter any of the previous data model containers or versions, how is Core Data to know how to successfully migrate?
With this in mind, I provide the following advice...
Keep ALL previous data models and versions in their respective .xcdatamodeld containers.
In the case that you have more than one data model that must migrate, keep the ORIGINAL versions of these data model .xcdatamodeld containers, and use the appropriate Core Data methods to merge the containers when necessary.
In direct response to the question, I suspect that at some stage you have modified the previous data model containers or versions, to suit your testing on the simulator. When testing on a device, the "different" hash signed values for the data models on device do not match anything that remains in your data model version containers, and so that Build & Run throws the error you noted in your question.
My advice is to rebuild your data model version containers (.xcdatamodeld files) as they were, to enable Core Data to properly migrate through all previous versions (and merge model containers as necessary) to arrive at the appropriate, and latest, SQLite database.
Hope this helps.
Let me know if I have missed the mark, and I will continue my investigation.
I'm (hopefully) close to releasing my first app which uses Core Data. Now I've read all of the articles and posts regarding lightweight migration and it makes sense. The only question I had is, do I have to do anything before I ship the app?
My understanding is, after I release, if I want to change anything in the model, I set up a second model object with the NSMigratePersistentSToreAutomaticallyOptio and NSInferMappingModelAUtomatiallyOption in the App Delegate.
Do I have to do anything else before I release the first version of my app?
Thanks,
You will need to create a second version of the object model (i.e. the Core Data graph, which you do in Xcode rather than in code) if you want to make any modifications, and set up your persistent store object using the method described here. Provided that you aren't making very complex changes to the data model or moving to a new model this will usually just work.
Is there an automatic way using XCode5, to update the class definitions after modifying the core data model (of course I'm talking about the classes regarding the data model ;) ).
I checked here, but the solution was not satisfying and since xcode 5 is out, maybe there is something new.
Thnaks
I normally implement all my custom code of managed object subclasses in categories. This is already part of my normal workflow and counts as an established best practice.
Now regenerating your class definitions is nothing more than choosing one menu command. I think this is an acceptable degree of automation.
Beats any third party framework import, learning curve and maintenance.