I am trying to build a CQRS and event sourced Rails 5.2.x application using RailsEventStore.
Now I would like to project my event stream into a relational model, ideally just using ActiveRecord and the PostgreSQL database I also used for my event store.
In the documentation of RailsEventStore I only found on-the-fly, non-persistent projections.
Is there any infrastructure available to continuously build and update a relational representation of an event stream? It needs to remember which events have already been applied to the relational model across restarts of the application.
In case you know how to do it, please let me know.
There is no out-of-the-box background process in RailsEventStore to support persistent projections the same way EventStore database does them.
There are however pieces you can fit together to achieve something similar — event handlers and linking.
My colleague Rafał put together a few posts documenting this approach:
https://blog.arkency.com/using-streams-to-build-read-models/
https://blog.arkency.com/read-model-patterns-in-case-of-lack-of-order-guarantee/
If you'd like to implement such projection as a separate background process rather to rely on event handlers (whether synchronous or not) then Distributed RailsEventStore with PgLinearizedRepository might be a good starting point.
Related
My question might seem a bit naive, but as a beginner iOS developer, I'm starting to think that Core Data is replaceable by firebase realtime database (or firestore in the future). I used both of them in two seperate projects and after activating the offline feature in firebase, I got the same results (that is, the data was saved to the device without the need for an internet connection). I think I read something in the firebase documentation about it not being able to filter and sort at the same time which would probably mean that Core Data can be more convenient for complex queries. It would be great to have some senior developers' views on this subject.
Thanks in advance.
The question is a bit off-topic for SO (IMO) and is (kind of) asking for opinions but it may be worth a high-level answer. I use both platforms daily.
Core Data and Firebase are two unrelated platforms used to (manage and) store data; it's hard to directly compare them without understanding your use case.
CD is a framework used to model objects in your app. It's the 'front end' of data storage, where the 'back end' could be SQL, flat files, plists etc. It's more of a single user concept which stores data locally on the device (it has cloud functionality but that's a different topic).
Firebase on the other hand is a live, event driven, cloud based, multi user capable NoSQL storage. While it offers off-line persistence, that's really for situations where you need to be interacting with data when the device is temporarily disconnected from the internet.
It is not correct that:
firebase documentation about it not being able to filter and sort at
the same time
But, your Firebase structure is dependent on what you want to get out of it - if it's structured correctly, it can be filtered and sorted at the same time in a variety of very powerful (and faaast) ways.
Core Data is really an incredible technology and building relationships between objects is very straight forward and has SQL-like queries for retrieving data.
If you are looking for a database that leverages local storage - go with Core Data or another database that's really strong locally such as Realm, MySql and a number of others.
If you want to have Cloud based, multi-user, event driven storage, Firebase is a very strong contender (Realm is another option as well)
I would suggest building a very simple To-Do type app and use Firebase for storage in one and then build another using Core data. Should only be a couple of hours of work but it will really give you some great basic experience with both - you can make a more informed decision from there.
I'm facing difficulties with finding the best CEP product for our problem. We need a distributed CEP solution with shared memory. The main reason for distribution isn't speeding up the process, but having a fallback in case of hardware or software problems on nodes. Because of that, all nodes should keep their own copy of the event-history.
Some less important requirements to the CEP product are:
- Open source is a big pre.
- It should run on a Linux system.
- Running in a Java environment would be nice.
Which CEP products are recommended?
A number of commercial non-open source products employ a distributed data grid to store the stateful event processing data in a fault-tolerant manner. My personal experience is with TIBCO BusinessEvents, which internally uses TIBCO ActiveSpaces. Other products claim do similar things, e.g., Oracle Event Processing uses Oracle Coherence.
Open source solutions, I wouldn't be aware that any of them offers functionality like this out of the box. With the right skills you might be able to use them in conjunction with a data grid (I've seen people try to use Drools Fusion together with infinispan), but there are quite a number of complexities that you need think about that a pre-integrated product would take care of for you (transaction boundaries, data access, keeping track of changes, data modeling).
An alternative you might consider if performance doesn't dictate a distributed/load-balanced setup could be to just run a hot standby, i.e., two engines performing the same CEP logic, but only one engine (the active one) actually triggering outgoing actions. The hot-standby engine would be just evaluating the CEP logic to have the data in its memory ready to take over in case of failure but not trigger outgoing actions as long as the other engine is running.
I installed InstantObjects for Delphi today, studied sample application. Everything seems to work clear and fine. Just one question so far - is it possible to map InstantObjects classes to existing database tables instead of creating a new database?
Unless it's changed recently, due to it's architecture, InstantObjects requires total control over the database which makes using it against a legacy database somewhat difficult. Your best bet, if you want to carry on using IO would be to write some kind of import routine from your legacy database and map the field values onto your IO Objects, then save them across to the main IO persistence layer. You might get some more information by posting posting on the Instant Objects newsgroups.
Alternatively, there are other OPFs (e.g. tiOPF), which work better with legacy databases.
I'm developing an iOS application using Core Data. I want to have the persistent store located in a shared location, such as a network drive, so that multiple users can work on the data (at different times i.e. concurrency is not part of the question).
But I also want to offer the ability to work on the data "offline", i.e. by keeping a local persistent store on the iPad. So far, I read that I could do this to some degree by using the persistent store coordinator's migration function, but this seems to imply the old store is then invalidated. Furthermore, I don't necessarily want to move the complete store "offline", but just a part of it: going with the simple "company department" example that Apple offers, I want users to be able to check out one department, along with all the employees associated with that department (and all the attributes associated with each employee). Then, the users can work on the department data locally on their iPad and, some time later, synchronize those changes back to the server's persistent store.
So, what I need is to copy a core data object from one store to another, along with all objects referenced through relationships. And this copy process needs to also ensure that if an object already exists in the target persistent store, that it's overwritten rather than a new object added to the store (I am already giving each object a UID for another reason, so I might be able to re-use the UID).
From all I've seen so far, it looks like there is no simple way to synchronize or copy Core Data persistent stores, is that a fair assessment?
So would I really need to write a piece of code that does the following:
retrieve object "A" through a MOC
retrieve all objects, across all entities, that have a relationship to object "A"
instantiate a new MOC for the target persistent store
for each object retrieved, check the target store if the object exists
if the object exists, overwrite it with the attributes from the object retrieved in steps 1 & 2
if the object doesn't exist, create it and set all attributes as per object retrieved in steps 1 & 2
While it's not the most complicated thing in the world to do, I would've still thought that this requirement for "online / offline editing" is common enough for some standard functionality be available for synchronizing parts of persistent stores?
Your point of views greatly appreciated,
thanks,
da_h-man
I was just half-kidding with the comment above. You really are describing a pretty hard problem - it's very difficult to nail this sort of synchronization, and there's seldom, in any development environment, going to be a turn-key solution that will "just work". I think your pseudo-code description above is a pretty accurate description of what you'll need to do. Although some of the work of traversing the relationships and checking for existing objects can be generalized, you're talking about some potentially complicated exception handling situations - for example, if updating an object, and only 1 out 5 related objects is somehow out of date, do you throw away the update or apply part of it? You say "concurrency" is not a part of the question, but if multiple users can "check out" objects at the same time, unless you plan to have a locking mechanism on those, you would start having conflicts when trying to make updates.
Something to check into are the new features in Core Data for leveraging iCloud - I doubt that's going to help with your problem, but it's generally related.
Since you want to be out on the network with your data, another thing to consider is whether Core Data is the right fit to your problem in general. Since Core Data is very much a technology designed to support the UI and MVC pattern in general, if your data needs are not especially bound to the UI, you might consider another type of DB solution.
If you are in fact leveraging Core Data in significant ways beyond just modeling, in terms of driving your UI, and you want to stick with it, I think you are correct in your analysis: you're going to have to roll your own solution. I think it will be a non-trivial thing to build and test.
An option to consider is CouchDB and an iOS implementation called TouchDB. It would mean adopting more of a document-oriented (JSON) approach to your problem, which may in fact be suitable, based on what you've described.
From what I've seen so far, I reckon the best approach is RestKit. It offers a Core Data wrapper that uses JSON to move data between remote and local stores. I haven't fully tried it yet, but from what the documentation reads, it sounds quite powerful and ideally suited for my needs.
You definetly should check these things:
Parse.com - cloud based data store
PFIncrementalStore https://github.com/sbonami/PFIncrementalStore - subclass of NSIncrementalStore which allows your Persistent Store Coordinator to store data both locally and remotely (on Parse Cloud) at the same time
All this stuff are well-documented. Also Parse.com is going to release iOS local datastore SDK http://blog.parse.com/2014/04/30/take-your-app-offline-with-parse-local-datastore/ wich is going to help keep your data synced.
I want to post in memory some child rows, and then conditionally post them, or don't post them to an underlying SQL database, depending on whether or not a parent row is posted, or not posted. I don't need a full ORM, but maybe just this:
User clicks Add doctor. Add doctor dialog box opens.
Before clicking Ok on Add doctor, within the Add doctor dialog, the user adds one or more patients which persist in memory only.
User clicks Ok in Add doctor window. Now all the patients are stored, plus the new doctor.
If user clicked Cancel on the doctor window, all the doctor and patient info is discarded.
Try if you like, mentally, to imagine how you might do the above using delphi data aware controls, and TADOQuery or other ADO objects. If there is a non-ADO-specific way to do this, I'm interested in that too, I'm just throwing ADO out there because I happen to be using MS-SQL Server and ADO in my current applications.
So at a previous employers where I worked for a short time, they had a class called TMasterDetail that was specifically written to add the above to ADO recordsets. It worked sometimes, and other times it failed in some really interesting and difficult to fix ways.
Is there anything built into the VCL, or any third party component that has a robust way of doing this technique? If not, is what I'm talking about above requiring an ORM? I thought ORMs were considered "bad" by lots of people, but the above is a pretty natural UI pattern that might occur in a million applications. If I was using a non-ADO non-Delphi-db-dataset style of working, the above wouldn't be a problem in almost any persistence layer I might write, and yet when databases with primary keys that use identity values to link the master and detail rows get into the picture, things get complicated.
Update: Transactions are hardly ideal in this case. (Commit/Rollback is too coarse a mechanism for my purposes.)
Your asking two separate questions:
How do I cache updates?
How can I commit updates to related tables at the same time.
Cached updates can be accomplished a number of different ways. Which one is best depends on your specific situation:
ADO Batch Updates
Since you've already stated that you're using ADO to access the data this is a reasonable option. You simply need to set the LockType to ltBatchOptimistic and CursorType to either ctKeySet or ctStatic before opening the dataset. Then call TADOCustomDataset.UpdateBatch when you're ready to commit.
Note: The underlying OLEDB provider must support batch updates to take advantage of this. The provider for SQL Server fully supports this.
I know of no other way to enforce the master/detail relationship when persisting the data than to call UpdateBatch sequentially on both datasets.
Parent.UpdateBatch;
Child.UpdateBatch;
Client Datasets
Data caching is one of the primary reasons for TClientDataset's existence and synchronizing a master/detail relationship isn't difficult at all.
To accomplish this you define the master/detail relationship on two dataset components as usual (in your case ADOQuery or ADOTable). Then create a single provider and connect it to the master dataset. Connect a single TClientDataset to the provider and you're done. TClientDatset interprets the detail dataset as a nested dataset field, which can be accessed and bound to data aware controls just like any other dataset.
Once this is in place you simply call TClientDataset.ApplyUpdates and the client dataset will take care of ordering the updates for the master/detail data correctly.
ORMs
There is a lot that can be said about ORMs. Too much to fit into an answer on StackOverflow so I'll try to be brief.
ORMs have gotten a bad rap lately. Some pundits have gone so far as to label them an anti-pattern. Personally I think this is a bit unfair. Object-relational mapping is an incredibly difficult problem to solve correctly. ORMs attempt to help by abstracting away a lot of the complexity involved in transferring data between a relational table and an instance of an object. But like with everything else in software development there are no silver bullets and ORMs are no exception.
For a simple data entry application without a lot of business rules an ORM is probably overkill. But as an application becomes more and more complex an ORM starts to look more appealing.
In most cases you'll want to use a third party ORM rather than rolling your own. Writing a custom ORM that perfectly fits your requirements sounds like a good idea and its easy to get started with simple mappings but you'll soon start running into issues like parent/child relationships, inheritance, caching and cache invalidation (trust me I know this from experience). Third party ORMs have already encountered these issues and spent an enormous amount of resources to solve them.
With many ORMs you trade code complexity for configuration complexity. Most of them are actively working to reduce the boilerplate configuration by turning to conventions and policies. If you name all your primary keys Id rather than having to map each table's Id column to a corresponding Id property for each class you simply tell the ORM about this convention and it assumes all tables and classes its aware of follow the convention. You only have to override the convention for specific cases where it doesn't apply. I'm not familiar with all of the ORMs for Delphi so I can't say which support this and which don't.
In any case you'll want to design your application architecture so you can push off the decision of which ORM framework (or for that matter any framework) to use as long as possible.