The question is relatively plain, but mainly directed to the ProcessMaker experts.
I need to extract batches of data from ProcessMaker to perform analysis later.
Currently, we have v3.3 which has database model documented very well, and not so well documented REST API.
Having no clue on the best approach I suggest Process maker developers are encouraged to use direct database connection to fetch data batches.
However, from the perspective of the v.4 upgrade, I see that the database model is no longer a part of the official documentation, as well as the "Data Integration" chapter. Everything points out to use REST API for any data affairs.
So, I am puzzled. Which way to go for v3.3 and v4? REST API or direct DB connection?
ProcessMaker 4 was designed and built as an API first application. The idea is that everything that can and should be done through the application should be done via the API. In fact, this is the way all modern systems are designed. The days of accessing the database directly are gone and for good reason. The API is a contract. It is a contract that says that if you make a request in a certain way, you will get a certain response. On the other hand, we cannot guarantee that the database itself will always have the same tables. As a result if you access the database directly, and then we decide to change the database structure, you will be out of luck and anything you built that access the database directly will potentially fail.
So - the decision is clear. V4 is a modern architecture built with modern tooling. It performs and scales better than V3. It is the future of ProcessMaker. So, we highly recommend using this versioning, upgrading and staying on our mainline, and using the API for all activities related to the data models.
Related
My question might seem a bit naive, but as a beginner iOS developer, I'm starting to think that Core Data is replaceable by firebase realtime database (or firestore in the future). I used both of them in two seperate projects and after activating the offline feature in firebase, I got the same results (that is, the data was saved to the device without the need for an internet connection). I think I read something in the firebase documentation about it not being able to filter and sort at the same time which would probably mean that Core Data can be more convenient for complex queries. It would be great to have some senior developers' views on this subject.
Thanks in advance.
The question is a bit off-topic for SO (IMO) and is (kind of) asking for opinions but it may be worth a high-level answer. I use both platforms daily.
Core Data and Firebase are two unrelated platforms used to (manage and) store data; it's hard to directly compare them without understanding your use case.
CD is a framework used to model objects in your app. It's the 'front end' of data storage, where the 'back end' could be SQL, flat files, plists etc. It's more of a single user concept which stores data locally on the device (it has cloud functionality but that's a different topic).
Firebase on the other hand is a live, event driven, cloud based, multi user capable NoSQL storage. While it offers off-line persistence, that's really for situations where you need to be interacting with data when the device is temporarily disconnected from the internet.
It is not correct that:
firebase documentation about it not being able to filter and sort at
the same time
But, your Firebase structure is dependent on what you want to get out of it - if it's structured correctly, it can be filtered and sorted at the same time in a variety of very powerful (and faaast) ways.
Core Data is really an incredible technology and building relationships between objects is very straight forward and has SQL-like queries for retrieving data.
If you are looking for a database that leverages local storage - go with Core Data or another database that's really strong locally such as Realm, MySql and a number of others.
If you want to have Cloud based, multi-user, event driven storage, Firebase is a very strong contender (Realm is another option as well)
I would suggest building a very simple To-Do type app and use Firebase for storage in one and then build another using Core data. Should only be a couple of hours of work but it will really give you some great basic experience with both - you can make a more informed decision from there.
First of all: I am a beginner to swift, so please excuse, if my question seems to be obvious to many people, but i could not find a satisfying answer for me, after doing some research on the subject "caching".
I have a very simple GraphCool Backend setup (using GraphQL query language) and Apollo:
https://github.com/apollographql/apollo-ios
on the client's side in my swift project.
As my database is going to grow over time, i am already thinking about how to preserve data volume for my users when making GraphQL-queries, i.e. i do not want to fetch all the data, every time the user restarts the app, but instead cache some data in local storage. Since i am already using apollo, i would like to know if apollo allows caching on local storage, or do i have to use a third party library to do that?
I am aware that there are 'cachePolicy' options in Apollo, but as i understand them, they are only for memory caching.
(I did not present any of my app's code for this question, because i don't think it gives any additional benefits as my question is basically about caching in Swift.)
I have to integrate various legacy applications with some newly introduced parts that are silos of information and have been built at different times with varying architectures. At times these applications may need to get data from other system if it exists and display it to the user within their own screens based on the business needs.
I was looking to see if its possible to implement a generic federation engine that kind of abstracts the aggregation of the data from various other OData endpoints and have a single version of truth.
An simplistic example could be as below.
I am not really looking to do an ETL here as that may introduce some data related side effects in terms of staleness etc.
Can some one share some ideas as to how this can be achieved or point me to any article on the net that shows such a concept.
Regards
Kiran
Officially, the answer is to use either the reflection provider or a custom provider.
Support for multiple data sources (odata)
Allow me to expose entities from multiple sources
To decide between the two approaches, take a look at this article.
If you decide that you need to build a custom provider, the referenced article also contains links to a series of other articles that will help you through the learning process.
Your project seems non-trivial, so in addition I recommend looking at other resources like the WCF Data Services Toolkit to help you along.
By the way, from an architecture standpoint, I believe your idea is sound. Yes, you may have some domain logic behind OData endpoints, but I've always believed this logic should be thin as OData is primarily used as part of data access layers, much like SQL (as opposed to service layers which encapsulate more behavior in the traditional sense). Even if that thin logic requires your aggregator to get a little smart, it's likely that you'll always be able to get away with it using a custom provider.
That being said, if the aggregator itself encapsulates a lot of behavior (as opposed to simply aggregating and re-exposing raw data), you should consider using another protocol that is less data-oriented (but keep using the OData backends in that service). Since domain logic is normally heavily specific, there's very rarely a one-size-fits-all type of protocol, so you'd naturally have to design it yourself.
However, if the aggregated data is exposed mostly as-is or with essentially structural changes (little to no behavior besides assembling the raw data), I think using OData again for that central component is very appropriate.
Obviously, and as you can see in the comments to your question, not everybody would agree with all of this -- so as always, take it with a grain of salt.
I want to write a web app with rails what uses RDF to represent linked data. But I really don't know what might be the best approach to store RDF graphs within a database for persistent storage. Also I want to use something like paper_trail to provide versioning database objects.
I read about RDF.rb and activeRDF. But RDF.rb does not include a layer to store data in a database. What about activeRDF?
I'm new to RDF. What is the best approach to handle large RDF graphs with rails?
Edit:
I found 4Store and AllegroGraph what fits for Ruby on Rails. I read that 4Store is entirely for free and AllegroGraph is limited to 50 million triples in the free version. What are the advantages of each of them?
Thanks.
Your database survey is quite incomplete. There is also BigData, OWLIM, Stardog, Virtuoso, Sesame, Mulgara, and TDB or SDB which are provided by Jena.
To clarify, Fuseki is just a server component for a backend that supports the Jena API to provide support for the SPARQL protocol. Generally, since you're using Ruby, this is how you will interact with a database -- via HTTP using SPARQL protocol. Probably every single database supports the SPARQL HTTP protocol for querying, and many will support something in the ballpark of either SPARQL update protocol, the graph store protocol, or a similar custom HTTP protocol for handling updates.
So if you're set on using Rails, then your best bet is to pick a database, work out a simple wrapper for the HTTP protocol, perhaps forking support in an existing Ruby library if it exists, and building your application based on that support.
Versioning is something that's not readily supported in a lot of systems. I think there is still a lot of thought going into how to do it properly in an RDF database. So likely, if you want versioning in your application, you're going to have to do something custom.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
My company has this Huge Database that gets fed with (many) events from multiple sources, for monitoring and reporting purposes. So far, every new dashboard or graphic from the data is a new Rails app with extra tables in the Huge Database and full access to the database contents.
Lately, there has been an idea floating around of having external (as in, not our company but sister companies) clients to our data, and it has been decided we should expose a read-only RESTful API to consult our data.
My point is - should we use an API for our own projects too? Is it overkill to access a RESTful API, even for "local" projects, instead of direct access to the database? I think it would pay off in terms of unifying our team's access to the data - but is it worth the extra round-trip? And can a RESTful API keep up with the demands of running 20 or so queries per second and exposing the results via JSON?
Thanks for any input!
I think there's a lot to be said for consistency. If you're providing an API for your clients, it seems to me that by using the same API you'll understand it better wrt. supporting it for your clients, you'll be testing it regularly (beyond your regression tests), and you're sending a message that it's good enough for you to use, so it should be fine for your clients.
By hiding everything behind the API, you're at liberty to change the database representations and not have to change both API interface code (to present the data via the API) and the database access code in your in-house applications. You'd only change the former.
Finally, such performance questions can really only be addressed by trying it and measuring. Perhaps it's worth knocking together a prototype API system and studying it under load ?
I would definitely go down the API route. This presents an easy to maintain interface to ALL the applications that will talk to your application, including validation etc. Sure you can ensure database integrity with column restrictions and stored procedures, but why maintain that as well?
Don't forget - you can also cache the API calls in the file system, memory, or using memcached (or any other service). Where datasets have not changed (check with updated_at or etags) you can simply return cached versions for tremendous speed improvements. The addition of etags in a recent application I developed saw HTML load time go from 1.6 seconds to 60 ms.
Off topic: An idea I have been toying with is dynamically loading API versions depending on the request. Something like this would give you the ability to dramatically alter the API while maintaining backwards compatibility. Since the different versions are in separate files it would be simple to maintain them separately.
Also if you use the Api internally then you should be able to reduce the amount of code you are having to maintain as you will just be maintaining the API and not the API and your own internal methods for accessing the data.
I've been thinking about the same thing for a project I'm about to start, whether I should build my Rails app from the ground up as a client of the API or not. I agree with the advantages already mentioned here, which I'll recap and add to:
Better API design: You become a user of your API, so it will be a lot more polished when you decided to open it;
Database independence: with reduced coupling, you could later switch from an RDBMS to a Document Store without changing as much;
Comparable performance: Performance can be addressed with HTTP caching (although I'd like to see some numbers comparing both).
On top of that, you also get:
Better testability: your whole business logic is black-box testable with basic HTTP resquest/response. Headless browsers / Selenium become responsible only for application-specific behavior;
Front-end independence: you not only become free to change database representation, you become free to change your whole front-end, from vanilla Rails-with-HTML-and-page-reloads, to sprinkled-Ajax, to full-blown pure javascript (e.g. with GWT), all sharing the same back-end.
Criticism
One problem I originally saw with this approach was that it would make me lose all the amenities and flexibilities that ActiveRecord provides, with associations, named_scopes and all. But using the API through ActveResource brings a lot of the good stuff back, and it seems like you can also have named_scopes. Not sure about associations.
More Criticism, please
We've been all singing the glories of this approach but, even though an answer has already been picked, I'd like to hear from other people what possible problems this approach might bring, and why we shouldn't use it.