offline web application design recommendation - ruby-on-rails

I want to know which is the best architecture to adopt for this case :
I have many shops that connect to a web application developed using Ruby on Rails.
internet is not reachable all the time
The solution was to develop an offline system which requires installing a local copy of the distant database.
All this wad already developed.
Now what I want to do :
Work always on the local copy of the database.
Any change on the local database should be synchronized with distant database.
All the local copies should have the same data in other local copies.
To resolve this problem I thought about using a JMS like software eventually Rabbit MQ.
This consists on pushing any sql request into a JMS queue that will be executed on the distant instance of the application which will insert into the distant DB and push the insert or SQL statement into another queue that will be read by all the local instances. This seems complicated and should slow down the application.
Is there a design or recommendation that I must apply to resolve this kind of problem ?

You can do that but essentially you are developing your own replication engine. Those things can be a bit tricky to get right (what happens if m1 and m3 are executed on replica r1, but m2 isn't?) I wouldn't want to develop something like that unless you are sure you have the resources to make it work.
I would look into existing off-the shelf replication solution. If you are already using a SQL DB it probably has some support for it. Look here for more details if you are using MySQL
Alternatively, if you are willing to explore other backends, I heard that CouchDB has great support for replication. I also heard of people using git libraries to do that sort of thing.
Update: After your comment, I realize you already use MySql replication and are looking for solution for re-syncing the databases after being offline.
Even in that case RabbitMQ doesn't help you at all since it requires constant connection to work, so you are back to square one. Easiest solution would be to just write all the changes (SQL commands) into a text file at a remote location, then when you get connection back copy that file (scp, ftp, emaill or whatever) to master server, run all the commands there and then just resync all the replicas.
Depending on your specific project you may also need to make sure there are no conflicts when running commands from different remote location but there is no general technical solution to this. Again, depending on the project, you may want to cancel one of the transactions, notify the users that it happened and so on.

I would recommend taking a look at CouchDB. It's a non-SQL database that does exactly what you are describing automatically. It's used especially in phone applications that often don't have internet or data connectivity. The idea is that you have a local copy of a CouchDB database and one or more remote CouchDB databases. The CouchDB server then takes care of teh replication of the distributed systems and you always work off your local database. This approach is nice because you don't have to build your own distributed replication engine. For more details I would take a look at the 'Distributed Updates and Replication' section of their documentation.

Related

micro service architecture database backup and restore

I'm working on a big project, which is based on micro service architecture , so consider I have 10 service which some of them have their own database,
these databases are in different technologies (mysql, mongodb , elastic, ... )
so what is the best practice for backup and restore collection of services?
the real problem is these databases are related to each other, for example in my logic backend server I keep oauhId of each user which comes from oauth server,
now consider restore these two databases separately and now my users db in logic server contains some users which there aren't any related records to them on oauth server,
just for your information, I'm using docker , docker-compose, docker swarm for my service orchestration.
As an idea: check how your services depend on each other. If your dependencies are acyclic, you might be able to backup all your data outside-in or inside-out, without running into consistency issues.
Doing so would guarantee you to have no elements in services depending on an inner one after your restore.
If your services show cyclic dependencies, you might be better serviced to have each service redundantly (e.g. master slave replication). Then you can take down the slave instances, taking a backup from the whole lot of slaves while they are offline. That would allow you to create an atomic backup accross all services. However your quality of the backup is then based on the quality of your master slave replication at each service.
Lastly you could keep record of change per service, plus a full backup. Thus you can write your rollback and the start applying the record of change until you reach a consistent state accross the service instances. I think that requires you to have logical dependencies (request identifier) that allows you to correlate the record of change elements (i.e. apply them across the services without the risk to apply them in a way that defies the logical dependencies that occured when clients actually interacted with your services).
I hope these ideas can help you solve your problem :)

MongoDB: Different applications connecting to different replicas

We use Mongodb as the central Database for our application; a consumer facing mobile app. At present its a 7-member replica-set with replica-set-1 being the master at the moment. The backend which connects to the mongo replica is build in Ruby on Rails and we use mongoid as the ODM.
There are mainly 3 pieces connecting to the MongoDB replica-set.
The consumer application
The Admin and customer care management application
The Data retrieval application ( for analytics and such purposes )
All these 3 apps connect to the same replica set as of now.
What I would like to know is whether is it possible to connect different applications to specific replicas.
For example, the mobile app connects to the primary for writes and the replicas 2-4
to read; the customer care management application connects to the primary
( for writes ) and replicas 5-7 for reads.
I dont think explicitly mentioning specific replicas in the mongoid.yml configuration is working. Even though I have already mentioned only replica-set-7 in the mongoid hosts file for the data retrieval application, I do see certain queries in the log file of replica-set-2 and 3.
So obviously, MongoDB decides the criteria to distribute the queries among its replicas despite the configuration specified at the client mongoid end.
I would really love to know if such a thing is possible at all using MongoDb and mongoid as it would help us solve a lot of our load issues. Right now heavy queries from the customer care and data retrieval apps affects the consumer facing mobile app as well; as the reads are not segragated. So basically would like to separate out the reads.
Also, if at all this is possible, I would again have my eyes raised on any possible pitfalls for this; specially that all 3 applications can write to the DB. For example, replica-3 suddenly becomes the primary after an election and its not explicitly mentioned in the configuration of the data retrieval application. What might happen there would become a concern.
I am not at all sure whether this is possible; but just wanted to know if theres a way to figure out this. Any help would be really appreciable.
When you connect to any member of a replica set, the client is told the full state of the replica set and can connect to any of them. The initial set of hosts are just the seeds for that process - as long as your application can reach one of those hosts, it doesn't matter which hosts are in that configuration.
Mongo does have the concept of tagged replica set members. When creating a connection or executing a query you can specify the tags to use to select the replica set member to read from.

How to build a server database for my application

I'm new to server side programming with a background in iOS. So I want to know where to start.
Here I tried to list some specific questions:
Can I just create a local database and practice on that?
Do the local databases and databases on remote server work the same?
If no, how can I choose which server I can use? (I went through the webpages of AWS cloud service and found they are really overwhelming.)
Arslan's answer is great, but I would like to add to it a bit. You mentioned a Chatroom, so in that case you should look into socket programming. The reason why I bring this up is, while no one has outright said it, you shouldn't create a chat server by read / writing to a database. It's much better to just keep it in memory and log to the database on an as need basis.
AWS is a fantastic solution and they have a lot of different solutions for different situations. You should look at using EC2, which is their server program. They have a free tier of it so that you can use and / or you can test locally. I suggest testing locally then pushing up to a free tier every now and then to make sure everything is running properly.
Also I would look into using CloudKit for data base storage. If you don't need instantaneous communication, it's far easier to use Apple's built in system rather than setup a server and manage it.
links: CloudKit, AWS EC2 Free Tier
As it happens I'm actually working on a ChatRoom Server program, here's the link to github. It is written in C++ so I recommend using it as a reference unless you want to write your own socket in C++.
Can I just create a local database and practice on that?
Sure. You can install a server locally on your machine ( there are plenty of available ) and through 'localhost:3000' or 'localhost' you can access the root of your server depending upon what you are using at server end. You can then configure your server to respond to a particular message.
Do the local databases and databases on remote server work the same?
Of course, the work they way is almost same. The difference you have stated yourself: remote.
If no, how can I choose which server I can use? (I went through the webpages of AWS cloud service and found they are really overwhelming.)
I would suggest you to start from the local server. But first you have to choose language: PHP, Ruby, Python - it depends upon you and your personal preferences. You can also use something like Parse.com. Parse.com is free up to 30 requests/second, and you can use Objective-C to send and retrieve data from the server with a few very easy steps. And of course, parse.com handles singing up and logging in a user for you , all you have to do is to write a code of few lines in your iOS app.
Download Apple's free Server.app from the Appstore, it wraps one of the best database management systems: PostgreSQL. Start it with this Terminal command:
sudo serveradmin start postgres
More info on these pages:
http://support.apple.com/kb/HT5583
http://www.postgresql.org

Should an iphone app communicate directly with a cassandra backend?

Obviously there are multiple steps and phases of implementing such a thing.
I was thinking I would eventually have a webserver that takes http json requests from the ios app, and then queries the cassandra backend and sends results back. I could load balance and all that fancy stuff still, and also provide a logical layer on server side, and keep the client app lightweight.
I'm not sure i understand how cassandra clients fit though. It seems like the cassandra objective c client could eliminate the need for the above approach.
I saw another question and answer but it wasnt clear, perhaps because it varys on the need.
An iPhone app should not directly connect to a Cassandra backend or any other DB store.
First of all, talking to a database often requires adapting a very specific binary protocol (for Cassandra in particular, binary CQL or Thrift). Writing an adapter that would let your Objective-C app communicate in this binary protocol is a major piece of work, and could easily cost more than the rest of your app in effort. If you hide the DB behind a web-server, however, you will be able to select from a variety of existing adapters available in different server-side languages, meaning that you don't need to redo all that low-level work. You'll only be responsible for a relatively small piece of server-side code that would translate your REST queries and forward them to one of the Cassandra adapters (which expose easy-to-use interfaces).
Secondly, if you wanted to connect to a remote database from the phone, your database server would have to open its ports to the internet at large, which is a very bad security practice, even if you use SSL and user credentials. Again, if you hide behind a web server, you will be putting in a layer of technology that has evolved for decades to remain secure on the public internet.
Finally, having your phone talk to Cassandra directly is a poor architectural pattern. When you write apps that communicate on the internet, you want them to know as little as possible about each other, only how to talk to each other (preferably in a standard protocol). That way you can replace or upgrade individual components while keeping everything else the same. This may not sound like a lot, but is actually the main reason why phones, or web browsers, don't directly talk to databases. (If this setup were a good idea in principle, the first two problems could be easily solved given enough engineering effort.)
The approach you first suggested with JSON and the web server is the only correct way to go.
Use something like RESTful API, there are many reasons for that.
if your servers ip addresses change you have to update all client, if you add more nodes you will need to update all clients, if you decide to upgrade your cassandra and some functions change your clients will break and you need to update all clients.

A way to synchronize data between an external device and a database?

In the application I am designing, I have to communicate with a device and store a history of data readings in a database. The device is essentially a sensor that spits out numbers via the serial port. The user end of the application is a RubyOnRails interface that allows the user to view this data and configure the device.
I am wondering what kind of connection between the database and the device you could recommend for this kind of a setup.
Up to this point, I had a custom application running on a host computer (a computer with the device connected directly through a serial port) that would serve as a bridge to a MySQL database. The application would connect directly to the MySQL database and execute queries. It works fairly well, but I am not sure if this is the best solution.
The only other alternative I see is to have an intermediate application that my custom application could connect to, instead of directly going to the database. This could be a part of the main application, or something separate. Would this be a better solution?
Would you recommend another approach?
Thank you,
I have a similar structure, although I fetch my data from a Web Service. The way I organize is:
Create classes in lib/imports, eg DailyDataImport, DailyDataSummarize (you can organize the hierarchy and names as per your wish or willingness).
Create a rake task under a new namespace, say import and add it to your cron job depending frequency. Take a look at Cron in Ruby. Its helpful.
This allows me to have a better control over what goes in my database.
Some questions to consider:
What schedule does the Device follow
to populate the data?
Do you need the data as-is or you
want a little control over it or you
need to process it, like summarizing
and aggregating etc.
MS SQL Server 2008 has great data synchronisation support.
SQL Server 2008 Express is free and can act as a replication subscriber (but not publisher) for clients.
Microsoft Sync Framework

Resources