A way to synchronize data between an external device and a database? - ruby-on-rails

In the application I am designing, I have to communicate with a device and store a history of data readings in a database. The device is essentially a sensor that spits out numbers via the serial port. The user end of the application is a RubyOnRails interface that allows the user to view this data and configure the device.
I am wondering what kind of connection between the database and the device you could recommend for this kind of a setup.
Up to this point, I had a custom application running on a host computer (a computer with the device connected directly through a serial port) that would serve as a bridge to a MySQL database. The application would connect directly to the MySQL database and execute queries. It works fairly well, but I am not sure if this is the best solution.
The only other alternative I see is to have an intermediate application that my custom application could connect to, instead of directly going to the database. This could be a part of the main application, or something separate. Would this be a better solution?
Would you recommend another approach?
Thank you,

I have a similar structure, although I fetch my data from a Web Service. The way I organize is:
Create classes in lib/imports, eg DailyDataImport, DailyDataSummarize (you can organize the hierarchy and names as per your wish or willingness).
Create a rake task under a new namespace, say import and add it to your cron job depending frequency. Take a look at Cron in Ruby. Its helpful.
This allows me to have a better control over what goes in my database.
Some questions to consider:
What schedule does the Device follow
to populate the data?
Do you need the data as-is or you
want a little control over it or you
need to process it, like summarizing
and aggregating etc.

MS SQL Server 2008 has great data synchronisation support.
SQL Server 2008 Express is free and can act as a replication subscriber (but not publisher) for clients.
Microsoft Sync Framework

Related

gpfdist vs gpload greenplum

I am setting greenplum for the first time. I am following the documentation. I want to setup connection from sql to greenplum database. Currently figuring out what's the best way to achieve this. I came across gpfdist and gpload.
How are the two different? Since both use external tables, both work on slaved nodes and are used for parallel loading. So Is there any advantage of using one over other?
Answering to your question for " I want to setup connection from sql to greenplum database"...
It's ambiguous for which SQL database you are referring to.
Also, there is no direct connectivity drivers available to connect non-greenplum database to greenplum database.
However if you want to migrate data from Oracle to Greenplum, then you can use Informatica's fastclone tool.
To answer your second part of question regarding gpfdist and gpload. GPFDIST is a file distributed process which runs on host system and it serves file parallely to many segments. While initialising external table to read/ write from file, you will need to specify which process will serve the file, In your case it will be GPFDIST. There are other processes too like FTP, GPHDFS, HTTP.
GPLOAD is a wrapper utility which makes your work easier by automatically creating gpfdist processes and external tables.
Also be aware that GPLOAD can only create readable external tables.
gpfdist n gpload or same. In gpfdist you do it manually while in gpload you can automate the activities via maiking entries in config(yaml file) file.
GPLOAD is a wrapper around GPFDIST. so when you load data via gpload it will internally use gpfdist only.
If you want to load/ migrate data from any other RDBMS to Greenplum and you are using any ETL or migration tool, it will use normal copy command and while loading/migrating if you enable gpload(now a days in the latest version of most of the ETL tool and migration tool support gpload feature when you migrate/load data to Greenplum) it will load data in parallel fashion via using gpfdist internally.

Software setup for hardware enabled application

I have a Raspberry PI that is tightly coupled with a device that I want to control.
The desired setup I want to have would look something like this:
The physical device with interactive hardware controls on the device (speaker, mic, buttons)
A Raspberry PI coupled to the device
On the PI:
A daemon app that reacts to changes from the hardware
A Webinterface that shows the current state of the device and allows to configure the device
The system should somehow be able to update itself with new software when it becomes available (apg-get or some other mechnism).
For the Webinterface I am going to use a rails app, which is not a problem as such. What is not clear to me is the event-driven software that is talking to the hardware through gpio. Firstly, I would prefer to do this using ruby, so that I don't have a big technology gap when developing the solution.
How can I ensure that both apps start up and run in the background when the raspberry PI starts
How do I notify the webapp of an event (e.g. a button was pressed).
I wonder if it makes sense that the two pieces of software have a shared database to communicate.
How to best setup some auto-update-mechanism for both pieces of software without requiring the user to take any actions.
Apps
This will be dependent on the operating system
If you install a lightweight version of Linux, you might be able to create some runtime applications or something. I've never done anything like this; but I know from Windows you can create startup programs -- likewise, you should be able to do something similar in Linux
BTW you wouldn't "run" the Rails app - you'll fire up the server to capture any requests. You'd basically run your app locally in "production" mode - allowing you to send requests, either through localhost, or setup a pseudo domain in the HOSTS file of your box
--
Web App
The web app itself is RESTful, meaning (I believe), it will only act upon having requests sent to it. Because this works over the HTTP protocol, it essentially means you'll need some sort of (web) service to send requests to the web app:
Representational state transfer (REST) is a way to create, read,
update or delete information on a server using simple HTTP calls
Although I've never done this myself, I would use the ruby app on your PI to send HTTP requests to your Rails app. This will certainly add a level of complexity, but will ensure you an interface the two types of data-transfer
The difference you have is Rails / any other web app will only act on request. "Native" applications will run as long as the operating system is operating; meaning you can "listen" for updates from the hardware etc.
What I would do is split the functionality:
Hardware input > send to service
Service > sends to Rails
Rails > sends response to service
Service > processes response
This may seem inefficient, but I think it's the best way to capture local-based input from your hardware. You'll have to use a localhost rails app, running with something like nginx or some other efficient server
--
Database
it would only make sense if they shared the data. You should remember that a database is different than a datatable. A database stores many tables, and is generally meant for a single purpose; whilst a datatable stores a single type of data.
From what you've written, I would recommend using two databases running on the same db server. This will give you the ability to create as many tables as you want for these databases - giving you scope to add as many different pieces of data you wish to each. Sharing data can be done using an API or a web service
--
Updating
Rails app will not need to be "updated" - you'll just need to deploy a fresh version. The beauty of Internet-centric software :)
In terms of your Rasberry-PI "on-board" software update - I don't have much experience with this, so can only recommend

Should an iphone app communicate directly with a cassandra backend?

Obviously there are multiple steps and phases of implementing such a thing.
I was thinking I would eventually have a webserver that takes http json requests from the ios app, and then queries the cassandra backend and sends results back. I could load balance and all that fancy stuff still, and also provide a logical layer on server side, and keep the client app lightweight.
I'm not sure i understand how cassandra clients fit though. It seems like the cassandra objective c client could eliminate the need for the above approach.
I saw another question and answer but it wasnt clear, perhaps because it varys on the need.
An iPhone app should not directly connect to a Cassandra backend or any other DB store.
First of all, talking to a database often requires adapting a very specific binary protocol (for Cassandra in particular, binary CQL or Thrift). Writing an adapter that would let your Objective-C app communicate in this binary protocol is a major piece of work, and could easily cost more than the rest of your app in effort. If you hide the DB behind a web-server, however, you will be able to select from a variety of existing adapters available in different server-side languages, meaning that you don't need to redo all that low-level work. You'll only be responsible for a relatively small piece of server-side code that would translate your REST queries and forward them to one of the Cassandra adapters (which expose easy-to-use interfaces).
Secondly, if you wanted to connect to a remote database from the phone, your database server would have to open its ports to the internet at large, which is a very bad security practice, even if you use SSL and user credentials. Again, if you hide behind a web server, you will be putting in a layer of technology that has evolved for decades to remain secure on the public internet.
Finally, having your phone talk to Cassandra directly is a poor architectural pattern. When you write apps that communicate on the internet, you want them to know as little as possible about each other, only how to talk to each other (preferably in a standard protocol). That way you can replace or upgrade individual components while keeping everything else the same. This may not sound like a lot, but is actually the main reason why phones, or web browsers, don't directly talk to databases. (If this setup were a good idea in principle, the first two problems could be easily solved given enough engineering effort.)
The approach you first suggested with JSON and the web server is the only correct way to go.
Use something like RESTful API, there are many reasons for that.
if your servers ip addresses change you have to update all client, if you add more nodes you will need to update all clients, if you decide to upgrade your cassandra and some functions change your clients will break and you need to update all clients.

Checking cisco network device rules programmatically

I want to be able to show (by device) open/blocked status for a given protocol between two devices/ports on a network. In other words, I need to output a list of network devices (firewalls & switches) between Server A and Server B and indicate whether the request should (according to each device's rules) be allowed through or blocked.
I'm starting with the Cisco networking devices, which are centrally managed by Cisco's Security Manager (CSM) application (version 4.2). I'm new to network management automation programming and want to make sure I'm not overlooking an obvious best way to handle this.
So far it's looking like I'll need to periodically export and ETL device rules out of CSM (they have a perl script that I can call to do this I believe) and into a separate database, then write some custom SQL code to determine which devices on a route between two hosts/ports will allow or block traffic of the given protocol?
Am I on the right track, or is there a better way to go about this?
If I understood your question, I think you can run a TCL script inside the Cisco equipments do collect the necessary information and transfer it to a central server, form there import it to a database and then correlate that information.
Hope that helps you in your work.

offline web application design recommendation

I want to know which is the best architecture to adopt for this case :
I have many shops that connect to a web application developed using Ruby on Rails.
internet is not reachable all the time
The solution was to develop an offline system which requires installing a local copy of the distant database.
All this wad already developed.
Now what I want to do :
Work always on the local copy of the database.
Any change on the local database should be synchronized with distant database.
All the local copies should have the same data in other local copies.
To resolve this problem I thought about using a JMS like software eventually Rabbit MQ.
This consists on pushing any sql request into a JMS queue that will be executed on the distant instance of the application which will insert into the distant DB and push the insert or SQL statement into another queue that will be read by all the local instances. This seems complicated and should slow down the application.
Is there a design or recommendation that I must apply to resolve this kind of problem ?
You can do that but essentially you are developing your own replication engine. Those things can be a bit tricky to get right (what happens if m1 and m3 are executed on replica r1, but m2 isn't?) I wouldn't want to develop something like that unless you are sure you have the resources to make it work.
I would look into existing off-the shelf replication solution. If you are already using a SQL DB it probably has some support for it. Look here for more details if you are using MySQL
Alternatively, if you are willing to explore other backends, I heard that CouchDB has great support for replication. I also heard of people using git libraries to do that sort of thing.
Update: After your comment, I realize you already use MySql replication and are looking for solution for re-syncing the databases after being offline.
Even in that case RabbitMQ doesn't help you at all since it requires constant connection to work, so you are back to square one. Easiest solution would be to just write all the changes (SQL commands) into a text file at a remote location, then when you get connection back copy that file (scp, ftp, emaill or whatever) to master server, run all the commands there and then just resync all the replicas.
Depending on your specific project you may also need to make sure there are no conflicts when running commands from different remote location but there is no general technical solution to this. Again, depending on the project, you may want to cancel one of the transactions, notify the users that it happened and so on.
I would recommend taking a look at CouchDB. It's a non-SQL database that does exactly what you are describing automatically. It's used especially in phone applications that often don't have internet or data connectivity. The idea is that you have a local copy of a CouchDB database and one or more remote CouchDB databases. The CouchDB server then takes care of teh replication of the distributed systems and you always work off your local database. This approach is nice because you don't have to build your own distributed replication engine. For more details I would take a look at the 'Distributed Updates and Replication' section of their documentation.

Resources