Postgres on kaa cluster nodes dont sync with each other - iot

I have setup kaa cluster with two nodes.
The postgres on second node does not sync with the first one, as i add any schema or sdk. Do I need to manually setup replication between postgres.
Or kaa handles this by itself, if it is so then why my second node is not in sync with the first.
admin-dao.properties
jdbc_url=jdbc:postgresql://192.168.1.21:5432,192.168.1.22:5432/kaa
sql-dao.properties
jdbc_host_port=192.168.1.21:5432,192.168.1.22:5432
Thanks
Rizwan

Yes, the replication has to be setup in order for dbs in cluser to sync. And kaa does not handle sync as per their documentation in Architecture Overview
http://kaaproject.github.io/kaa/docs/v0.10.0/Architecture-overview/
SQL database
SQL database instance is used to store tenants, applications, endpoint
groups and other metadata that does not grow as the number of
endpoints increases.
High availability of a Kaa cluster is achieved by deploying the SQL
database in HA mode. Kaa officially supports MariaDB and PostgreSQL as
the embedded SQL databases at the moment.

Related

Difference between database connector/reader nodes in KNIME

While creating some basic workflow using KNIME and PSQL I have encountered problems with selecting proper node for fetching data from db.
In node repo we can find at least:
PostgreSQL Connector
Database Reader
Database Connector
Actually, we can do the same using 2) alone or connecting either 1) or 2) to node 3) input.
I assumed there are some hidden advantages like improved performance with complex queries or better overall stability but on the other hand we are using exactly the same database driver, anyway..
There is a big difference between the Connector Nodes and the Reader Node.
The Database Reader, reads data into KNIME, the data is then on the machine running the workflow. This can be a bad idea for big tables.
The Connector nodes do not. The data remains where it is (usually on a remote machine in your cluster). You can then connect Database nodes to the connector nodes. All data manipulation will then happen within the database, no data is loaded to your machine (unless you use the output port preview).
For the difference of the other two:
The PostgresSQL Connector is just a special case of the Database Connector, that has pre-set configuration. However you can make the same configuration with the Database Connector, which allows you to choose more detailed options for non standard databases.
One advantage of using 1 or 2 is that you only need to enter connection details once for a database in a workflow, and can then use multiple reader or writer nodes. I'm not sure if there is a performance benefit.
1 offers simpler connection details with the bundled postgres jdbc drivers than 2

Informix replication

We are developing our own Informix Replication handler. Informix version is 12.10. We are using Enterprise Replication, Primary-Target One-to-many option. ie... all database changes originate at the primary
database and are replicated to the target databases. We configured replication setup and replication is working fine.
Now if we write into master server, it will replicate in to slaves. The problem is we are also able to write into slaves. Is there any way to make the slaves read only? ie.. We should only be able to write into master server. Is it possible?
Please note that we are not considering Update-Anywhere Replication System, since we are using Timeseries data and there are many restrictions in informix for Conflict resolution rules for timeseries data. So please dont suggest Update-Anywhere Replication.
You need to change the mode of the participants (slaves) to readonly.
Use this command:
cdr modify server --mode readonly <server_group>
Refer to the Informix Enterprise Replication documentation here

Kubernetes Deployments across the Datacenters

Is it possible to failover the traffic from a mysql k8s deployment running in one datacenter to a deployment running in another datacenter along with its storage?
If yes , Do we need to spread the same k8s cluster on multiple datacenters or we have to run separate k8s clusters in each datacenter?
How k8s will ship or manage the storage volume across the datacenters? Do we need a special type of cloud storage for that purpose?
note: I just qouted mysql as an example of application that needs to store some data , it can be anything stateful that needs to carry over its data volumes. it is not that kind of HA like mysql-HA , it is just starting serving the application as it is from somewhere else automatically along with its data. any application that stores data to volume.
How can we achieve HA for our stateful application across the datacenters using k8s.
Thanks
You don't need to use Kubernetes to achieve HA.
I would recommend using MySQL Replication(i.e. Master/Slave configuration) to achieve HA. More info in the docs on how to set replication up.
In one data center, you would have a Master, and in your other data center, you would have the slave. You can even have multiple slaves in multiple data centers.
If problems arise on the master, you can automatically failover to a slave using the mysqlfailover utility. This way you have your data in 2 data centers that is in sync.
I'm not sure if this exactly fits your use cases, but it is one option for enabling HA on your MySQL database.

What is the recommended replication strategy for OpsCenter keyspace?

I'm using OpsCenter to monitor and configure my Cassandra cluster (It's actually a DSE cluster) and I have a keyspace that spans multiple datacenters. The OpsCenter keyspace, which is created and maintained by OpsCenter, use SimpleStrategy as the default replication strategy, which prevents me from turning on its repair service (mentioned in OpsCenter's document).
As the Isolating OpsCenter Performance Data blog says that using a dedicated datacenter requires us to manually monitor and scale the OpsCenter nodes, I was wondering what is the recommended replication strategy and factor for the OpsCenter keyspace so that storing OpsCenter data has limited performance impact on my production nodes while requires minimal tuning when I scale my production datacenters?
Suppose my production nodes use NetworkTopologyStrategy with two datacenters 'Cassandra' and 'Solr' (in a DSE setting) where 'Cassandra' datacenter supports OLTP and 'Solr' datacenter is dedicated for searching. Is it a valid solution to set the replication of OpsCenter keyspace to { 'class' : 'NetworkTopologyStrategy', 'Cassandra': 1}?
Thanks,
Ziju
NetworkTopologyStrategy is the recommended one for cases like this, so your proposed solution is valid. RF of 1 is debatable, but since OpsCenter keyspace doesn’t usually store anything very important, it should work okay, although I’d bump that to 2.

How can I run multiple Neo4j databases on a single server?

How can I run multiple Neo4j databases simultaneously on a single server? I would like to have separate data directories and ports if this is possible.
Has anyone done this successfully and if so explain how to do this
I have tried something like:
bin\neo4j start
To set up Neo4j with multiple instances on a single server, you essentially configure a cluster, with each node having its own set of configuration properties. You then run the cluster in single-instance (non-HA) mode (otherwise you'll just end up with a replication cluster, which doesn't meet your requirement).
Full instructions are in the Neo4j docs online and in your local doc\manual folder.
Note: The folks at Neo Technology call this out for dev/test purposes. I can't offer guidance on running this in production, other than the fact you'd have multiple instances competing for the same resources (cpu, disk, memory, network).
It's possible to setup Rexster to serve up multiple neo4j database directories. This is great if you're using the Gremlin query language. Other access forms may not be available (beyond my knowledge). Check out this question/answer: possible to connect to multiple neo4j databases via bulbs/Rexster?

Resources