I'm going to use influxdb to store a lot of iot data from sensors.
As the last cluster version of influxdbv0.11 is not ready to use in production, and the Relay HA is too young too, is there another way to scale-out influxdb?
eg:
What are the maturity of the last cluster version of influxdb v0.11? Should I customize v0.11 or try other cost-saving way.
How about use kafka infront of influxdb to buffer data when influxdb got down?
How about sharding?Is there any detailed document about sharding in influxdb( https://influxdata.com/high-availability/)?
Any way, I just want to find a free, cluster working influxdb.
Other than InfluxDB Relay there isn't a free way to scale out InfluxDB.
Related
In thingsboard, I want to save timeseries data in influxdb or Tdengine.
please help me.
As of 3.3.4.1 Thingsboard supports only Postgres, Cassandra and Timescale databases. But you can use REST rule node to duplicate your data to InfluxDB via its Cloud API.
I need you help regarding to Telegraf monitoring of influxDB instance and a behavior I cannot explain.
The configuration is the following:
Two independent instances of InfluxDB v1.7.10 are running on seperate servers, say server A and server B
Two telegraf services v1.13.4 are running with the same configuration:
One output being a "monitoring" database created in the influx database
Several inputs (system, disk, ping, ...)
Grafana is used on both server to explore Telegraf stored values
On server A, which is running fine, the monitoring shard size and cardinality are quite regular. On server B on the other hand, the monitoring shard size and cardinality are much more important (by a factor 10).
I cannot explain this difference and I have already checked:
tag and field cardinality of the inputs used on both server
telegraf configuration on both server
Any idea about where to look to explain this behavior ?
Tanks for your help !
we are using Telegraf, Influxdb and Grafana for Monitoring our environment. We have two datacenters dc1 and dc2. Each datacenter has one pod of Influxdb running. we want some approach to replicate the data between two influxdb instances running across two datacenters. So, if dc1 goes down we can have the data of both datacenters(dc1 and dc2) in dc2. We are using opensource Influxdb so can anyone please suggest some approaches to achieve this?
Tried to follow Replication during ingest approach where we configure two influxdb urls of both datacenters in telegraf.conf as per this https://www.influxdata.com/blog/multiple-data-center-replication-influxdb/ documentation but, what if one of the influxdb is down? and also after it's recovery both influxdb will have different data so, we do not want to follow this approach.
Sorry I just started to learn docker. My question may seem stupid for some of you.
In fact, I would like to know if there is a way to collect performance metrics from "CAdvisor" container (not from cgroup) at runtime ? I mean, extract performance values from the curves designed by cadvisor like memory usage or network traffic.
I need to record this values and save them in a database so that, I can perform a statistic analyzes upon these generated values (like comparing memory consumption for two docker containers at t=50s).
Thanks in advance.
As other answers mention, cAdvisor doesn't provide its own performance data API, instead it exposes metrics which are typically handled in a separate database if one wants to derive performance data beyond "real time". For example, cAdvisor exports Prometheus metrics natively:
http://prometheus.io/docs/instrumenting/exporters/
The Prometheus metric types:
http://prometheus.io/docs/concepts/metric_types/
Prometheus supports a fairly rich functional expression language that can be used for querying and visualization:
http://prometheus.io/docs/querying/basics/
cAdvisor does provide a rest endpoint to get any stats in real time. By default, it keeps latest two minute of data. You can configure it to keep more or less. It also supports a storage backend to keep dumping stats to an influxdb database.
REST Api:
eg. /api/v1.3/containers
doc: https://github.com/google/cadvisor/blob/master/docs/api.md
Doc on setting up InfluxDB:
https://github.com/google/cadvisor/blob/master/docs/influxdb.md
I think you could use https://github.com/tutumcloud/container-metrics for this. Basically what that would be doing is using influxdb http://influxdb.com/ as a time series data store.
There is some more information available here: http://blog.tutum.co/2014/08/25/panamax-docker-application-template-with-cadvisor-elasticsearch-grafana-and-influxdb/
A couple of people seemed to be looking into the ELK stack (Elastic Search, Logstash, Kibana) for visualising some of this data here: https://github.com/google/cadvisor/issues/634
We're trying to scale up HBase writes on a cluster using Thrift. (Our HBase application is in Python, and hence needs Thrift.)
Despite increasing the number of nodes in the cluster, we are seeing the same write speeds.
First off, is the recommended strategy to run Thrift on:
1. The client?
2. The HBase master?
3. HBase region servers?
If on #1 or #2, will the client or HBase master take care of splitting the requests to the various region servers? It doesn't appear to in our case.
If #3, then I have to modify the client to write to the specific region servers, and randomize the writes. I can do this, but it seems to defeat the purpose of using HBase.
Any other tips on read/write scaling (especially with Thrift) are greatly appreciated.
In HBase to gain performance with node increase you should have a decent "rowkey" distribution. As long as you have "hot spots" (a very busy region server) in your cluster you would not gain anything from increasing your cluster size. checkout the article on row key design to start with.
If you don't need to read right away (if you are comfortable with async writes) you can check asynch hbase client from stumbleupon for performance gain.
I found the answer at these two questions, it looks like we'll go with #3 (write to the specific region servers, and randomize the writes):
Is it better to send data to hbase via one stream or via several servers concurrently?
HBase Thrift: how to connect to remote HBase master/cluster?