How to publish JMX stats on to a single remote server

How to publish JMX stats on to a single remote server - jmx

Lets say I have two applications/tomcats T1 and T2, both of which are jmx enabled. Each of them normally would have their own URL <serve_X>:<port_X> to which the jmx clients would connect. I want to know if it is possible to have a single rmi-server S1, running on port P1; which can hold the statistics of both T1 and T2.
If so how can I figure out the context? (as all the stats are now redirected to the same url). The closest I could find on internet is point 7 in this page. The intent is to have a centralized location for jmx services. I am trying to figure out if there is something like a context name (as in servlets) to facilitate this.

One solution that is rather new (compared to when the question was asked) is something like this:
JMX -> Codahale Metrics -> metrics-statsd -> StatsD -> Graphite/Reporting/Monitoring
Basically you use StatsD to aggregate the stats and Metrics library to convert JMX to something reasonable.

I think to do this you would need to re-write the object names so that they don't clash.
The server S1 would run on P1 as you say, and for requests coming in, forward them to the respective tomcats T1 and T2.
If you have e.g. tomacat:key1=value1 as objectname, then you could expose that on your proxyserver S1 as tomcat:server=T1,key1=value1 for the first real server and tomcat:server=T2,key1=value1 for the second.

If you end goal is ease of monitoring, or ability to combine stats, then look into Evident ClearStone

Related

How to define Alerts with exception in InfluxDB/Kapacitor

I'm trying to figure out the best or a reasonable approach to defining alerts in InfluxDB. For example, I might use the CPU batch tickscript that comes with telegraf. This could be setup as a global monitor/alert for all hosts being monitored by telegraf.
What is the approach when you want to deviate from the above setup for a host, ie instead of X% for a specific server we want to alert on Y%?
I'm happy that a distinct tickscript could be created for the custom values but how do I go about excluding the host from the original 'global' one?
This is a simple scenario but this needs to meet the needs of 10,000 hosts of which there will be 100s of exceptions and this will also encompass 10s/100s of global alert definitions.
I'm struggling to see how you could use the platform as the primary source of monitoring/alerting.

As said in the comments, you can use the sideload node to achieve that.
Say you want to ensure that your InfluxDB servers are not overloaded. You may want to allow 100 measurements by default. Only on one server, which happens to get a massive number of datapoints, you want to limit it to 10 (a value which is exceeded by the _internal database easily, but good for our example).
Given the following excerpt from a tick script
var data = stream
|from()
.database(db)
.retentionPolicy(rp)
.measurement(measurement)
.groupBy(groupBy)
.where(whereFilter)
|eval(lambda: "numMeasurements")
.as('value')
var customized = data
|sideload()
.source('file:///etc/kapacitor/customizations/demo/')
.order('hosts/host-{{.hostname}}.yaml')
.field('maxNumMeasurements',100)
|log()
var trigger = customized
|alert()
.crit(lambda: "value" > "maxNumMeasurements")
and the name of the server with the exception being influxdb and the file /etc/kapacitor/customizations/demo/hosts/host-influxdb.yaml looking as follows
maxNumMeasurements: 10
A critical alert will be triggered if value and hence numMeasurements will exceed 10 AND the hostname tag equals influxdb OR if value exceeds 100.
There is an example in the documentation handling scheduled downtimes using sideload
Furthermore, I have created an example available on github using docker-compose
Note that there is a caveat with the example: The alert flaps because of a second database dynamically generated. But it should be sufficient to show how to approach the problem.

What is the cost of using sideload nodes in terms of performance and computation if you have over 10 thousand servers?

Managing alerts manually directly in Chronograph/Kapacitor is not feasible for big number of custom alerts.
At AMMP Technologies we need to manage alerts per database, customer, customer_objects. The number can go into the 1000s. We've opted for a custom solution where keep a standard set of template tickscripts (not to be confused with Kapacitor templates), and we provide an interface to the user where only expose relevant variables. After that a service (written in python) combines the values for those variables with a tickscript and using the Kapacitor API deploys (updates, or deletes) the task on the Kapacitor server. This is then automated so that data for new customers/objects is combined with the templates and automatically deployed to Kapacitor.
You obviously need to design your tasks to be specific enough so that they don't overlap and generic enough so that it's not too much work to create tasks for every little thing.

Can Telegraf combine/add value of metrics that are per-node, say for a cluster?

Let's say I have some software running on a VM that is emitting two metrics that are fed through Telegraf to be written into InfluxDB. Let's say the metric are no. successfully handled HTTP requests (S), and no. of failed HTTP requests (F), on that VM. However, I might configure three such VMs each emitting those 2 metrics.
Now, if I would like to have a computed metric which is the sum of S from each VM, and sum of F from each VM, and store as new metrics, at various instants of time. Is this something that can be achieved using Telegraf ? Or is there a better, more efficient, more elegant way ?
Kindly note that my knowledge of Telegraf and InfluxDB are theoretical, as I've recently started reading up about them, so I have not actually tried any of the above, yet.

This isn't something telegraf would be responsible for.
With Influx 1.x, you'd use a TICKScript or Continuous Queries to calculate the sum and inject the new sampled value.
Roughly, this would look like:
CREATE CONTINUOUS QUERY "sum_sample_daily" ON "database"
BEGIN
SELECT sum("*") INTO "daily_measurement" FROM "measurement" GROUP BY time(1d)
END
CQ docs

Wireshark script to sum Length to Source IP

I am capturing the router interface from my Fritzbox modem then using Wireshark to view it.
I'd like a script to filter a number of Source IP's and then sum all the Length's (data quantity) associated with them. Effectively giving me the total data usage of each IP address I monitor.
Conceptually is sounds simple, but after a look at Lua, I think I'm in over my head.
Thanks.

Maybe tshark can help you achieve your goal directly, without the need for a script at all? Have you tried something like:
tshark -r file.pcap -z endpoints,ip,"ip.src eq 1.1.1.1" -q
... where 1.1.1.1 represents the IP address of the endpoint you're interested in gathering statistics for? You can specify as many endpoints as you need to by or ing them together, or even use a subnet such as "ip.src eq 1.1.1.0/24", for example.

Is Erlang message distribution broker-aware?

As stated in Erlang's official documentation Distributed Erlang, erlang nodes can communicate ( issue messages ) to other nodes in the same erlang cluster. Therefore it is possible to issue message such as :
Nodes A, B, C and D
A --> B
A --> C
B --> A
C --> B
...
Per my question, by "broker-aware", I mean: Can we have a node issue a message to any other node that is available based on a load balancing rule ?
A --> [ B or C or D ]
B --> [ A or C or D ]
...
Well, I know it is "possible" to design this, which requires some state management, etc. But are there built-in features for that ? If not, any open-source project someone is aware of that is NOT non-erlang message driven ( overall excluding RabbitMQ, etc as I want a pure erlang message broker) ?

I don't think there is a library for that, because the problem is very general. Your computation may be CPU bound, memory bound, network bound or use some other resource. Some tasks should be spawned "close to data". For example, when there is lots of data on disk, that would have to transmitted over the network.
The easiest way is to have some central "job manager" and workers asking for jobs. Other option is to have some kind of metric and update it like in this post on mailing list

Network Traffic in AMQP QPID

QPID AMQP
I have a question regrading network traffic . suppose I have a Publisher on Machine A . The Qpid broker is running on Machine B . WE have two subscribers Machine C and Machine D (They both subscribe to same topics). Now Imagine a topology where
A-->B-->X-->C
|
D
(Publisher A is connected to B and subscriber C and D are connected to Broker through and intermediate node X)
Message that is published by A which matches the topics for C and D will be received by both .What I want to know is that will the edge b->x carry the message twice (once for b->x->c and second time for b->x->c). Or is the AMQP/qpid framework intelligent enough to send message once from B to X and then send copies to each individual subscriber (hence less network traffic on b->x).
What I thought was that since X knows nothing and if we have private subscription queues for each subscriber (or even if shared queue and browsing/copying message instead of consuming) , the message will be travelling twice through b->x
This question is not specific to QPID . I would like to know the solutions for other Broker based (RabbitMQ ) and brokerless messaging frameworks (Zero MQ , LBM/UMS). I read in an article Zero Mq tries to provide a smarter solution http://www.250bpm.com/pubsub#toc4 , but it seems complicated since how would intermediate hops know about when to send multiple copies or not (I am not Networking expert so i might be missign something obvoius ,so any help would be really appreciated)

I'm assuming X is another Qpid broker, connected to B through the 'federation' feature. That being the case, the message will not be transported twice from B to X.
There are different ways you can configure this, depending on other requirements for the scenario.
The first is to statically link X to B: you create a queue on B for X to subscribe to, bind that Q to the exchange in question such that messages for both C and D will match, then use qpid-route to create a bridge from that queue to the exchange on X. C and D now connect and bind to that exchange on X and will receive the messages published by A as expected. In this case the messages will always flow from B to X regardless of whether C or D are active. If you add another consumer, E, then you may need to statically add a binding to the brdiged queue on B.
The second option is to use dynamic routing. This will automatically handle the propagation of binding information from X to B such that messages will only flow from B to X if they are needed by an active binding on X.

RabbitMQ will also only propagate a message across an intermediate link such as this once (and it will only get sent at all if some downstream consumer will actually end up seeing the message).

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

How to publish JMX stats on to a single remote server - jmx

One solution that is rather new (compared to when the question was asked) is something like this: JMX -> Codahale Metrics -> metrics-statsd -> StatsD -> Graphite/Reporting/Monitoring Basically you use StatsD to aggregate the stats and Metrics library to convert JMX to something reasonable.

If you end goal is ease of monitoring, or ability to combine stats, then look into Evident ClearStone

Related

How to define Alerts with exception in InfluxDB/Kapacitor

Can Telegraf combine/add value of metrics that are per-node, say for a cluster?

Wireshark script to sum Length to Source IP

Is Erlang message distribution broker-aware?

Network Traffic in AMQP QPID

Categories

Resources