I know that storm provide a functionality that you could ack every tuple you processed, but if I want to evaluate how the storm performed and get the performance information about the topology or bolts(e.g. how many failure to ack and how many latency of each bolt etc.)
Where can I find these data?
Storm has an web interface available exactly for that, called Storm UI, which can be started with the
./storm ui
command, and is accessible by default on port 8080.
Lots of useful information and statistics is available there, however, you must know how to interpret it, since there's not much documentation on Storm yet, except for the mailing list and the wiki pages.
Related
I am looking for monitoring tool for the following use cases:
Collect basic metrics about virtual machine (cpu usage, memory usage, i/o, available space)
Extract metrics from SQL Server (probably running some queries)
Extract information from external service about processing i.e how many processing are currently running and for how long. I am thinking about writing python scripts, but don't know how to combine with monitoring tool
Have the ability to plot charts and manage alerts and it will nice to have ability to send not only mails, but send message to slack/ms teams.
I was thing about Prometheus, because it has wmi_exporter, node_exporter, sql exporter, alert manager with possibility to send notifications to multiple destinations, but I don't know what to do with this external service and python scripts.
Any suggestions?
Prometheus can definitely do what you say you need done. Some of it may not be trivial, but you can definitely fill in the blanks yourself.
E.g. you can get machine metrics basically out of the box by firing up a node_exporter and having it scraped by Prometheus, but I don't think it has e.g. information on all running processes. The latter might require you to write an agent/exporter: a simple web server that exposes metrics on /metrics; there exists a Python client library to help with that. Or have said processes (assuming they're your code) push metrics to a Pushgateway instead, if they're short lived batch jobs.
Oh, and for charts/dashboards you probably want Grafana, as Prometheus' abilities in that area are rather limited and Grafana integrates rather well with Prometheus.
I have come across a lot of tools to monitor Linux servers which could generate alerts as well when the CPU usage goes alarmingly high, or the disk space goes very low etc.
However, in terms of Ejabberd I couldn’t find an exisiting module which could do something similar. I am particularly looking to receive alerts pertaining to mnesia getting overloaded, space availability etc. and other basic parameters worth monitoring.
Have a look at Exometer. It can report via SNMP. It doesn't come with the monitoring you're talking about out of the box but you should be able to configure it to report on whatever you need.
SNMP support comes standard with Erlang. You should have a look at Erlang/OTP os_mon. Depending on your needs, it may do what you want out of the box.
We're looking to implement ActiveMQ to handle messaging between two of our servers, over a geographically diverse environment (Australia to the UK and back, via the internet).
I've been looking for some vague indicators of performance round the net but so far have had no luck.
My question: compared to a DIY TCP/SSL implementation of basic messaging, how would ActiveMQ perform? Similar systems of our own can send and receive messages across Australia in 100-150ms, over a SSL layer with an already established connection.
Also, does ActiveMQ persist its TLS/SSL connections, thus saving a substantial amount of time that would already be used in connection creation/teardown?
What I am hoping is that it will at least perform better than HTTPS, at a per-request level.
I am aware that performance can vary remarkably, depending on hardware, networks, code and so on. I'm just after something to start with.
I know the above is a little fuzzy - if you need any clarification please let me know and I will only be too happy to oblige.
Thank you.
What Tim means is that this is not an apples to apples comparison. If you are solely concerned with the performance of a single point to point connection to transfer data, a direct link will give you a good result (although DIY is still a dubious design decision). If you are building a system that requires the transfer of data and you have more complex functional requirements, then a broker-based messaging platform like ActiveMQ will come into play.
You should consider broker-based messaging if you want:
a post-office style system where a producer sends a message, and knows that it will be consumed at some point, even if there is no consumer there at that time
to not care where the consumer of a message is, or how many of them there are
a guarantee that a message will be consumed, even if the consumer that first handle it dies mid-way through the process (transactions, redelivery)
many consumers, with a guarantee that a message will only be consumed once - queues
many consumers that will each react to a single message - topics
These patterns are pretty standard, and apply to all off the shelf messaging products. As a general rule, DIY in this domain is a bad idea, as messaging is complex (see http://www.ohloh.net/p/activemq/estimated_cost for an estimate of how long it would take you do do same); and has many existing implementations of various flavours (some without a broker) that are all well used, commercially supported and don't require you to maintain them. I would think very hard before going down to the TCP level for any sort of data transfer as there is so much prior art.
How do you isolate a performance issue to a specific component of the application infrastructure? Specifically, are there distinct markers in the result logs that distinguish between bottlenecks at web, application and/or database server levels?
I was asked this question in an interview and went blank on it. Seems this information is not available anywhere.
In addition to SiteScope and other agentless monitoring of system components, you need to make sure your scenario and scripts are working as expected. This includes proper error checking and use of transactions (and a host of other things). If the transactions are granular enough, this will give you insight into at least the requests that have performance issues. Once you have these indicators, work with the infrastructure team to review logs and other information. Being an iterative process, tests can be made to focus on a smaller and smaller section of the infrastructure.
In addition, loadrunner scripts don't have to be made strictly 'coming in through the frontdoor'. If you have a multi-tiered system, scripts can be made to hit the web/app/database servers directly.
For what to look for, focus on any measurements that have 'knees' or 'hockey stick' type of behaviour. You can hook into any of the server resource type measurements directly in the controller and integrate other team's stats in the analysis phase. Compare with benchmarks at lower virtual user levels to determine what is acceptable and unacceptable.
Good luck!
If the interview is focused on LoadRunner and SiteScope is considered - I'd come to conclusion that it's more focused on HP/Mercury solutions.. In that case I'd suggest you to look into HP Diagnostics and it's LoadRunner integration capabilities.
This type of information is usually not available by just looking at the standard results from a performance test.
Parts of the information you are looking for MAY be found by using SiteScope to monitor all the relevant servers in the test. SiteScope offers many counters to look at such as CPU, Memory, Disk I/O and Network I/O - as seen on each server.
This information perhaps gives clues as to where the bottleneck is, and the more counters you add to SiteScope, the bigger the change to pinpoint the bottleneck.
It is a very common misconception that AppServer and DBServer bottlenecks could be identified by just looking at the raw response times or hits, pages etc (web protocol), unless of course the URI accessed defines the exact component(s) in the system...
I need to use one logical PGM based multicast address in application while enable such application "seamlessly" running across several different geo-locations (i.e. think US/Europe/Australia).
Application is quite throughput (several million biz. messages a day) and latency demanding whith a lot of small but very frequently send messages. Classical Atom pub will not work here due some external limits of latencies.
I have come up with several options to connect those datacenters but can’t find the best one.
Options which I have considered are:
1) Forward multicast messages via VPN’s (can VPN handle such big load).
2) Translate all multicast messages to “wrapper messages” and forward them via AMQP.
3) Write specialized in-house gate which tunnels multicast messages via TCP to other two locations.
4) Any other solution
I would prefer option 1 as it does not need additional code writes from devs. but I’m afraid it will not be reliable connection.
Are there any rules to apply for such connectivity?
What the best network configuration with regard to the geographical configuration is for above constrains.
Just wanted to say hello :)
As for the topic, we have not much experience with multicasting over WAN, however, my feeling is that PGM + WAN + high volume of data would lead to retransmission storms. VPN won't make this problem disappear as all the Australian receivers would, when confronted with missing packets, send NACKS to Europe etc.
PGM specification does allow for tree structure of nodes for message delivery, so in theory you could place a single node on the receiving side that would in its turn re-multicast the data locally. However, I am not sure whether this kind of functionality is available with MS implementation of PGM. Optionally, you can place a Cisco router with PGM support on the receiving side that would handle this for you.
In any case, my preference would be to convert the data to TCP stream, pass it over the WAN and then convert it back to PGM on the other side. Some code has to be written, but no nasty surprises are to be expected.
Martin S.
at CohesiveFT we ran into a very similar problem when we designed our "VPN-Cubed" product for connecting multiple clouds up to servers behind our own firewall, in one VPN. We wanted to be able to run apps that talked to each other using multicast, but for example Amazon EC2 does not support multicast for reasons that should be fairly obvious if you consider the potential for network storms across a whole data center. We also wanted to route traffic across a wide area federation of nodes using the internet.
Without going into too much detail, the solution involved combining tunneling with standard routing protocols like BGP, and open technologies for VPNs. We used RabbitMQ AMQP to deliver messages in a pubsub style without needing physical multicast. This means you can fake multicast over wide area subnets, even across domains and firewalls, provided you are in the VPN-Cubed safe harbour. It works because it is a 'network overlay' as described in technical note here: http://blog.elasticserver.com/2008/12/vpn-cubed-technical-overview.html
I don't intend to actually offer you a specific solution, but I do hope this answer gives you confidence to try some of these approaches.
Cheers, alexis