Monitoring applications on Zabbix - monitoring

everyone, I am trying to monitor an application for transactions on Zabbix, I have already installed the agent on the windows server where the application is running on. However, I only get stats such as CPU, disk space, networks etc. I am trying to monitor transactions and interfaces. How do I go about?

I would consider using user parameters if you want pure zabbix solution. Have your application transactions output status, and capture that. This approach can be cumbersome.
If you are not looking for pure zabbix solution I would look at a solution that supports transactional monitoring, something like http://wdt.io .

Related

What is recommended solution for monitoring heterogeneous infrastructure?

I am looking for monitoring tool for the following use cases:
Collect basic metrics about virtual machine (cpu usage, memory usage, i/o, available space)
Extract metrics from SQL Server (probably running some queries)
Extract information from external service about processing i.e how many processing are currently running and for how long. I am thinking about writing python scripts, but don't know how to combine with monitoring tool
Have the ability to plot charts and manage alerts and it will nice to have ability to send not only mails, but send message to slack/ms teams.
I was thing about Prometheus, because it has wmi_exporter, node_exporter, sql exporter, alert manager with possibility to send notifications to multiple destinations, but I don't know what to do with this external service and python scripts.
Any suggestions?
Prometheus can definitely do what you say you need done. Some of it may not be trivial, but you can definitely fill in the blanks yourself.
E.g. you can get machine metrics basically out of the box by firing up a node_exporter and having it scraped by Prometheus, but I don't think it has e.g. information on all running processes. The latter might require you to write an agent/exporter: a simple web server that exposes metrics on /metrics; there exists a Python client library to help with that. Or have said processes (assuming they're your code) push metrics to a Pushgateway instead, if they're short lived batch jobs.
Oh, and for charts/dashboards you probably want Grafana, as Prometheus' abilities in that area are rather limited and Grafana integrates rather well with Prometheus.

Zabbix integration with prometheus

We are currently monitoring our network devices with Zabbix but now we want to use Zabbix along with Prometheus for real-time monitoring and powerful alerting of Prometheus.
How can I integrate my existing Zabbix solution with Prometheus?
There seems to be a Zabbix to Prometheus exporter that may achieve what you want, but please note that I wouldn't recommend that.
Apart from some temporary migration scenarios I see little use in polling one monitoring system from the other. You're probably better off deploying the appropriate Prometheus exporters (e.g. SNMP, if your talking about network devices) and use Prometheus for the whole monitoring setup.
Of course you can still keep your Zabbix setup running side by side, if you need to.

openstack overkill for HA website stack?

Some background:
I'm building a pretty involved website (as far as used stack concerned). Components among some other smaller stuff include:
Elasticsearch
Redis
ZeroMQ
Couchbase
RethinkDB
traffic through Nginx -> Node
The intention is to have a high available website running but be pretty lean (and low cost) at the same time.
Current topology I'm considering:
2 webservers in active/active config with DNS-loadbalancing. (Nginx, static asset serving, etc. + loadbalancing to the second tier:
2 appservers in active/active. Most of the components like Elasticsearch can do sharding/replication themselves so this should not be as hard to set-up (fingers crossed)
session handling in replicated Redis
Naturally I want monitoring and alerting when something is wrong, and ideally the system should be able to handle failures automatically. Stuff like: promote Redis from Slave to Master, or even initialize a new ec2-instance, if I were to be on Ec2 that is.
However, I want to be free from a particular hosting provider. Which I believe (please correct if wrong) is where Openstack comes in.
Is it correct that:
- openstack allows me to control the entire lifecycle of my website-stack (covering multiple boxes / virtual machines? )
- Does it allow me to (with work on config of course) to spin-up instances, monitor, alert when something goes wrong, take appropriate actions in those scenario's, etc.?
Or is Openstack just entirely the wrong tool for the job? Anything else that would fit better as a sort of "management layer" on top of my entire website?
Thanks
OpenStack isn't VMWare ESX. It's not a very good straight up simple virtual machine hosting environment. If what you want is a way to easily manage virtual machines I might suggest Ganeti. It even has HA failover of virtual machines. In a two physical host environment, this is probably the way to go.
What OpenStack gives you that Ganeti won't is RESTful APIs. It has AWS Compatible APIs, but it has OpenStack APIs that are even better. If you want to automate elasticity or healability this is huge. Being able to link up in python using existing client APIs and just write scripts that spin up instances as needed is something joe DevOps is all about.
So I guess it comes down to what your level of commitment is and what you need. For 2 physical machines OpenStack probably isn't the best solution. But, down the line when you've got more apps and more vms than you can manage manually, openstack will be there to help you write code that makes your datacenter dance to your melodic tunes.

Emulate incoming network messages for Indy

Is it possible to emulate incoming messages using Indy (if it's of any importance: I'm using Indy 10 and Delphi 2009)? I want to be able to create these messages locally and I want Indy to believe that they come from specific clients in the network. All the internal Indy handling (choice of the thread in which the message is received and stuff like that) should be exactly the same as if the message would have arrived over the network.
Any ideas on that? Thanks in advance for any tips.
What you want to do has nothing to do with Indy, as you would need to do this on a much lower level. The easiest way to make Indy believe that messages come from a specific client is to inject properly prepared packets into the network stack. Read up on TCP Packet Injection on Google or Wikipedia. EtterCap is one such tool that allows to inject packets into established connections. However, this is definitely going into gray areas, as some of the tools are illegal in some countries.
Anyway, all of this is IMHO much too complicated. I don't know what exactly you want to do, but a specially prepared client or server is a much better tool to emulate certain behaviour while developing server or client applications. You can run them locally, or if you need to have different IP addresses or subnets you can do a lot with virtual machines.
Indy doesn't have any built-in mechanisms for this but thinking off the top of my head I would recommend building a small test application (or a suite) that runs locally on your development machine and connects to your Indy server application to replay messages.
It should be irrelevant to your Indy server applications if a TCP connection is made either locally or from a remote host as the mechanisms by which a server thread is created and a command processed is identical to both scenarios.
My last gig involved using Indy and all our testing was done with a similar Resender type application that would load local message files and send these to the Indy server app.
HTH and good luck!
One thing you can do would be to create virtual machines to run your test clients, that way they will not be seen as "local machine", and its fairly simple to create a complex network with VMS -- provided you have enough memory and disk space. The other advantage of testing with VM's is you can eliminate the development environment completely when its time to focus on deployment. Amazing how much time that saves alone.
VirtualPC is a free download from Microsoft and works fairly well. VMWare has another option, but costs a little more to get started. For development purposes, I prefer the desktop versions but the server versions also work well. You will still need to have a license to install the virtual OS. MSDN membership is probably the cheapest way to go, and allows you to build test environments for other flavors of the OS.
Indy has abstract stack mechanism for crossplatform support (IDStack.pas) I think u can hack the stack for windows (IdStackWindows.pas). It is a class. U can even consider to derivate it and override some functions to do the hack.

What are the requirements for an application health monitoring system?

What, at a minimum, should an application health-monitoring system do for you (the developer) and/or your boss (the IT Manager) and/or the operations (on-call) staff?
What else should it do above the minimum requirements?
Is monitoring the 'infrastructure' applications (ms-exchange, apache, etc.) sufficient or do individual user applications, web sites, and databases also need to be monitored?
if the latter, what do you need to know about them?
ADDENDUM: thanks for the input, i was really looking for application-level monitoring not infrastructure monitoring, but it is good to know about both
Whether the application is running.
Unusual cpu/memory/network usage.
Report any unhandled exceptions.
Status of various modules (if applicable).
Status of external components (databases, webservices, fileservers, etc.)
Number of pending background tasks (if applicable).
Maybe track usage of the application and report statistics on most/less used functionalities so you know where optimizations are most beneficial.
The answer is 'it depends'. Why do you need to monitor? How large is your operations staff? Do you need reporting? What is the application environment? Who cares if the application fails? Who cares if an exception happens? Are any of the errors recoverable? I could ask questions like these for a long time.
Great question.
We've been looking for some application-level monitoring solution for our needs some time ago without any luck. Popular monitoring solution are mostly addressed to monitor infrastrcture and - in my opinion - they are too complicated for a requirements of most of small and mid-sized companies.
We required (mainly) following features:
alerts - we wanted to know about
incident as fast as possible
painless management - hosted service wouldbe
the best
visualizations - it's good to know what is going on and take some knowledge from the data
Because we didn't find suitable solution we started to write our own. Finally we've ended with up-and-running service called AlertGrid. (You can check it for free of course.)
The idea behind it is to provide an easy way to handle custom monitoring scenarios. Integration API is very simple (one function with two required parameters). At the momment we and others are using it for:
monitor scheduled tasks (cron jobs)
monitor entire application logic execution
alert on errors in applications
we are also working on examples of basic infrastructure monitoring using AlertGrid
This is such an open ended question, but I would start with physical measurements.
1. Are all the machines I think are hosting this site pingable?
2. Are all the machines which should be serving content actually serving some content? (Ideally this would be hit from an external network.)
3. Is each expected service on each machine running?
3a. Have those services run recently?
4. Does each machine have hard drive space left? (Don't forget the db)
5. Have these machines been backed up? When was the last time?
Once one lays out the physical monitoring of the systems, one can address those specific to a system?
1. Can an automated script log in? How long did it take?
2. How many users are live? Have there been a million fake accounts added?
...
These sorts of questions get more nebulous, and can be very system specific. They also usually can be derived reactively when responding to phsyical measurements. Hard drive fill up, maybe the web server logs got filled up because a bunch of agents created too many fake users. That kind of thing.
While plan A shouldn't necessarily be reactive, it is the way many a site setup a monitoring system.
Minimum: make sure it is running :)
However, some other stuff would be very useful. For example, the CPU load, RAM usage and (in multiuser systems) which user is running what. Also, for applications that access network, a list of network connections for each app. And (if you have access to client computer(s)) it would be cool to be able to see the 'window title' of the app - maybe check each 2-3 minutes if it changed and save it. Also, a list of files open by the application could be very useful, but it is not a must.
I think this is fairly simple - monitor so that you can be warned early enough before something goes wrong. That means monitor dependencies and the application itself.
It's really hard to provide specifics if you're not going to give details on the application you're monitoring, so I'd say use that as a general rule.
At a minimum you want to know that the system is healthy. This is subjective in what defines your system is healthy. Is it computers are up, the needed resources exist, the data is flowing through the system, the data is properly producing results, etc, etc.
In my project we do monitoring of most of this and then some. It really comes down to what is the highest level that you can use to analyze that everything is working. In our case we need to know down to the data output. If you just need to know down to the are these machines up it saves you on trying to show an inexperienced end user what is wrong.
There are also "off the shelf" tools that will do a lot of the hard work for you if you are just looking too hard into data results. I particularly liked Nagios when I was looking around but we needed more than it could easily show so I wrote our own monitoring system. Basically we also watch for "peculiarities" in the system, memory / cpu spikes, etc...
thanks everyone for the input, i was really looking for application-level monitoring not infrastructure monitoring, but it is good to know about both
the difference is:
infrastructure monitoring would be servers plus MS Exchange Server, Apache, IIS, and so forth
application monitoring would be user machines and the specific programs that they use to do their jobs, and/or servers plus the data-moving/backend applications that they run to keep the data flowing
sometimes it's hard to draw the line - an oversimplified definition might be "if your team wrote it, it's an application; if you bought it, it's infrastructure"
i think in practice it is best to monitor both
What you need to do is to break down the business process of the application and then have the software emit events at major business components. In addition, you'll need to create end to end synthetic transactions (eg. emulating end users clicking on a website). All that data would be fed into an monitoring tool. In the past, I've done JMX for applications of which flowed into Tivoli Monitoring's JMX Adapter and then I've done scripts that implement a "fake user" and then pipe in the results into Tivoli Monitoring's Script Adapter. Tivoli Monitoring takes the data and then creates application health and performance charts from that raw data.

Resources