I need to make freinds zabbix & other monitoring system.
My company uses Zabbix for monitoring. Our partner plans to use other system.
We need to exchange monitoring datas.
I'm interested in coopereation with the next systems: BMC Patrol, MS SCOM, NetCool, Portal.
What is the best way to integrate it?
Maybe via SNMP?
Replicate hosts and metrics into your Zabbix (use Zabbix trapper item type and setup also Allowed hosts value) and then just use some suitable zabbix-sender implementation and push data into Zabbix.
IMO it's terrible idea, because latency, syncing, ... Do you really need data (item values) or do you need only visualize data from different datasources in one graph?
Regarding BMC Patrol you can use History Loader/Propagator KM to export the monitoring data:
https://docs.bmc.com/docs/display/public/unixlinux912/PATROL+KM+for+History+Loader
or you can use the 'dump_hist' command to dump the history data from the agents:
https://docs.bmc.com/docs/display/pia9600/dump_hist+uility
Regarding Netcool events, you could get the information using different approaches, for example, depending on the version, you could get the events from the HTTP interface, as described below:
https://www.ibm.com/support/knowledgecenter/en/SSNFET_9.2.0/com.ibm.netcool_OMNIbus.doc_7.4.0/omnibus/wip/api/reference/omn_api_http_httpinterface.html
Or perhaps you could create a flat file gateway to read the events and write them on a file:
https://www.ibm.com/support/knowledgecenter/en/SSSHTQ/omnibus/gateways/flatfilegw/wip/concept/flatfilegw_intro.html
Related
I am looking for monitoring tool for the following use cases:
Collect basic metrics about virtual machine (cpu usage, memory usage, i/o, available space)
Extract metrics from SQL Server (probably running some queries)
Extract information from external service about processing i.e how many processing are currently running and for how long. I am thinking about writing python scripts, but don't know how to combine with monitoring tool
Have the ability to plot charts and manage alerts and it will nice to have ability to send not only mails, but send message to slack/ms teams.
I was thing about Prometheus, because it has wmi_exporter, node_exporter, sql exporter, alert manager with possibility to send notifications to multiple destinations, but I don't know what to do with this external service and python scripts.
Any suggestions?
Prometheus can definitely do what you say you need done. Some of it may not be trivial, but you can definitely fill in the blanks yourself.
E.g. you can get machine metrics basically out of the box by firing up a node_exporter and having it scraped by Prometheus, but I don't think it has e.g. information on all running processes. The latter might require you to write an agent/exporter: a simple web server that exposes metrics on /metrics; there exists a Python client library to help with that. Or have said processes (assuming they're your code) push metrics to a Pushgateway instead, if they're short lived batch jobs.
Oh, and for charts/dashboards you probably want Grafana, as Prometheus' abilities in that area are rather limited and Grafana integrates rather well with Prometheus.
I work with Docker and Kubernetes.
I would like to collect application specific metrics from each Docker.
There are various applications, each running in one or more Dockers.
I would like to collect the metrics in JSON format in order to perform further processing on each type of metrics.
I am trying to understand what is the best practice, if any and what tools can I use to achieve my goal.
Currently I am looking into several options, none looks too good:
Connecting to kubectl, getting a list of pods, performing a command (exec) at each pod to cause the application to print/send JSON with metrics. I don't like this option as it means that I need to be aware to which pods exist and access each, while the whole point of having Kubernetes is to avoid dealing with this issue.
I am looking for Kubernetes API HTTP GET request that will allow me to pull a specific file.
The closest I found is GET /api/v1/namespaces/{namespace}/pods/{name}/log and it seems it is not quite what I need.
And again, it forces me to mention each pop by name.
I consider to use ExecAction in Probe to send JSON with metrics periodically. It is a hack (as this is not the purpose of Probe), but it removes the need to handle each specific pod
I can't use Prometheus for reasons that are out of my control but I wonder how Prometheus collects metric. Maybe I can use similar approach?
Any possible solution will be appreciated.
From an architectural point of view you have 2 options here:
1) pull model: your application exposes metrics data through a mechanisms (for instance using the HTTP protocol on a different port) and an external tool scrapes your pods at a timed interval (getting pod addresses from the k8s API); this is the model used by prometheus for instance.
2) push model: your application actively pushes metrics to an external server, tipically a time series database such as influxdb, when it is most relevant to it.
In my opinion, option 2 is the easiest to implement, because:
you don't need to deal with k8s APIs in order to discover pods addresses;
you don't need to create a local storage layer to store your metrics (because you push them one by one as they occour);
But there is a drawback: you need to be careful how you implement this, it could cause your API to become slower and you might need to deal with asynchronous calls to your metrics server.
This is obviously a very generic answer, but I hope it could point you in the right direction.
Pity you can not use Prometheus, but it's a good lead for what can be done in this scope. What Prom does is as follows :
1: it assumes that metrics you want to scrape (collect) are exposed with some http endpoint that Prometheus can access.
2: it connects to kubernetes api to perform a discovery of endpoints to scrape metrics from (there is a config for it, but generaly it means it has to be able to connect to the API and list services/deployments/pods and analyze their annotations (as they have info about metrics endpoints) to compose a list of places to scrape data from
3: periodicaly (15s, 60s etc.) it connects to the endpoints and collects the exposed metrics.
That's it. Rest is storage/postprocessing. The kube related part might be a significant amount of work to do though, so it would be way better to go with something that already exists.
Sidenote: while this is generaly a pull based model, there are cases where pull is not possible (vide short lived scripts like php), that is where Prometheus pushgateway comes into play to allow pushing metrics to a place where Prometheus will pull from.
I am trying to write a small script that will help me automate some of my IT tasks regarding to VLAN management.
I do not want to log-in to my switch via command-line - I want to send commands to it and get response (over the NET).
Are there any alternatives? I have started to search the web but so far I did not found anything.
I know SNMP is an option to gain info but I want to check other alternatives
thanks.
You can try Netconf Configuration Protocol, it is RPC-like management protocol which is supported by Cisco and many other vendors.
SNMP is the only widely and commonly used option here.
You can use WMI to manage Windows-based infrastructure.
There is also legacy SYSLOG protocol (RFC3164) which is UDP based.
For traffic monitoring and billing purposes there are NetFlow,
sFlow, jFlow, IPFIX and RADIUS protocols.
There are some other protocols but mostly proprietary.
So I'd suggest using SNMP which is nowadays a de-facto standard in network monitoring domain.
You might look at Expect as a scripting language solution. It is commonly used to do exactly what you are needing:
log into device (with result cases)
execute commands
save config
logout
As you build out a script library, tasks become simplified as you could do things like run scripts with parameters and have Expect do all the detail work.
See the wikipedia article for an overview.
I have also used SNMP for this kind of thing but the functionality is different because you are using an SNMP read-write privilege to upload new parts or complete configs, saving the running config to flash and/or saving the config off-device.
Try NETCONF+YANG protocol because it is currently the best option for network device configuration. More about SNMP alternatives:
https://bestmonitoringtools.com/top-snmp-alternatives-because-snmp-is-dying/
I have a cluster of servers monitored with ganglia and I just added a new application on one of them. The new application uses an SNMP handler to report about its activity.
I never used SNMP before and would like to gather most of SNMP metrics that I see in the MIB file to my rrd files used by Ganglia.
Would this be possible ?
I wanted to write a new ganglia module that would take into account the new metrics from the new application. So I tried to read the code of the SNMP handler of the application but cannot catch where it takes its information from.
What would be a good way to figure out this situation ?
Thank you!
I need to monitor several Linux servers placed in a different location from my farm.
I have VPN connection to this remote location.
Internally I use Zenoss 4 to monitor the systems, I would like to use Zenoss to monitor remote systems too. For contract policy, I cannot use VPN connection for Zenoss data (e.g. SNMP or SSH).
What I created is a bunch of scripts that fetch desired data from remote systems to an internal server. The format of the returned data is one CVS per every location, containing data from all appliances placed in that location.
For example:
$ cat LOCATION_1/current/current.csv
APPLIANCE1,out_of_memory,no,=,no,3,-
APPLIANCE1,postgre_idle,no,=,no,3,-
APPLIANCE2,out_of_memory,no,=,no,3,-
APPLIANCE2,postgre_idle,no,=,no,3,-
The format of CVS is this one:
HOSTNAME,CHECK_NAME,RESULT_VALUE,COMPARE,DESIRED_VALUE,INFO
How can i integrate those data in Zenoss, as the machines were placed in the internal farm?
If it is necessary, I could eventually change the format of fetched data.
Thank you very much
One possibility is for your internal server that communicates with remote systems (let's call it INTERNAL1) to re-issue the events as SNMP traps (or write them to the rsyslog file) and then process them in Zenoss.
For example, the message can start with the name of the server: "[APPLIANCE1] Out of Memory". In the "Event Class transform" section of your Zenoss web interface (http://my_zenoss_install.local:8080/zport/dmd/Events/editEventClassTransform), you can transform attributes of incoming messages (using Python). I frequently use this to lower the severity of an event. E.g.,
if evt.component == 'abrt' and evt.message.find('Saved core dump of pid') != -1:
evt.severity = 2 # was originally 3, I think
For your needs, you can set the evt.device to APPLIANCE1 if the message comes from INTERNAL1, and contains [APPLIANCE1] tag as the message prefix, or anything else you want to use to uniquely identify messages/traps from remote systems.
I don't claim this to be the best way of achieving your goal. My knowledge of Zenoss is strictly limited to what I currently need to use it for.
P.S. here is a rather old document from Zenoss about using event transforms. Unfortunately documentation in Zenoss is sparse and scattered (as you may have already learned), so searching old posts and/or asking questions on the Zenoss forum may be necessary.
Simply you can deploy one collector in remote location, and you add that host into collector pool , you can monitor remote linux servers also