This page describes the components of the network monitoring infrastructure that has been deployed in the DORII project, by installing and configuring a number of publicly available monitoring tools and developing a unique monitoring interface. This page provides some guidelines concerning:
Following the instruction on the SmokePing web site there are a lot of prerequisites: most of them are already available in RPM format. So, the following packets should be installed using YUM
The packets for dig and ssh should be already installed by default.
Concerning the Perl modules, the related packages are:
SpeedyCGI is contained in perl-CGI-SpeedyCGI packet. Finally, Perl and Apache webserver should be already on the machine by default.
Smokeping can be used in Master-Slave (gathering all the data in a central server) mode or as a stand alone instance (useful for test or in case the M/S mode is not working)
Master-Slave mode is described here and the configuration is very easy, since the configuration file does not need any tweaking, it is downloaded from the Master host. So:
<smokeping_dir>/bin/smokeping --master-url=http://monitor2.cnit.it/cgi-bin/smokeping.cgi --cache-dir=/var/lib/smokeping/ --shared-secret=<smokeping_dir>/etc/smokeping/slavesecrets.conf --slave-name=RemoteUserProbeName --debug-daemon
Some explanations about the options:
Once SmokePing archive is downloaded and unpacked (for example in /opt/smokeping-2.4.2, then renamed /opt/smokeping), as suggested by the original guide, some parameter tweaking is needed:
Now you can try your brand new Smokeping installation going back to the official guide.
TODO
Pathload is a tool to estimate the available bandwidth of a network path. The basic idea behind Pathload is that the one-way delays of a periodic packet stream show an increasing trend when the stream rate is larger than the available bandwidth The measurement algorithm is iterative and requires the cooperation of both the sender and the receiver. Pathload is non-intrusive, meaning that it does not cause significant increases in the network utilization, delays, or losses. The tool has been verified experimentally, by comparing its results with SNMP utilization data from the path routers.
DORII application sites (EUCENTRE, OGS, ELETTRA, UC, ECO-CdP) act as pathload_receiver and export results at CNIT Monitoring Platform by using Lynx Web Client (Check if you have it installed), whereas DORII sites hosting CE and SE (PSNC, GRNET, ELETTRA, CSIC-IFCA) are required to run pathload_sender.
If a firewall is present, it is necessary to open some destination and source ports. Please, take into account that Pathload is a client-server application that exchange data through:
The sender listens on port 55002, and data are sent to the UDP 55001 destination port For sake of clarity, the following figure shows how pathload works with two receivers and one sender. Data are reported to data collector at CNIT (monitor2.cnit.it). The type of traffic flows and the TCP/UDP ports are highlighted.
Various applications exist to collect and consolidate network usage information. At a basic level, these applications (also called manager) use SNMP (Simple Network Management Protocol) to read statistics from each monitored device (router or host) where an SNMP agent is configured and running. A standard Management Information Base (MIB) collects counters of the number of datagrams and bytes sent and received on each interface of a router and also gives the number of packets discarded because of congestion. An SNMP application can periodically poll each router and each device and covert the returned information into a view of usage across the whole network. SNMP can also help identify network interface failures or outage conditions. SNMP has been enabled on CEs, SEs and IEs interfaces.
The procedure and the commands to enable SNMP are strictly dependent on the operating system (OS) running on the device under consideration. For Linux-like OSs, the following steps are necessary:
Possible issues that could prevent SNMP from properly working are the following
You can download the SNMP configuration file for Linux systems from here
A Nagiosserver was installed at CNIT (Savona) and at GRNET to monitor every resourcs (Gateways, CEs, IEs, SEs) belonging to to DORII infrastructure. PING is used to check if every grid resource is up and running. Nagios is designed to allow plugins to return optional performance data in addition to normal status data, as well as to allow one to pass those performance data to external applications for processing. Normally, plugins return a single line of text that indicates the status of some type of measurable data. Nagiosgrapher is a Graphing System that collects the output of Nagios Plugins to generate graphs. When we use check_ping plugin, it generates some information that is saved in a RRD database. Nagiosgrapher takes this information and generates PING graphs of every DORII resources. Since Nagios polls the nodes using the SNMP protocol, the install of the SNMP agent (e.g., net-snmp package) is required.