Cluster-Handbook/Munin

From Wikibooks, open books for an open world
Jump to navigation Jump to search

Munin[edit | edit source]

Installation of the software Munin[edit | edit source]

Munin is a software system based on Linux. It measures the server load. This requires a 64-bit computer!
First unzip and to install the software package Munin using the command “sudo apt-get install munin munin-node”. This extracts the entire Munin software package on the Linux operating system. Once you’ve performed this, enter sudo nano /etc/munin/munin.conf to open the configuration menu. The file should look like this:

#htmldir /var/www/statistics
#logdir/var/log/munin
#rundir/var/run/munin

You delete the comments by removing the # character from the commands so that the program can read and execute it. After you installed the munin plugins with sudo nano /usr/share/munin/plugins. The program will then restart with sudo /etc/init.d/munin-node restart, so that it accepts all new settings. The command sudo apt-get install apache2
installs the web server and
sudo/etc/apache2/mods-available/status/conf
shows finally the configuration menu. Here the extended status must be set to On to run Munin as desired. The sudo a2enmod status needs to be activated. This must be on enabeld.
After that the plug-ins are going to be enabled. For this one must enter the following commands in the command line:

sudo ln-s/urs/share/munin/plugins/ _apache/etc/munin/plugins ln-s/urs/share/munin/plugins/ apache_proccess/etc/ munin/plugins
ln-s/urs/share/munin/plugins/_volume/etc/munin/ plugins

To change the settings of Munin you restart the system with:

sudo /etc/init.d/apache2 restart

The following command installs the graphic package:

sudo apt-get install libwww-perl

This is required for the design of the graphs.

Working with Munin[edit | edit source]

The software system Munin must be connected to an Internet server so that its visual interface can be displayed. For this purpose, again, open the configuration file with this command:
sudo nano /etc/munin/munin.conf

The shown IP Name localdomain for locally called internet domain domain will be changed to the name Master. The displayed IP address must be in the master has be changed to 127.0.0.1.Worker gets the number 10.0.2.2. ( Each working group got different IP extensions, here the 2.2).
## First our “normal” host. [server02/ Master] address 127.0.0.1
(Vgl. http://help.ubuntu-se.org/9.10/serverguide/sv/munin.html)
On a Windows computer always the same name must be used. When the web browsers can not open Munin, the name must be changed in the sudo/etc/hosts file. Subsequently, enter the IP from the Master/munin in the internet browser and trie to open the Munin page or the software system. If the installation was successful, Munin can be accessed and measures the server load. However, the measurement takes some time to complete because Munin measures per day/month/year or different workloads on a few servers. It displays the minimum and maximum values (see on the next page the picture of the software system Munin). In addition, the system measures at different times. Updates for Munin appear and also get reported by the program. It is also displayed when the server is can not be reached, for example, during a power failure or computer crash.

http://blog.m3d1c5.org/2011/10/prosody-xmpp-server-mit-munin-uberwachen/
Example for the display of the server utilization levels with Munin
(Quelle: http://zockertown.de/s9y/index.php?/archives/1426-Munin-ist-schon-toll.html)


The advantage of the program is that you can react to a failure of a server even with a large number of servers and quickly detect which server is down. This must then be optionally repaired or renewed.

Example of computer cluster in Munin, vgl. http://munin.ping.uio.no/

Overview • ping.uio.no
o aquarius.ping.uio.no [ disk exim network processes system ]
o bache.ping.uio.no [ disk network nfs
postfix processes system time ]
o bambi.ping.uio.no [ disk network nfs
processes system time ]
o bimbo.ping.uio.no [ disk exim network nfs
other processes system ]
o bottolf.ping.uio.no [ disk exim network
nfs processes system time ]
o cirrus.ping.uio.no [ disk exim network
processes sensors system ]
o cumulus.ping.uio.no [ disk exim network
processes sensors system ]
o freddy.ping.uio.no [ disk network nfs
postfix processes sensors system time ]

o galactica.ping.uio.no [ disk exim
network nfs postfix printing processes
system ]
o gud.ping.uio.no [ disk network nfs
postfix printing processes sensors
system ]
o kjell.ping.uio.no [ disk network
nfs postfix processes sensors system time ]
o knuth.ping.uio.no [ Apache disk mysql
network nfs postfix processes sensors system time ]
o m.ping.uio.no [ disk exim network nfs
printing processes sensors system ]
o matz.ping.uio.no [ disk network nfs
processes system ]
o meg.ping.uio.no [ disk network nfs other processes system ]
o pike.ping.uio.no [ Apache disk exim
munin network printing processes sensors
system time virtual machines ]
o ponnypetra.ping.uio.no [ disk network other processes system ]
o rosa.ping.uio.no [ disk network nfs
processes system time ]
o rossum.ping.uio.no [ Apache disk exim network nfs other processes system time ]
o tetra.ping.uio.no [ disk network
processes system ]
o urias.ping.uio.no [ disk network nfs other processes system time ]
o utslett.ping.uio.no [ disk munin
network processes system ]

On the picture you can see the individual process servers and systems. In Ubuntu all packets have a start and stop function. These control the services.
Therefore one must enter: sudo /etc/init.d/munin-nodestart|stop|restart|force-reload|try-restart

“Restart” restarts the system, existing systems on the server will be stopped. “Try -restart restarts the service when he was stopped before.”

Warnings

If the limits of the capacity utilization in the Munin server are exceeded, these values are usually displayed in red. One can send then alerts via e-mail, so that the maximum disk space is not exceeded. For this purpose, open the file munin.conf ( wiki.ubuntuusers.de/Munin). These commands are then added:

# Drop somejuser@fnord.comm and antoherurser@blibb.com an email
everytime
# something changes ( OK Warning, CRITICAL OK, etc)
Contacts me
Contacts.me.command mail -s “Munin notification ( var:host)” user@example.com
Contact.me.always_send warning critical

The email address must be adapted to your own system. This should be done even the utilization values are determined from when the server threatens to overflow, to timely send a warning to the user can. Before a postfix should be installed and configured so that the e-mails are sent to all users. For each host, this can be achieved as follows (see example from the configuration file of Munin.):
(localhost.localdomain/Master)
Address 127.0.0.1.
use_node_name yes
<plugin>.<fieldname>. (critical,warning) <value>

The plugin is accessed via the URL of the graph. The field name can be copied from the Munin graph. Under Internal name is the fieldname shown. Critical warning can be freely selected. The value is determined as described above and upon reaching/exceeding a warning e-mail sent to all users.
Example of a Server Warning entry in Munin

“[localhost.localdomain]
address 127.0.0.1
use_node_name yes
fd._dev_evms_hda2.warning 70
df._dev_evms_hda2.critical 95
df._dev_mapper_hda5.warning 70
df._dev_mapper_hda5.critical 70”

Here was 70 determined as a critical value and 95 selected to be a very critical value. The values should be carefully selected and not too low, because the user gets an warning email and can be frightened. The warnings should also be sent in any case with the truly critical values, so you can, if necessary, load the system with a backup.

CPU main processor

Munin can also measure the load on the main processor. This is a central processing unit executing a program. This also works for central host computers, connected to the plurality of terminals. Even earlier computer server performance and data can be compared with one another with Munin. The master collects the performance data, the node stores them and generates a graphic on the Web interface. The storage of the graphic is made via the RRDtool.

CPU main server metrics (http://www.server-wissen.de/wp-content/uploads/2012/02/cpu-day.png)

Munin errors and cleanup[edit | edit source]

Various types of errors can occur, for example, the IP address may change from one day to the other. Munin can not achieve the desired browser page therefore. In this case, the address in the configuration file needs adjustment. It is not so easy to change the name of the localdomainserver.
White bars in the graphic:
The cause may be that the user has configured a graphic file or a mistake when while unpacking the package. When installing the permissions mistakes can happen easily, because then e.g. no warning e-mails can be sent when the server overflows.