Déroulement de l'atelier "Surveiller et améliorer les performances de son serveur"
Plan
- Énumération de quelques outils
- Quelques exemples d'utilisation à Montréal (nagios, munin, logcheck) et à Dakar (mon)
- Échanges sur les expériences des participants
- Recommandations en matière de supervision
- Comment améliorer les performances de son serveur ?
Déroulement
1- Énumération de quelques outils
- le faire soi-même (pas productif)
- Suivi des journaux d’évènements (les logs)
- auth.log
- syslog
- mail.log
- dmesg
- quelques commandes
- top- Process Activity Command
- vmstat – System Activity, Hardware and System Information
- w – Find Out Who Is Logged on And What They Are Doing
- ps – Displays The Processes
- uptime – Tell How Long The System Has Been Running
- free – Memory Usage
- iptraf – Real-time Network Statistics
- iostat – Average CPU Load, Disk Activity
- netstat – Network Statistics
- ss – Network Statistics
- tcpdump – Detailed Network Traffic Analysis
- mpstat – Multiprocessor Usage
- sar – Collect and Report System Activity
- pmap – Process Memory Usage
- ipfm - a bandwidth analysis
- Suivi des journaux d’évènements (les logs)
- supervision active (envoi d'alertes)
- Nagios
- Logcheck / logwatch
- supervision passive
- Cacti
- Munin
- Mrtg
- Zabbix
En deux liens : http://www.debianhelp.co.uk/monitortools.htm et http://www.cyberciti.biz/tips/top-linux-monitoring-tools.html
2- Exemples de Montréal
- Nagios
***** Nagios ***** Notification Type: PROBLEM Service: Espace disque Host: vz-aufhorsite State: WARNING for 0d 0h 3m 1s Address: 204.136.13.20 Info: WARNINGbr/: 90% 2386/2818MB used (warning=90% critical=95%)br Date/Time: Thu Aug 4 13:40:40 EDT 2011
***** Nagios ***** Notification Type: PROBLEM Service: Espace disque Host: vz-aufhorsite State: CRITICAL for 0d 0h 0m 1s Address: 204.136.13.20 Info: CRITICALbr/: 95% 2533/2818MB used (warning=90% critical=95%)br Date/Time: Fri Aug 5 17:00:40 EDT 2011
***** Nagios ***** Notification Type: RECOVERY Service: Espace disque Host: vz-aufhorsite State: OK for 0d 0h 0m 1s Address: 204.136.13.20 Info: OK Date/Time: Sun Aug 7 06:25:40 EDT 2011
**** Nagios ***** Notification Type: PROBLEM Host: nfs-mtl State: DOWN for 0d 0h 0m 0s Address: 10.36.1.200 Info: CRITICAL - Host Unreachable (10.36.1.200) Date/Time: Mon Jul 18 17:18:00 EDT 2011
- logcheck
ecurity Events for su =-=-=-=-=-=-=-=-=-=-=- Jul 28 15:03:23 10.36.0.17 su[3079]: pam_unix(su:auth): authentication failure; logname=xxxxx uid=1008 euid=0 tty=pts/1 ruser=xxxxx rhost= user=ftp Jul 28 15:03:25 10.36.0.17 su[3079]: FAILED su for ftp by xxxxx Security Events for sudo =-=-=-=-=-=-=-=-=-=-=-=- Jul 28 15:01:48 10.36.0.17 sudo: pam_unix(sudo:auth): authentication failure; logname=xxxxx uid=0 euid=0 tty=/dev/pts/1 ruser= rhost= user=xxxxx
- Munin
suivi des onduleurs : http://superca-munin.ca.auf/onduleurs.html
- suivi de ressources systèmes, exemple de la mémoire :