It's all about respect

"Don't waste my time, buddy. You're just a dogsbody sysadmin; I write the software and you merely service it. So just shut up with your petty concerns and do as I say, OK?"
Comment from serverfault.com 

Today I'm going to write about one of the main reasons, in my opinion, why cooperation between dev and ops is not working most of the time: respect and understanding of the other persons job.

dev and ops have different types of work to do, but all have the same goal: bring an application to production and make it as good as possible (depending on time, money, technology, ...)

So in my experience some people mix up their jobs:
  • Some sysadmin believe every developer has to know everything about infrastructure
  • Some sysadmin even think they know how to program and that they could do the job of dev people much better
  • Some developers think they know how to set up infrastructure and how to configure all systems
  • Some developers think ops people should know all about the app without telling them anything
Don't get me wrong there are sysadmins who know how to program and there are developers who know how infrastructure works, but most of the time this is not the case. So when you work together, accept that everyone has his profession and most of the people know what they are doing. Let's straighten another point: It would be nice when every programmer would know the infrastructure and every sysadmin would know enough about programming, but this will not be the case. So when you argue about applications and/or infrastructure just explain the other one why you think it should be this way and not the other, don't just say "No!" (we were there [insert link here] already).

Respect each other and be polite. In my opinion dev and ops have different tasks during a project, which they should accomplish to make cooperation between both possible:
  • ops: Set up a development environment which matches the production environment in the main points: OS, network setup, ...
  • dev: Explain what the app should be doing and what type of systems it is using
  • ops: Make clear what other system exist which might influence the new environment
  • dev + ops: talk about non-functional requirements as early as possible and not two weeks before launch, otherwise it might get really messy
  • ...

Selenium and Nagios

Hi,

I've implemented a Nagios check for Selenium test cases. With this check it is possible to put your recorded test cases from your Selenium IDE into Nagios to use them for monitoring.

In my opinion this has the following advantages:
  • You can transfer your test cases to monitoring without making any changes.
  • You can run the test cases with multiple browsers. This means, you use the JavaScript and rendering engine of the browser not any other HTML/HTTP library.
  • You are more flexible, when monitoring more complex scenarios.
To get this working, you need the following components:
There are two possible scenarios how you could put this to work:
  • Install the Selinium RC server on your Nagios host.
  • Install the Selinum RC server on a different host.
I will only explain how you can set up the Selenium RC server on your Nagios host. When you want to install it on a different server you currently have to use nrpe/nsca to get it going. This is the only difference, the rest stays the same. You can contact the Selenium RC server directly on another server, but this is not yet implemented.

First record your test case with the Selenium IDE. When you've done this before, take a look at the Selenium IDE documentation. After recording it, export it to a Java file (File -> Export Test Case As ... -> Java (JUnit) - Selenium RC). Compile this Java file with javac or your favorite IDE. For compilation you already need the JUnit library and the file selenium-server.jar:
javac -cp ./junit.jar:./selenium-server.jar [your test case file]
 On your Nagios host, put check_selenium into your libexec path, add the files CallSeleniumTest.class, junit.jar and selenium-server.jar somewhere. Adjust the classpath in the check_selenium file. As the Selenium IDE puts the Java classes automatically in the package com.tests.examples, create a directory com/tests/examples on your filesystem and add it to the classpath. Put the compiled Java file into this directory.

Add the definition to your nagios configuration and test it. When your nagios server is a headless linux system, you can also run Selenium RC headless with Xvfb. Take a look at this post, how to set it up.

Hope this all gives you new ways how to monitor your web applications or make the already established way a bit easier.

The plugin also returns performance data for the test cases, but be aware the returned time also contains the startup time for the browser.

This plugin is written with Java and the Selenium test case is integrated with reflection. I'm sure you could also write this in Perl, Ruby or something different. I did it in Java because I had a bit of a mess with the needed Perl modules. Perhaps I will do a Perl version later.


Update 2011/07:
Some people asked for a more detailed explanation on integrating check_selenium into Nagios. Here are two possible solutions how to integrate it. The first one is with NRPE. The check runs on a different host. With the second possibility, the check runs on your Nagios host.
define command {
  command_name  nrpe_check_selenium
  command_line  $USER1$/check_nrpe -t 60 -H $HOSTADDRESS$ -p 5666 -c check_selenium -a "$ARG1$" "$ARG2$" "$ARG3$"
}

define command {
  command_name  local_check_selenium
  command_line  $USER1$/check_selenium "$ARG1$" "$ARG2$" "$ARG3$"
}
After that, you can define your service and asign it to a host:
define service {
  service_description   nrpe_selenium_Google
  use   service-check-05min
  host_name  your.winserver.here
  check_command  nrpe_check_selenium!com.example.tests.GoogleTestCase!http://www.google.com!*iexplore
}

define service {
  service_description  selenium_Google
  use  service-check-05min
  host_name  your.server.here
  check_command    local_check_selenium!com.example.tests.GoogleTestCase!http://www.google.com!*firefox
}

When executing the check via NRPE, add a line like this to your nrpe.cfg file on your remote host:

command[check_selenium]=/usr/local/nagios/libexec/check_selenium.sh --class "$ARG1$" --baseUrl "$ARG2$" --browsertype "$ARG3$"
Hope this will help some people to get faster results.