Linux Cluster - Debugging Resource Failures

If you have a resource that fails to start, and there's nothing obvious in the logs (look for "lrmd", "LRM operation", etc.), you can try starting it manually to diagnose the problem further. Likewise for failed stop and monitor ops.

First, you have to unmanage the resource, so Pacemaker won't try to do anything with it, with:

# crm resource unmanage  <resource>
Configure environment:
# export OCF_ROOT=/usr/lib/ocf
# export OCF_RESKEY_<param>=<value>
# ... (likewise for all other resource parameters, run        
       "crm configure show <resource>" to verify what
       params you need to set here)
Run the op:
# /usr/lib/ocf/resource.d/heartbeat/<ra> start ; echo $? 
Look for helpful error messages, and check the return code.
If that doesn't help, try using sh -x or bash -x to see exactly what the RA is doing. Do a stop first just in case, then try the start again:
# /usr/lib/ocf/resource.d/heartbeat/<ra> stop
# sh -x /usr/lib/ocf/resource.d/heartbeat/<ra> start ; echo $?
Once you've figured out what the problem is and solved it, give the resource back to Pacemaker:
# crm resource manage <resource>

Ref: http://clusterlabs.org/wiki/Debugging_Resource_Failures


Cluster Resource Manager Quick Reference

This post is dedicated to CRM (Cluster Resource Manager) for sysadmin that have to manage cluster on linux system.

Basic usage

# sudo crm status
Get the status of cluster.


# sudo crm configure edit `resource_name`
Edit configuration of single resource resource_name.

Resource agent

# crm ra classes
List resource agent classes

# crm ra list ocf
List OCF resource agent available.


APT Quick Reference

Basic "apt-get" usage

sudo apt-get install package
Downloads package and all of its dependencies, and installs or upgrades them.
sudo apt-get -u -V upgrade
List packages to be upgraded with their versions.
sudo apt-get remove [--purge] package
Removes package and any packages that depend on it
sudo apt-get purge -y $(dpkg --list |grep '^rc' |awk '{print $2}')
Purge packages removed, but not purged (rc)

Basic "apt-cache" usage

apt-cache search pattern
Searches packages and descriptions for pattern.
apt-cache show package
Shows the full description of package
apt-cache showpkg package
Shows a lot more detail about and its relationships to other packages.

Basic "dpkg" usage

dpkg-deb -e package_file.deb
Extract content of .deb file in current directory.
dpkg --list |grep "^rc" | cut -d " " -f 3 | xargs sudo dpkg --purge
Remove all rc packages.
dpkg -S /etc/bash.bashrc
How to find which package contains/supplies a certain file.
dpkg -l | awk '{ print $2 }' | tail -n+5 |tr '\n' ' '
List all packages installed in one line
dpkg -I package.deb
Get info about a package

Basic "aptitude" usage

aptitude download package_name
Download the dpkg file of package_name


  1. http://www.cyberciti.biz/ref/apt-dpkg-ref.html - APT and Dpkg Quick Reference Sheet


How to disable gateway and dns entries from dhcp in Ubuntu


If you have a server with two network cards and on all card are configured with dhcp protcol there is a possibility that only one default gateway will be setup on one of these cards.

On Ubuntu, for resolve this problem you can install ifmetric packages

#sudo apt-get install ifmetric

then setup the metric for you network card in your /etc/network/interfaces file (or similary in interfaces.d)...

# The primary network interface
allow-hotplug eth0
auto eth0
iface eth0 inet dhcp
   metric 200
allow-hotplug eth1
# The secondary network interface
auto eth1
iface eth1 inet dhcp

Ref: http://serverfault.com/questions/29394/debian-interfaces-file-ignore-gateway-and-dns-entries-from-dhcp