Yum Download SRPMs

Found a nice post today on how to use yum to download source RPMs, rather than having to do a manual search on the relevant mirror.

pvmove Disk Migrations

Lots of people make use of linux's lvm (Logical Volume Manager) for providing services such as disk volume resizing and snapshotting under linux. But few people seem to know about the little pvmove utility, which offers a very powerful facility for migrating data between disk volumes on the fly.

Let's say, for example, that you have a disk volume you need to rebuild for some reason. Perhaps you want to change the raid type you're using on it; perhaps you want to rebuild it using larger disks. Whatever the reason, you need to migrate all your data to another temporary disk volume so you can rebuild your initial one.

The standard way of doing this is probably to just create a new filesystem on your new disk volume, and then copy or rsync all the data across. But how do you verify that you have all the data at the end of the copy, and that nothing has changed on your original disk after the copy started? If you did a second rsync and nothing new was copied across, and the disk usage totals exactly match, and you remember to unmount the original disk immediately, you might have an exact copy. But if your original disk data is changing at all, getting a good copy of a large disk volume can actually be pretty tricky.

The elegant lvm/pvmove solution to this problem is this: instead of doing a userspace migration between disk volumes, you add your new volume into the existing volume group, and then tell lvm to move all the physical extents off of your old physical volume, and the migration is magically handled by lvm, without even needing to unmount the logical volume!

# Volume group 'extra' exists on physical volume /dev/sdc1
$ lvs
  LV   VG     Attr   LSize  Origin Snap%  Move Log Copy%  Convert
  data extra  -wi-ao 100.00G

# Add new physical volume /dev/sdd1 into volume group
$ vgextend extra /dev/sdd1
  Volume group "extra" successfully extended
$ lvs
  LV   VG     Attr   LSize  Origin Snap%  Move Log Copy%  Convert
  data extra  -wi-ao 200.00G

# Use pvmove to move physical extents off of old /dev/sdc1 (verbose mode)
$ pvmove -v /dev/sdc1
# Lots of output in verbose mode ...

# Done - remove old physical volume
$ pvremove /dev/sdc1
$ lvs
  LV   VG     Attr   LSize  Origin Snap%  Move Log Copy%  Convert
  data extra  -wi-ao 100.00G

The joys of linux.

Quick Linux Box Hardware Overview

Note to self: here's how to get a quick overview of the hardware on a
linux box:
perl -F"\s*:\s*" -ane "chomp \$F[1];
  print qq/\$F[1] / if \$F[0] =~ m/^(model name|cpu MHz)/;
  print qq/\n/ if \$F[0] eq qq/\n/" /proc/cpuinfo
grep MemTotal /proc/meminfo
grep SwapTotal /proc/meminfo
fdisk -l /dev/[sh]d? 2>/dev/null | grep Disk

Particularly useful if you're auditing a bunch of machines (via an ssh loop or clusterssh or something) and want a quick 5000-foot view of what's there.

Basic KVM on CentOS 5

I've been using kvm for my virtualisation needs lately, instead of xen, and finding it great. Disadvantages are that it requires hardware virtualisation support, and so only works on newer Intel/AMD CPUs. Advantages are that it's baked into recent linux kernels, and so more or less Just Works out of the box, no magic kernels required.

There are some pretty useful resources covering this stuff out on the web - the following sites are particularly useful:

There's not too much specific to CentOS though, so here's the recipe I've been using for CentOS 5:

# Confirm your CPU has virtualisation support
egrep 'vmx|svm' /proc/cpuinfo

# Install the kvm and qemu packages you need
# From the CentOS Extras repository (older):
yum install --enablerepo=extras kvm kmod-kvm qemu
# OR from my repository (for most recent kernels only):
ARCH=$(uname -i)
OF_MREPO=http://www.openfusion.com.au/mrepo/centos5-$ARCH/RPMS.of/
rpm -Uvh $OF_MREPO/openfusion-release-0.3-1.of.noarch.rpm
yum install kvm kmod-kvm qemu

# Install the appropriate kernel module - either:
modprobe kvm-intel
# OR:
modprobe kvm-amd
lsmod | grep kvm

# Check the kvm device exists
ls -l /dev/kvm

# I like to run my virtual machines as a 'kvm' user, not as root
chgrp kvm /dev/kvm
chmod 660 /dev/kvm
ls -l /dev/kvm
useradd -r -g kvm kvm

# Create a disk image to use
cd /data/images
IMAGE=centos5x.img
# Note that the specified size is a maximum - the image only uses what it needs
qemu-img create -f qcow2 $IMAGE 10G
chown kvm $IMAGE

# Boot an install ISO on your image and do the install
MEM=1024
ISO=/path/to/CentOS-5.2-x86_64-bin-DVD.iso
# ISO=/path/to/WinXP.iso
qemu-kvm -hda $IMAGE -m ${MEM:-512} -cdrom $ISO -boot d
# I usually just do a minimal install with std defaults and dhcp, and configure later

# After your install has completed restart without the -boot parameter
# This should have outgoing networking working, but pings don't work (!)
qemu-kvm -hda $IMAGE -m ${MEM:-512} &

That should be sufficient to get you up and running with basic outgoing networking (for instance as a test desktop instance). In qemu terms this is using 'user mode' networking which is easy but slow, so if you want better performance, or if you want to allow incoming connections (e.g. as a server) you need some extra magic, which I'll cover in a "subsequent post":kvm_bridging.

Simple KVM Bridging

Following on from my post yesterday on "Basic KVM on CentOS 5", here's how to setup simple bridging to allow incoming network connections to your VM (and to get other standard network functionality like pings working). This is a simplified/tweaked version of Hadyn Solomon's bridging instructions.

Note this this is all done on your HOST machine, not your guest.

For CentOS:

# Install bridge-utils
yum install bridge-utils

# Add a bridge interface config file
vi /etc/sysconfig/network-scripts/ifcfg-br0
# DHCP version
ONBOOT=yes
TYPE=Bridge
DEVICE=br0
BOOTPROTO=dhcp
# OR, static version
ONBOOT=yes
TYPE=Bridge
DEVICE=br0
BOOTPROTO=static
IPADDR=xx.xx.xx.xx
NETMASK=255.255.255.0

# Make your primary interface part of this bridge e.g.
vi /etc/sysconfig/network-scripts/ifcfg-eth0
# Add:
BRIDGE=br0
# Optional: comment out BOOTPROTO/IPADDR lines, since they're
# no longer being used (the br0 takes precedence)

# Add a script to connect your guest instance to the bridge on guest boot
vi /etc/qemu-ifup
#!/bin/bash
BRIDGE=$(/sbin/ip route list | awk '/^default / { print $NF }')
/sbin/ifconfig $1 0.0.0.0 up
/usr/sbin/brctl addif $BRIDGE $1
# END OF SCRIPT
# Silence a qemu warning by creating a noop qemu-ifdown script
vi /etc/qemu-ifdown
#!/bin/bash
# END OF SCRIPT
chmod +x /etc/qemu-if*

# Test - bridged networking uses a 'tap' networking device
NAME=c5-1
qemu-kvm -hda $NAME.img -name $NAME -m ${MEM:-512} -net nic -net tap &

Done. This should give you VMs that are full network members, able to be pinged and accessed just like a regular host. Bear in mind that this means you'll want to setup firewalls etc. if you're not in a controlled environment.

Notes:

  • If you want to run more than one VM on your LAN, you need to set the guest MAC address explicitly, since otherwise qemu uses a static default that will conflict with any other similar VM on the LAN. e.g. do something like:
# HOST_ID, identifying your host machine (2-digit hex)
HOST_ID=91
# INSTANCE, identifying the guest on this host (2-digit hex)
INSTANCE=01
# Startup, but with explicit macaddr
NAME=c5-1
qemu-kvm -hda $NAME.img -name $NAME -m ${MEM:-512} \
  -net nic,macaddr=00:16:3e:${HOST_ID}:${INSTANCE}:00 -net tap &
  • This doesn't use the paravirtual ('virtio') drivers that Hadyn mentions, as these aren't available until kernel 2.6.25, so they're not available to CentOS linux guests without a kernel upgrade.

Simple dual upstream gateways in CentOS


Update 2019-05-05: see this revised post for a simpler implementation method and a gotcha to watch out for. HT to Jim MacLeod for suggested improvements in his comments below.



Had to setup some simple policy-based routing on CentOS again recently, and had forgotten the exact steps. So here's the simplest recipe for CentOS that seems to work. This assumes you have two upstream gateways (gw1 and gw2), and that your default route is gw1, so all you're trying to do is have packets that come in on gw2 go back out gw2.

1) Define an extra routing table e.g.

$ cat /etc/iproute2/rt_tables
#
# reserved values
#
255     local
254     main
253     default
0       unspec
#
# local tables
#
102     gw2
#

2) Add a default route via gw2 (here 172.16.2.254) to table gw2 on the appropriate interface (here eth1) e.g.

$ cat /etc/sysconfig/network-scripts/route-eth1
default table gw2 via 172.16.2.254

3) Add an ifup-local script to add a rule to use table gw2 for eth1 packets e.g.

$ cat /etc/sysconfig/network-scripts/ifup-local
#!/bin/bash
#
# Script to add/delete routing rules for gw2 devices
#

GW2_DEVICE=eth1
GW2_LOCAL_ADDR=172.16.2.1

if [ $(basename $0) = ifdown-local ]; then
  OP=del
else
  OP=add
fi

if [ "$1" = "$GW2_DEVICE" ]; then
  ip rule $OP from $GW2_LOCAL_ADDR table gw2
fi

4) Use the ifup-local script also as ifdown-local, to remove that rule

$ cd /etc/sysconfig/network-scripts
$ ln -s ifup-local ifdown-local

5) Restart networking, and you're done!

# service network restart

For more, see:

Finding/Delete Broken Symlinks

Find goodness (with a recent-ish find for the '-delete'):

find -L . -type l
find -L . -type l -delete

Using Network Driver Images on RedHat/CentOS Installs

I was building a shiny new CentOS 5.0 server today with a very nice 3ware 9650SE raid card.

Problem #1: the RedHat anaconda installer kernel doesn't support these cards yet, so no hard drives were detected.

If you are dealing with a clueful Linux vendor like 3ware, though, you can just go to their comprehensive download driver page, grab the right driver you need for your kernel, drop the files onto a floppy disk, and boot with a 'dd' (for 'driverdisk') kernel parameter i.e. type 'linux dd' at your boot prompt.

Problem #2: no floppy disks! So the choices were: actually exit the office and go and buy a floppy disk, or (since this was a kickstart anyway) figure out how to build and use a network driver image. Hmmm ...

Turns out the dd kernel parameter supports networked images out of the box. You just specify dd=http://..., dd=ftp://..., or dd=nfs://..., giving it the path to your driver image. So the only missing piece was putting the 3ware drivers onto a suitable disk image. I ended up doing the following:

# Decide what name you'll give to your image e.g.
DRIVER=3ware-c5-x86_64
mkdir /tmp/$DRIVER
cd /tmp/$DRIVER
# download your driver from wherever and save as $DRIVER.zip (or whatever)
# e.g. wget -O $DRIVER.zip http://www.3ware.com/KB/article.aspx?id=15080
#   though this doesn't work with 3ware, as you need to agree to their
#   licence agreement
# unpack your archive (assume zip here)
mkdir files
unzip -d files $DRIVER.zip
# download a suitable base image from somewhere
wget -O $DRIVER.img \
  http://ftp.usf.edu/pub/freedos/files/distributions/1.0/fdboot.img
# mount your dos image
mkdir mnt
sudo mount $DRIVER.img mnt -o loop,rw
sudo cp files/* mnt
ls mnt
sudo umount mnt

Then you can just copy your $DRIVER.img somewhere web- or ftp- or nfs-accessible, and give it the appropriate url with your dd kernel parameter e.g.

dd=http://web/pub/3ware/3ware-c5-x86_64.img

Alternatives: here's an interesting post about how to this with USB keys as well, but I didn't end up going that way.

Fuzzy Displays and Dual NVIDIA 8xxx Cards

We've been chasing a problem recently with trying to use dual nvidia 8000-series cards with four displays. 7000-series cards work just fine (we're mostly using 7900GSs), but with 8000-series cards (mostly 8600GTs) we're seeing an intermittent problem with one of the displays (and only one) going badly 'fuzzy'. It's not a hardware problem because it moves displays and cables and cards.

Turns out it's an nvidia driver issue, and present on the latest 100.14.11 linux drivers. Lonni from nvidia got back to us saying:

This is a known bug ... it is specific to G8x GPUs ... The issue is still being investigated, and there is not currently a resolution timeframe.

So this is a heads-up for anyone trying to run dual 8000-series cards on linux and seeing this. And props to nvidia for getting back to us really quickly and acknowledging the problem. Hopefully there's a fix soonish so we can put these lovely cards to use.

Linux on Gigabyte GA-M59SLI-S5/S4 Motherboards

We've been having a bit of trouble with these motherboards under linux recently. The two S4/S5 variants are basically identical except that the S5 has two Gbit ethernet ports where the S4 has only one, and the S5 has a couple of extra SATA connections - we've been using both variants. We chose these boards primarily because we wanted AM2 boards with multiple PCIe 16x slots to use with multiple displays.

We're running on the latest BIOS, and have tested various kernels from 2.6.9 up to about 2.6.19 so far - all evidence the same the same problems. Note that these are much more likely to be BIOS bugs, we think, than kernel problems.

The problems we're seeing are:

  • kernel panics on boot due to apic problems - we can workaround by specifying a 'noapic' kernel parameter at boot time

  • problems with IRQ 7 - we get the following message in the messages log soon after boot:

    kernel: irq 7: nobody cared (try booting with the "irqpoll" option)
    kernel:  [<c044aacb>] __report_bad_irq+0x2b/0x69
    kernel:  [<c044acb8>] note_interrupt+0x1af/0x1e7
    kernel:  [<c05700ba>] usb_hcd_irq+0x23/0x50
    kernel:  [<c044a2ff>] handle_IRQ_event+0x23/0x49
    kernel:  [<c044a3d8>] __do_IRQ+0xb3/0xe8
    kernel:  [<c04063f4>] do_IRQ+0x93/0xae
    kernel:  [<c040492e>] common_interrupt+0x1a/0x20
    kernel:  [<c0402b98>] default_idle+0x0/0x59
    kernel:  [<c0402bc9>] default_idle+0x31/0x59
    kernel:  [<c0402c90>] cpu_idle+0x9f/0xb9
    kernel:  =======================
    kernel: handlers:
    kernel: [<c0570097>] (usb_hcd_irq+0x0/0x50)
    kernel: Disabling IRQ #7
    

    after which IRQ 7 is disabled and whatever device is using IRQ 7 seems to fail intermittently or just behave strangely (and "irqpoll" would just cause hangs early in the boot process).

This second problem has been pretty annoying, and hard to diagnose because it would affect different devices on different machines depending on what bios settings were on and what slots devices were in. I spent a lot of time chasing weird nvidia video card hangs which we were blaming on the binary nvidia kernel module, which turned out to be this interrupt problem.

Similarly, if it was the sound device that happened to get that interrupt, you'd just get choppy or garbled sound out of your sound device, when other machines would be working flawlessly.

So after much pain, we've even managed to come up with a workaround: it turns out that IRQ 7 is the traditional LPT port interrupt - if you ensure the parallel port is turned on in the bios (we were religiously turning it off as unused!) it will grab IRQ 7 for itself and all your IRQ problems just go away.

Hope that saves someone else some pain ...