XenServer 5.6 thin provisioning with ext3

XenServer 5.6 allows admins a choice between 3 different kinds of volume management: LVM, LVHD or ext3. With the default in XenServer 5.6 of LVHD you gain quick snapshots and have thin provisioning of snapshots and suspended virtual machines, but running virtual machines have 100% of their disk allocation counted against the disk usage. In order to get thin provisioning of running VMs you need to build/rebuild your SRs as ext3 volumes. You lose rapid snapshots in the process. I also am not sure that this meets everyone’s definition of “thin provisioning” since this is just lazy allocation of blocks on ext3. If you fill up the disk on the VM and then delete a large amount of space, I don’t believe you will see the disk usage affected on your virtual machine. Still, with most server images in the Enterprise being nearly un-utilized, this should still be effective — particularly if you are good about log rotation and don’t let your partitions fill up.

In order to convert the default local storage volume on a XenServer 5.6 host you need to use the console xe utilities to destroy and recreate the SR. This is destructive to VMs on the host, so these instructions assume a newly build XenServer 5.6 — the adaption to adding a new drive to a host and creating a new SR with ext3 using ‘xe sr-create’ with these arguments is also straight forwards. If you’ve already got VMs on the SR you’ll need to migrate them off and migrate them back one way or another. Don’t try this for the first time on a VM host that you care about, particularly if you aren’t skilled with the command line.

First there’s a default template in XenServer 5.6 which needs to be removed from the storage:

# xe vbd-list
uuid ( RO)             : f5c9f545-2019-7299-be87-fc7ef00be1e2
          vm-uuid ( RO): e2ad0921-dea8-5a1a-77e8-d3257fdcf48d
    vm-name-label ( RO): XenServer Transfer VM 5.6.0-31124p
         vdi-uuid ( RO): c3a8d327-2036-4ce2-9946-f0522f7572f4
            empty ( RO): false
           device ( RO):
# xe template-uninstall template-uuid=e2ad0921-dea8-5a1a-77e8-d3257fdcf48d
The following items are about to be destroyed
VM : e2ad0921-dea8-5a1a-77e8-d3257fdcf48d (XenServer Transfer VM 5.6.0-31124p)
VDI: c3a8d327-2036-4ce2-9946-f0522f7572f4 (XenServer Transfer VM system disk)
Type 'yes' to continue
yes
All objects destroyed

If you really needed that template, you don’t have it anymore. You’ll have to figure out how to get it back. I’m not sure what the purpose of that is for. It is by default installed on all new XenServer 5.6 images, so you should be able to export it from a fresh install and re-import it to fix, but I’m not going to offer instructions on how to do that, and haven’t tested it.

Next, find the uuid of the Local Storage SR:

# xe sr-list name-label="Local storage"
uuid ( RO)                : dacfea90-263e-0811-ab88-22f01b89b1b4
          name-label ( RW): Local storage
    name-description ( RW):
                host ( RO): vmhost.example.com
                type ( RO): lvm
        content-type ( RO): user

Then find the PBD that is attached to that:

]# xe pbd-list sr-uuid=dacfea90-263e-0811-ab88-22f01b89b1b4
uuid ( RO)                  : daabdf71-641c-900b-3451-bd5c70675fab
             host-uuid ( RO): 23d8a9a0-a317-47a5-a1e6-858ab120b57b
               sr-uuid ( RO): dacfea90-263e-0811-ab88-22f01b89b1b4
         device-config (MRO): device: /dev/disk/by-id/scsi-36001c230bd1017000e4f2ee6554b21c8-part3
    currently-attached ( RO): true

Then unplug the PBD:

# xe pbd-unplug uuid=daabdf71-641c-900b-3451-bd5c70675fab

Now destroy the SR:

# xe sr-destroy uuid=dacfea90-263e-0811-ab88-22f01b89b1b4

Now you can create the SR. I’ve been using servers that have /dev/sda, so the storage partition is /dev/sda3. If you’re doing this on a SATA system (ick) you might have to use /dev/hda3 here, or on an HP probably /dev/cciss/c0d0p3. If you have FibreChannel or iSCSI-attached disk on a SAN you’re on your own to figure out what your block device is.

# xe sr-create content-type=user type=ext device-config:device=/dev/sda3 shared=false name-label="Local storage"
76ec3072-ae85-cd38-e363-34cf6b63d520

This command will take some time to return as it creates the SR.

You now probably want to tune the reserved space down on the ext3 partition to make more of it available. The filesystem reserves 5% of the storage to make block allocation and defragmentation more efficient, but you probably want to manage that yourself (set monitoring alarms at 95% and migrate VMs off if the storage gets above 95%).

The block device to tune is not /dev/sda3, but you can find it from df -k:

# df -k
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/sda1              4128448   3214896    703840  83% /
none                    384512         0    384512   0% /dev/shm
/opt/xensource/packages/iso/XenCenter.iso
                         44410     44410         0 100% /var/xen/xc-install
/dev/mapper/XSLocalEXT--76ec3072--ae85--cd38--e363--34cf6b63d520-76ec3072--ae85--cd38--e363--34cf6b63d520
                     279556112    191652 265163836   1% /var/run/sr-mount/76ec3072-ae85-cd38-e363-34cf6b63d520

Use tune2fs against that really ugly block device name to set the reserve to 0%:

# tune2fs -m 0 /dev/mapper/XSLocalEXT--76ec3072--ae85--cd38--e363--34cf6b63d520-76ec3072--ae85--cd38--e363--34cf6b63d520
tune2fs 1.39 (29-May-2006)
Setting reserved blocks percentage to 0% (0 blocks)

You should now be able to see the new “Local storage” device in XenCenter and can set it as the default storage location for new VMs. You will also see VHDs associated with your VMs showing up in the /var/run/sr-mount/[...etc...] directory.

Share

9 Comments

  1. Russ says:

    I got this to work nicely in 5.6 on a single host via iSCSI thank you very much… But I do have a question regarding getting this same type of \Thin Provisioning\ in a pool with multiple hosts.

    Is that even possible?

    If so enlighten me :)

    Thank you in advance

  2. All I’ve done so far is do this against hosts with direct attached SCSI drives.

    So far I don’t have any shared storage to play with. Right now I’ve only got corp IT (wikis, etc) that I’m reusing some old crappy hardware on. VM migration is somewhat painful with vm-export/vm-import to NFS for backups and when I have to move things around. So everything in my environment right now is a little bit ghetto — but this trick is helping me keep the disk space down.

    I’m thinking of putting together some cheap Linux iSCSI + GNBD-replicated shared storage heads for pre-production use.

    And at some point I’ll need proper shared storage for load and production.

    For both of those, I’m just not there yet, though, and it’ll be a few months before I get there.

  3. Timothy Geist says:

    What about the performance? From what I’ve seen using ext3 is a substantial overhead. What if you plan to fill up a server with like 50 VMs. Won’t that be a problem?

    Also I assume you also lose the LVM snapshot ability which is pretty handy.

  4. You probably shouldn’t do 50 VMs on a single host with direct attached storage.

    At that point you probably want some kind of SAN/NAS appliance that does disk de-deduplication natively for you.

    Right now I’m taking already purchased 1U servers and chopping them up into more like 10-16 VMs with XenServer Free. One thing that I do is eliminate all the useless cronjobs that chew up I/O in the middle of the night (slocate, etc). I’ve also turned off fsync() in syslog.conf which produces large wins in getting I/O down very low.

    Since there’s no licensing overhead, or shared storage or anything I’m just getting 10:1 or 16:1 compression and making efficient financial use of capital (each server cost $5k or so, 3+ years ago). If you’re buying expensive virt heads for $20k and loading them up with Enterprise licensing, then you need to hit compression like 50:1 to make sense, and then doing ‘thin provisioning’ this way doesn’t make any sense.

    You don’t lose vm snapshots if you do this, but they’re not as fast or thin. I can still take vm snapshots and then vm export a file to take a backup. Tends to crush the I/O in the middle of the night, though.

    And I haven’t measured what the relative I/O performance is between ext3 and LVM or LVHD, but like I said, I’ve tuned the VMs that I’ve configured this way so that they mostly don’t do any I/O…

    And, I don’t recommend this for production or load testing. I’m using this for cheap virts for corporate IT stuff, and integration and dev sandboxes. It is very “poor-mans” approach.

  5. Timothy Geist says:

    Thanks for your info Lamont,

    Can you tell me what 1U servers you’re using? Also I’m a bit baffled by your ratio. You said you’re not using shared storage but you mentioned compression? How are you getting the compression? Or were you talking about thin?

    We’re thinking of using nexentastor for shared storage although thats still a bit further away (and the deduplication is currently buggy). I’m not sure how reliable it would be but it shows promise.

    Currently we have about 35 VMs (mix matched centOS and server 2008) per host (8 cores and 64gigs of ram, 3ware raid10 8 drives) and it works pretty well. But it is using the default install and not ext3 which is why I was wondering how much of the impact it would be if I did change to ext3.

    Thanks for your tips on the I/O. Will check it out.

  6. We’re using Dell 1950s right now with 32GB of RAM, 2 SAS drives, 4 cores (about 3 years old, repurposing them). As long as you don’t need any CPU, then on a memory-limited basis it still financially works out to upgrade the chassis and run them.

    By compression I mean physical-to-virtual compression — 10 images on one virtual server being 10:1 (VM) compression.

    Part of the issue that I’m dealing with is having literally hundreds of dell 1950s chassis that we already own and having zero budget for capital.

    Interesting to know that you’re getting away with using internal storage with 35:1 VM compression (although you’ve got more drives). I’ve got some old 2950s with 6 drives that I’m looking at using and a few DL380s with 8 drives that are in the same class.

    I haven’t tested out the ext3 I/O, I’d be interested in any numbers you come up with if you test it.

    I’ve also been thinking about blogging something about tuning to reduce I/O on servers.

  7. Here’s the info on reducing server I/O load:

    http://www.scriptkiddie.org/blog/2010/09/19/reducing-server-io-on-virtualized-hosts/

    If you’ve got ubuntu, you’ll probably need to adapt the instructions.

  8. Johnny Puffs says:

    Hi Lamont,

    Very helpful article on setting up thin provisioning with ext3. Just what I needed.

    Regarding the template “XenServer Transfer VM 5.6.0-31124p” that needs to be uninstalled.. It’s actually pretty important. This is what handles all the functions for the Virtual Appliance Tools (Import Virtual Appliance, Export Virtual Appliance and Disk Image Import). These will all fail without this template.

    Fortunately it is VERY easy to place it back on your server without the need for exporting from another XenServer and importing it…

    In the XenServer host directory /opt/xensource/packages/files/transfer-vm/ are a number of related files. It takes a couple easy steps to reinstall the Template VM:

    1. If you CD into the directory and list the files you will see the output::

    [root@ixen ~]# cd /opt/xensource/packages/files/transfer-vm/
    [root@ixen transfer-vm]# ls -al
    total 3692
    drwxr-xr-x 2 root root 4096 May 12 12:30 .
    drwxr-xr-x 5 root root 4096 May 12 12:30 ..
    -rw-r–r– 1 root root 134 May 20 2010 65-install-transfer-vm
    -rwxr-xr-x 1 root root 478 May 20 2010 do-copy
    -rwxr-xr-x 1 root root 613 May 20 2010 do-transfer
    -rwxr-xr-x 1 root root 1784 May 20 2010 install-transfer-vm.sh
    -rw-r–r– 1 root root 3747840 May 20 2010 transfer-vm.xva
    -rwxr-xr-x 1 root root 491 May 20 2010 uninstall-transfer-vm.sh
    [root@ixen transfer-vm]#

    2. Then simply run the install-transfer-vm.sh script and you are done:

    [root@ixen transfer-vm]# ./install-transfer-vm.sh

    Hope that helps some people.

    Thanks for your article.

  9. Braden says:

    I was going to use these commands to upgrade XenServer 5.6 to 5.6 SP2 (so we can start to use Intelli-Cache)… I was able to run the commands but was a little worried about the fact that the partition is still setup with LVM….

    Do I need to delete the partition and recreate it in linux before I re-create the storage repository (or is it ok if the underlying partition has LVM setup)

    THANKS FOR THE HELP!

Leave a Reply