I've been poking at Rusty's hypervisor, lhype, which he's developing as example code for the upcoming paravirt ops framework. The lhype patch applies cleanly to recent -mm kernels, and I'm queuing up some patches for Rusty here (or browse the git repo). The latest patch there adds support for Thread Local Storage, allowing things like Fedora binaries to run without segfaulting.
[ NOTE: an updated version of this procedure is now published here ]
If anyone wants a very quick and simple recipe to get up and running with lhype, try this:
1. Obtain an appropriate kernel and lhype patches1:
$ git clone git://git.infradead.org/~jmorris/linux-2.6-lhype
2. Check out the 'patches' branch:
$ cd linux-2.6-lhype
$ git-checkout patches
3. Configure the kernel:
$ make menuconfig
Keep it simple initially, to get up and running. Also keep in mind that lhype currently only works for i386, no high memory and no SMP. Enable the paravirt ops framework and select lhype as a module:
4. Build and install the kernel:
$ make -j12 && sudo make modules_install && sudo make -j12 install
Or whatever usually works for you. Note that the kernel you're building will be for both the guest and the host domains.
5. Boot into host kernel (dom0)
This should just work. Now the fun really begins.
6. Grab a nice Linux OS image2:
$ wget http://fabrice.bellard.free.fr/qemu/linux-test-0.5.1.tar.gz
$ tar -xzf linux-test-0.5.1.tar.gz linux-test/linux.img
7. Create a file for shared network i/o:
$ dd if=/dev/zero of=netfile bs=1024 count=4
8. Launch a guest!
$ sudo modprobe lhype
$ sudo linux-2.6-lhype/drivers/lhype/lhype_add 32M 1 linux-2.6-lhype/vmlinux \
linux-test/linux.img netfile root=/dev/lhba
If all went well, you should see the kernel boot messages fly past and then something like:
Linux version 2.6.19-rc5-mm2-gf808425d (firstname.lastname@example.org)
(gcc version 4.1.1 20060928 (Red Hat 4.1.1-28)) #1 PREEMPT Tue Nov 28 00:53:39 EST 2006
QEMU Linux test distribution (based on Redhat 9)
Type 'exit' to halt the system
sh: no job control in this shell
It's already useful for kernel hacking. Local networking works. It'd also likely be useful for teaching purposes, being relatively simple yet quite concrete.
'lhype_add' is an app included with the kernel which launches and monitors guest domains. It's actually a simple ELF loader, which maps the guest kernel image into the host's memory, then opens /proc/lhype and writes some config info about the guest. This kicks the hypervisor into action to initialize and launch the guest, while the open procfile fd is used for control, console i/o, and DMA-like i/o via shared memory (using ideas from Rusty's earlier XenShare work). The hypervisor is simply a loadable kernel module. Cool stuff.
It's a little different to Xen, in that the host domain (dom0) is simply a normal kernel running in ring 0 with userspace in ring 3. The hypervisor is a small ELF object loaded into the top of memory (when the lhype module is loaded), which contains some simple domain switching code, interrupt handlers, a few low-level objects which need to be virtualized, and finally an array of structs to maintain information for each guest domain (drivers/lhype/hypervisor.S).
The hypervisor runs in ring 0, with the guest domains running as host domain tasks in ring 1, trapping into the hypervisor for virtualized operations via paravirt ops hooks (arch/i386/kernel/lhype.c) and subsequent hypercalls (drivers/lhype/hypercalls.c). Thus, the hypervisor and host kernel run in the same ring, rather than, say, the hypervisor in ring 0 with the host kernel in ring 1, as is the case with Xen. The advantage for lhype is simplicity: the hypervisor can be kept extremely small and simple, because it only needs to handle tasks related solely to virtualization. It's just 463 lines of assembler, with comments. Of course, from an isolation point of view the host kernel is effectively part of the hypervisor, because they share the same hardware privilege level. It has also been noted that in practice, a typical dom0 has so much privileged access to the hypervisor that it's not necessarily meaningful to run them in separate rings. Probably a good beer @ OLS discussion topic.
Overall, it seems like a very clean and elegant design. I'll see if I can write up some more detailed notes on what I've learned about it soon.
Note that Rusty will be giving a presumably canonical talk on lhype at LCA 2007.
The OSDL Virtualization list is probably the best place to keep up with development at this stage, as well as Rusty's accurately titled bleeding edge page.
1 You can also roll your own via the paravirt patch queue, where the core development takes place.
2Qemu image suggested for simplicity. e.g. an FC6 image will now work, although it won't get past single user mode due to lack of support for initrd.
On a hypervisor-related note, IBM researcher Reiner Sailer has posted an ACM/sHype HOWTO for Fedora Core 6, which explains how to enable the current Xen security extensions.
While I was doing some ELF reading, I found this great document, A Whirlwind Tutorial on Creating Really Teensy ELF Executables for Linux.
Looks like FOSS.IN/2006 was another big success.
o Google news query
|Date:||February 28th, 2007 01:47 am (UTC)|| |
I assume things have changed a little bit since that time and i'm unable to reproduce the recipe you described :
(i use the lguest's big patch for 2.6.20)
./Documentation/lguest/lguest 64m vmlinux --block=linux-0.2.img netfile root=/dev/lgba
(i used linux-0.2.img because linux-test-0.5.1.tar.gz have been replaced by this image)
And by running this command i get the following error :
[ 0.920000] No filesystem could mount root, tried: cramfs
[ 0.920000] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(254,0)
[ 0.920000] lguest: CRASH: VFS: Unable to mount root fs on unknown-block(254,0)
So i don't know where i'm wrong, if you have an idea plz help me, i thanks you by advance :)
The info in my post is really out of date now, I suggest reading the current documentation at the web site:http://lguest.ozlabs.org/
Not sure what the problem is you're seeing, could be that image is a partitioned disk instead of a raw fs image. Try the mailing list per the above site.