I've moved my blog to jmcneil.net. This is no longer being updated!

Tuesday, January 20, 2009

Learning How it Really Works

Over the past couple of weeks, I've been trying to dive much deeper into Python than I have in the past. Overall, I have about 4 years of experience writing Python code and a few C extensions. It feels as though I should know more about the platform that has been paying my mortgage.

My goal today was to instanciate a code object manually and piece together a string of bytecode that would simply import another module. I'm by no means an expert at this!

As a guide, I am using the PyCodeObject structure as defined in code.h

/* Bytecode object */
typedef struct {
PyObject_HEAD
int co_argcount; /* #arguments, except *args */
int co_nlocals; /* #local variables */
int co_stacksize; /* #entries needed for evaluation stack */
int co_flags; /* CO_..., see below */
PyObject *co_code; /* instruction opcodes */
PyObject *co_consts; /* list (constants used) */
PyObject *co_names; /* list of strings (names used) */
PyObject *co_varnames; /* tuple of strings (local variable names) */
PyObject *co_freevars; /* tuple of strings (free variable names) */
PyObject *co_cellvars; /* tuple of strings (cell variable names) */
/* The rest doesn't count for hash/cmp */
PyObject *co_filename; /* string (where it was loaded from) */
PyObject *co_name; /* string (name, for reference) */
int co_firstlineno; /* first source line number */
PyObject *co_lnotab; /* string (encoding addr<->lineno mapping) */
void *co_zombieframe; /* for optimization only (see frameobject.c) */
} PyCodeObject;

It's possible to generate a code object from within Python via the types.CodeType class. The documentation states that the object requires 12 arguments, all which correspond to the above structure members.

class code(object)
| code(argcount, nlocals, stacksize, flags,
| codestring, constants, names,
| varnames, filename, name, firstlineno,
| lnotab[, freevars[, cellvars]])
|
| Create a code object. Not for the faint of heart.
|
In my little example, I simply want to "import this", which should trigger a print of The Zen of Python. The first step was determine what my bytecode string needs to look like. To do this, I wrote a simple Python module and generated a .pyc file. I was then able to open the file and extract what I needed to run my import.

[jeff@marvin ~]$ cat testmodule.py
import this
[jeff@marvin ~]$

Next, import the module.

[jeff@marvin ~]$ python -c 'import testmodule'
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
[jeff@marvin ~]$

Good. Now we have a .pyc file. Next, we'll open the file, pull the magic out,
and determine what our actual bytecode is.

>>> import dis
>>> import marshal
>>> f = open('testmodule.pyc', 'rb')
>>> f.read(8)
'\xd1\xf2\r\n\x935vI'
>>>

We read the first 8 bytes of the file to move the file pointer. This gets us past the magic data and to the start of the actual marshaled code object. To access that object:

>>> marshal.load(f)
<code object <module> at 0xb7f43578, file "testmodule.py", line 1>
>>> c = _

The bytecode string itself is stored in c.co_code. Using the dis module, we can take a look at the bytecode layout.

>>> dis.dis(c)
1 0 LOAD_CONST 0 (-1)
3 LOAD_CONST 1 (None)
6 IMPORT_NAME 0 (this)
9 STORE_NAME 0 (this)
12 LOAD_CONST 1 (None)
15 RETURN_VALUE
>>>

Ok, so the instruction we're *really* worried about is at offset 6. It is an IMPORT_NAME opcode, with arguments 0, and (this). What do those arguments mean? The C code that actually executes IMPORT_NAME is located in Python/ceval.c:2088.

...
case IMPORT_NAME:
w = GETITEM(names, oparg);
x = PyDict_GetItemString(f->f_builtins, "__import__");
if (x == NULL) {
PyErr_SetString(PyExc_ImportError,
"__import__ not found");
break;
...

The first statement executed sheds a bit of light on the details. The '0' argument to the IMPORT_NAME opcode is an index into a names tuple. In our scenario, the corresponding value needs to be the name of the module that we're loading.

We're going to ignore STORE_NAME. The other opcode we care about is LOAD_CONST. It's corresponding argument serves the same purpose:

...
case LOAD_CONST:
x = GETITEM(consts, oparg);
...

Now we can build up a bytecode string. Note that the indexes specified below will correspond to other elements of the CodeType class. We've quite simply not set that up yet. Our raw bytecode string looks like the following:

d\x01\x00d\x00\x00k\x00\x00d\x00\x00S

This translates to:

100 1 0 100 0 0 107 0 0 100 0 0 83

Now, using 'dis.dis' on the above code string, we wind up with the following byte code:

0 LOAD_CONST 1 (1)
3 LOAD_CONST 0 (0)
6 IMPORT_NAME 0 (0)
9 LOAD_CONST 0 (0)
12 RETURN_VALUE

So, now we can call types.CodeType and build up a code object using our own home-brewed byte string.

import types

func_code = 'd\x01\x00d\x00\x00k\x00\x00d\x00\x00S'
c = types.CodeType(0, 1, 1, 0, func_code,
(None, -1), ('this',), ('this',), 'test_filename', 'test_name', 1, '')

eval(c)

The three tuples are indexed by LOAD_CONST and IMPORT_NAME as exampled above. It's now possible to translate 'IMPORT_NAME 0 (0)' into 'IMPORT_NAME 0 (this).' The final argument, defined as lnotab lets us translate address/lineno mappings. My assumption is that it is used for mapping between marshaled code and line numbers, starting with 'firstlineno.'

When the code object is disassembled, the values are referenced correctly:
>>> dis.dis(c)
1 0 LOAD_CONST 1 (-1)
3 LOAD_CONST 0 (None)
6 IMPORT_NAME 0 (this)
9 LOAD_CONST 0 (None)
12 RETURN_VALUE
>>>

Running the above code does exactly what it should:
[jeff@marvin ~]$ python test.py
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
[jeff@marvin ~]$

Thursday, January 15, 2009

Learning Xen

I've been diving into Xen over the past week or so. In an effort to learn how it really works, I decided to setup a new VM without using the Red Hat documented virt-install utility. A lot of this I learned from http://wiki.xensource.com. Hopefully someone else finds this useful as well.

I'm doing this on Red Hat Enterprise v5.2. My machine currently has 2GB of RAM and about 20GB free space under /var, which is where I'll stick my disk image. I've a dual-core CPU, but it doesn't appear to have virtualization extensions available.

1. Setting up the host system.

This is a simple process. All of the Xen packages are available via Yum and are part of the base entitlement.

xenhost# yum install xen kernel-xen
xenhost# yum install virt-manager libvirt libvirt-python \
libvirt-python python-virtinst

The kernel-xen package updates the /etc/grub.conf file, but doesn't set the Xen kernel to boot by default. On my system, that meant setting the default kernel to '0' as opposed to '1', but that will probably differ. Simply reboot.

2. Creating Disk Images

Xen supports a few different block devices types. It's possible to directly attach physical devices, use direct files, or NBD devices. It's even possible to setup a copy-on-write configuration which is probably very useful when testing installations which require rolling back. In this example, I'm going to use the "blktap" driver and a disk image.

The disk image itself is nothing but a dump of /dev/zero.

[root@xenhost images]# dd if=/dev/zero bs=1024 \
count=1500000 of=example.dsk
1500000+0 records in
1500000+0 records out
1536000000 bytes (1.5 GB) copied, 32.68 seconds, 47.0 MB/s

There. That gives us 1GB of space, minus FS overhead. That ought to be more than enough to hold a minimal Red Hat Linux installation. The next step is to create a filesystem.

[root@xenhost images]# mkfs -t ext3 -j example.dsk
mke2fs 1.39 (29-May-2006)
example.dsk is not a block special device.
Proceed anyway? (y,n) y
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
187776 inodes, 375000 blocks
18750 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=385875968
12 block groups
32768 blocks per group, 32768 fragments per group
15648 inodes per group
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912

Writing inode tables: done
Creating journal (8192 blocks): done
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 25 mounts or
180 days, whichever comes first. Use tune2fs -c or -i to override.
[root@xenhost images]# tune2fs -c0 -i0 ./example.dsk
tune2fs 1.39 (29-May-2006)
Setting maximal mount count to -1
Setting interval between checks to 0 seconds
[root@xenhost images]#

We'll also need swap space.

[root@xenhost images]# dd if=/dev/zero bs=1024 \
count=256000 of=example-swap.dsk
256000+0 records in
256000+0 records out
262144000 bytes (262 MB) copied, 2.82531 seconds, 92.8 MB/s
[root@xenhost images]# mkswap ./example-swap.dsk
Setting up swapspace version 1, size = 262139 kB
[root@xenhost images]#

Now we need to mount the image up and install a base version of Linux. The common method of doing this is to use a loopback device (losetup and friends) and mount the image as a file system. I'm not going to do it that way. Remember, as we rebooted under the Xen kernel, we're running in the context of Domain0. It's possible to use the Xen tools to make this file system available as we would to any guest domain. The only trick? We specify '0' as our domain ID. This also helped to get me familiar with the Xen utilities.

[root@xenhost images]# modprobe xenblk
[root@xenhost images]# xm block-attach 0 \
tap:aio:/var/lib/xen/images/example.dsk xvda1 w
[root@xenhost images]# ls -al /dev/xvda1
brw-r----- 1 root disk 202, 1 Jan 15 16:46 /dev/xvda1

Lots going on there! So, in english: attach the block device located at /var/lib/xen/images/example.dsk to domain0 as /dev/xvda1. It should be writeable. Now, we can mount that file system up just like we would any other. No need for a loopback device.

[root@xenhost images]# mount /dev/xvda1 /mnt
[root@xenhost images]# ls /mnt
lost+found
[root@xenhost images]# mount
...
/dev/xvda1 on /mnt type ext3 (rw)

It would have been possible to specify "file://var/lib.." as opposed to "tap:aio", but from what I understand, blktap is the preferred mechanism as the consistancy of the guest OS isn't at the mercy of the host buffer cache contents (power outage, anyone?).

3. Install the Guest OS.

There are a few ways to do this, but the net result has to be the same: OS files need to make it to this FS. You can do this via Yum & the --installroot option, cp -r, RPM & chroot. In my opinion, the easiest method is to use Yum.

[root@xenhost images]# mkdir -p /mnt/var/lib/yum
[root@xenhost images]# yum --installroot=/mnt groupinstall Base
...read repos...
Transaction Summary
=============================================================================
Install 333 Package(s)
Update 0 Package(s)
Remove 0 Package(s)

Total download size: 188 M
Is this ok [y/N]: y
...download and run transaction...
Complete!
[root@xenhost images]# yum install --installroot=/mnt -y kernel-xen
...
Complete!

Almost there. Now we need to chroot into the guest OS and configure a few things. All of this could easily be automated with a script and probably should be if more than a couple domU virtuals are setup.

[root@xenhost var]# chroot /mnt
bash-3.2# authconfig --useshadow --update
bash-3.2# passwd root
Changing password for user root.
New UNIX password:
BAD PASSWORD: it is based on a dictionary word
Retype new UNIX password:
passwd: all authentication tokens updated successfully.
bash-3.2# echo "127.0.0.1 localhost" > /etc/hosts
bash-3.2# cd /root && cp /etc/skel/.* .

Also, it's necessary to update the /etc/modprobe.conf on the guest to include the Xen drives.

alias eth0 xennet
alias scsi_hostadapter xenblk

Lastly, we need an /etc/fstab that matches our Xen configuration. I've used the following in this example.

/dev/xvda1 / ext3 defaults 1 1
/dev/xvda2 none swap sw 0 0
none /dev/pts devpts gid=5,mode=620 0 0
none /dev/shm tmpfs defaults 0 0
none /proc proc defaults 0 0
none /sys sysfs defaults 0 0

Now we have a working, though limited, install. I didn't bother to setup networking or anything just yet, that's fairly textbook once the instance is running. We need to unmount the new OS directory and block-detach the xvda1 device.

[root@xenhost var]# umount /mnt
[root@xenhost var]# xm block-detach 0 xvda1

4. Building up a Xen Configuration File.

The next step is to create a working domain configuration file under /etc/xen. There are a couple of pieces of data we'll need to generate first. Both the MAC as well as the UUID need to be unique across systems. To do this, I put together a small Python script.

#!/usr/bin/python

import virtinst.util

print "New UUID: %s" %\
virtinst.util.uuidToString(virtinst.util.randomUUID())
print "New MAC: %s" % virtinst.util.randomMAC()

Running the above script outputs the following:

New UUID: 65ffda11-5fef-0876-23a4-76839888b36b
New MAC: 00:16:3e:22:15:51

So, now it's possible to put a Xen configuration together. Note the MAC and the UUID from the above script are used.

name = "example"
uuid = "65ffda11-5fef-0876-23a4-76839888b36b"
memory = 128
vcpus = 16 # Why not? ;-)
kernel = "/boot/vmlinuz-2.6.18-92.1.22.el5xen"
ramdisk = "/boot/initrd-2.6.18-92.1.22.el5xen-no-scsi.img"
disk = [ "tap:aio://var/lib/xen/images/example.dsk,xvda1,w",
"tap:aio://var/lib/xen/images/example-swap.dsk,xvda2,w"]
root= "/dev/xvda1 ro"
vif = ["mac=00:16:3e:22:15:51,bridge=xenbr0,script=vif-bridge" ]

The configuation is pretty straight forward. The vif line creates an eth0 device within the guest that's part of the xenbr0 bridge. This makes the new VM accessible via the same network that the host resides on. Configure that interface as you would a normal, physical, device.

5. Start up the Virtual

This is the easy part. Simply run 'xm create example.' If everything was done correctly, the new virtual ought to start up. To watch the machine boot and login, simply type 'xm console example.'

[root@xenhost xen]# xm console example

Red Hat Enterprise Linux Server release 5.2 (Tikanga)
Kernel 2.6.18-92.1.22.el5xen on an i686

example login: root
Password:
Last login: Thu Jan 15 20:08:25 on xvc0
[root@example ~]# uname -a
Linux example 2.6.18-92.1.22.el5xen #1 SMP Fri Dec 5 10:29:16 EST 2008 i686 i686 i386 GNU/Linux
[root@example ~]# cat /proc/cpuinfo | grep processor | wc -l
16
[root@example ~]#

6. Configurations are Python!

The Xen configuration files not only look like Python, they *are* Python. This makes the entire configure process extremely flexible. For (an extremely useless) example:

[root@xenhost xen]# cat example
name = "example"
uuid = "65ffda11-5fef-0876-23a4-76839888b36b"
memory = 128
vcpus = 16
kernel = "/boot/vmlinuz-2.6.18-92.1.22.el5xen"
ramdisk = "/boot/initrd-2.6.18-92.1.22.el5xen-no-scsi.img"
disk = [ "tap:aio://var/lib/xen/images/example.dsk,xvda1,w",
"tap:aio://var/lib/xen/images/example-swap.dsk,xvda2,w"]
root= "/dev/xvda1 ro"
vif = ["mac=00:16:3e:22:15:51,bridge=xenbr0,script=vif-bridge" ]
#vfb = [ "type=vnc,vncdisplay=2" ]

for i in disk:
print "device %s" % i
[root@xenhost xen]# xm create example
Using config file "./example".
device tap:aio://var/lib/xen/images/example.dsk,xvda1,w
device tap:aio://var/lib/xen/images/example-swap.dsk,xvda2,w
Started domain example

I know I left off a lot of important configuration, but I know more about Xen now than I did yesterday. I think I'll take a dive into libvirt over the next day or so. About the only thing cooler than Xen virtualization is Xen virtualization, in Python. I've done a lot of automated installation work in the past, this really exposes a lot of functionality I wish I had back then.

Monday, January 12, 2009

Python 3.0 Porting Efforts?

I've spent the last hour or so wondering if there is a page or a resource that tells me which common Python packages have been ported to 3.0? 

Two reasons.  

First, I think it would be very useful to know what's been updated.  Secondly, I've been trying to find a reason to write some real Python 3.0 code.  If such a resource existed, it would be easy to offer a bit of my time towards helping with porting efforts.  I've written a few "test.py" scripts to learn new features, but nothing of any substance.

Does anyone know?

Update: As it's been pointed out (and I probably should have thought of this), one can simply use the package index to pull a list of 3.0 compatibles:  http://pypi.python.org/pypi?:action=browse&c=533. I guess a good starting point would be to start with useful packages that aren't on here and go from there.  I was really looking for more of a community effort page as it's quite easy to figure out what a specific package supports.

Friday, January 2, 2009

Making a Freelance Living?

I've wanted to post about this topic for a while now, though I haven't been able to come up with an approach that doesn't make it look like I'm begging for work! I'm employed and I usually enjoy my job!


I've had dozens upon dozens of conversations with fellow developers regarding freelance software work. Many of them think that it's a fairly trivial process to strike out on their own and simply "freelance." You know, if they could only shake the management overhead, they would make gazillions.


I don't get it.


First of all, how exactly does an individual with a software engineering background find work as a freelance developer?  This has always escaped me.  It seems as though there's a bit of marketing knowledge needed.


I have friends that find work. I have coworkers that keep busy on the weekends. These opportunities seem to pass me by.   Is it a situation where a developer needs to setup an entire corporate facade and pawn himself off as "Synergistic Corporate Solutions" or "Logimental Systems Design?" Complete with domain, eight-hundred number, and phony sales department? Is it a word of mouth thing?


Next, what about ongoing maintenance agreements, bug fixes, and support? All of this sounds like a lot for one dude.  How do you folks out there running one man shops set this stuff up? My first thought is that the amount of money required to fully support someone would require a level of service too high to provide with such limited resources?


Is there even a market for it? I've a second degree black belt in Python-Fujitsu, a small collection of vendor certifications, 12 years of professional Linux experience, and I understand enough of the business voodoo to get invited to the fancy marketing meetings.  Aren't there 45,000 guys just like me out there trying to make a buck on the side the same way? It seems as though it would be a bit of a saturated market.


I'm coming to the end of a two week vacation. I could really get used to working in a home office.  It's been nice to see my kids during the day and enjoy a bit more time with the family. I'm really starting to like not having to sit in Atlanta traffic twice a day (and my blood pressure is thanking me!). My current employer doesn't allow telecommuting so it's not really an option right now.


I guess I'm looking for a good book or similar resource that touches on the subject. Perhaps the experiences of others that have tried to make a living doing what it is we do.