Virtualization
Virtualization is a broad topic that becomes more relevant to open source computing every day. While cloud providers like amazon webservices offer the ability to spin up servers running proprietary operating systems, more often than not users choose to run an open source operating system such as Linux or FreeBSD. With desktop virtualization software like VirtualBox it’s simple to run a free OS on your personal workstation, too. By making the deployment of open source operating systems almost trivial, these virtualization techniques bring more users into the open source family everyday.
Emulators
First, let’s talk a little bit about emulators which are a slightly different beast than virtualization software.
- bochs is an older project that emulates x86 cpus. It’s interesting to read about the internals and discover how the hardware is emulated with software.
- Hercules Mainframe Emulator
- qemu is a particularly interesting project that dances on the border between virtualization and emulation. While it’s able to strictly emulate a wide variety of systems in several modes, qemu is also capable of using kvm to provide virtualization and improve performance.
Virtualization
Virtualization lets us run different (instances or actually different) operating systems on the same physical hardware. Let’s look at xen, the leader of virtualization in the open source space.
- The xen project software overview gives us some really great basics about the project and how it works.
- A dom0 kernel is required for management of guests.
- While virtualization in the x86 world didn’t really exist until 2005, it is now a thing.
(as a total aside, this piece from the xen is an interesting consideration of security in the open source world now that cloud computing is so prevalent)
ec2, DigitalOcean, etc
Like we discussed extensively in the lecture on remote system management, services like ec2, DigitalOcean, Linode, and scores of others give anyone the ability to spin up servers ‘in the cloud’. Under the hood most of these rely on Xen or something similar. Because it’s common to run Linux or FreeBSD on these cloud servers, it’s easy to argue that more folks use and benefit from open source now more than ever. Even if your average developer is running Mac OS or Windows on his or her machine, odds are they end up dealing with some open source software running their code at some point. Likewise, the technologies in the next section make it easy for developers to work off a common baseline, a system in a known state, to simplify reproducing bugs, having repeatable builds, testing across environments, etc.
Containers
Related to virtualization, containers as popularized by docker uses other facilities to “container-ize” a view of an operating system. So, apps running in a container share the same kernel but have their own isolated view.
Virtual Machines
While not strictly in the same space as virtualization or emulation, I think it’s nice to look at so-called ‘virtual machines’ that are used to execute a lot of our code. Java probably did the most to popularize the idea, and its goals were portability and safety.
Java Virtual Machine and Java Bytecode
- Java Bytecode Example
- http://asm.ow2.org/ is a bytecode manipulation framework that allows instrumentation and other tricks. Consult the guide for lots more information.
- cglib is another tool for working with bytecode.
From cglib: The missing manual:
Hibernate uses cglib for example for its generation of dynamic proxies. Instead of returning the full object that you stored ina a database, Hibernate will return you an instrumented version of your stored class that lazily loads some values from the database only when they are requested.
Python Bytecode
Since we consider python to be an “interpretted” language and Java to be a “compiled” language, it might be a little surprising to hear that python has its own VM. Though much less stable, less publicized, and much more dynamic, we can disassemble python code and look at the bytecode that’s generated. Consider this simple function:
def add_one(x):
y = x + 1
return y
And the resulting bytecode:
>>> import dis
>>> dis.dis(add_one)
2 0 LOAD_FAST 0 (x)
3 LOAD_CONST 1 (1)
6 BINARY_ADD
7 STORE_FAST 1 (y)
3 10 LOAD_FAST 1 (y)
13 RETURN_VALUE
>>>
- Python VM In Action
- Python Interpreter – 500 lines or less
- pytest and assertions gives a practical example of insane bytecode manipulation