Polycephaly finds new home in Xtreams-Grid
When I first proposed Polycephaly to Arden Thomas, the product manager, and wrangled Martin Kobetic in to help me; I was unsure if we were really going to help our customers solve real world problems with it or not. Polycephaly, meaning “Many Heads” is a framework like Squeak’s Hydra. It lets you run multiple Smalltalk Images simultaneously, with one image controlling the “drones”. Ironically, the drones run headless, so despite it being called Many Heads it really should have been called Many Headlesses.
To our delight, customers started emailing in with reports of its usefulness in their development and production environments. We also got a lot of feedback too. Polycephaly used stdin/stdout to communicate over pipes between the master and drone. Any customer code that just so happened to write out to Stdout or read from Stdin would break Polycephaly.
Customers found it very useful to spawn 3-5 extra images and give them work to better utilize their multi-core machines. But they also wanted to do the same over the network. And so, Polycephaly-NetworkVirtualMachine was published by Holger Kleinsorgen.
Another criticism was how VirtualMachines worked; specifically it was not useful as a load balancer. So Runar Jordahl blogged about his solution, which round-robin’d “jobs” to different VirtualMachine instances.
Meanwhile, I was somewhat unimpressed with the inability for a drone to respond in a streaming fashion back to the master; or for the master to supply job objects in a stream either. This meant that as soon as a drone had completed its job, it would typically lock up in communication with the master through BOSS.
So, the technical debt on Polycephaly finally reached a turning point when a review was written on the VWNC mailing list and I was resolved to fix the problems. And what luck, I had an engineering trip planned to travel to Ottawa where two of my colleagues live, and then to Victoria (Vancouver Isle.) where another one of my colleagues live. It just so happens that these colleagues could help me and face to face time is invaluable.
I present to you Xtreams-Grid, a VisualWorks and ObjectStudio solution for working with multiple images on your local machine and across a network. Here is the features list when compared with Polycephaly:
- Remove drones and remote masters. Drones can be activated via inetd style, or be pre-configured. Drones can connect to a pre-determined master, or be listening for a master(s) to connect.
- Sockets instead of Pipes, no more Stdin/Stdout breaking the system.
- Streaming results and arguments to-and-fro. The stream is now multiplexed too, with high through-put using the Xtreams-Multiplexer protocol.
- VirtualMachines now acts as a load balancer with a “job” style API and can easily be used with #promise.
- When starting a VirtualMachine locally, you can specify a different image to use and/or a different virtual machine executable.
- Xtreams-Grid has actual documentation, where Polycephaly was still a work-in-progress and had none.
- If you have the Xtreams parcels available (they will be updated in VisualWorks 7.9 at some point in the nearish future); an image no longer has to be saved, the Xtreams drone script will auto-load Xtreams-Grid and its prerequisites on start up.
It also has a much more sensible name. Every one knows what grid computing is and typically looks up ‘grid’ when trying to find it. Polycephaly, while having a very unique name that is highly recognizable, was always a bit of a tongue in cheek job. You can consider it as a past-tense code name at this point.
The API of Xtreams-Grid is very similar to Polycephaly, though not 100% the same. There’s no #timeout:do: method any more, since you can use [] onTimeout: aDuration do: aBlock. VirtualMachines is a complete rewrite and the #do:environment: API is gone completely.
There are new tricks up its sleeve though. Consider, for a moment, this peculiar little example:
evens := machine do: [ :odds | odds collecting: [ :e | e + 1 ] ] with: (1 to: 9) reading.
In this case, we are streaming in odd numbers to the drone and the drone will increment each element in the stream by one, then stream the results back out to the master. The master can leisurely read results, only taxing the drone as required.
As well as this, the drone can consume multiple streams as arguments from the master. It does not need to consume each argument stream in full; since underneath the hood is the Xtreams-Multiplexer protocol, which works similarly to the SSH2 multiplexing protocol.
There is lots to explore here. There are tests which demonstrate some of these capabilities, as well as updated documentation at http://code.google.com/p/xtreams. Or, more specifically: http://code.google.com/p/xtreams/wiki/Grid
The latest version of XtreamsDevelopment as of this writing is 526.