Swarm - Scala distributed processing
Google code project to distribute processing over the grid. Built on scala. Swarm uses Scala's portable delimited continuations to efficiently allows code to migrate across nodes as it accesses the data. This is not a new idea - it was first touted in the AI community as mobile agents.
This kinda seems at odds with current map-reduce style distribution - as popularised with Hadoop and the excellently elegant Gridgain. With these the focus is on problem decomposition - map your problem down into independent computational blocks and transport to the data / access the data at low-cost. The aim is to speed up processing by distributing the computation problem over the grid.
Swarm while an interesting use of Scala technology, seems only really applicable to problems where the data cannot migrate / relative cost of transporting the data is too high. In such problems we simply want to execute some kind of operation on a local system/data set. It also seems to be targetted at computation which is effectively traversing a set of heterogenous data nodes. If there were identical, or there is only one node, we could just use Gridgain today to achieve the same thing.

This kinda seems at odds with current map-reduce style distribution - as popularised with Hadoop and the excellently elegant Gridgain. With these the focus is on problem decomposition - map your problem down into independent computational blocks and transport to the data / access the data at low-cost. The aim is to speed up processing by distributing the computation problem over the grid.
Swarm while an interesting use of Scala technology, seems only really applicable to problems where the data cannot migrate / relative cost of transporting the data is too high. In such problems we simply want to execute some kind of operation on a local system/data set. It also seems to be targetted at computation which is effectively traversing a set of heterogenous data nodes. If there were identical, or there is only one node, we could just use Gridgain today to achieve the same thing.

Labels: distributed, scala

