If you've ever wanted to build your own supercomputer but have been held back by the demands of parallel programming in C, Pseudo Remote Threads is for you. This prize-winning Java programming model greatly simplifies parallel programming on clusters, bringing supercomputing out of the laboratory and into the hands of everyday Java programmers.
Over the past three years, parallel clustering has begun to change the face of supercomputing. Where once the monolithic multimillion-dollar machine ruled, the parallel cluster is fast becoming the supercomputer of choice. Predictably, enthusiasm in open-source circles has spawned hundreds -- if not thousands -- of parallel clustering projects. The first and most well-known open-source clustering system is Beowulf. Launched in 1994 by Thomas Sterling and Donald Becker under the auspices of NASA, Beowulf started out as a 16-node demonstration cluster. Today, there are hundreds of implementations of Beowulf, ranging from the Oak Ridge National Laboratory's Stone SouperComputer to Aspen Systems Inc.'s custom-built commercial clusters (see Resources).
Unfortunately for Java programmers, most clustering systems are built around C-based software messaging APIs such as Message Passing Interface (MPI) or Parallel Virtual Machine (PVM). Parallel programming in C is no easy task, so I have devised a workaround. In this article, I will show you how to use a combination of Java threads and Java Remote Method Invocation (RMI) to create your own Java-based supercomputer.
Note that this article assumes you have a working knowledge of Java threads and RMI.
What's in a supercomputer?
A supercomputer is defined as a cluster of eight or more nodes working together as a single high-performance machine. The Java-based supercomputer consists of a job dispatcher and any number of run servers, also known as hosts. The job dispatcher spawns multiple threads, each containing the code for a different subtask. Each of the threads migrates its code to a different run server. Each run server then executes the migrated code and returns the results back to the job dispatcher. Finally, the job dispatcher combines the results obtained from the various threads.
This parallel clustering system is called Pseudo Remote Threads because the threads are scheduled on the job dispatcher but the code within the threads is executed on a remote machine.
|
What are the system's components?
The term components refers to the logical modules that make up the parallel clustering system known as Pseudo Remote Threads. The system consists of the following components:
- Job dispatcher is the machine that runs the show. It spawns various threads, each containing a subtask of the main task to be executed by the cluster. The code within each thread is sent to a remote machine to be executed. Threads are scheduled on the job dispatcher, so ideally this machine should not be used to execute any subtasks.
- SubTask is a user-defined class that defines a data- or functionally-independent portion of the main task. You can define different classes for different subportions of the main task. The class name
SubTaskis an example. You may give any name to aSubTaskclass, though it should represent the assigned subtask. When you define theSubTaskclass, you must implement theJobCodeIntinterface and the methodjobCode(), described next. - JobCodeInt is a Java interface. You must implement this interface and the method
jobCode()in the class that defines your subtask. The methodjobCode()describes the code that is to be executed remotely. If you want to use a local resource remotely, you have to initialize the resource outside of thejobCode(). For example, if you wanted to send a group of images for remote processing, you would have to initialize theImageobjects outside of thejobCode(). You may make calls to standard Java library classes within this method because these libraries are present on the remote machine. - RunServer is a Java object that allows remote procedure calls on its methods. It has a method that takes as its parameter an object that has implemented the
JobCodeIntinterface.RunServerthen executes the code within this object on the machine on which it is running (the run server machine) and returns the result of the computation as an instance of classObject.Objectis the top-most class in the Java class hierarchy. - PseudoRemThr is a Java class that wraps a thread and accepts an instance of a given
SubTask. It then selects a remote host and sends theSubTaskinstance there for execution. You have the option to specify a host if you want to make use of particular resources (such as a database or printer) available on that host. - HostSelector is a module. If you do not specify a remote host, then the
PseudoRemThrclass calls theHostSelectormodule to select a particular host. If no host is completely free, theHostSelectormay return the least-loaded remote machine. If a remote machine is a multiprocessor system, theHostSelectormay return the name of this host more than once. Currently, theHostSelectorcannot select the host based on the complexity of a given task.
|
How Pseudo Remote Threads works
To use Pseudo Remote Threads, you need to implement both the job dispatcher and the run servers. This section describes how to set up each part.
Implementing the job dispatcher
First, break up the main task into data- or functionally-independent subtasks. For each subtask, define a class that implements the interface JobCodeInt and, hence, implements the method jobCode(). Within jobCode(), define the code that is to be executed by every given subtask.
Note that you must not make calls to user-defined resources local to the job dispatcher. Initialize all such resources outside of this method. For example, you could initialize such resources in the constructor of the SubTask class.
Create instances of the class PseudoRemThr and pass the instance of a SubTask to each of the instances of PseudoRemThr. If you want to explicitly specify a remote host, you may do so by calling a different constructor of the PseudoRemThr object.
Wait for the threads to complete. Get the results from each of the instances of PseudoRemThr by calling the method getResult(). The result returns a Boolean object with value false if the computation is not complete; otherwise, it returns an instance of class Object, which contains the result. You must cast this to the class whose result type you are expecting. Combine all subtask results into the final result.
Implementing a run server
Implementing the run server is a simple task:
- Start RMI Registry.
- Start the
RunServer.
The run server contacts the job dispatcher when it starts up and informs the job dispatcher that it is ready to accept tasks for execution.
|
An example computation
Time to put the model to the test. The following example computation was run in parallel using two machines. One was a 333 MHz Pentium II running Windows 98 and the other was a 500 MHz Pentium III running Windows 2000, Professional Edition.
To calculate the sum of square roots of numbers from 1 to 10 raised to 9, I created class Sqrt, which calculates the sum of roots between the values dblStart and dblEnd.
Sqrt implements the JobCodeInt interface and, hence, the jobCode() method. Within jobCode(), I defined the code to perform the calculation.
The constructor is used to pass data to the class and initialize any resources that are local to the job dispatcher. The range of numbers for which the sum of square roots is to be calculated must be sent to the constructor. Class Sqrt is defined in Listing 1.
//The class Sqrt calculates the sum of roots
//between the values dblStart and dblEnd.
//The calculation is done within the jobCode() method
//It implements the JobCodeInt interface and implementation
//code is within the jobCode() method
//use the constructor to pass data to the class and to
//initialize resources that are local to the Job
//Dispatcher machine. In this example, the range of numbers
//for which the sum of square roots is to be calculated is
//sent to the class
public class Sqrt implements JobCodeInt
{
double dblStart, dblEnd, dblPartialSum;
public Sqrt(double Start,double End)
{
dblStart = Start;
dblEnd = End;
}
public Object jobCode()
{
dblPartialSum = 0;
for(double i=dblStart;i<=dblEnd;i++)
//can make calls to standard Java functions and objects.
dblPartialSum += Math.sqrt(i);
//return the result an object of a standard Java class.
return (new Double(dblPartialSum));
}
}
|
The JobDispatcher class creates two instances of the Sqrt class. It then divides the main task by giving a subtask to one Sqrt object (Sqrt1) and the rest of it to the other Sqrt object (Sqrt2). Next, JobDispatcher creates two objects of the PseudoRemThr class and passes one Sqrt object as a parameter to each of them. It then waits for the threads to execute.
After the threads finish executing, the partial result is obtained from each PseudoRemThr instance. These partial results are combined for the final result, as shown in Listing 2.
//this class can have any name of your choosing
//the name JobDispatcher has been chosen merely for convenience
public class JobDispatcher
{
public static void main(String args[])
{
double fin = 10000000; //represents 10 raised to 9
double finByTen = fin/10; //represents 10 raised to 8
long nlStartTime = System.currentTimeMillis();
//range is from 1 to 3*10^8
Sqrt sqrt1 = new Sqrt(1,finByTen*3);
//range is from ((3*10^8)+1) to 10^9
Sqrt sqrt2 = new Sqrt((finByTen*3)+1,fin);
//The following creates two instances of PseudoRemThr class.
//The parameters to this constructor are as follows.
//First parameter: An instance of a class representing a subtask
//Second parameter: Remote Host on which this subtask
//will be executed
//Third parameter: A descriptive name for this
//PseudoRemThr instance.
PseudoRemThr psr1 = new
PseudoRemThr(sqrt1,"//192.168.1.1:3333/","Win98");
PseudoRemThr psr2 = new
PseudoRemThr(sqrt2,"//192.168.1.2:3333/","Win2K");
psr1.waitForResult(); //wait for execution to get over
psr2.waitForResult();
//get the result from each thread
Double res1 = (Double)psr1.getResult();
Double res2 = (Double)psr2.getResult();
double finalRes = res1.doubleValue() + res2.doubleValue();
long nlEndTime = System.currentTimeMillis();
System.out.println("Total time taken: " + (nlEndTime-nlStartTime));
System.out.println("Sum: " + finalRes);
}
}
|
|
Notes on performance
Total execution time for this computation was in the range 120,000 to 128,000 milliseconds. When the same task was run locally without breaking up the tasks, the time varied from 183,241 to 237,641 milliseconds.
Initially, the main task consisted of calculating the sum of square roots from 1 to 10 raised to 7. To test performance, I increased the calculation to 10 raised to 8, and finally to 10 raised to 9.
As the task size increased, so did the difference in execution time between remote parallel execution and local execution. This means that remote parallel execution consumed less time on larger tasks. Remote parallel execution was not favorable for smaller tasks because the overhead for communication between the machines was significant. As I increased the task size, the overhead associated with communication between machines diminished in comparison to the overhead of executing the entire task on a single machine. Thus, I conclude that this system is best implemented on tasks requiring heavy computation.
|
What are the advantages of using Pseudo Remote Threads?
Because Pseudo Remote Threads is a Java-based system, it can be used to implement clusters made up of more than one operating system, or heterogeneous clusters. Using Pseudo Remote Threads, you avoid the difficulties of converting C/C++ legacy code, and instead take advantage of the Java standard library and its various extensions. In addition, Pseudo Remote Threads frees you from memory management. The flip side of this is, of course, that the performance of the system is directly tied to the performance of the JRE.
|
Where to go from here
With so many commercial applications being built using the Java platform, and given the difficulty of converting legacy C/C++ code to take advantage of parallelism, now may be the time to bring Java-based supercomputing into the commercial arena. A good start would be to begin building Java-based applications with parallelism and load balancing in mind.
The Internet is a good example of a heterogeneous cluster, so it follows that Pseudo Remote Threads could be employed over the Internet, converting the Web into a single, Java-based supercomputer (see Resources for more on this concept). For practical purposes, however, it should be noted that you will achieve the best results within a homogeneous cluster dedicated to performing a single task.
Finally, for everyday purposes, Pseudo Remote Threads makes it fairly simple to convert a LAN -- such as those in universities and homes -- into a mini-supercomputer. This is the usage pioneered by the Beowulf system. With Pseudo Remote Threads, it is now available to Java programmers, too.
discuss this topic to forum
