PCJ is a library for Java language that helps to perform parallel and distributed calculations. The current version is able to work on the multicore systems connected with the typical interconnect such as ethernet or infiniband providing users with the uniform view across nodes.
Download PCJ library (jar file of 24.02.2019 ver. 5.0.8) Latest (improved IDE support)!
Download PCJ manual (pdf) for PCJ 5 New!
The PCJ library can be used with no cost at BSD license. It requires Java 8 and no additional tools or comilers. The PCJ library for Java 7 is available in the dowload section.
The source code is available at GitHub: https://github.com/hpdcj/pcj
Version 5 introduces asyncPut() and asyncGet() methods; put() and get() methods are now synchronous. There is new handling of shared variables. The code developed for PCJ 4 hast to be modified. For details please reffer to the JavaDoc file.
The usage should be acknowledged by reference to the PCJ web site and/or reference to the papers:
- M. Nowicki, Ł. Górski, P. Bała PCJ – Java Library for Highly Scalable HPC and Big Data Processing 2018 International Conference on High Performance Computing \& Simulation (HPCS), pp:12-20 IEEE, 2018
- M. Nowicki, M. Ryczkowska, Ł. Górski, M. Szynkiewicz, P. Bała PCJ - a Java library for heterogenous parallel computing In: X. Zhuang (Ed.) Recent Advances in Information Science (Recent Advances in Computer Engineering Series vol 36) WSEAS Press 2016 pp. 66-72
- M. Nowicki, Ł. Górski, P. Grabarczyk, P. Bała PCJ - Java library for high performance computing in PGAS model In: W. W. Smari and V. Zeljkovic (Eds.) 2012 International Conference on High Performance Computing and Simulation (HPCS) IEEE 2014 pp. 202-209
- M. Nowicki, P. Bała PCJ-new approach for parallel computations in java In: P. Manninen, P. Oster (Eds.) Applied Parallel and Scientific Computing, LNCS 7782, Springer, Heidelberg (2013) pp. 115-125
- M. Nowicki, P. Bała Parallel computations in Java with PCJ library In: W. W. Smari and V. Zeljkovic (Eds.) 2012 International Conference on High Performance Computing and Simulation (HPCS) IEEE 2012 pp. 381-387
Full paper list can be found here: http://pcj.icm.edu.pl/pcj-papers
Contact: bala@icm.edu.pl faramir@icm.edu.pl
The PCJ library was created with some principles.
- Tasks (PCJ threads)
- Each task executes its own set of instructions. Variables and instructions are private to the task. PCJ offers methods to synchronize tasks.
- Local variables
- Variables are accessed locally within each tasks and are stored in the local memory.
- Shared variables
- There is dedicated class called Storage which represents shared memory. Each task can access other tasks variables that are stored in a shared memory. Shareable variable has to have a special annotation @Shared.
There is distinction between nodes and tasks (PCJ threads). One instance of JVM is understood as node. In principle it can run on a single multicore node. One node can hold many tasks (PCJ threads) – separated instances of threads that run calculations. This design is aligned with novel computer architectures containing hundreds or thousands of nodes, each of them built of several or even more cores. This forces us to use different communication mechanism for inter- and intranode communication.
In the PCJ there is one node called Manager. It is responsible for setting unique identifiers to the tasks, sending messages to other tasks to start calculations, creating groups and synchronizing all tasks in calculations. In contrast to our previous version of the PCJ library, the Manager node has its own tasks and can execute parallel programs.
Execution in multinode multicore environment
The application using PCJ library is run as typical Java application using Java Virtual Machine (JVM). In the multinode environment one (or more) JVM has to be started on each node. PCJ library takes care on this process and allows user to start execution on multiple nodes, running multiple threads on each node. The number of nodes and threads can be easily configured, however the most resonable choice is to limit on each node number of threads to the number of available cores. Typically, single Java Virtual machine is run on each physical node although PCJ allows for multiple JVM scenario.
Since PCJ application is not running within single JVM, the communication between different threads has to be realized in different manners. If communicating threads run within the same JVM, the Java concurrency mechanisms can be used to synchronize and exchange information. If data exchange has to be realized between different JVM's the network communication using for example sockets has to be used.
The PCJ library handles both situations hiding details from the user. It distinguishes between inter- and intranode communication and pick up proper data exchange mechanism. Moreover, nodes are organized in the graph which allows to optimize global communication.