CloudKeeper is a domain-specific language and runtime system for implementing and running dataflows on the Java Virtual Machine. Designed to facilitate “programming in the large”, CloudKeeper is entirely general-purpose and abstracts away concerns such as data transfer, serialization, scheduling, checkpointing, and package/dependency management.
The functional units in CloudKeeper dataflows are called modules, and they have in- and out-ports. Orchestrate domain logic by simply instantiating modules and creating connections between ports.
Debug dataflows in a single JVM on a laptop, and deploy in the cloud – without changing a single line of code. CloudKeeper abstracts away low-level details such as serialization, data movement, check-pointing, scheduling, or dependency/package management.
Embed CloudKeeper dataflows into other software-engineering projects. Write dataflows textually in the CloudKeeper internal domain-specific language that inherits Java’s type system as well as its excellent IDE support.
Use CloudKeeper as alternative to lower-level concurrency concepts such as threads, Java executor services, actor systems, etc. CloudKeeper is modular and versatile: Keep intermediate results as in-memory Java objects, in the file system, or in a cloud-storage service. Similarly, processing of individual tasks may be as different as using an existing thread pool or a distributed resource manager like Grid Engine.