CloudKeeper is a domain-specific language and runtime system for implementing and running dataflows on the Java Virtual Machine. Designed to facilitate “programming in the large”, CloudKeeper is entirely general-purpose and abstracts away concerns such as data transfer, serialization, scheduling, checkpointing, and package/dependency management.
The functional units in CloudKeeper dataflows are called modules, and they have in- and out-ports. Orchestrate domain logic by simply instantiating modules and creating connections between ports.
Debug dataflows in a single JVM on a laptop, and deploy in the cloud – without changing a single line of code. CloudKeeper abstracts away low-level details such as serialization, data movement, check-pointing, scheduling, or dependency/package management.
@CompositeModulePlugin("Analyzes DNA")
public abstract class GenomeAnalysisModule
extends CompositeModule<GenomeAnalysisModule> {
public abstract InPort<FASTQ> dnaFragments();
public abstract OutPort<PDF> report();
AlignModule alignModule = child(AlignModule.class)
.dnaFragments().from(dnaFragments())
.reference().from(value(Constants.REFERENCE_GENOME));
StatsModule statsModule = child(StatsModule.class)
.dnaFragments().from(dnaFragments());
ReportModule reportModule = child(ReportModule.class)
.mutations().from(align.mutations())
.stats().from(stats.stats());
{ report().from(reportModule.pdf()); }
}
Embed CloudKeeper dataflows into other software-engineering projects. Write dataflows textually in the CloudKeeper internal domain-specific language that inherits Java’s type system as well as its excellent IDE support.
Use CloudKeeper as alternative to lower-level concurrency concepts such as threads, Java executor services, actor systems, etc. CloudKeeper is modular and versatile: Keep intermediate results as in-memory Java objects, in the file system, or in a cloud-storage service. Similarly, processing of individual tasks may be as different as using an existing thread pool or a distributed resource manager like Grid Engine.
<dependency>
<groupId>xyz.cloudkeeper.core</groupId>
<artifactId>cloudkeeper-api</artifactId>
</dependency>
<dependency>
<groupId>xyz.cloudkeeper.core</groupId>
<artifactId>cloudkeeper-model</artifactId>
</dependency>