API Overview

The main entrypoint to ADAM is the ADAMContext, which allows genomic data to be loaded in to Spark as GenomicRDD. GenomicRDDs can be transformed using ADAM’s built in pre-processing algorithms, Spark’s RDD primitives, the region join primitive, and ADAM’s pipe APIs. GenomicRDDs can also be interacted with as Spark SQL tables.

In addition to the Scala/Java API, ADAM can be used from Python and R.

Adding dependencies on ADAM libraries

ADAM libraries are available from Maven Central under the groupId org.bdgenomics.adam, such as the adam-core library:

<dependency>
  <groupId>org.bdgenomics.adam</groupId>
  <artifactId>adam-core${binary.version}</artifactId>
  <version>${adam.version}</version>
</dependency>

Scala apps should depend on adam-core, while Java applications should also depend on adam-apis:

<dependency>
  <groupId>org.bdgenomics.adam</groupId>
  <artifactId>adam-apis${binary.version}</artifactId>
  <version>${adam.version}</version>
</dependency>

For each release, we support four ${binary.version}s:

  • _2.10: Spark 1.6.x on Scala 2.10
  • _2.11: Spark 1.6.x on Scala 2.11
  • -spark2_2.10: Spark 2.x on Scala 2.10
  • -spark2_2.11: Spark 2.x on Scala 2.11

Additionally, we push nightly SNAPSHOT releases of ADAM to the Sonatype snapshot repo, for developers who are interested in working on top of the latest changes in ADAM.

The ADAM Python API

ADAM’s Python API wraps the ADAMContext and GenomicRDD APIs so they can be used from PySpark. The Python API is feature complete relative to ADAM’s Java API, with the exception of the region join API, which is not supported.

The ADAM R API

ADAM’s R API wraps the ADAMContext and GenomicRDD APIs so they can be used from SparkR. The R API is feature complete relative to ADAM’s Java API, with the exception of the region join API, which is not supported.