Integration with Spark
Leverage the Spark execution engine, Scala and Python SDK and SQL with the Parque columnar storage format.
Advanced query tool for bioinformaticians
Work with genomic and phenotypic tabular data using declarative relational query language in a parallel execution engine.
Genomic ordered data architecture
Efficient data structures and commands for genomic analysis use-cases, such as range-queries and table joins.
GORpipe query syntax
Combines the best of SQL and Unix shell pipe syntax, supporting seek-able nested queries, materialized views, and a rich set of commands and functions.
Support for external commands
Define new commands using JVM language or shell scripts.
Compatible with standard formats
BAM, CRAM, VCF, Tabix, TSV, CSV.
Setup parameterized functions using YML and FreeMarker scripts.