Integration with Spark
Leverage the Spark execution engine, Scala and Python SDK and SQL with the Parque columnar storage format.
GORpipe allows analysis of large sets of genomic and phenotypic tabular data using a declarative query language in a parallel execution engine.
Advanced query tool for bioinformaticians
Work with genomic and phenotypic tabular data using declarative relational query language in a parallel execution engine.
Genomic ordered data architecture
Efficient data structures and commands for genomic analysis use-cases, such as range-queries and table joins.
GORpipe query syntax
Combines the best of SQL and Unix shell pipe syntax, supporting seek-able nested queries, materialized views, and a rich set of commands and functions.
Support for external commands
Define new commands using JVM language or shell scripts.
Compatible with standard formats
BAM, CRAM, VCF, Tabix, TSV, CSV.
Stored procedures
Setup parameterized functions using YML and FreeMarker scripts.