GLnexus
Scalable datastore for population genome sequencing, with on-demand joint genotyping
|
#include <data.h>
Public Member Functions | |
virtual Status | dataset_header (const std::string &dataset, std::shared_ptr< const bcf_hdr_t > &hdr) const =0 |
Retrieve the BCF header for a data set. | |
virtual Status | dataset_range (const std::string &dataset, const bcf_hdr_t *hdr, const range &pos, std::vector< std::shared_ptr< bcf1_t > > &records)=0 |
virtual Status | dataset_range_and_header (const std::string &dataset, const range &pos, std::shared_ptr< const bcf_hdr_t > &hdr, std::vector< std::shared_ptr< bcf1_t > > &records) |
virtual Status | sampleset_range (const MetadataCache &metadata, const std::string &sampleset, const range &pos, std::shared_ptr< const std::set< std::string >> &samples, std::shared_ptr< const std::set< std::string >> &datasets, std::vector< std::unique_ptr< RangeBCFIterator >> &iterators) |
Abstract interface to stored BCF data sets. The implementation is responsible for any suitable caching.
|
pure virtual |
Retrieve all BCF records in the data set overlapping a range.
Each record x will already have been "unpacked" with bcf_unpack(x,BCF_UN_ALL). The records may be shared, so they must not be mutated. (They aren't declared const because some vcf.h accessor functions don't take const bcf1_t*)
The provided header must match the data set, otherwise the behavior is undefined!
Implemented in GLnexus::BCFKeyValueData.
|
virtual |
Wrapper for dataset_range which first fetches the appropriate header (useful if the caller doesn't already have the header in hand)
|
virtual |
Get iterators for BCF records overlapping the given range in all datasets containing at least one sample in the designated sample set. To facilitate parallelization, the implementation may yield multiple iterators, each of which will produce a range-based disjoint subset of the relevant records. Each iterator will yield results for each relevant data set (possibly yielding zero records in some steps) – that is, they will all reach their end after the same number of steps. The iterators together will produce each relevant record exactly once.
Reimplemented in GLnexus::BCFKeyValueData.