primeqa.ir.dense.colbert_top.colbert.data.examples.Examples#
- class primeqa.ir.dense.colbert_top.colbert.data.examples.Examples(path=None, data=None, nway=None, provenance=None)#
Bases:
object
Methods
cast
provenance
save
toDict
NOTE: For distributed sampling, this isn't equivalent to perfectly uniform sampling.
- tolist(rank=None, nranks=None)#
NOTE: For distributed sampling, this isn’t equivalent to perfectly uniform sampling. In particular, each subset is perfectly represented in every batch! However, since we never repeat passes over the data, we never repeat any particular triple, and the split across nodes is random (since the underlying file is pre-shuffled), there’s no concern here.