primeqa.ir.dense.colbert_top.colbert.indexing.codecs.residual.ResidualCodec#
- class primeqa.ir.dense.colbert_top.colbert.indexing.codecs.residual.ResidualCodec(config, centroids, avg_residual=None, bucket_cutoffs=None, bucket_weights=None)#
Bases:
object
Methods
binarize
compress
EVENTUALLY: Fusing the kernels or otherwise avoiding materalizing the entire matrix before max(dim=0)
We batch below even if the target device is CUDA to avoid large temporary buffers causing OOM.
load
Handles multi-dimensional codes too.
save
try_load_torch_extensions
- compress_into_codes(embs, out_device)#
- EVENTUALLY: Fusing the kernels or otherwise avoiding materalizing the entire matrix before max(dim=0)
seems like it would help here a lot.
- decompress(compressed_embs: primeqa.ir.dense.colbert_top.colbert.indexing.codecs.residual_embeddings.ResidualEmbeddings)#
We batch below even if the target device is CUDA to avoid large temporary buffers causing OOM.
- lookup_centroids(codes, out_device)#
Handles multi-dimensional codes too.
EVENTUALLY: The .split() below should happen on a flat view.