Reference#
- class tca.pipeline.SearchIndex(embeddings, metadata, config)[source]#
- Parameters:
embeddings (np.ndarray)
metadata (list[MetadataRecord])
config (TurboQuantConfig)
- class tca.config.TurboQuantConfig(bit_width: 'int' = 3, candidate_k: 'int' = 128, rerank_k: 'int' = 20, oversample: 'int' = 2, seed: 'int' = 0, quantizer_kind: 'str' = 'prod', lloyd_max_iter: 'int' = 100, lloyd_tol: 'float' = 1e-06, monte_carlo_samples: 'int' = 20000, store_original_embeddings: 'bool' = True, auto_score_gap_threshold: 'float' = 0.06, auto_score_spread_threshold: 'float' = 0.015, max_candidate_k: 'int' = 2048, max_oversample: 'int' = 8)[source]#
- Parameters:
bit_width (int)
candidate_k (int)
rerank_k (int)
oversample (int)
seed (int)
quantizer_kind (str)
lloyd_max_iter (int)
lloyd_tol (float)
monte_carlo_samples (int)
store_original_embeddings (bool)
auto_score_gap_threshold (float)
auto_score_spread_threshold (float)
max_candidate_k (int)
max_oversample (int)
- tca.quantization.fit_scalar_codebook(bit_width, dimension, n_samples=20000, seed=0, max_iter=100, tol=1e-06)[source]#
- class tca.quantization.TurboQuantMSE(dimension, bit_width, *, seed=0, monte_carlo_samples=20000, lloyd_max_iter=100, lloyd_tol=1e-06)[source]#
- Parameters:
- approximate_inner_products(query, encoded)[source]#
- Parameters:
query (_Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str])
encoded (EncodedMSE | _Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str])
- Return type:
- class tca.quantization.EncodedProd(indices: 'IntArray', signs: 'SignArray', residual_norms: 'FloatArray')[source]#
- Parameters:
- class tca.quantization.TurboQuantProd(dimension, bit_width, *, seed=0, monte_carlo_samples=20000, lloyd_max_iter=100, lloyd_tol=1e-06)[source]#
- Parameters:
- decode(encoded)[source]#
- Parameters:
encoded (EncodedProd)
- Return type:
- tca.quantization.exact_topk(query, bank, top_k, ids=None)[source]#
- Parameters:
query (_Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str])
bank (_Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str])
top_k (int)
- Return type: