-
Notifications
You must be signed in to change notification settings - Fork 314
Redis TTL for MinhashLSH #305
Copy link
Copy link
Open
Description
Does the library support defining a TTL expiry for the LSH index? If not, is there a reason TTL support wasn't implemented?
I looked into the source code and was able to work around this by iterating over all keys and expiring them. However, I'd be happy to contribute if this is something the maintainers would be open to.
Below is a method of a class I wrote that has lsh as a member.
def refresh_ttl(self) -> None:
"""Refresh TTL on all Redis keys used by the datasketch LSH index.
Redis key structure per band (RedisSetStorage):
- table._name : a Redis hash mapping band‑hash → Redis set key
- each set key : a Redis set containing the indexed document keys
Redis key structure for the keys table (RedisListStorage):
- lsh.keys._name : a Redis hash mapping doc‑key → Redis list key
- each list key : a Redis list containing the per‑band hashes
"""
# for reference -> self.lsh = MinHashLSH()
if self._redis_client is None:
return
# --- Round trip 1: fetch all Redis keys that need expiration ---
pipe = self._redis_client.pipeline()
tables = cast(list[RedisSetStorage], self.lsh.hashtables)
for table in tables:
pipe.hvals(cast(str, table._name))
lsh_keys = cast(RedisListStorage, self.lsh.keys)
pipe.hvals(cast(str, lsh_keys._name))
results: list[list[bytes]] = pipe.execute()
all_redis_keys = results[:-1]
lsh_key_redis_keys = results[-1]
# --- Round trip 2: EXPIRE every key ---
pipe = self._redis_client.pipeline()
for table, redis_keys in zip(tables, all_redis_keys):
pipe.expire(cast(bytes, table._name), REDIS_INDEX_TTL)
for rk in redis_keys:
pipe.expire(rk, REDIS_INDEX_TTL)
pipe.expire(lsh_keys._name, REDIS_INDEX_TTL)
for rk in lsh_key_redis_keys:
pipe.expire(rk, REDIS_INDEX_TTL)
pipe.execute()
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels