High-performance MinHash implementation in Rust with Python bindings for efficient similarity estimation and deduplication of large datasets