Skip to content

fingerprint

Track Fingerprint Generator.

This module provides track fingerprinting functionality for content-based cache invalidation. Instead of using TTL expiration, tracks are fingerprinted based on critical properties that indicate meaningful changes. The fingerprint is a SHA-256 hash of canonical track data.

Core Concept: - Track fingerprint = SHA-256(persistent_id + location + file_size + duration + dates) - Cache invalidates only when fingerprint changes (content-based invalidation) - Eliminates wasteful TTL expiration for stable music library data

Usage

generator = FingerprintGenerator() fingerprint = generator.generate_track_fingerprint(track_data) if fingerprint != cached_fingerprint: invalidate_cache(track_id)

Design Philosophy (Linus Torvalds approved): - Deterministic: Same track data always produces same fingerprint - Minimal: Only essential properties that matter for cache validity - Fast: SHA-256 is fast enough for 1000+ tracks - Reliable: Detects all meaningful changes, ignores cosmetic ones

FingerprintGenerationError

FingerprintGenerationError(message, track_data=None)

Bases: Exception

Raised when fingerprint generation fails.

Initialize fingerprint generation error.

Parameters:

Name Type Description Default
message str

Error description

required
track_data dict[str, Any] | None

Track data that caused the error (optional, for debugging)

None
Source code in src/services/cache/fingerprint.py
def __init__(self, message: str, track_data: dict[str, Any] | None = None) -> None:
    """Initialize fingerprint generation error.

    Args:
        message: Error description
        track_data: Track data that caused the error (optional, for debugging)
    """
    super().__init__(message)
    self.track_data = track_data

FingerprintGenerator

FingerprintGenerator(logger=None)

Generates deterministic fingerprints for music tracks.

Track fingerprints are SHA-256 hashes of critical track properties that indicate meaningful changes. This enables content-based cache invalidation instead of wasteful time-based TTL expiration.

Fingerprint Properties (carefully chosen): - persistent_id: Primary key from Music.app (never changes for same track) - location: File path (changes when file moved/deleted) - file_size: File size in bytes (changes when file replaced) - duration: Track duration (changes when file replaced) - date_modified: File modification timestamp (changes on metadata edits) - date_added: When track was added (changes on re-import)

Deliberately Excluded Properties
  • play_count: Changes frequently but doesn't affect processing
  • rating: User preference, not content change
  • last_played: Frequent changes irrelevant to processing
  • genre: This is what we're computing, can't use for fingerprint!

Initialize fingerprint generator.

Parameters:

Name Type Description Default
logger Logger | None

Logger for debugging fingerprint generation (optional)

None
Source code in src/services/cache/fingerprint.py
def __init__(self, logger: logging.Logger | None = None) -> None:
    """Initialize fingerprint generator.

    Args:
        logger: Logger for debugging fingerprint generation (optional)
    """
    self.logger = logger or logging.getLogger(__name__)

generate_track_fingerprint

generate_track_fingerprint(track_data)

Generate SHA-256 fingerprint for a music track.

Creates a deterministic hash based on critical track properties that indicate meaningful content changes. The fingerprint enables content-based cache invalidation instead of arbitrary TTL expiration.

Parameters:

Name Type Description Default
track_data dict[str, Any]

Track metadata dictionary containing track properties

required

Returns:

Type Description
str

SHA-256 hash string (64 hex characters) representing track fingerprint

Raises:

Type Description
FingerprintGenerationError

If required properties missing or invalid

Example

track_data = { "persistent_id": "ABC123DEF456", "location": "/Users/user/Music/song.mp3", "file_size": 5242880, "duration": 240.5, "date_modified": "2025-09-11 10:30:00", "date_added": "2025-09-10 15:00:00" } fingerprint = generator.generate_track_fingerprint(track_data)

Returns: "a1b2c3d4e5f6789012345678901234567890abcdef1234567890abcdef123456"

Source code in src/services/cache/fingerprint.py
def generate_track_fingerprint(self, track_data: dict[str, Any]) -> str:
    """Generate SHA-256 fingerprint for a music track.

    Creates a deterministic hash based on critical track properties that
    indicate meaningful content changes. The fingerprint enables content-based
    cache invalidation instead of arbitrary TTL expiration.

    Args:
        track_data: Track metadata dictionary containing track properties

    Returns:
        SHA-256 hash string (64 hex characters) representing track fingerprint

    Raises:
        FingerprintGenerationError: If required properties missing or invalid

    Example:
        track_data = {
            "persistent_id": "ABC123DEF456",
            "location": "/Users/user/Music/song.mp3",
            "file_size": 5242880,
            "duration": 240.5,
            "date_modified": "2025-09-11 10:30:00",
            "date_added": "2025-09-10 15:00:00"
        }
        fingerprint = generator.generate_track_fingerprint(track_data)
        # Returns: "a1b2c3d4e5f6789012345678901234567890abcdef1234567890abcdef123456"
    """
    try:
        return self._create_fingerprint_hash(track_data)
    except FingerprintGenerationError:
        raise
    except (KeyError, TypeError, ValueError) as e:
        error_msg = f"Failed to generate fingerprint: {e}"
        self.logger.exception(error_msg)
        raise FingerprintGenerationError(error_msg, track_data) from e

validate_fingerprint staticmethod

validate_fingerprint(fingerprint)

Validate that a fingerprint has correct format.

Parameters:

Name Type Description Default
fingerprint Any

Fingerprint value to validate (should be string)

required

Returns:

Type Description
bool

True if fingerprint is valid SHA-256 hex string, False otherwise

Source code in src/services/cache/fingerprint.py
@staticmethod
def validate_fingerprint(fingerprint: Any) -> bool:
    """Validate that a fingerprint has correct format.

    Args:
        fingerprint: Fingerprint value to validate (should be string)

    Returns:
        True if fingerprint is valid SHA-256 hex string, False otherwise
    """
    if not isinstance(fingerprint, str):
        return False

    # SHA-256 produces 64 hex characters
    if len(fingerprint) != 64:
        return False

    # Check if all characters are valid hex
    try:
        int(fingerprint, 16)
        return True
    except ValueError:
        return False