Ensure consistent quality tag matching across all pattern types
Fix inconsistency where some quality tags were only matched in bracketed format but not in unbracketed format, causing streams with unbracketed quality tags to be excluded from matching. Previous issue: - [4K], [Unknown], [Unk], [Slow], [Dead] were matched (bracketed) - But "4K", "Unknown", "Unk", "Slow", "Dead" were NOT matched (unbracketed) This caused streams like: - "BBC One 4K" (no brackets) - NOT normalized, NOT matched - "BBC One Unknown" - NOT normalized, NOT matched - "ESPN (Slow)" - NOT normalized, NOT matched Changes to HARDCODED_IGNORE_PATTERNS: - Added 4K, Unknown, Unk, Slow, Dead to middle pattern: " TAG " - Added 4K, Unknown, Unk, Slow, Dead to end pattern: " TAG" - Added 4K, Unknown, Unk, Slow, Dead to colon pattern: "TAG:" - Added 4K, Unknown, Unk, Slow, Dead to parentheses pattern: (TAG) - Added inline documentation for each pattern type - Note: All patterns already use re.IGNORECASE for case-insensitive matching Now ALL quality tags are handled consistently across: - Bracketed: [TAG] ✓ - Unbracketed at end: " TAG" ✓ - Unbracketed in middle: " TAG " ✓ - Parenthesized: (TAG) ✓ - With colon: "TAG:" ✓ This ensures no quality tags are missed regardless of format.
This commit is contained in:
@@ -14,18 +14,38 @@ from glob import glob
|
||||
LOGGER = logging.getLogger("plugins.fuzzy_matcher")
|
||||
|
||||
# Hardcoded regex patterns to ignore during fuzzy matching
|
||||
# Note: All patterns are applied with re.IGNORECASE flag in normalize_name()
|
||||
HARDCODED_IGNORE_PATTERNS = [
|
||||
# Bracketed quality tags: [4K], [UHD], [FHD], [HD], [SD], [Unknown], [Unk], [Slow], [Dead]
|
||||
r'\[(4K|UHD|FHD|HD|SD|Unknown|Unk|Slow|Dead)\]',
|
||||
r'\[(?:4k|uhd|fhd|hd|sd|unknown|unk|slow|dead)\]',
|
||||
|
||||
# Single letter tags in parentheses: (A), (B), (C), etc.
|
||||
r'\([A-Z]\)',
|
||||
|
||||
# Regional: " East" or " east"
|
||||
r'\s[Ee][Aa][Ss][Tt]',
|
||||
r'\s(?:UHD|FHD|SD|HD|FD)\s',
|
||||
r'\s(?:UHD|FHD|SD|HD|FD)$',
|
||||
r'\b(?:UHD|FHD|SD|HD|FD):?\s',
|
||||
r'\s\(CX\)',
|
||||
r'\s\((UHD|FHD|SD|HD|FD|Backup)\)',
|
||||
r'\bUSA?:\s',
|
||||
r'\bUS\s',
|
||||
|
||||
# Unbracketed quality tags in middle: " 4K ", " UHD ", " FHD ", " HD ", " SD ", etc.
|
||||
r'\s(?:4K|UHD|FHD|HD|SD|Unknown|Unk|Slow|Dead|FD)\s',
|
||||
|
||||
# Unbracketed quality tags at end: " 4K", " UHD", " FHD", " HD", " SD", etc.
|
||||
r'\s(?:4K|UHD|FHD|HD|SD|Unknown|Unk|Slow|Dead|FD)$',
|
||||
|
||||
# Word boundary quality tags with optional colon: "4K:", "UHD:", "FHD:", "HD:", etc.
|
||||
r'\b(?:4K|UHD|FHD|HD|SD|Unknown|Unk|Slow|Dead|FD):?\s',
|
||||
|
||||
# Special tags
|
||||
r'\s\(CX\)', # Cinemax tag
|
||||
|
||||
# Parenthesized quality tags: (4K), (UHD), (FHD), (HD), (SD), (Unknown), (Unk), (Slow), (Dead), (Backup)
|
||||
r'\s\((4K|UHD|FHD|HD|SD|Unknown|Unk|Slow|Dead|FD|Backup)\)',
|
||||
|
||||
# Geographic prefixes
|
||||
r'\bUSA?:\s', # "US:" or "USA:"
|
||||
r'\bUS\s', # "US " at word boundary
|
||||
|
||||
# Backup tags
|
||||
r'\([bB]ackup\)',
|
||||
]
|
||||
|
||||
|
||||
Reference in New Issue
Block a user