Cleaned search index for accent insensitive search

Keno · July 1, 2022, 7:56pm

What is the feature or ability you would like to have?
Lexicon has an excellent search feature! But as someone who is playing Spanish and German tracks (with accents and umlauts), one thing from Serato is missing from me: A cleaned search index. What that means is that serato stores normalized strings in a query database. More visually: Consider a track called “La Médina” - this track will not be found if you enter “La Medina” or “La Mèdina”. This is especially bad for things like the “ñ” , which can be stored as wither a single character, or a "n" (U+006E) followed by the code point for the combining tilde (U+0303). Many of the big providers do this, e.g. spohttps://chuckdafonk.medium.com/quick-hits-accents-umlauts-in-music-search-maximizing-instagram-story-highlights-e2b901fed022tify. In a normalized search, the search index stores all strings as unaccented, and queries are transformed before searching, i.e. All queries above would produce a hit on “La Médina”. This is very helpful in my opinion. This is also supported on a too level in the meanwhile, eg. in SQL

**How will this feature help you and others?**It facilitates search. Not really high priority, but it always gives me pause if I search for a track of which I’m not sure of the correct accentuation and it doesn’t turn up.

Is this feature available in an existing product? If so, what product?
Many tools and platforms use this, e.g. spotify and serato.

Does a workaround currently exist?
Not really. Do partial searches or try to figure out the correct diacrits.

Christiaan · July 1, 2022, 9:07pm

I completely agree this should be the default way to search really. But the Lexicon database is based on SQLite and I don’t think it has a builtin way to do this, I think there might be an extension but I haven’t really dug into that yet.

I’ll have to experiment a bit, but the way I want it to work is that accents are ignored by default really. I can’t even type an accent on my keyboard if I wanted to…