Fix search for hyphenated names

This issue was caused by the combination of the fact that we have two
different imcompatible tokenizers for names, and the fact that our
name-normalizer ignroes all non-letter and non-digit characters.

Basically, the name tokenizer used to build index uses ' ' as the separator,
and the one used to tokenize queries use all non-letter, non-digit characters.

Take the name "Double-barrelled" as an example.  The full-text search index
for this looks like "doublebarrelled", because it's treated as  one token
(because there's no spaces in it), and the normalzier removes all
non-letter/digits.

On the other hand, the query term "double-barrelled" will be split into
"double" "barrelled", and internally it becomes AND-ed prefix matches
"double* AND barrelled*".  Beacuse "barrelled*" doesn't match "doublebarrelled"
the query doesn't hit.

So (for now) let's split names with '-' when buidling the index.  With this
CL the index will be "double barrelled" and the query "double-barrelled"
(and also "double barrelled") *will* hit this.

Long-term we probably need a better fix.

Bug 5592553

Change-Id: I34bfa8647eec8d203f8ff7fc8a85f42505054c7c
4 files changed