Dynamic fuzzy search with LINQ and entity framework and jQuery autocomplete part 2
By eidias on (tags: fuzzy search, categories: code)Search has always been my most desired feature in applications and if I were to guess, I’m not the only one with this craving. The comfort and flexibility it gives you is enormous under one condition – it’s smart enough.
It’s time for part 2
In the first part I gave an introduction on why I wanted to implement a custom search. Now, I’m going to show you how I did that.
As a starting point I needed to figure out how to do fuzzy search and what type of data it can produce.
I did that when I first stumbled on the subject and it turned out to be a lot of fun. That resulted in a small lib that you can look up here.
The algorithm I decided to utilize is the Longest Common Subsequence (LCS) length which is used to rank entries. These entries are later sorted by that rank and the top N are displayed. There is one caveat here – the LCS length is O(n*m) where n is the length of the term typed in by the user and m is the length of the word that’s being ranked. I optimized the memory footprint compared to the original algorithm but the number of operations didn’t change – so this will be a bottle neck at some point because this algorithm needs to be applied to every indexed string… which makes it… ugh… O(n3) – but for now, I’ll let it slide.
The second step was to create a search index which seemed to be an easy task – for each entity you want to have searchable run a db query and return the results – but that would mean I’d have to do very similar thing multiple times and I wanted to keep the index creation as flexible and compact as possible (obviously). So the idea – attributes. Here’s how I wanted to have it:
1: public class Foo
2: {
3: [Searchable]
4: public string Bar {get;set;}
5: [Searchable]
6: public string Buzz {get;set;}
7: public Frob Frob {get;set;}
8: public byte[] Bytes {get;set;}
9: // ...
10: }
11:
12: public class Frob
13: {
14: [Searchable]
15: public string Qux {get;set;}
16: public string Bleach {get;set;}
17: // ...
18: }
That way I could choose which property should be searchable and which not (note that I didn’t put the attribute on a class).
The searchable classes are part of the domain model and part of the class that inherits from DbContext (for non entity framework users – that’s the class that holds reference to all types that are stored in the database). I call that class AppContext.
At this point I can say what I want to query.
This is how I get the types that need to be queried:
1: var assembly = Assembly.GetAssembly(typeof(IAppContext));
2: var types = assembly.GetTypes()
3: .Where(t => t.GetProperties().Any(p => p.GetCustomAttributes(typeof(SearchableAttribute), true).Any())
4: && !t.IsAbstract
5: && typeof(Entity).IsAssignableFrom(t))
6: .ToList();
As you can see, there is a limitation here – each searchable entity needs to inherit from Entity which is a class that holds an ID. Similarly I can get a list of properties for each type (stored under the variable typeFrom)
1: var propertiesToQuery = typeFrom.GetProperties().Where(p => p.GetCustomAttributes(typeof(SearchableAttribute), true).Any() || p.Name == "ID").ToArray();
So now I have all the information to run the query and we’ll go through that in the next post.
Cheers