Dynamic fuzzy search with LINQ and entity framework and jQuery autocomplete part 3
By eidias on (tags: fuzzy search, categories: code)Search has always been my most desired feature in applications and if I were to guess, I’m not the only one with this craving. The comfort and flexibility it gives you is enormous under one condition – it’s smart enough.
Part three is up
We finished up the second part with ways to pull all the required information to run a db query.
In the not-desired-concatenate-sql-statement world it would look more or less like this:
1: var sql = new StringBuilder();
2: foreach(var type in types)
3: {
4: sql.Append("SELECT ");
5: sql.Append(string.Join(",", type.GetPropertiesToQuery()));
6: sql.Append(" FROM ");
7: sql.Append(type);
8: }
then execute that, get the result (DataSet, matrix, whatever) and start ranking, but…
First of all, concatenating sql is bad.
Second of all, entity framework doesn’t provide an api to fetch the names of tables or columns for a given type. By default there’s a convention to pluralize the type name to get the table name and use the property name as column name, but you can override the conventions (I believe that feature is available either from EF 5.0 or will be available in EF 6.0) or you can override particular things with attributes.
Third of all, even if there' would be an api or I’d write something on my own, the DbContext.Database.SqlQuery functions need a return type and because of the attribute approach I don’t know the return type.
The ideal (well, almost ideal) way to query the data would be something like this:
1: AppContext.Set<Foo>().Select(f => new { f.SearchableProperty1, f.SearchableProperty2, ... }).ToList();
this executes a query that fetches only the desired columns for a particular type and let’s me handle the result. But there are two problems here.
- If I have a list of types, then using AppContext.Set<>() is not
possibleeasy. - How can I dynamically create an anonymous type based on a list of properties
To solve problem one I did this (note that this is NOT clean code, but sometimes, you just have to live with it)
1: GetType().GetMethod("QueryValueType", BindingFlags.Instance | BindingFlags.NonPublic)
2: .MakeGenericMethod(typeFrom, typeTo)
3: .Invoke(this, new object[] { propertiesToQuery.Select(p => p.Name).ToArray() })
This allows me to create a method with the following signature:
1: private IEnumerable<SearchIndexItem> QueryValueType<T, TTo>(string[] propertiesToQuery)
and why is that important? Well, let me digress a little.
The DbContext class has 2 methods that allow you to access entities stored in the db. These are usually not the way to query the db (again for non ef users – normally to query entities, you create a property for each type you want to have queryable), but in this case they’re a perfect match. Here are their signatures:
1: public DbSet Set(Type entityType)
2: public DbSet<TEntity> Set<TEntity>() where TEntity : class
The first looks like a better fit for this case, but handling a DbSet compared to DbSet<T> is much more cumbersome and verbose, so I’d rather go with the latter and the hack presented earlier will let me do just that.
Having these in place I can run a query on the db in my QueryValueType<T> method and project the result onto my SearchIndexItem. To code looks like this:
1: var set = AppContext.Set<T>().ToList();
2: var result = set.SelectMany(x => propertiesToQuery.Select(p => new { x.ID, Value = x.GetType().InvokeMember(p, BindingFlags.GetField | BindingFlags.GetProperty, null, x, null) })
3: .Where(p => p.Value != null)
4: .Select(p => new SearchIndexItem
5: {
6: Key = p.Value.ToString(),
7: Url = Url.Action("Edit", type.Name, new { p.ID }),
8: Type = type
9: }));
That pretty much does the job. Wrap the code fragments in a few classes to keep the separation of concerns, throw in jQuery autocomplete on the client side (it’s a web project) and we’re done, right?
Wrong.
There are two more things that still need to be taken care of. The first is that the code above will do a “SELECT * FROM …” and that “*” is a problem. The second is that even if we do the query right, and pull only the columns we need, the whole operation is quite time and resource consuming.
We’ll take a look at these in the next part.
Cheers