Running a Contains operator in Azure Table Service linq query

2.8k Views Asked by At

I'd like to know why/how this query is running on a Azure Storage table, given that 'contains' are not allowed in Azure Table Service? Is this not doing what I think it's doing? It's running and fetching values. Also, is this fetching the whole table first then filtering? In the debugger it looks like it doesn't run fully until I run the ToList()?

Here is my code, the bottom line I use a contains.

List<string> partitionIds = new List<string> {"1", "2", "3"};

var table = // get table here...

var result = table.ExecuteQuery(new TableQuery<ConnectionEntity>()); 
var queryResult = result.Where(w => partitionIds.Contains(w.PartitionKey)).ToList();
2

There are 2 best solutions below

2
On BEST ANSWER

As stated on the site you provided, Azure Table service does not support the validation of a contain statement. As the data is saved in a no-SQL environment, a contains statement could require enormous amounts of power, depending on the size of your data set. At the moment, your query sends a request to the server that asks for the whole data set (result.GetAll() ). On your system, it evaluates the contains part on the data set that was returned by the server (result.where(contains).tolist() ). This way, the server doesn't evaluate your contains statement, so the server is satisfied. However, your server still needs to do a lot of work to evaluate this statement!

For your second question: The standard way of getting data in Entity Framework, is getting it just in time. This means, it only evaluates the query the moment the data is needed, which is at the moment the data is transformed to a list, when you try to loop over it, or when you try to print it. The only way to get it earlier, is by explicitly loading it by calling result.Load() instead of .toList(). After that, you can still call .toList(), but the result set was loaded at the moment you explicitly stated .Load().

0
On

I know it's an old post, but we have actual a similar problem and I didn't found something newer.

To load all data and filter them was not an option for us. Also to load the record one by one is not an acceptable solution.

So the easiest way to do it within a query is to create an multiple or conditions. To change it to something like new TableQuery<ConnectionEntity>().Where(w => w.PartitionKey == "1" || w.PartitionKey == "2" || ...).

This work's well but it has of course some limitations. With our tests we got 400 BadRequest with more then about 110 conditions.

But for if you know, the count is not that much, you can do this.

I wrote an extension method to do this on an IQueryable dynamicly like .Contains() (tested with Microsoft.Azure.Cosmos.Table library) It wasn't easy :)

Here is the code

    /// <summary>
    /// Convert Contains to a concatenated Or condition for Azure Table query support
    /// </summary>
    /// <typeparam name="T">Entity type</typeparam>
    /// <typeparam name="TParam">property type to check</typeparam>
    /// <param name="query">current query to extend</param>
    /// <param name="values">Values to proof</param>
    /// <param name="property">Which property should be proofed</param>
    /// <returns></returns>
    public static IQueryable<T> WhereContains<T, TParam>(this IQueryable<T> query, IEnumerable<TParam> values, Expression<Func<T, TParam>> property)
    {
        var enumerable = values.ToList();
        if (!enumerable.Any())
            return query;

        Expression<Func<T, bool>> predicate = null;
        var parameter = Expression.Parameter(typeof(T), "entity");
        var propertyName = ((property.Body as MemberExpression)?.Member as PropertyInfo)?.Name 
                           ?? throw new Exception("Property can't be evaluated");
        foreach (var value in enumerable)
        {
            var scope = new ExpressionScopedVariables { Value = value };
            var filterStringExp = Expression.Constant(scope);
            var getVariable = typeof(ExpressionScopedVariables).GetMember("Value")[0];
            var access = Expression.MakeMemberAccess(filterStringExp, getVariable);

            Expression<Func<T, bool>> currentExpression = Expression.Lambda<Func<T, bool>>(
                Expression.Equal(
                    Expression.Property(parameter, propertyName),
                    access), parameter);
            predicate = predicate == null ? currentExpression : Expression.Lambda<Func<T, bool>>(Expression.OrElse(predicate.Body, currentExpression.Body), predicate.Parameters);
        }

        return query.Where(predicate ?? throw new InvalidOperationException());
    }

    class ExpressionScopedVariables
    {
        // ReSharper disable once UnusedAutoPropertyAccessor.Local
        public object Value { get; set; }
    }

And the example how to use it

var query = from v in _baseRepository.AsQueryable()
            where v.PartitionKey == partitionKey
            select v;

query = query.WhereContains(entityIds, v => v.RowKey);
var entities = (await query.QueryAsync()).ToList();

_baseRepository is our own CloudTable repository implementation and AsQueryable() and QueryAsync() are extension methods to create and to execute the query