Linq GroupBy and Filter

92 Views Asked by At

I have Parent child one to many relationship between Alerts & Matches Table. I am stuck in applying filter on the child table property Following is the query, and scenarios is an incoming list of integers.

var groupedQuery = 
    from a in _clientContext.Alerts
    join m in _clientContext.AlertsMatches
        on a.AlertID equals m.alertID
    group new { a, m } by new
    {
        a.AlertID,
        a.AlertDate,
        a.AlertScore,
    } into gr
    select new
    {
        AlertId = gr.Key.AlertID,
        AlertDate = gr.Key.AlertDate,
        AlertScore = gr.Key.AlertScore,
        ScenarioIds = gr.Select(x => x.m.Scenario.ScenarioId).ToList(),
    };

var query = 
    from item in groupedQuery
    select new AlertsDTO
    {
        AlertId = item.AlertId,
        AlertDate = item.AlertDate,
        ScenarioIds = ScenarioIds
    };

if (scenarios != null && scenarios.Length > 0)
            {
                // some condition to filter only alerts where some scenarioIds match
                query = query.Where(w => w.ScenarioIds.Any(id => id == 2));

                // another condition to filter only alerts where all scenarioIds match
               query = query.Where(w => w.ScenarioIds.All(id => id == 2));
            }

On the filter clause I am getting the following error

System.InvalidOperationException: The LINQ expression 'DbSet() .LeftJoin( inner: DbSet(), outerKeySelector: a => (object)a.AlertID, innerKeySelector: a0 => (object)a0.alertID, resultSelector: (a, a0) => new TransparentIdentifier<Alerts, AlertsMatches>( Outer = a, Inner = a0 )) .GroupBy(ti => new { AlertID = ti.Outer.AlertID, AlertDate = ti.Outer.AlertDate, AlertScore = ti.Outer.AlertScore }) .Select(g => new { AlertId = g.Key.AlertID, AlertDate = g.Key.AlertDate, AlertScore = g.Key.AlertScore, ScenarioIds = g .AsQueryable() .LeftJoin( inner: DbSet(), outerKeySelector: e => EF.Property<int?>(e.Inner, "ScenarioID"), innerKeySelector: s => EF.Property<int?>(s, "ScenarioID"), resultSelector: (o, i) => new TransparentIdentifier<TransparentIdentifier<Alerts, AlertsMatches>, Scenario>( Outer = o, Inner = i )) .Select(e => e.Inner.ScenarioID) .Distinct() .ToList() }) .Where(e1 => e1.ScenarioIds .Any(id => id == 2))' could not be translated

And below are the respective classes

public class AlertsDTO { 
    public int AlertId {get;set;}
    public DateTime AlertDate {get;set;}
    public string Scenario {get;set;}
}

public class Scenario { 
    public int ScenarioId {get;set;}
    public string ScenarioName {get;set;}
}

public class Alerts{
    public int AlertID {get;set;}
    public DateTime AlertDate {get;set;}
    public int AlertScore {get;set;}
}

public class AlertsMatches
{
    public int alertID { get; set;}
    public Scenario Scenario {get;set;}
    
}
2

There are 2 best solutions below

0
jdweng On

See below. I did not test so it may need some minor changes

var groupedQuery = 
    (from a in _clientContext.Alerts
    join m in _clientContext.AlertsMatches on a.AlertID equals m.alertID
    select new {a = a, m = m }
    ).GroupBy(x => new { id = x.a.AlertID, date = x.a.AlertDate, score = x.a.Alert.Score})
    .Select(gr => new {
        AlertId = gr.Key.id,
        AlertDate = gr.Key.date,
        AlertScore = gr.Key.score,
        ScenarioIds = gr.Select(y => y.m.Scenario.ScenarioId).ToList(),
    });
0
Harald Coppoolse On

You start your question with:

On the filter clause I am getting an error

Are you sure that you are getting the error while still creating the query without executing it? So in the statement query = query.Where(w => w.ScenarioIds.Any(id => id == 2));, or the other where?

What does this have to do with my question?

Well, an IQueryable seems like an IEnumerable. However, an IEnumerable represents a sequence of similar objects. You can ask for the first object of the sequence, and as long as you've got an object, you can ask for the next object of the sequence. The IEnumerable holds everything to be able to do this.

An IQueryable however - although it also implements IEnumerable - doesn't represent a enumerable sequence, it represents the potential to fetch an IEnumerable sequence.

To be able to do this, an IQueryable holds an Expression and a Provider. The Expression holds in some generic format what data must be fetched, the Provider knows where the data must be fetched from (usually a Database Management System), and what language is used to communicate with this DBMS (usually something like SQL).

If you look closely to LINQ statements, you'll notice that there are two types of LINQ statements. Those that return an IQueryable<TResult> and the others.

Examples of the first group are Join, Where, Select. Examples of the last group are ToList, ToArray, Count, FirstOrDefault, Any.

IQueryables of the first group don't execute the query, they only change the Expression. As long as you concatenate LINQ statements from this group, the query is not executed, the database is not contacted. The value of the query is an IQueryable<TResult>.

Only after you start enumerating by using a function of the other group, like ToList, the query is executed: the Expression is sent to the Provider, who will translate the query into SQL and fetch the data from the database. The fetched data is returned as an enumerable sequence.

You get the error The LINQ expression ... could not be translated. I would have expected that you got this error after starting enumerating, not after concatenating several LINQ statements.

** Ok, but how do I solve my problem **

I'm not sure what your requirements are. You give us a LINQ statement, and says that it doesn't do what you want, but you didn't write what you want.

You stated that there is a one-to-many relation between Alerts and Matches: every Alert has zero or more Matches, every Match belongs to exactly one Alert, namely the Alert that the foreign key AlertID refers to.

Requirement: Give me (several properties of) the Alerts, each Alert with (several properties of) the Matches of this Alert.

In your code:

var groupedQuery = 
from a in _clientContext.Alerts
join m in _clientContext.AlertsMatches
    on a.AlertID equals m.alertID
group new { a, m } by new
{
    a.AlertID,
    a.AlertDate,
    a.AlertScore,
} into gr
select new
{
    AlertId = gr.Key.AlertID,
    AlertDate = gr.Key.AlertDate,
    AlertScore = gr.Key.AlertScore,
    ScenarioIds = gr.Select(x => x.m.Scenario.ScenarioId).ToList(),
};

The query isn't executed yet, but if you add a debug statement, something like the following, I think you'll see the error.

var debugResult = groupdQuery.Take(10).ToList();

My LINQ query syntax is a bit rusty. I think that the error has something to do with your GroupBy. Parameter keySelector seems a bit odd to me.

Anyway, I always use method syntax. To fetch Alerts with their Matches, I use the overload of Queryable.GroupJoin that has a parameter resultSelector

// GroupJoin Alerts with AlertMatches
var alertsWithTheirScenarioIds = _clientContext.Alerts.GroupJoin(
_clientContext.AlertsMatches,
alert => alert.AlertId,         // from every Alert take the AlertId
match => match.AlertId,         // from every AlertMatch take the foreign key
                                // to the Alert that it belongs to

// parameter resultSelector: from every Alert,
// with all its zero or more AlertMatches, make one new:
(alert, matchesOfThisAlert) => new
{
    Id = alert.AlertId,
    Date = alert.AlertDate,
    Score = alert.AlertScore,

    // fetch the Ids of all matches that belong to this alert:
    ScenarioIds = matchesOfThisAlert.Select(match => match.ScenarioId).ToList(),
});

var debugResult = alertsWithTheirScenarioIds.Take(10).ToList();

Does this solve your problem?

Let's continue:

var query = from item in groupedQuery
select new AlertsDTO
{
    AlertId = item.AlertId,
    AlertDate = item.AlertDate,
    ScenarioIds = ScenarioIds
};

It seems to me that you create the same sequence as in groupedQuery, only without the Score. If you don't want to use the Score, why select it in the groupedQuery?

Anyway, if you haven't found the error yet, you can also add a debug statement here:

var debugResult2 = query.Take(10).ToList();

If this works correctly, let's move on to your next statement:

if (scenarios != null && scenarios.Length > 0)
{
    ...
}

Apparently scenarios is an array of something.

if (scenarios != null && scenarios.Length > 0)
{
    // keep only those Alerts that have at least one ScenarioId equal to 2
    query = alertsWithTheirScenarioIds
        .Where(alert => alert.ScenarioIds.Any(scenarioId => scenarioId == 2));

    // You know the drill by now. What happens if you add the debug statement:
    var debugResult3 = query.Take(10).ToList();

    // in your code you don't use the previous query, you immediately change it:
    // keep only those alerts that only have ScenarioIds equal to 2
    query = alertsWithTheirScenarioIds
        .Where(alert => alert.ScenarioIds.All(scenarioId => scenarioId == 2));

    var debugResult4 = query.Take(10).ToList();
}

If changing the code into method Syntax didn't help, I think that adding the debug statement helps you finding where exactly the error occurs.