Programming language/library that uses dataflow analysis to fetch only required data from the database

15 Views Asked by At

To illustrate what I am talking about, I will give some code in C# with EF Core, but the choice of C#/EF Core is merely for illustrating the concept, which would apply to any situation where you are persisting objects in a database.

public class MyEntity
{
    public int Id { get; set; }
    public bool InterestingField { get; set; }
    public string BoringField {get; set;}
    public string ReallyBigBoringField {get; set;}
}

The following function will fetch all columns from the database, despite only using one of them:

public bool IsMyFieldSet(int id)
{
    using (var context = new MyDbContext())
    {
        var entity = context.MyEntities.FirstOrDefault(e => e.Id == id);
        return entity != null && entity.InterestingField;
    }
}

Now, for most programming langauges/libraries there is some way I can manually specify that I only want to load the InterestingField column from the database. But looking at the code for IsMyFieldSet, it is clear that it could only ever possibly access one field. This seems like the sort of thing that the compiler could determine ahead of time using static analysis, and optimize for me.

I have tried searching for examples of this in popular languages, as well as research languages that would be more likely to try out something fancy like this, but I couldn't find anything. Are there examples of languages/libraries that will do this sort of optimization for you?

Footnote

Conceptually, you could think of this as performing dead code elimination across the language-database interface. If you read a variable but never use its value, the compiler can safely eliminate that dead code for you. If you read a field from a database and never use it, why can't the compiler do the same?

I also think this could be extended beyond the scope of a single function by using a more advanced type system. You could pass around partial objects, whose types specify which fields are populated, and type inference may make it reasonably ergonomic.

0

There are 0 best solutions below