July05

LINQ Group By with NULL database values

LINQ is fantastic for the ability to write queries that express intent much more clearly than the same SQL, or structured code.  One problem that I have run into though is handling NULL database values that are part of a group by statement. 

Grouping by ProductSKU

Grouping in LINQ allows you to return sets of data from a collection for a given key value.  The group by clause is what the key ends up being in the result set.  Lets take a grouping of the Products by the SKU.

from p in Products
    group p by p.ProductSKU

Enumerable IGrouping collection

This results from the group by are enumerable groups ( IGrouping<String, Product> ) with the String being the Key for the groups (the ProductSKU field from the table).  The typical way you walk through this result is a nested for loop.

var groups = from p in Products
    group p by p.ProductSKU;
    
foreach( var groupentry in groups )
{
    Console.WriteLine( "Group: {0}", groupentry.Key );
    
    foreach( var groupitem in groupentry )
    {
        Console.WriteLine("Product: {0}", groupitem.ProductSKU);
    }
}
    

I end up with a list that looks something like this:

Group: VDB4DBA
Product: VDB4DBA
Group: VDB4DMW
Product: VDB4DMW
Group: VDB4PARTNER
Product: VDB4PARTNER

This works, but not what I really wanted.  In this case the first four characters of the SKU are the same per product family (VDB4 for all VistaDB 4 SKUs).  I would like to be able to group by only those first four characters instead of the complete ProductSKU.  You can do this with the following code.

from p in Products
    group p by p.ProductSKU.Substring(0, 4)

What if there are NULL entries?

But what happens if there is a NULL entry in the ProductSKU?  You get a ConstraintException: The property cannot be set to a null value.

More...