Thursday, March 8, 2012

Getting Distinct Records using LINQ

LINQ provides us with a handy way of obtaining Distinct records from a collection of objects. The collection of objects has to be a generic collection and could be anything from a records returned by an LINQ to SQL or  objects returned by a WCF service or your very own collection of objects. The Distinct Operator faithfully queries a sequence and gets the distinct items matching a specific criteria. (That's a no-brainer!) 
The method is overloaded. By default, if you invoke the distinct method on an enumeration or collection, you do not need to pass any function parameters. If you are invoking the distinct method on an enumeration returned by say querying a SQL table, the distinct will by default act on the primary key. If you want to find out the specific records with distinct values for a particular column, you need to use an IEqualityComparer or Type T. 
Enough talk....let's try some code!
For the example, I'm going to use the NorthWind products table. We can see from the diagram below that there is a relationship between the Products and Categories in the NorthWind database.

What I will do is use the repository that I have created here and get the list of Products from the Products table. First I will perform a Distinct() on the records returned and see what happens.


            //objProductsRep is my concrete repository of the abstract
            //data repository which I have created for the NorthWind database
            ProductRepository objProductsRep = new ProductRepository();

            //gv_Products is the gridview which I am using in the aspx file
            //to display the data.
            gv_Products.DataSource = objProductsRep.GetAllProducts()
                .Distinct()
                .ToList<Product>();
            gv_Products.DataBind();


This generates a GridView as shown below:

So this brings us to the important and all consuming question of how do we get Distinct values for say, the Categories; i.e. how many distinct Categories of Products are there? (Believe me, sometimes it is important to know such things; stock markets collapse and people go bankrupt for the lack of such knowledge)
But let us not digress and get back to the topic at hand. Below is the code of class which uses the IEqualityComparer to filter only the Distinct CategoryIDs.



    /// <summary>
    /// This class extends the IEqualityComparer to compare two products and find out
    /// if their categories are unique. This is like comparing apples and oranges(no pun
    /// intended!) and finding out if they both belong to the category fruits. The  
    /// IEqualityComparer is of type Product
    /// </summary>

    public class ProductsComparer : IEqualityComparer<Product>
    {
        /// <summary>
        /// This is the method which does the comparison
        /// </summary>
        /// <param name="P1">The first product to compare</param>
        /// <param name="P2">The second product to compare</param>
        /// <returns>Return true if there is a match for the categories
        /// else false</returns>
        public bool Equals(Product P1, Product P2)
        {
            if (P1.CategoryID == P2.CategoryID)
                return true;
            return false;
        }
        /// <summary>
        /// This method helps the comparison by producing the hashkey
        /// of the category ID column. It is understood if the CategoryID
        /// is same then the products belong to the same category
        /// </summary>
        /// <param name="P">The product</param>
        /// <returns>the integer hashcode</returns>
        public int GetHashCode(Product P)
        {
            if (Object.ReferenceEquals(P, null)) return 0;

            int hashProductID = P.CategoryID == null ? 0 : P.CategoryID.GetHashCode();

            return hashProductID;
        }
    }


Now we invoke the same Distinct method(well, actually the overloaded version of it) on the collection of Products returned by the Product Repository, but this time pass an object of ProductsComparer class as a function parameter.

            //objProductsRep is my concrete repository of the abstract
            //data repository which I have created for the NorthWind database
            ProductRepository objProductsRep = new ProductRepository();

            //gv_Products is the gridview which I am using in the aspx file
            //to display the data. I have ordered the results by Category ID
            //in an attempt to make it a little more pleasing to the eye
             gv_Products.DataSource = objProductsRep.GetAllProducts()
                .Distinct(new ProductsComparer())
                .OrderBy(P => P.CategoryID)
                .ToList<Product>();
             gv_Products.DataBind();



And the result is the following GridView:

References


1 comment:

Anonymous said...

Good & helpful post.