HowTo – Linq in C#

Source Code

All source code can be found on GitHub here.

My cheat sheet for Linq can be found here.

This is part of my HowTo in .NET seriies. An overview can be seen here,

Linq Intro

LINQ provides a consistent and common API for querying and manipulating data sets from a range of data sources such as databases, XML and in memory data sets.

It provides functionality for querying and manipulating set data similar to SQL.

LINQ queries are strongly typed; the compiler will check for syntax and type safety.

Getting Started

To access LINQ you must import the System.Linq name space from within the System.Core.dll.

Linq can be performed upon any collection implementing IEnumerable and also System.Array.

A simple example could be:

var oldPeople = from p in people
where p.Age > 30
select p

Which selects all people who are are older than 30 into a collection implementing IEnumerable

The ‘natural language’ like syntax compiles down into extension methods which we could have implemented directly.

var oldPeople = people.Where( x => x.Age > 30 );

Deferred Execution

Queries are not actually evaluated until you iterate over the sequence; the execution is deferred. This means that modifying the data and using the same Linq result set will cause the result execution to be invoked twice. This allows you to always rely on your results reflected upon the latest version of the data. This does need to be treated with care to prevent unnecessary execution.

Immediate execution can be made by any number of extension methods defined by the Enumerable type such as ToArray(), ToDictionary<TSource,TKey>(), and ToList().

LINQ Queries and Non-Generic Collections

Apart from System.Array all other non generic collections can not be used with Linq however we can convert them into a generic collections (which implement IEnumerable with the OfType() function.

var goodCollection = BadCollection.OfType<TypeName>();

Lambda and Anonymous Delegates

Linq makes extensive use of anonymous delegates and Lambdas. If you are not up to speed on this you might want to read my post on this here.

Anonymous Types

Linq makes extensive use of anonymous types. If you are not up to speed on this you might want to read this <http://msdn.microsoft.com/en-gb/library/vstudio/bb397696.aspx&gt;, though in short, anonymous types allow creation of types on the fly similar to JSON.

var v = new { Name = "John", Age = 30 };

Filters

Filtering allows away of removing entities which are not required; a predicate or where clause.

The previous example was using the where statement for both extension and natural language to filter out people based upon their age.

var oldPeople = people.Where (x => x.Age < 30);
var count = (from p in people
             where p.Age < 30
             select p).Count ();

There where extension method can also provide the element ordinal or position index.

var oldPeopls = people.Where (( x, index ) => index <= 4 && x.Age < 30);

Any property, member variable or function can be called or drilled into for part of our query. The following example show how to find people who have children; children returns a collection of children that a parent has.

var count = people.Where (x => x.Children.Count () > 0);
var count = (from p in people
             where p.Children.Count () > 0
             select p).Count ();

Element

The element functions allow selection of an element from the collection; the execution is not deferred but executed immediately.

First allows the selection of the first element in the collection.

var samplePeople = people.OrderBy (x => x.Name).First ();

First can also provide a Predicate delegate and be used to find the first element which passes the predicate.

var samplePeople = people.OrderByDescending (x => x.Name).First (x => x.Gender == Gender.Male);

The problem with First() is that if there are no elements in the collection or elements which pass the predicate run time error occurs. FirstOrDefault() allows returning of default(T) to prevent this error from happening. The function default(T) provides the default value for a value type e.g 0 for int, or null for a reference type.

var samplePeople = people.Where (x => x.Name == "No Match").FirstOrDefault ();

Any element can be referenced and retrieved by its ordinal position with the ElementAt function.

var samplePeople = people.OrderBy (x => x.Name).ElementAt (0);

Grouping

Grouping is similar to SQL grouping with one major exception. In SQL any data projected into the result set has to either be a field which was provided in the group list or an aggregate.

Linq simply partitions the collection into distinct groups with all data from all elements available regardless of their involvement in the group list. Combined with anonymous types we can have a lot of control over the grouping and the result set.

The following example groups people by gender and then creates an anonymous type containing the gender and the count of people in each group.

Linq automatically provides the Key field on the grouped result set; here we use it to populate the ‘Key’ field which we create on the anonymous type and will contain the gender.

// Group by gender and count...
var samplePeople = people.GroupBy (x => x.Gender).OrderBy (x => x.Key).
	Select (grp => new { Key = grp.Key,  Value = grp.Count ()});
// Group by gender and count...
var samplePeople = from p in people
	group p by p.Gender into gens
	select new { Key = gens.Key, Value = gens.Count()};

As previously mentioned all data is available from the grouped data. This next example provides the value element in our anonymous type as an order list of people by name with all the original data in tact.

// Split into groups of gender by grouping on Name
var samplePeople = people.GroupBy (x => x.Gender).OrderBy (x => x.Key).
	Select (grp => new { Key = grp.Key,  Value = grp.OrderBy (x => x.Name)});
var samplePeople = from p in people
	group p by p.Gender into gens
	select new { Key = gens.Key, Value = gens.OrderBy( x => x.Name)};

We can also group on multiple keys via anonymous types.

var samplePeople = people.GroupBy (key => new { key.Gender, key.Age},
(key, group ) => new { key.Gender, key.Age, Count = group.Count()});

Projection

Projection provides a way of creating a new collection from the current set by projecting the ex sting data into it along with other Linq functions such as Count() and Sum() or pretty much anything you want.

Select provides a way of selecting a combination of fields into the result set. Here we create a list of strings by projecting the Name field into the result set.

var allFemaleNames = people.Where (x => x.Gender == Gender.Female).Select (x => x.Name);
var allFemaleNames = from p in people
           			where p.Gender == Gender.Female
           			orderby p.Name descending
					select p.Name;

Projection is often used with anonymous types to create new types on the fly as required. Here we create a new type with Age and Name of all females.

var parentsNameAndAge = people.Where (x => x.Gender == Gender.Female).Select (x => new { Name = x.Name, Age = x.Age });
var parentsNameAndAge = from p in people
	where p.Gender == Gender.Female
	select new { Name =  p.Name, Age = p.Age };

SelectMany provides a way of flattening a collection of collections. Children on parent is a collection of children for a parent. Here we create a collection of all Children in one collection. The where clauses is provided on people to select only females and then upon the flattened children collection to select only boys; ie all boys who have mums.

var boysWithFemaleParents = people.Where (x => x.Gender == Gender.Female).
	SelectMany (x => x.Children).
		Where (x => x.Gender == Gender.Male);
var boysWithFemaleParents = from parent in people
	where parent.Gender == Gender.Female
	from children in parent.Children
	where children.Gender == Gender.Male
	select children;

Ordering

Ordering functions provide…… ** wait for it ** …. ordering functionalit.

Elements in a collection can be ordered ascending with the OrderBy function.

var orderdPeople = people.OrderBy (x => x.Name);
var orderdPeople = from p in people
	orderby p.Name
	select new { p.Name, p.Age, p.Children, p.Gender };

OrderBy() function can take an class implementing the IComparer to provide any required comparisons. StringComparer class contains a few predefined comparisons for strings with and without case sensitivity.

var orderdPeople = people.OrderBy (x => x.Name, StringComparer.CurrentCultureIgnoreCase);

Ordering can be made by a field descending

var orderdPeople = people.OrderByDescending (x => x.Name);
var orderdPeople = from p in people
	orderby p.Name descending
	select new { p.Name, p.Age, p.Gender, p.Children };

Ordering can also be made upon multiple fields by using either ThenBy or ThenByDescending after OrderBy or OrderByDescending.

var orderdPeople = people.OrderBy (x => x.Age).ThenByDescending (x => x.Name);

Natural language Linq supports multiple ordering fields similar to sql by separating the field names by commas.

var orderdPeople = from p in people
	orderby p.Age, p.Name descending
	select new { p.Name, p.Age, p.Gender, p.Children };

Elements Can be reversed with the Reverse() method.

var orderdPeople = people.OrderBy (x => x.Name).Reverse ();

Some of the Linq extension methods are not supported through natural Linq though the extension methods can be called in situ of the natural language.

var orderdPeople = (from p in people
	orderby p.Name
    select new { p.Name, p.Age, p.Gender, p.Children }).Reverse ();

Quantifiers

Quantifiers provide functionality to asses if any or all elements in the collection pass a predicate.

Any() returns true if any elements pass a predicate and is useful to be called before First(Predicate); though this is quite wasteful.

var isAnyPeople = people.Any (x => x.Gender == Gender.Unknown);

It does not require a delegate as a parameter and can be called to asses if a collection or current Linq expression will evaluate to an empty collection or not.

var isAnyPeople = (from p in people
                   where p.Gender == Gender.Unknown
                   select p).Any ();

Here is a good example of allowing a combination of natural Linq and Linq extension methods. Children property returns a collection of children for that person.

var isAnyPeople = from p in people
    			  where p.Children.Any (child => child.Gender == Gender.Unknown)
				  group p by p.Gender into genders
                  select new { Gender = genders.Key, People = genders};

When ever an IEnumerable is provided a new Linq expression can be started. Here we are selecting any parents who have boys for children.

var isAnyPeople =
	people.Where (z => z.Children.Any (child => child.Gender == Gender.Male)).
                GroupBy (x => x.Gender).
				Select (y => new { Gender = y.Key, People = y });

The All() function can be used to make sure that all elements in the collection pass the predicate; there is no parameterless overload.

var allMale = people.All (x => x.Gender == Gender.Male);

Below is an example of combining Any with grouping. We group the data by gender with a predicate of having all children as girls; anyone who does not have all girls for children are omitted from the result set.

var isAll = from p in people
	group p by p.Gender into genders
	where genders.All( x => x.Children.All (child => child.Gender == Gender.Female))
	select new { Gender = genders.Key, Value = genders};

Conversions

Conversions provide a way of converting of translating data. We have already talked about ToArray() and ToList() which can be used to prevent the deferred execution and push the results set into a collection.

var samplePeople = people.OrderBy (x => x.Name).ToArray ();
var samplePeople = people.ToList ();

ToDictionary creates a dictionary of all elements in keyed upon the return fields from the Func delegate provided. The example above creation a Dictionary<string,Person> where the key is a string of peoples names.

var samplePeople = people.ToDictionary (x => x.Name);

OfType as mentioned previouslt is a way to convert an array into an IEnumerable to allow Arrays to have Linq executed on them. It actually creates a collection of only the specified type ommitting anything which is not of the provided type; reemember array is a collectio of objects.

Below we have an array of ints and decimals, calling OfType() creates a collection of IEnumeranle of the two decimal elements within the array. All other elements are ommitted.

// Filter out those which are note od typw double
var sampleIntNumbers = new List<object> (){ 1,2,3,4,5,6m,7,8m,9,10};
var sampleDecimalNumbers = sampleIntNumbers.OfType<decimal> ().ToList ();

This might not be required functionality, we could have used either Cast or Convert create a collection with all elements contained within.

The example above could have been written above.

var sampleObjNumbers = new List<int> (){ 1,2,3,4,5,6,7,8,9,10};
var sampleIntNumbers = sampleObjNumbers.ConvertAll<double> (x => Convert.ToDouble (x));

If all elements implement an interface or have a common ancestory we could have used the Cast function.

// Cast can only cast to Implemented Interfaces and classes within the hierachy
// Convert all data ti
var sampleObjNumbers = new List<object> (){ 1,2,3,4,5,6m,7,8,9,10};
var sampleIntNumbers = sampleObjNumbers.Cast<int> ().ToList ();

ConvertAll can be used for a lot more purposes, below we convert people list a list of strings which contains peoples names in upper case.

// Convert all strings to Upper Case
var upperCaseNames = people.Select (x => x.Name).OrderBy (x => x).ToList ().ConvertAll<string> (x => x.ToUpper ());

Aggregate

Aggregates are similar to SQL aggregates. They are functions which return a single result value for a set of data; the simplest example being Count().

var count = people.Count ();
var count = (from p in people
			 select p).Count ();

Count can also take a Predicate delegate; the count value will be the number records passing the provided predicate.

var count = people.Count (x => x.Gender == Gender.Male);
var count = (from p in people
             where p.Gender == Gender.Male
             select p).Count ();

Nested counts can help with projections. Here we translate the data into a new collection of Name and ChildrenCount. Name being the persons name and ChildrenCount being the count of girls the person has.

var samplePeople = people.Select
	(x => new { Name = x.Name, ChildrenCount = x.Children.Count (y => y.Gender == Gender.Female)});
var samplePeople = from p in people
				   select new { Name= p.Name, ChildrenCount =
						  (from c in p.Children
	                 	  where c.Gender == Gender.Female
	                 	  select c).Count () };

Sum provides a way of adding up all number values. Here we are adding up all the counts of children into variable.

var childrenCount = people.Sum (x => x.Children.Count ());

The example below shows building up the Sum with Where clauses to get the count of boys who have fathers.

var maleSonsCount = people.Where (x => x.Gender == Gender.Male).Sum (x => x.Children.Where (y => y.Gender == Gender.Male).Count ());

Min provides the Math.Min or minimum value from a set. Here we find the minimum number of children a person has.

var minChildrenCount = people.Min (x => x.Children.Count ());
var children = from p in people
	group p by p.Gender into peopleGenders
		let minKids = peopleGenders.Min( x => x.Children.Count())
	select new { Gender = peopleGenders.Key, Value = peopleGenders.Count() };

Max provides the Math.Max or maximum value from a set. Here we find the maximum number of children a person has.

var maxChildrenCount = people.Max (x => x.Children.Count ());

Average is the mean average; i.e add up all values and divide by the number of elements.

var avgChildrenCount = people.Average (x => x.Children.Count ());

The example below groups the data into genders and then provides an anonymous type of Gender, Average, Min and Max of the number of children for each gender type.

var samplePeople = people.GroupBy (x => x.Gender).
	Select (y => new { Key = y.Key,
		Average = Math.Round (y.Average (z => z.Children.Count ()), 2),
		Min = y.Min (z => z.Children.Count ()),
		Max = y.Max (z => z.Children.Count ())}
);

Set

Set functions provides distinct, union, intersection and join functionality.

Distinct removes all duplicated data within a collection to provide a result set with distinct elements.

var numbers = new List<int> (){ 1,1,2,2,3,4,5 };
var distinctNumbers = numbers.Distinct ();
var distinctNumbers = ( from p in people
                        orderby p.Name
                        select p.Name  ).Distinct();

Distinction works on equality. Custom equality checks can be provided by providing an IEqualityComparer.

A random example but here we provide an equality check for people only by their gender. The example then projects the gender into a list which is ordered by Gender; ie we are getting a distinct list of genders ordered within our initial collection.

public class PersonSexComparer : IEqualityComparer<Person>
{
	public bool Equals (Person x, Person y)
	{
		return x.Gender == y.Gender;
	}

	public int GetHashCode (Person obj)
	{
		return obj.Gender.GetHashCode ();
	}

}

var distinctSex = people.Distinct (new PersonSexComparer ()).
	OrderBy (x => x.Gender).Select (y => y.Gender);

Union provides a way to combine sets of data. Here we create a super set of people and children. The SelectMany flatens all children into a collection before unioning them with the collection of parents.

var allPeople = people.Union (people.SelectMany (x => x.Children)).
	OrderByDescending (x => x.Gender).ThenBy (x => x.Name);
var allPeople = people.Union ( from p in people
                  from c in p.Children
                 select c);

Intersection provides a result set of elements which exist in both collections.

var groupOne = new List<int> (){ 1,2,3,4,5};
var groupTwo = new List<int> (){4,5,6,7};
var groupInter = groupOne.Intersect (groupTwo).OrderBy (x => x);

No real equivalent in natural link though the contains method can allow similar functionality.

var groupOne = new List<int> (){ 1,2,3,4,5};
var groupTwo = new List<int> (){4,5,6,7};
var groupInter = from n in groupOne
		where groupTwo.Contains(n)
		select n;

Except provides a ‘Not In’ functionality. Al elements in the first set which are not in the second will be found in the result set.

var groupOne = new List<int> (){ 1,2,3,4,5};
var groupTwo = new List<int> (){4,5,6,7};
var groupExcept = groupOne.Except (groupTwo).OrderBy (x => x);
var groupOne = new List<int> (){ 1,2,3,4,5};
var groupTwo = new List<int> (){4,5,6,7};
var groupExcept = from n in groupOne
	where !groupTwo.Contains(n)
		select n;

Concat provides a way of combining the elements of two collections into one.

var groupOne = new List<int> (){ 1,2,3,4,5};
var groupTwo = new List<int> (){4,5,6,7};
var groupExcept = groupOne.Concat (groupTwo).OrderBy( x=> x);

SequenceEquals provides a check of whether all elements in two collections are equal and in the same order.

var groupOne = new List<int> (){ 1,2,3,4,5};
var groupOneAgain = new List<int> (){ 1,2,3,4,5};
Assert.IsTrue (groupOne.SequenceEqual(groupOneAgain));

Join is similar to SQL join; we can match two elements in two collections and work upon all data contained.

Here we have a collection which describes each element in the Gender enumerator. We can join between people and gender description collection and project all fields into the result set.

var foo = people.Join( genderDescriptions,
                      aPerson => aPerson.Gender,
                      aDesc => aDesc.Gender,
                      ( aPerson, aDes ) => new { Name = aPerson.Name, Gender = aPerson.Gender, Desc = aDes.Descripion} );
var foo = from aPerson in people
	join aDes in genderDescriptions on aPerson.Gender equals aDes.Gender
select new { Name = aPerson.Name, Gender = aPerson.Gender, Desc = aDes.Descripion};

Partitioning

Partitioning provides a way of taking or skipping elements in a collection,

Take(x) takes the first x elements in a collection.

var firstThreePeople = people.Take (3);

Here we combine take with where to take the first two males.

var firstTwoPeople = people.Where (x => x.Gender == Gender.Male).Take (2);

Here we add into the mix an anonymous type to create a new type with Name, Age, Gender and Children of the first two males.

var firstTwoPeople =
	(from p in people
	  where p.Gender == Gender.Male
	  select new { p.Name, p.Age, p.Gender, p.Children }).Take (2);

Skip is the opposite of take. Skip(x) takes all but the first x elements. Here we take all but the first 4 people.

var allButFirstFour = people.Skip (4);

TakeWhile provides a function to take all elements in a collection while a predicate is met. Any elements which pass the predicate and follow an element which fails the predicate will be omitted from the result set.

var allButFirstFour = people.TakeWhile (x => x.Gender == Gender.Male);

SkipWhile provides the opposite of TakeWhile. All initial elements passing the predicate will be omitted until the first element passes the predicate.

var allButFirstFour = people.SkipWhile (x => x.Gender == Gender.Male);

Generation

Generation functions include Range and Repeat; these provide a way of creating a collection of type T.

Range provides a way of creating a collection of integers between (and including) a start and end int.

var sampleInts = Enumerable.Range(1, 10);

Repeat takes type T and creates a collection of int times. Here we create a collection of 10 1’s.

var sampleInts = Enumerable.Repeat(1, 10);
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s