LINQ is not just LINQ to SQL
Deconstructing a LINQ Statement
(download source code: http://aaronhoffman.googlecode.com/files/AaronHoffmanLinqDemov01.zip)
LINQ is a valuable tool, but because of the way it is usually demonstrated, developers think that its only use is for querying a Microsoft SQL Server database. This is because it is usually being demonstrated along with "LINQ to SQL". LINQ is not just LINQ to SQL. I like to think of LINQ as a shorthand (or syntax shortcut) way of writing code - a way to write less lines of code (or simplify code), but still perform the same operations (kind of like what foreach is to the for loop).
I would like to demonstrate LINQ in a way that does not use LINQ to SQL. I will break down a LINQ statement into lines of code that developers might be more familiar with in an attempt to show what is going on under the covers. I will start with code that hopefully everyone has seen/written before, then build up to a LINQ statement that performs the same operation. This way you will see what the compiler turns your LINQ Statements into.
In each example we will search through a collection of Customer objects to find all the Customers that meet our criteria (that "pass our test"). The following statement will be used in each example to filter the collection (I will refer to this as: The Test)
Customer.CompanyName.ToUpper().StartsWith(OUR_STRING.ToUpper())
Example 1
In this example we will use a foreach loop to iterate through a collection of Customers. If a Customer passes our test, we will add it to a second collection of Customers... That’s it.
[see code]
Example 2
In this example we will use the foreach loop again, but we will move The Test out of the loop and into its own class. This can be a little confusing – it seems like an extra/unnecessary step. This is just to help show you what the compiler does when you write an anonymous method (which we haven’t done yet, but we’re going to...). If this example confuses you, just read on, then come back to it later.
[see code]
Example 3
In this example the foreach loop from the previous examples has been replaced by the FindAll() method. The FindAll() method does exactly what we have been doing manually in the past two examples. It will loop through the collection of Customers and test each Customer to see if it passes our test. If it does, it will add that Customer to a second collection and eventually return that collection. All we have to do is pass in The Test.
Now, if you have not worked with delegates before, this is where it might get a little fuzzy. Although, you probably have worked with delegates before without knowing it. If you have ever handled a Button’s Click event, you have worked with delegates.
You can think of a Delegate like a method that can be passed around like a variable (kind of like a method pointer). And Delegates also have types. The type of a delegate defines what the method signature needs to look like, but not the logic within the method. A Button’s Click Event Delegate has a type of EventHandler. The definition of which looks like this:
public delegate void EventHandler(object sender, EventArgs e);
And so, your Button_Click method needs to look like this:
public void Button1_Click(object sender, EventArgs e) { }
Your button_click method needs to return void and have two parameters, an object and an EventArgs class. The name of the method and the logic within it do not matter.
Now back to the FindAll() method. The FindAll() method has a single parameter that is a Delegate of type Predicate<t>. Here is the definition of the Predicate<t> Delegate:
public delegate bool Predicate<T>(T obj);
So, in order to match this delegate type, we need to write a method that returns a Boolean and takes one parameter that is the same type as the Generic List (that is what "<T>" means – in our case that would be a Customer). We can then pass that method in to the FindAll() method so it can use it to test each Customer. Well, we have already written that method, haven’t we? It is the same method that we called within the foreach loop in Example 2. It is The Test:
public bool CompanyNameStartsWith(Customer c)
{
return c.CompanyName.ToUpper().StartsWith(CompanyName.ToUpper());
}
Take a look at the code for Example 3. The foreach loop has been replaced by the FindAll() method.
[see code]
Example 4
If this isn’t making much sense to you, just stick with it, you don’t need to know all of the inner workings of LINQ in order to use LINQ.
In this example, instead of using the “CompanyNameStartsWith” method we used previously, we just write that logic in-line again. This time, however, we use an anonymous method. Within the FindAll() method call, we write a new method in-line using the delegate() keyword. The benefit of this is that we write less lines of code. And the reason we don’t have to write the code for the class that would hold this one method, is that the compiler does it for us! Click Here if you want more information on how this is done.
[see code]
Example 5
Another feature that released along with LINQ are Lambda Expressions. And like LINQ, I like to think of Lambdas as a syntax shortcut or Shorthand way of writing code. It is a way to write/create delegates in less lines of code. The “delegate” keyword and brackets have been replaced by the new Lambda operator (=>). And we can now re-write the example above in one clean line of code. A little too clean, you might say. How does it know what ‘c’ is? Click Here (and Here) for more information.
[see code]
Example 6
The first (and only) example with LINQ! This example does basically what the past examples have done, with a few differences. You probably noticed the ‘var’ keyword. Var is an example of an implicitly typed local variable. You don’t need to declare the type that will be returned by the LINQ statement (because sometimes it won’t exist yet...), but the compiler will determine the type for you. In this example, we could replace var with IEnumerable<Customer>, but it is a LINQ convention to just use var. (I will not be explaining LINQ syntax here, there are plenty of resources that do that already. I just want to demonstrate what the compiler does to your LINQ statements). Take a look at the code.
[see code]
Example 7
The LINQ query above can also be written using the IEnumerable<T>.Where(Func<T, bool>) extension method. The Where() method is a lot like the FindAll() method, but way better. (I won’t get into why it is better – that is another post). You might then ask, Why would you ever write this as a LINQ statement then? Why don’t we just use the Where() method from the start? This is one of the beauties of LINQ. The Where() method isn’t the only method a LINQ statement is turned into (although, it is probably a popular one). But, when you are writing a LINQ statement, you don’t have to worry about which methods need to get called, you just have to write in one standard syntax, and all the work is done for you by the compiler! Not only does it determine the type that will be returned by the statement, but it also determines which methods need to get called as well! (This process, of course, is a little more complicated than that – but that is the short and sweet answer). Take a look at the code.
[see code]
Example 8
There is nothing new in this example. I just wanted to come full circle and show you that the compiler will turn “the meat” of the LINQ statement into a bunch of little methods for you. Methods and Classes that you never had to write!
[see code]
Summary
And that’s it... I hope you were able to learn something from this demonstration. If nothing else, just start using LINQ! Like I said previously, LINQ is not just LINQ to SQL. And you don’t have to know all the inner workings of the compiler to start using it. So Get Started!
-Aaron Hoffman
Here is a list of the resources I used to gather information for this post.
Scott Gu's LINQ to SQL Guide:
http://weblogs.asp.net/scottgu/archive/2007/09/07/linq-to-sql-part-9-using-a-custom-linq-expression-with-the-lt-asp-linqdatasource-gt-control.aspx
LINQ: .NET Language-Integrated Query
http://msdn.microsoft.com/en-us/library/bb308959.aspx
LINQ to SQL: .NET Language-Integrated Query for Relational Data
http://msdn.microsoft.com/en-us/library/bb425822.aspx
More information about anonymous methods & delegates and how they relate to the compiler
http://msdn.microsoft.com/en-us/magazine/cc163682.aspx
Delegates Overview
http://msdn.microsoft.com/en-us/library/ms173171.aspx
Lambdas
http://blogs.msdn.com/charlie/archive/2008/06/28/lambdas.aspx
'var' c# reference
http://msdn.microsoft.com/en-us/library/bb383973.aspx
(download source code: http://aaronhoffman.googlecode.com/files/AaronHoffmanLinqDemov01.zip)
LINQ is a valuable tool, but because of the way it is usually demonstrated, developers think that its only use is for querying a Microsoft SQL Server database. This is because it is usually being demonstrated along with "LINQ to SQL". LINQ is not just LINQ to SQL. I like to think of LINQ as a shorthand (or syntax shortcut) way of writing code - a way to write less lines of code (or simplify code), but still perform the same operations (kind of like what foreach is to the for loop).
I would like to demonstrate LINQ in a way that does not use LINQ to SQL. I will break down a LINQ statement into lines of code that developers might be more familiar with in an attempt to show what is going on under the covers. I will start with code that hopefully everyone has seen/written before, then build up to a LINQ statement that performs the same operation. This way you will see what the compiler turns your LINQ Statements into.
In each example we will search through a collection of Customer objects to find all the Customers that meet our criteria (that "pass our test"). The following statement will be used in each example to filter the collection (I will refer to this as: The Test)
Customer.CompanyName.ToUpper().StartsWith(OUR_STRING.ToUpper())
Example 1
In this example we will use a foreach loop to iterate through a collection of Customers. If a Customer passes our test, we will add it to a second collection of Customers... That’s it.
[see code]
Example 2
In this example we will use the foreach loop again, but we will move The Test out of the loop and into its own class. This can be a little confusing – it seems like an extra/unnecessary step. This is just to help show you what the compiler does when you write an anonymous method (which we haven’t done yet, but we’re going to...). If this example confuses you, just read on, then come back to it later.
[see code]
Example 3
In this example the foreach loop from the previous examples has been replaced by the FindAll() method. The FindAll() method does exactly what we have been doing manually in the past two examples. It will loop through the collection of Customers and test each Customer to see if it passes our test. If it does, it will add that Customer to a second collection and eventually return that collection. All we have to do is pass in The Test.
Now, if you have not worked with delegates before, this is where it might get a little fuzzy. Although, you probably have worked with delegates before without knowing it. If you have ever handled a Button’s Click event, you have worked with delegates.
You can think of a Delegate like a method that can be passed around like a variable (kind of like a method pointer). And Delegates also have types. The type of a delegate defines what the method signature needs to look like, but not the logic within the method. A Button’s Click Event Delegate has a type of EventHandler. The definition of which looks like this:
public delegate void EventHandler(object sender, EventArgs e);
And so, your Button_Click method needs to look like this:
public void Button1_Click(object sender, EventArgs e) { }
Your button_click method needs to return void and have two parameters, an object and an EventArgs class. The name of the method and the logic within it do not matter.
Now back to the FindAll() method. The FindAll() method has a single parameter that is a Delegate of type Predicate<t>. Here is the definition of the Predicate<t> Delegate:
public delegate bool Predicate<T>(T obj);
So, in order to match this delegate type, we need to write a method that returns a Boolean and takes one parameter that is the same type as the Generic List (that is what "<T>" means – in our case that would be a Customer). We can then pass that method in to the FindAll() method so it can use it to test each Customer. Well, we have already written that method, haven’t we? It is the same method that we called within the foreach loop in Example 2. It is The Test:
public bool CompanyNameStartsWith(Customer c)
{
return c.CompanyName.ToUpper().StartsWith(CompanyName.ToUpper());
}
Take a look at the code for Example 3. The foreach loop has been replaced by the FindAll() method.
[see code]
Example 4
If this isn’t making much sense to you, just stick with it, you don’t need to know all of the inner workings of LINQ in order to use LINQ.
In this example, instead of using the “CompanyNameStartsWith” method we used previously, we just write that logic in-line again. This time, however, we use an anonymous method. Within the FindAll() method call, we write a new method in-line using the delegate() keyword. The benefit of this is that we write less lines of code. And the reason we don’t have to write the code for the class that would hold this one method, is that the compiler does it for us! Click Here if you want more information on how this is done.
[see code]
Example 5
Another feature that released along with LINQ are Lambda Expressions. And like LINQ, I like to think of Lambdas as a syntax shortcut or Shorthand way of writing code. It is a way to write/create delegates in less lines of code. The “delegate” keyword and brackets have been replaced by the new Lambda operator (=>). And we can now re-write the example above in one clean line of code. A little too clean, you might say. How does it know what ‘c’ is? Click Here (and Here) for more information.
[see code]
Example 6
The first (and only) example with LINQ! This example does basically what the past examples have done, with a few differences. You probably noticed the ‘var’ keyword. Var is an example of an implicitly typed local variable. You don’t need to declare the type that will be returned by the LINQ statement (because sometimes it won’t exist yet...), but the compiler will determine the type for you. In this example, we could replace var with IEnumerable<Customer>, but it is a LINQ convention to just use var. (I will not be explaining LINQ syntax here, there are plenty of resources that do that already. I just want to demonstrate what the compiler does to your LINQ statements). Take a look at the code.
[see code]
Example 7
The LINQ query above can also be written using the IEnumerable<T>.Where(Func<T, bool>) extension method. The Where() method is a lot like the FindAll() method, but way better. (I won’t get into why it is better – that is another post). You might then ask, Why would you ever write this as a LINQ statement then? Why don’t we just use the Where() method from the start? This is one of the beauties of LINQ. The Where() method isn’t the only method a LINQ statement is turned into (although, it is probably a popular one). But, when you are writing a LINQ statement, you don’t have to worry about which methods need to get called, you just have to write in one standard syntax, and all the work is done for you by the compiler! Not only does it determine the type that will be returned by the statement, but it also determines which methods need to get called as well! (This process, of course, is a little more complicated than that – but that is the short and sweet answer). Take a look at the code.
[see code]
Example 8
There is nothing new in this example. I just wanted to come full circle and show you that the compiler will turn “the meat” of the LINQ statement into a bunch of little methods for you. Methods and Classes that you never had to write!
[see code]
Summary
And that’s it... I hope you were able to learn something from this demonstration. If nothing else, just start using LINQ! Like I said previously, LINQ is not just LINQ to SQL. And you don’t have to know all the inner workings of the compiler to start using it. So Get Started!
-Aaron Hoffman
Here is a list of the resources I used to gather information for this post.
Scott Gu's LINQ to SQL Guide:
http://weblogs.asp.net/scottgu/archive/2007/09/07/linq-to-sql-part-9-using-a-custom-linq-expression-with-the-lt-asp-linqdatasource-gt-control.aspx
LINQ: .NET Language-Integrated Query
http://msdn.microsoft.com/en-us/library/bb308959.aspx
LINQ to SQL: .NET Language-Integrated Query for Relational Data
http://msdn.microsoft.com/en-us/library/bb425822.aspx
More information about anonymous methods & delegates and how they relate to the compiler
http://msdn.microsoft.com/en-us/magazine/cc163682.aspx
Delegates Overview
http://msdn.microsoft.com/en-us/library/ms173171.aspx
Lambdas
http://blogs.msdn.com/charlie/archive/2008/06/28/lambdas.aspx
'var' c# reference
http://msdn.microsoft.com/en-us/library/bb383973.aspx
Comments