Microsoft LINQ Books

Programming Microsoft LINQ & Introducing Microsoft LINQ
Welcome to Microsoft LINQ Books Sign in | Join | Help
in Search

Marco Russo

  • Use of Distinct and OrderBy in LINQ

    A few days ago I found a bug in a program written using LINQ to SQL, which was caused by years of use of SQL. The requirement was something like: get the distinct values of (bla bla bla) sorted alphabetically. An example of the required query with Northwind would be the following one:

    SELECT DISTINCT
            e.LastName
    FROM    Orders o
    LEFT JOIN [Employees] e
            ON e.[EmployeeID] = o.[EmployeeID]
    ORDER BY e.LastName 

    Fundamentally, we are using both a DISTINCT and an ORDER BY statement in SQL.

    Now, if you create a NorthwindDataContext importing the Order and Employee tables, you can try to write a similar statement in LINQ to SQL. Unfortunately, the Distinct clause is not part of the query syntax and the most intuitive path could be the one of calling Distinct at the end of your statement, like in the following query:

    var queryA =
        (from o in db.Orders
         orderby o.Employee.LastName
         select o.Employee.LastName)
         .Distinct();

     

    However, the Distinct clause is removing the sort condition defined by the orderby keyword. In fact, the SQL statement sent to the database is the following one:

    SELECT DISTINCT
            [t1].[LastName]
    FROM    [dbo].[Orders] AS [t0]
    LEFT OUTER JOIN [dbo].[Employees] AS [t1]
            ON [t1].[EmployeeID] = [t0].[EmployeeID]

     

    This behavior might appear strange. The problem is that the Distinct operator does not grant that it will maintain the original order of values. Applied to LINQ to SQL, this mean that a sort constraint can be ignored in the case of a query like queryA.

    The solution is pretty sample: put the OrderBy operator after the Distinct one, like in the following queryB definition:

    var queryB = 
        (from o in db.Orders
         select o.Employee.LastName)
        .Distinct().OrderBy( n => n );
    

     

    This will result in the following SQL statement sent to Northwind:

    SELECT  [t2].[LastName]
    FROM    ( SELECT DISTINCT
                        [t1].[LastName]
              FROM      [dbo].[Orders] AS [t0]
              LEFT OUTER JOIN [dbo].[Employees] AS [t1]
                        ON [t1].[EmployeeID] = [t0].[EmployeeID]
            ) AS [t2]
    ORDER BY [t2].[LastName]

     

    If you remove some syntax redundancy, this is exactly the same query I wrote at the beginning of my post.

    The lesson is: in a SQL query, the position of an operator is not relevant until operators belong to the same SELECT/FROM statement. In LINQ, this is not true and the conversion to SQL could remove LINQ operators when their operation might be ignored by other operators in the same LINQ query.

    Final consideration: initially I considered that the compiler could emit some warning in case a query reduction is done like in the queryA case. Unfortunately, the query reduction operation is done by the LINQ to SQL provider at execution time and not during compilation. A warning could still be possible, but it's something that I would move to tools like FxCop.

  • LINQ query optimizations

    Look at this excellent blog post written by K. Scott Allen. I completely agree with him: don't try to optimize a LINQ query until you measure its performance and understand it is really a bottleneck that needs to be improved.

    An interesting consideration I never made before is that you can call the OrderBy extension method after the Select and not before. Yes, using the query syntax of C# you are used to put the Select after the OrderBy, but sometime it could be better to invert this order (the reasons are already well explained in the Scott Allen's post).

    And, of course, consider that performance have to be evaluated in two dimension: time and space. And, sooner than later, a third dimension (parallelism) will gain the same importance.

  • Implement progress reporting and cancellation of LINQ queries

    Samuel Jack wrote two interesting posts discussing possible extension methods for LINQ. One is to implement progress reporting of a LINQ query. The other is to implement a way to cancel a running LINQ query.

    Both implementations are very simple and they are very good to illustrate how LINQ can be extended and manipulated in a simple way by using extension methods.

  • LINQ to SQL and varchar(1) fields

    If you are using the Object Relational Designer of LINQ to SQL creating an entity of an existing table that has some VARCHAR(1) fields, you are going into this issue.

    The data member created in C# is char instead of string. If the field is always filled with one char, this works. But if you try to read a row from the table containing an empty string (not a NULL field, but a field of zero characters) you will get this exception:

    String must be exactly one character long.

    This behavior has been already described in this post and in the LINQ forum. But one more warning could be important: this is a latent error that will express yourself only at runtime if you don't fix. Thus, be careful whenever you have some VARCHAR(1) fields in your tables.

  • The Amazon Reviews law

    While you have few reviews, a single bad review lower the overall rate. Today we got a bad review, probably because there was a misunderstanding about the scope of the book. I feel the need to give some information to help other possible readers to make a good choice.

    First of all, I suggest everyone to take a look at the book contents. It already describes pretty well what you should expect to find in this large book. In the same place, you will find the links to download two sample chapters of the book. They are not the toughest ones, but they show you the general approach that is the one of explaining LINQ, addressing its use with other libraries (like ASP.NET, WCF, WPF, WCF and so on). This does not mean that we cover how to make data binding works of how to write an application in WPF. We assume that a particular library for communication or presentation is already in your skills. We only concentrate our attention on data query and manipulation.

    Another point is the language. We used C# as a language of choice, and used VB.NET only in chapters where the features and/or syntaxes are significantly different (XML integration is one of the most important area for this). Converting existing C# samples in VB is very simple, and we always highlighted when major differences are expected. There are parts where only the VB syntax is available (see XML) and other parts where C# doesn't have correspondent VB syntax. All these differences are well explained in two appendixes, one for C# and the other for VB. We had to make this decision because space was limited and we had a lot of content to put into the book.

    I hope this will help you. Please contact me if you have any doubt and/or would like to give other feedback.

  • Extending LINQ to XML

    Eric White shows some interesting use of LINQ to XML to query an Open XML document.

    Something that is not immediate to learn when you use LINQ is that you can define your own extension methods to make your queries smarter and more readable. This post is a good exercise to think in a more flexible way: even for me, it's the first time I see an example of "extension" applied to LINQ to XML.

  • Sample chapters from Programming LINQ

    Two sample chapters of my Programming Microsoft LINQ book are finally available. Links to download pages in the chapters title.

    Chapter 6 - Tools for LINQ to SQL

    In this chapter, we took a look at the tools that are available to generate LINQ to SQL entities and DataContext classes. The .NET Framework SDK includes the command-line tool named SQLMetal. Visual Studio 2008 has a graphical editor known as the Object Relational Designer. Both allow the creation of a DBML file, the generation of source code in C# and Visual Basic, and the creation of an external XML mapping file. The Object Relational Designer also allows you to edit an existing DBML file, dynamically importing existing tables, views, stored procedures, and user-defined functions from an existing SQL Server database.

    Chapter 16 - LINQ and ASP.NET

    This chapter showed you how to leverage the new features and controls available in ASP.NET 3.5 to develop data-enabled Web applications, using LINQ to SQL and LINQ in general. Consider that what you have seen is really useful for rapidly defining Web site prototypes and simple Web solutions. On the other hand, in enterprise-level solutions you will probably need at least one intermediate layer between the ASP.NET presentation layer and the data persistence one, represented by LINQ to SQL. In real enterprise solutions, you usually also need a business layer that abstracts all business logic, security policies, and validation rules from any kind of specific persistence layer. And you will probably have a Model-View-Controller or Model-View-Presenter pattern governing the UI. In this more complex scenario, chances are that the LinqDataSource control will be tied to entities collections more often than to LINQ to SQL results.

    The following is the complete list of the chapters included in the book.

    Programming Microsoft LINQ

    • Part I LINQ FOUNDATIONS
      • 1 LINQ Introduction
      • 2 LINQ Syntax Fundamentals
      • 3 LINQ to Objects
    • Part II LINQ to Relational Data
      • 4 LINQ to SQL: Querying Data
      • 5 LINQ to SQL: Managing Data
      • 6 Tools for LINQ to SQL
      • 7 LINQ to DataSet
      • 8 LINQ to Entities
    • Part III LINQ and XML
      • 9 LINQ to XML: Managing the XML Infoset
      • 10 LINQ to XML: Querying Nodes
    • Part IV Advanced LINQ
      • 11 Inside Expression Trees
      • 12 Extending LINQ
      • 13 Parallel LINQ
      • 14 Other LINQ Implementations
    • Part V Applied LINQ
      • 15 LINQ in a Multitier Solution
      • 16 LINQ and ASP.NET
      • 17 LINQ and WPF/Silverlight
      • 18 LINQ and the Windows Communication Foundation
    • Appendixes
      • A ADO.NET Entity Framework
      • B C# 3.0: New Language Features
      • C Visual Basic 2008: New Language Features
  • To join or not to join: that is the question (in LINQ)

    A comment received by one reader of Programming LINQ suggested me to underline a concept that is not so intuitive using LINQ, especially if you come from years of SQL coding.

    The idea is very simple. Two entities in LINQ might be related in the model. Whenever this happen, usually it is better to leverage on this existing relationship and not to write the join syntax in an explicit way. If you are using LINQ to SQL, the generated SQL code might be more performant or at least correspondant to the one generated by writing an explicit join in your LINQ query. The less constraints in your query, the better.

    Let's look at an example on the Northwind database. Imagine you want to see a list of all categories with a flag set for the one which a particular product belongs to. This is a SQL query we could write:

    SELECT
        c
    .CategoryID, 
        c
    .CategoryName,
        CASE WHEN p.ProductID IS NULL 
            THEN 0
            ELSE 1
        END AS Selected
    FROM Categories c
    LEFT JOIN Products p
        ON p.CategoryID = c.CategoryID
        AND p.ProductID = 10
    ORDER BY CategoryName

    Ok, we can write the same query in many other ways, but there are several more complex situations where a LEFT JOIN is used to test the presence of an element in a related table. A correspondant LINQ query might be the following one:

    from c in dc.Categories
    orderby c.CategoryName
    join p in dc.Products.Where(p => p.ProductID == 10)
        on c.CategoryID equals p.CategoryID 
        into pj
    from x in pj.DefaultIfEmpty()
    select new {
        c.CategoryID,
        c.CategoryName,
        Selected = x != null
    };

    The LINQ query above will generate a SQL query containing a LEFT JOIN statement. However, a relationship exists between Categories and Customer, and you can leverage on this relationship in the point where you really need to traverse the relationship (in the projection statement). The following one is a better way to get the same result:

    from c in dc.Categories
    orderby c.CategoryName
    select new {
        c.CategoryID, 
        c.CategoryName,
        Selected = c.Products.Any( p => p.ProductID == 10 ) ? true : false
    };

    This new version has two advantages. First, it is shorter and express its intent more explicitly.  Second, it generates a SQL query with an EXISTS statement, similar to the following one.

    SELECT CategoryID, CategoryName,
        (CASE
            WHEN EXISTS(
                SELECT NULL AS [EMPTY]
                FROM Products AS p
                WHERE (p.ProductID = 10) AND (p.CategoryID = c.CategoryID)
                ) THEN 1
            ELSE 0
        END) AS Selected
    FROM Categories AS c
    ORDER BY CategoryName

    The execution plan used by SQL Server might be similar if not equal. However, using the implicit relationship between Categories and Products in the LINQ query is usually better, because it gives more freedom to the LINQ provider to generate a more efficient SQL code.

  • TechEd interview

    I and Paolo have been interviewed at TechEd by Ken Rosen. We talk about our experience as book authors.

    If you are interested in writing a book, or if you simply want to see our faces and hear our italian accent, you can watch the video available in both low resolution and high resolution. Enjoy!

  • TechEd 2008 book signing

    I'm already in Orlando for TechEd 2008 Developers. Tomorrow I and Paolo Pialorsi will be at the TechEd bookshop for a book signing of our just released Programming Microsoft LINQ, scheduled at 4:00PM-4:30PM. I wrote a post a few days a go with the list of chapters included in the book. LINQ to SQL and LINQ to Entities are two technologies that are significative to access data, even if you don't have to use them in every possible scenario.

    If you are attending to TechEd, meet us tomorrow at the bookshop to talk about LINQ!

  • Programming Microsoft LINQ finally shipping

    Finally, the Programming Microsoft LINQ book is available. We updated the website that supports our books (http://programminglinq.com), where you can download the sample code of the book.

    What’s in this book? Well, we tried to cover everything that was in RTM, but we also introduced technologies that are still in beta or in early CTP stages, like LINQ to Entities and Parallel LINQ. To give you an idea of the content, at the end of this post there is a list of the chapters included in the book.

    Now, the next news is that we will be at TechEd Developers next week in Orlando. Feel free to contact us if you want to give us feedback about the book or if you simply want to talk about LINQ. Moreover, we will be at the bookshop for book signing on June 3rd from 4:00pm to 4:30pm. We hope to see you there!

    Programming Microsoft LINQ

    ·         Part I LINQ FOUNDATIONS

    o   1 LINQ Introduction

    o   2 LINQ Syntax Fundamentals

    o   3 LINQ to Objects

    ·         Part II LINQ to Relational Data

    o   4 LINQ to SQL: Querying Data

    o   5 LINQ to SQL: Managing Data

    o   6 Tools for LINQ to SQL

    o   7 LINQ to DataSet

    o   8 LINQ to Entities

    ·         Part III LINQ and XML

    o   9 LINQ to XML: Managing the XML Infoset

    o   10 LINQ to XML: Querying Nodes

    ·         Part IV Advanced LINQ

    o   11 Inside Expression Trees

    o   12 Extending LINQ

    o   13 Parallel LINQ

    o   14 Other LINQ Implementations

    ·         Part V Applied LINQ

    o   15 LINQ in a Multitier Solution

    o   16 LINQ and ASP.NET

    o   17 LINQ and WPF/Silverlight

    o   18 LINQ and the Windows Communication Foundation

    ·         Appendixes

    o   A ADO.NET Entity Framework

    o   B C# 3.0: New Language Features

    o   C Visual Basic 2008: New Language Features

     

  • LINQ Framework Design Guidelines

    I just want to link this post from Mircea Trofin with LINQ Framework Design Guidelines, which are very interesting if you want to extend LINQ in some way. In our upcoming Programming Microsoft LINQ book we wrote a whole chapter titled "Extending LINQ" and this post is a very good integration.

  • LINQ to Regex

    Within 2 weeks our new Programming Microsoft LINQ should be finally available! Today I just read a post about a LINQ to Regex implementation that we would have covered in our book if only we had a time machine.

    When I thought at LINQ implementations in the past, I always underestimated that even if you use LINQ for a merely 1% of its capabilities, sometimes the Intellisense-enabling nature of LINQ is a good enough reason to switch your library-interface to a LINQ model. In reality, most of it is the magic of Intellisense + Extension Methods. Even if I don't use Regex very often, LINQ to Regex illustrated an illuminating concept to me.

  • LINQ adopted by Microsoft Robotics SDK

    The new CTP of Microsoft Robotics Developer Studio implements a LINQ query syntax to define Data Contract Filters on DSS (Decentralized Software Services). I don't have a real experience on Robotics SDK, but I looked at documentation for the LINQ implementation and this is a good example of the use of LINQ in an environment unrelated to relational databases.

    Looking at these examples, I thought that an interesting LINQ application is the definition of filters on subscription services just as the Robotics SDK does. This mean acquisition of data changing in real-time (ok, soft real-time...). The flexibility offered by IQueryable is great.

  • LINQPad is a very good tool

    I recently used LINQPad and I have to say that is a very very good tool. You can use it to test your LINQ query in a very interactive way. If you want to test LINQ to SQL, it automatically generates the necessary DataContext class and give you an environment test to execute your LINQ queries.

    A very missing feature is the AutoComplete, but a future release might include it. However, using it is already a benefit for your productivity.

    If you teach classes or speak at conferences, this is absolutely a must-have tool.

More Posts Next page »
Powered by Community Server (Personal Edition), by Telligent Systems