Microsoft LINQ Books

Programming Microsoft LINQ & Introducing Microsoft LINQ
Welcome to Microsoft LINQ Books Sign in | Join | Help
in Search

Marco Russo

  • LINQ to SQL vs LINQ to Entities - decisions from ADO.NET team

    To make a long story short: the ADO.NET team is now responsible of ADO.NET Entity Framework (including LINQ to Entities) and of LINQ to SQL (the last one was originally in charge of the a separated team, tied to the C# compiler).

    There is an evident overlapping between LINQ to SQL and LINQ to Entities and since the first day, Microsoft said that in the long run, these two solutions would have been merged into a single one.
    Now, the roadmap that is arising is: Entity Framework will be improved adding features that will be necessary to cover scenarios where LINQ to SQL today is preferred over LINQ to Entities and Entity Framework.

    There are a lot of comments - I suggest you starting here to get a good recap and pointers to many others.

    My personal opinion is that LINQ to SQL is very good in some scenarios and should not be dropped until a good alternative (in EF?) is available. For example, I use LINQ to SQL to implement nightly processes that are part of ETL solutions. In these cases, I use LINQ to SQL to read data (expecially configuration data, but sometimes also source data) and use the SqlBulkCopy API to write data into destination tables. Having all the necessary into a single executable file, without external dependencies, is a big advantage for deployment (a single file to copy). Today LINQ to Entities would be slower, would have more files and would require .NET 3.5 SP1 on production servers (the last one would not be a real issue in my case). There are of course other scenarios when there is something that makes LINQ to SQL a better choice against the current version of Entity Framework.

    My hope is that a convergence of two partially overlapped frameworks is good, but at the same time this shouldn't be a penalization for the current users of the "losing" part. This will require several releases of .NET to be done, and I hope that in the meantime the LINQ to SQL engine will have a decent evolution to keep its current position of "light LINQ oriented DAL replacement to SQL Server".

  • Book signing @ PDC 08

    If you are at PDC 2008, I'll be at book signing for Programming Microsoft LINQ at bookstore on Tuesday 28, during the coffee break between 3:00 and 3:30 PM.
    I and Paolo will be happy to meet you and receive your direct feedback about our LINQ book.

  • Use IEnumerable as a source for SqlBulkCopy

    Today I needed to use SqlBulkCopy class passing an IEnumerable<T> as a source instead of a IDataReader. Before writing something that someone else could have already written, I made some search and I found this interesting post that solve exactly this issue. In the post there is also a link to source code. Take also a look at performance optimization for getter as described in post comment.

  • Non-boolean LINQ predicates

    Bart De Smet just wrote a long post about LINQ predicates that can be defined without returning a boolean value.

    This is something I partially evaluated writing the Programming Microsoft LINQ book, but in his post Bart goes very deep on this topic and shows a lot of interesting details and ideas.

  • Active queries LINQ

    Paul Stovell made a presentation on "Reactive Programming and Bindable LINQ" at TechED Australia 2008 (unfortunately, I was at the antipode in Italy, but the topic is really interesting). I didn't know there are projects somewhat similar to Bindable LINQ in CodePlex: Obtics and Continuous LINQ. I really like the idea of defining "live" queries with LINQ.

  • LINQ to SQL and the procedure cache of SQL Server

    I just received a mail from Adam Machanic that pointed me to this bug (I would call it a performance issue) about the construction of SQL statements generated by the LINQ to SQL engine.

    The issue: every string passed as a constant in the query will be auto-parameterized using the length of the passed string, even when you used a string variable into the LINQ query. If you write something like

    string s = "Wine";
    var query =
            from x in db.Products
            where x.ProductName == s
           
    select x;

    you will see that a parameter of type NVARCHAR(4) will be passed to the generated SQL query. The next execution of query might contain a different value in the s parameter, and for this reason a different parameter type might be used: if the length of the string in the s variable changes, then the same query will be sent to SQL Server, but using a different type in the sp_executesql parameters .For example, a NVARCHAR(5) would be used whether s contains"Bread".

    The consequence of this behavior is that you could have a non-optimal performance from SQL Server and, more important, the procedure cache could be filled up with several copies of the same query, differing each other only in the length of the parameter type.

    I agree with Adam: this is something to be fixed. But my suspect is that we will get a "by design" answer another time...

  • IQueryable under the cover

    In the Programming Microsoft LINQ book we dedicated two whole chapters (76 pages) about the writing of a IQueryable LINQ provider: one is about expression trees and the other covers the several ways to extend LINQ, including the writing of an IQueryable provider. I know that the subject is complex and probably is not necessary to every programmer. However, a good understanding of what happens under the cover of an IQueryable provider is good for everyone using any flavor of LINQ: when you debug your code, it might help you in finding issues faster.

    I wrote this introduction just to explain why you should read this post of Bart De Smet, which is undoubtedly shorter than the corresponding chapter of our book and gives you a very good step-by-step introduction of the inner workings of an IQueryable LINQ provider. Then, if you really like this kind of things, you have another good reason to read the book :-)

  • Important LINQ Changes in .NET 3.5 SP1

    Dinesh Kulkarni wrote an important post about changes in LINQ introduced by .NET 3.5 SP1 that has been released yesterday.

    One of the interesting changes is in the Cast<T> operator and its behavior is better described in this post by Ed Maurer. I think that the side effects of this change should be limited, because the use of explicit type for the range variable in a query expression (i.e. from int n in numbers select... instead of from n in numbers select...) is not very common. In fact, I don't remember examples of its usage in our Programming LINQ book. Take care of this change if you used (or will use) this syntax.

  • Dangerous use of ArrayList in Lambda Expressions

    I have just validated this bug posted on Connect. It seems a compiler issue, I'd like to read a Microsoft answer about this.

    However, the general issue is that using ArrayList in a lambda expression with a collection initializer could be dangerous. There are not so many reasons to use an ArrayList in a lambda expression, unless you are refactoring or working with legacy code that cannot be modified upgrading ArrayList to generic collections.

  • The adoption of LINQ

    Eric White has written an interesting post titled "Are developers using LINQ?" - there are interesting considerations about the adoption of functional programming too, but the most interesting part for me is the list of comment of the post. A lot of people described the adoption of LINQ into their team or company, and there is a spread variety of comments (good and bad).

    An interesting comment is about the future adoption of F# when it will be shipped, because of the complete adoption of functional programming (C# 3 is not a complete functional programming like F# is). I suggest you to take a look at this post and its comments, because it gives you an idea of what is going on out there.

  • Multiple Results with LINQ to SQL

    I just read a post about getting multiple results with LINQ to SQL without using stored procedures. This technique is interesting when you have multiple queries returning a few rows each one and you want to save time by skipping some roundtrip between your program and SQL Server. Looking at the post, I immediately thought that it would be interesting comparing this solution with an asynchronous one, executing each query in a different thread. I don't have time to make some benchmark, but it would be interesting to make a comparison between these two techniques.

  • Use of Distinct and OrderBy in LINQ

    A few days ago I found a bug in a program written using LINQ to SQL, which was caused by years of use of SQL. The requirement was something like: get the distinct values of (bla bla bla) sorted alphabetically. An example of the required query with Northwind would be the following one:

    SELECT DISTINCT
            e.LastName
    FROM    Orders o
    LEFT JOIN [Employees] e
            ON e.[EmployeeID] = o.[EmployeeID]
    ORDER BY e.LastName 

    Fundamentally, we are using both a DISTINCT and an ORDER BY statement in SQL.

    Now, if you create a NorthwindDataContext importing the Order and Employee tables, you can try to write a similar statement in LINQ to SQL. Unfortunately, the Distinct clause is not part of the query syntax and the most intuitive path could be the one of calling Distinct at the end of your statement, like in the following query:

    var queryA =
        (from o in db.Orders
         orderby o.Employee.LastName
         select o.Employee.LastName)
         .Distinct();

     

    However, the Distinct clause is removing the sort condition defined by the orderby keyword. In fact, the SQL statement sent to the database is the following one:

    SELECT DISTINCT
            [t1].[LastName]
    FROM    [dbo].[Orders] AS [t0]
    LEFT OUTER JOIN [dbo].[Employees] AS [t1]
            ON [t1].[EmployeeID] = [t0].[EmployeeID]

     

    This behavior might appear strange. The problem is that the Distinct operator does not grant that it will maintain the original order of values. Applied to LINQ to SQL, this mean that a sort constraint can be ignored in the case of a query like queryA.

    The solution is pretty sample: put the OrderBy operator after the Distinct one, like in the following queryB definition:

    var queryB = 
        (from o in db.Orders
         select o.Employee.LastName)
        .Distinct().OrderBy( n => n );
    

     

    This will result in the following SQL statement sent to Northwind:

    SELECT  [t2].[LastName]
    FROM    ( SELECT DISTINCT
                        [t1].[LastName]
              FROM      [dbo].[Orders] AS [t0]
              LEFT OUTER JOIN [dbo].[Employees] AS [t1]
                        ON [t1].[EmployeeID] = [t0].[EmployeeID]
            ) AS [t2]
    ORDER BY [t2].[LastName]

     

    If you remove some syntax redundancy, this is exactly the same query I wrote at the beginning of my post.

    The lesson is: in a SQL query, the position of an operator is not relevant until operators belong to the same SELECT/FROM statement. In LINQ, this is not true and the conversion to SQL could remove LINQ operators when their operation might be ignored by other operators in the same LINQ query.

    Final consideration: initially I considered that the compiler could emit some warning in case a query reduction is done like in the queryA case. Unfortunately, the query reduction operation is done by the LINQ to SQL provider at execution time and not during compilation. A warning could still be possible, but it's something that I would move to tools like FxCop.

  • LINQ query optimizations

    Look at this excellent blog post written by K. Scott Allen. I completely agree with him: don't try to optimize a LINQ query until you measure its performance and understand it is really a bottleneck that needs to be improved.

    An interesting consideration I never made before is that you can call the OrderBy extension method after the Select and not before. Yes, using the query syntax of C# you are used to put the Select after the OrderBy, but sometime it could be better to invert this order (the reasons are already well explained in the Scott Allen's post).

    And, of course, consider that performance have to be evaluated in two dimension: time and space. And, sooner than later, a third dimension (parallelism) will gain the same importance.

  • Implement progress reporting and cancellation of LINQ queries

    Samuel Jack wrote two interesting posts discussing possible extension methods for LINQ. One is to implement progress reporting of a LINQ query. The other is to implement a way to cancel a running LINQ query.

    Both implementations are very simple and they are very good to illustrate how LINQ can be extended and manipulated in a simple way by using extension methods.

  • LINQ to SQL and varchar(1) fields

    If you are using the Object Relational Designer of LINQ to SQL creating an entity of an existing table that has some VARCHAR(1) fields, you are going into this issue.

    The data member created in C# is char instead of string. If the field is always filled with one char, this works. But if you try to read a row from the table containing an empty string (not a NULL field, but a field of zero characters) you will get this exception:

    String must be exactly one character long.

    This behavior has been already described in this post and in the LINQ forum. But one more warning could be important: this is a latent error that will express yourself only at runtime if you don't fix. Thus, be careful whenever you have some VARCHAR(1) fields in your tables.

More Posts Next page »
Powered by Community Server (Personal Edition), by Telligent Systems