The Object Model
LINQ-to-SQL is all about its object model. Through the object model you define the model of data you'll work with. The object model contains classes that may contain both data and behaviorpersistence, logic, and validation. The Visual Studio 2008 wizard is the easiest starting point for building such an object model. However, it is ultimately a collection of classes that can be written manually. Suppose you intend to connect to the standard Northwind database and work with Customers and Orders tables. You end up with a Customer class as in Listing One.
The LINQ-to-SQL object model is essentially a table-centric object model with a varying level of abstraction. In the default source code for the various classes, the abstraction over the physical structure of the tables is null. As you can see in Listing One, Customer is a faithful representation of a record in the Customers table. However, the class is decorated with a few attributes and is marked as partial.
The class is a plain class that just exposes a few properties and implements a few interfaces. More precisely, it is a plain class that operates as a table record class. Properties and the class itself are mapped to columns of a particular table through attributes. Attributes contain metadata that bind a particular property to a particular column and the whole class to a table record. The attribute contains information about name and type of the column, plus additional information such as primary key, nullability, and constraints:
[Column(Storage="_CustomerID", DbType="NChar(5) NOT NULL", CanBeNull=false, IsPrimaryKey=true)]
The Column attribute dictates that the property has to be serialized back to the database table when an instance of the class is made persistent. Likewise, the property has to be initialized with persistent data when an instance of the class is built from a data query.
[Table(Name="dbo.Customers")] public partial class Customer : INotifyPropertyChanging, INotifyPropertyChanged { [Column(Storage="_CustomerID", DbType="NChar(5) NOT NULL", CanBeNull=false, IsPrimaryKey=true)] public string CustomerID { get { return this._CustomerID; } set { if ((this._CustomerID != value)) { this.OnCustomerIDChanging(value); this.SendPropertyChanging(); this._CustomerID = value; this.SendPropertyChanged("CustomerID"); this.OnCustomerIDChanged(); } } } : [Association(Name="Customer_Order", Storage="_Orders", OtherKey="CustomerID")] public EntitySet<Order> Orders { get { return this._Orders; } set { this._Orders.Assign(value); } } }
Any class in the LINQ-to-SQL object model doesn't show off any Save or Load methods for I/O operations. To access the results of a query, you just write a LINQ query:
NorthwindDataContext context = new NorthwindDataContext(); var data = from customer in context.Customers where customer.Country == "USA" select customer;
The data context object is the entry point in the LINQ-to-SQL engine. The class is automatically generated by the Visual Studio 2008 wizard as a specialization of a system-provided DataContext class. The data context is the means through which developers can run queries and updates against the database.
Apparently, the preceding query targets an in-memory object model and has no relationship to the T-SQL language and SQL Server. Some T-SQL code, however, is generated under-the-hood and executed against the database connection associated with the data context. The aforementioned LINQ code evaluates to the following T-SQL:
SELECT * FROM customers WHERE country="USA"
Likewise, a more sophisticated T-SQL code is run when some program goes through the following LINQ code:
var data = from customer in context.Customers where customer.Orders.Count > 10 select customer;
There's always a lot of T-SQL code behind any LINQ query that targets a LINQ-to-SQL collection. Here's the T-SQL code for the preceding query:
SELECT [t0].[CustomerID], [t0].[CompanyName], [t0].[ContactName], [t0].[ContactTitle], [t0].[Address], [t0].[City], [t0].[Region], [t0].[PostalCode], [t0].[Country], [t0].[Phone], [t0].[Fax] FROM [dbo].[Customers] AS [t0] WHERE (( SELECT COUNT(*) FROM [dbo].[Orders] AS [t1] WHERE [t1].[CustomerID] = [t0].[CustomerID] )) > @p0
The LINQ-to-SQL engine buried in the folds of the DataContext object, translates the LINQ query into a T-SQL statement. However, the statement is not immediately run. It executes only when the result set is accessed for reading, enumerated, or simply copied into a memory collection. The effective T-SQL code can be tracked either using the SQL Profiler tool or the Visual Studio 2008 debugger; see Figure 1.
Writing queries using the LINQ syntax is much easier and faster for every developer. Moreover, the approach behind LINQ-to-SQL takes engineers towards an object-oriented vision of the data that makes it inherently simpler to manage complex scenarios.
The primary reason that stopped engineers from widely adopting object-oriented methodologies to design the data model of their applications has been the lack of effective (and especially quick and reliable) mechanisms to make the object model persistent down to a relational database. Object-oriented database-management systems, for a variety of reasons, have yet to capture the hearts of architects and project managers. As a result, an object/relational mismatch has been before our eyes for years. LINQ-to-SQL attempts to provide a simplified solution that, compared to a full O/RM tool, is lightweight, out-of-the-box, fully integrated in Visual Studio 2008, and limited to SQL Server.
Through the DataContext class, the LINQ-to-SQL object model also supports a number of interesting features typical of a full-fledged O/RM toolincluding lazy loading, identity and transaction management, and optimistic concurrency. And, of course, the LINQ-to-SQL object model fully supports updates. Updates are not conducted through a direct database operation, but also pass through the DataContext class. You enter changes to the object model and when you're done, you tell the data context to make changes persistent to the data storage, as in Listing Two.
NorthwindDataContext db = new NorthwindDataContext(); Product p1 = db.Products.Single(p => p.ProductName == "X"); Product p2 = db.Products.Single(p => p.ProductName == "Y"); Order order1 = new Order(); order1.OrderDate = DateTime.Now; order1.RequiredDate = order1.OrderDate.AddDays(7); OrderDetail item1 = new OrderDetail(); item1.ProductID = p1.ProductID; item1.Quantity = 4; order1.OrderDetails.Add(item1); OrderDetail item2 = new OrderDetail(); item2.ProductID = p2.ProductID; item2.Quantity = 1; order1.OrderDetails.Add(item2); Customer cust = db.Customers.Single(c => c.CustomerID == "ALFKI"); cust.Orders.Add(order1); db.SubmitChanges();
LINQ-to-SQL and SQL Server Development
LINQ-to-SQL has the power to autonomously generate good T-SQL statements. Does this mean that LINQ-to-SQL can ultimately replace SQL Server experts in any organization? What about the quality of the T-SQL code being generated? Is there any room left for manual improvement and fine-tuning?
LINQ-to-SQL and O/RM tools in general are not designed to be in contraposition to SQL Server experts. They are just tools in the hands of database experts. However, an O/RM is a high-level tool that other professionals can use as effectively, even when they lack specific SQL Server advanced skills.
LINQ-to-SQL is designed to produce T-SQL code that is the best possible in the widest possible number of situations. And to get that code written down you don't have to exercise specific SQL Server skills. This doesn't mean that LINQ-to-SQL will always employ the perfect T-SQL code for any situation. But it will in a surprisingly large number of situations. The role of a SQL Server expert is just detecting whether there's room for improvement and introducing it.