Contents

Introduction

One of the new features of SQL Server 2008 is support for hierarchical data structures. These data structures can be used to store information pertaining to organization charts, online forums, network topologies, and more. This article seeks to present useful information regarding to the way you can work with hierarchical data in SQL Server.

To compile and run the examples in this article, you will need Visual Studio 2008 and SQL Server 2008.

Fun with SQL

As you may have guessed, we shall be working with data structures that help us basically build trees using SQL Server (as opposed to just ‘lines of data’ that we’re used to). Since we’re working with trees, here are some things we need to know:

  • Where the root of the tree is.

  • Who the node’s parents and children are.

  • The level (depth) of a node in the tree.

  • Ways of traversing our tree.

  • Ways of doing indexing over the tree.

A new UDT

In order to get all of this to work, SQL Server 2008 introduces a new UDT1, hierarchyid (Msdn page). This data type contains the path from the root all the way to the node, and is thus used in lieu of, say, a foreign key, when linking tree structures together. The value that SQL Server writes to hierarchyid is itself unreadable (it’s made using a clever algorithm) but we can use ToString() if we want to get a human-readable representation.

Before we jump into C# (the scary part), let’s play a bit with good old SQL in order to get a feel for what this new feature looks like and what to do with it. Let’s imagine, for example, that we are using SQL Server to keep a hierarchical structure of help topics that can be compiled into a help application2. We begin by defining a somewhat unusual table schema3:

create table HelpItem
(
  Id      hierarchyid primary key clustered,
  Lvl     as Id.GetLevel(),
  ItemId  int unique,
  Name    nvarchar(32) not null,
  Content ntext
)

Okay, that’s probably a lot different to what you’re used to, especially if, like me, your first row of (almost) every table is defined an int identity. Having a hierarchyid as the primary key allows us to do different types of indexing over the table, which we will discuss later. The second column, Lvl, is a computed column that calls on a GetLevel() method of the hierarchyid type (remember it’s a CLR type)4.

Insertion

To appreciate both the Id and Lvl columns, let’s add some data to our table. This has to be done programmatically:

insert into HelpItem
  (Id, ItemId, Name, Content)
values
  (hierarchyid::GetRoot(), 1, 'My Product', 'How to use my product')

If you’ve never met the :: syntax, it’s basically used for calling class-level (static) methods inside SQL Server – instance methods still use the dot notation. In the above insertion, we just specified the hierarchyid of our first entry to be the root of the tree.

To view the data, you must fiddle the query somewhat. Doing a typical select * is useless because the hierarchyid column will have the value of 0x in every row. Don’t ask me why (I have no idea). To get a readable value, we simply use the ToString() method I mentioned earlier, and execute the following query instead:

SELECT [Id].ToString()
      ,[Lvl]
      ,[ItemId]
      ,[Name]
      ,[Content]
  FROM [Hierarchical].[dbo].[HelpItem]
 
// here is the output:
 
Id Lvl ItemId Name       Content
-- --- ------ ---------- ---------------------
/  0   1      My Product How to use my product

As you can see, the textual version of the root Id element is a simple slash (/), and the root node has a level of 0 (zero).

Descendants and Ancestors

We’ve managed to insert the root through some hackery, and we also ended up providing our root element with an ItemId instead of auto-generating one (which can be a problem). Now, to add more elements to the hierarchy, we need to get acquainted with a new method called GetDescendant.

Essentially, GetDescendant generates the hierarchyid of a new element that is a) a descendant of the element you called it on; and b) is located between the two elements provided to GetDescendant as parameters. Of course, either of the parameters can be null, which means that there’s no constraint on where the element can occur.

That said, let’s add a child element called ‘Introduction’ as a child of our root element:

insert into HelpItem
  (Id, ItemId, Name, Content)
values
  ((select Id from HelpItem where ItemId = 1).GetDescendant(null,null),
   2,
   'Introduction',
   'Some general info about our product')

Just to reiterate, what’s happening above is that, for our new element, we chose a descendant root from the parent element. We passed two null values, this making the placement of the new item unconstrained. Turning the hierarchyid value of the new item yields the value /1/. This piece of information is fairly useless per se though, because we are mainly interested in the actual (binary) hierarchyid value.

In addition to being able to get at the descendant of a particular element (which is essential for random insertion), it’s also possible to get an ancestor N levels up by calling GetAncestor(N) on the element you’re interested in. The result is, predictably, a hierarchyid of the parent N steps up the tree, or null if there aren’t that many elements up the tree.

Another method available is IsDescendantOf, which checks whether our hierarchyid is a descendant (direct or indirect) of the hierarchyid provided as parameter.

All right, if you’re like me, you are probably bored with all this theory. Let’s try to make a hierarchical database and bind it to a simple Silverlight app.

Fun with C#… or not

It really amuses me when someone uses the phrase modern ORM framework – if frameworks were modern, we would have support for hierarchyid in both Linq2Sql and Entity Framework. That said, it’s not the case: ORM frameworks do not support our brand new type out of the box.

After playing around with trying to get a Linq2Sql T4 tempalate to play nice with SqlHierarchyId (it worked, but Linq2Sql failed to agree with the idea of an hierarchyid identity column), I decided to let it go at that and work without an ORM5, especially seeing how I only have one very simple table and I’m interested in working with Silverlight which, as you probably know, cannot directly use an O-R mapper anyway.

Building a POCO

Seeing how we aren’t going to get an auto-generated entity class (big deal), we have to make one ourselves. This is even more appropriate if we take into account that a Silverlight app cannot use an ORM layer directly, and we end up interfacing through a web service. Here’s our service’s HelpItem definition:

[DataContract]
public class HelpItem
{
  internal SqlHierarchyId Id;
  [DataMember]
  public int ItemId;
  [DataMember]
  public string Name;
  [DataMember]
  public string Content;
  private List<HelpItem> children;
  [DataMember]
  public List<HelpItem> Children
  {
    get
    {
      return children ?? (children = new List<HelpItem>());
    }
  }
}

This structure is tricky, and requires some explanation. Some of its elements cannot be transmitted over WCF and so are kept private – for example, you cannot use the SqlHierarchyId type in Silverlight anyway, so it doesn’t get its own [DataMember] attribute. On the other hand, in order to transmit the hierarchy we have to condense it by aggregating the items in their respective parents – thus the need for a Children collection (which is lazily created, to boot).

Procuring the data

To get the data from the server, we use good old-fashioned SqlConnection and SqlCommand classes – the way things were done when .NET was just appearing. At the moment, we define just one function, which returns all the elements as an object graph:

[OperationContract]
public HelpItem GetHelpItems()
{
  SqlConnection dbConn = new SqlConnection(
    "Data Source=(local);Initial Catalog=Hierarchical;Integrated Security=True");
  List<HelpItem> results = new List<HelpItem>();
  try
  {
    dbConn.Open();
    SqlCommand cmd = new SqlCommand(
      "select Id.ToString(), ItemId, Name, Content from HelpItem order by Id.GetLevel()",
      dbConn);
    using (var reader = cmd.ExecuteReader())
    {
      while (reader.Read())
      {
        HelpItem i = new HelpItem
        {
          Id = SqlHierarchyId.Parse(reader.GetSqlString(0)),
          ItemId = reader.GetInt32(1),
          Name = reader.GetString(2),
          Content = reader.GetString(3)
        };
        results.Add(i);
        // make sure its parent knows
        foreach (HelpItem parent in results.Where(r => r.Id.Equals(i.Id.GetAncestor(1))))
          parent.Children.Add(i);
      }
    }
  }
  finally
  {
    if (dbConn.State == System.Data.ConnectionState.Open)
      dbConn.Close();
  }
  return results.Count() > 0 ? results[0] : null;
}

The only really ‘smart’ thing we’re doing above is using LINQ and the GetAncestor() method of SqlHierarchyId in order to add each child to its respective parent6. You’ll also note that, even though we return but a single element, this element aggregates everything else, thus yielding the whole object graph.

Downloading and using

Unfortunately, the POCOs we get from the service aren’t very useful on their own – they have weak support for things like editing, so we copy them into corresponding client-side HelpItem structures:

private static void ProcessChildren(HelpItem orig, HelpService.HelpItem curr)
{
  foreach (HelpService.HelpItem hi in curr.Children)
  {
    HelpItem i = new HelpItem { Name = hi.Name, Content = hi.Content };
    orig.Children.Add(i);
    ProcessChildren(i, hi);
  }
}
void hsc_GetHelpItemsCompleted(object sender, GetHelpItemsCompletedEventArgs e)
{
  // convert and assign
  if (e.Result != null)
  {
    HelpItem root = new HelpItem
                      {
                        Name = e.Result.Name,
                        Content = e.Result.Content
                      };
    ProcessChildren(root, e.Result);
    items.Clear();
    items.Add(root);
  }
}

This copying procedure lets us use better collections and more features in the entities – something that will come in handy later. Meanwhile, we can use our acquired elements in the model – and here’s what we get:

On the next iteration, we shall continue by looking at how edited tree items can be posted back to the server.

Notes

  1. ^ A UDT is a user-defined data type. Unlike the primitive server types such as bit and int, a UDT is a CLR type, thus having typical CLR methods such as ToString().
  2. ^ The idea comes from Andrej Tozon’s blog entries which describe how to work with the tree control in Silverlight.
  3. ^ As you might have guessed, UI-driven schema creation or editing isn’t supported by tools such as Management Studio. This is due to the absense of editors (and inability to do, say a stringhierarchyid conversion) as well as lack of support for computed columns.
  4. ^ When we come to interfacing with an ORM, we shall completely get rid of this generated column. It is only shown here as an illustration – in real life, it’s probably easier to simply create generated columns as elements of a partial class that ORM frameworks make.
  5. ^ In case you’re wondering, I looked at EF, Linq2Sql and NHContrib.
  6. ^ One interesting thing to note is the == operator doesn’t work on SqlHierarchyId – at least not how one would expect. Equals() did the trick, though.