How to Detect Corrupted TFS Work Items

One of the problems that you might encounter (but hopefully you will never) when working with the TFS warehouse is the presence of corrupted work items.

Why do I mention the TFS warehouse when I am talking about corrupted TFS work items?

Corrupted work items (and I am talking here about severely corrupted work items, not work items with some invalid field value that do not allow you to save them in Visual Studio or TSWA) do not typically show up in any query you can build to display all work items. In fact, you can even go ahead and use the TFS API to query every single work item in your TFS work item tracking system, and it will still not show those bad work items.

However, corrupted TFS work items affect the TFS warehouse processing. They cannot be processed and added to the TFS data warehouse. When the Work Item Tracking Warehouse Sync job encounters one of those bad boys, it will stop processing at that point. Newer work items created after the bad one are not processed nor reflected in the Tfs_Warehouse relational data warehouse store. That directly affects the Analysis processing which is driven by Tfs_Warehouse in order to update the Tfs_Analysis analysis database. You will start noticing that your reports do not reflect changes made to newer work items.

If you are someone that does not have TFS Reporting enabled, then you are not using the TFS warehouse. In that case, if you do have corrupted TFS work items, you will most likely not notice that issue at all, nor will it affect the normal TFS operations. It is however advised that resolve it rather than keep it dormant until it is time to resolve it (say for instance if you need to utilize reporting).

Work Item Corruption Detection Procedure

The only way to detect work item corruption is by querying the OLTP part of the TFS installation. This is the part that is housed inside the TFS database for the project collection for which you want to detect work item corruption.

To detect corrupted work items, you can follow the procedure outlined below:

  1. Open a new query window in SQL Server Management Studio and select the TFS project collection’s database as the active database for the query. For example, for a default collection, this would typically be the Tfs_DefaultCollection database.
  2. Get the count of records present in the WorkItemsLatest table: SELECT COUNT(*) FROM WorkItemsLatest
  3. Get the count of unique records in the WorkItemsAre table: SELECT COUNT(*) FROM ( SELECT DISTINCT ID FROM WorkItemsAre) AS Grp. You need the unique records since this table has a record for each work item revision. So each work item can have one or more records in this table.
  4. If the counts in steps 2 and 3 above are different then you have work item corruption. When that happens, you will find that the count resulting from querying against the WorkItemsAre table has one or more work items that are not present in the WorkItemsLatest table. You can get the ID’s of those records and delete them from the WorkItemsAre table. This will remove the clogging and allow warehouse processing to continue normally.

Note that this procedure involves having access to the TFS project collection’s database in order to query the records. It also involves deleting records directly from the database. This is not supported. If you do that, it will most likely affect your TFS product supportability by Microsoft.

I state the procedure for informational reasons, especially the detection part. I would suggest that you contact Microsoft’s Product Support Services (PSS) if you run the queries and find that the counts are different. They will be able to assist you further in removing the bad records. You are risking doing more damage to your TFS installation if you elect to delete the records by yourself, especially if you are not a seasoned database developer or DBA.

A Query to Do All the Work

The following SQL statements summarize all the steps involved in identifying the corrupted TFS work item records along with their owners or the folks that created those work items.

 

USE Tfs_DefaultCollection;
 
DECLARE @WorkItemsLatestCount INT;
SELECT @WorkItemsLatestCount = COUNT(*)
FROM WorkItemsLatest;
 
DECLARE @WorkItemsAreCount INT;
SELECT @WorkItemsAreCount = COUNT(*)
FROM 
( SELECT DISTINCT ID
  FROM WorkItemsAre) AS UniqueWorkItems;
 
IF (@WorkItemsLatestCount = @WorkItemsAreCount)
  BEGIN
    PRINT 'There are no corrupted work items.'
  END
ELSE
  BEGIN
 
    DECLARE @CorruptedWorkItemsCount INT;
    SELECT @CorruptedWorkItemsCount = 
      @WorkItemsAreCount - @WorkItemsLatestCount;
    PRINT 'There is/are ' + 
      CONVERT(NVARCHAR, @CorruptedWorkItemsCount) + 
      ' corrupted work item(s):'
 
    -- List the corrupted work items
    SELECT 
        PersonNames.DisplayPart + ' [' + PersonNames.String + ']' 
          AS [Work Item Created By],
        CorruptedWorkItems.*
    FROM WorkItemsAre AS CorruptedWorkItems
    INNER JOIN Constants AS PersonNames
    ON (PersonNames.ConstID = CorruptedWorkItems.PersonId)
    WHERE ID = 
    (
        SELECT UniqueWorkItems.ID
        FROM 
        ( SELECT DISTINCT ID
          FROM WorkItemsAre) AS UniqueWorkItems
        LEFT OUTER JOIN WorkItemsLatest AS LatestWorkItems
        ON (LatestWorkItems.ID = UniqueWorkItems.ID)
        WHERE LatestWorkItems.ID IS NULL
    );
 
  END;
 

Notice that the first line uses the Tfs_DefaultCollection database. If you have a named project collection, use your database’s name instead.

The SQL statements above will either print “There are no corrupted work items” if everything looks good, or “There is/are n corrupted work item(s):” followed by the list of corrupted work items.

Note that the first column in the rows containing the corrupted work item records is the friendly name of the creator of the work item along with their domain name account. For example, if the work item was created by John Doe whose domain account is REDMOMD\JDoe, then the first column for his record will contain “John Doe [REDMOND\JDoe]”.

Deleting corrupted work items can be done – AT YOUR OWN RISK – by issuing the following SQL command for each record:

DELETE FROM WorkItemsAre WHERE ID=999999

Of course, the work item ID field will be different than 999999.

If you chose the record deleting route for any reason, please be smart and make sure you have a backup made before proceeding.

How Do the Work Items Get Corrupted

That is a good question, and I am still not sure of the answer. All I can say is that I ran into that scenario on a TFS installation that involved some manual work done to it by some SQL DBA’s in order to try and optimize indexes (not a good idea, let the TFS jobs do that).

It also possible for records to get orphaned as part of the upgrade process from TFS 2008 to TFS 2010. I am also not sure how that happens, but it might be during the migration of records from the old database, TfsWorkItemTracking, to the new project collection database, Tfs_YourProjectCollectionName.

Final Thoughts

In a future post, I will provide more details on other methods to detect corruption and orphaned records in TFS. But for now, this should be helpful in detecting one of the scenarios involving work item corruption.

Please feel free to send me any feedback or corrections.

Finally, again, I have to state this: Do not attempt to delete the corrupted records by yourself unless you really really really know what you are doing. If at all possible, contact PSS and they can help resolve this.

Published 08-29-2011 5:00 AM by Mohammad Jalloul
Powered by Community Server (Non-Commercial Edition), by Telligent Systems