Wednesday, March 19, 2008

Working with DataTables

I recently had to write a method that would take a single row from a DataTable and expand that entry based on a start and end date and then determine which rows intersected with a range of dates.

The process to accomplish this seemed fairly straight forward:

  • Clone the original table.
  • Populate the new table with the correct number of entries.
  • Add the new rows back that fall within the range of dates.

 

Original Item:

ItemId StartDate EndDate Time
12345 3/5/2008 3/8/2008 8:00 AM

 

Expanded Items:

ItemId StartDate EndDate Time
12345 3/5/2008 3/8/2008 8:00 AM
12345 3/6/2008 3/8/3008 8:00 AM
12345 3/7/2008 3/8/2008 8:00 AM
12345 3/8/2008 3/8/2008 8:00 AM

 

Expected Results for Range of 3/6/2008 to 3/8/2008

ItemId StartDate EndDate Time
12345 3/6/2008 3/8/3008 8:00 AM
12345 3/7/2008 3/8/2008 8:00 AM
12345 3/8/2008 3/8/2008 8:00 AM

 

When I originally started out with creating this method my thought was to create only the rows that would be needed. So I began writing a bunch of branching logic that would compare the start date and the range start, compare the range end and end date, etc.; trying to determine how many days needed to be created and what day to start with. It quickly became very complicated and difficult for me to determine what I was doing. So I went back to the drawing board and figured out that with the help of the DataTable and DataRow classes in .NET that I could just expand the entire original item and then just select the rows where that intersected with my date range.

 

Code

Here is version of the method that I ended up with, I basically just removed the business logic that was specific to our application.

 

        public DataTable ExpandRows(DataTable original, DateTime rangeStart, DateTime rangeEnd)

        {

            DataTable expanded = original.Clone();

 

            //Assumes only one row for demo purposes

            DateTime itemStart = DateTime.Parse(original.Rows[0]["StartDate"].ToString());

            DateTime itemEnd = DateTime.Parse(original.Rows[0]["EndDate"].ToString());

 

            int numDays = itemEnd.Subtract(itemStart).Days;

            for (int i = 0; i <= numDays; i++)

            {

                DataRow newRow = expanded.NewRow();

                newRow.ItemArray = original.Rows[0].ItemArray;

                newRow["StartDate"] = itemStart.AddDays(i);

                expanded.Rows.Add(newRow);

            }

 

            //select the rows that fall within the specified range of dates.

            DataRow[] addRows =

                expanded.Select(

                    string.Format("StartDate >= '{0}' AND StartDate <= '{1}'", rangeStart.ToShortDateString(),

                                  rangeEnd.ToShortDateString()));

 

            //Remove the original row,since we do not know if it falls within range.

            original.Rows.Remove(original.Rows[0]);

 

            //add expanded rows that fall within the range

            int numRowsToAdd = addRows.Length;

            for (int j = 0; j < numRowsToAdd; j++)

            {

                original.ImportRow(addRows[j]);

            }

 

            return original;

        }

Limiting Results

Because I was using this method on a search page and I wanted to limit the results, because if you have an item with a large start/end date span and as well as a large range start/end date span there cold be hundreds or thousands of rows in the table. To limit the results I used a technique that I found in the ASP.NET Forums that shows how to add a unique id to the table and only return the a specified number of rows. So here is the final code with the limit the number of rows. Below is the final code:

        public DataTable ExpandRows(DataTable original, DateTime rangeStart, DateTime rangeEnd, int maxRowCount)

        {

            DataTable expanded = original.Clone();

 

            //Assumes only one row for demo purposes

            DateTime itemStart = DateTime.Parse(original.Rows[0]["StartDate"].ToString());

            DateTime itemEnd = DateTime.Parse(original.Rows[0]["EndDate"].ToString());

 

            int numDays = itemEnd.Subtract(itemStart).Days;

            for (int i = 0; i <= numDays; i++)

            {

                DataRow newRow = expanded.NewRow();

                newRow.ItemArray = original.Rows[0].ItemArray;

                newRow["StartDate"] = itemStart.AddDays(i);

                expanded.Rows.Add(newRow);

            }

 

            //select the rows that fall within the specified range of dates.

            DataRow[] addRows =

                expanded.Select(

                    string.Format("StartDate >= '{0}' AND StartDate <= '{1}'", rangeStart.ToShortDateString(),

                                  rangeEnd.ToShortDateString()));

 

            //Remove the original row,since we do not know if it falls within range.

            original.Rows.Remove(original.Rows[0]);

 

            //add expanded rows that fall within the range

            int numRowsToAdd = addRows.Length;

            for (int j = 0; j < numRowsToAdd; j++)

            {

                original.ImportRow(addRows[j]);

            }

 

 

            if (maxRowCount > 0)

            {

                if (original.Rows.Count > maxRowCount)

                {

                    DataTable autoIds = new DataTable();

                    DataColumn column = new DataColumn();

                    column.DataType = typeof(Int32);

                    column.AutoIncrement = true;

                    column.AutoIncrementSeed = 1;

                    column.AutoIncrementStep = 1;

                    column.ColumnName = "AutoRowId";

                    autoIds.Columns.Add(column);

 

                    autoIds.Merge(original, true, MissingSchemaAction.AddWithKey);

 

                    DataTable truncated = original.Clone();

                    DataRow[] limitedRows = autoIds.Select(string.Format("AutoRowId<={0}", maxRowCount));

 

                    for (int z = 0; z < maxRowCount; z++)

                    {

                        truncated.ImportRow(limitedRows[z]);

                    }

 

                    return truncated;

                }

            }

 

            return original;           

        }

 

 

Lessons Learned

Here are some of the things that I learned along the way.

  • Sometimes you need to step back and reassess the path you have initially chosen, it may not always be the best one. And in this case I came up with something that ended up being much simpler and more efficient.
  • The importance of Unit Tests. I have been experimenting with using Test Driven Development (TDD) recently. So I was decided to switch my approach I was able to feel confident that the code was still working as expected, because I had immediate feedback as I made changes.
  • You cannot directly insert a datarow from one table into another. If you try to do this you will get the error:  "This row already belongs to another table". Instead, use ImportRow to copy the row into another DataTable.
  • I was using a DataView to do some sorting on the DataTable like the following,

                DataView dataView = new DataView(original, string.Empty, "StartDate ASC", DataViewRowState.OriginalRows);

However, because I was removing the original rows and adding new ones, the table was missing the original rows and the newly added ones; Because as the name states, it was looking at the original rows in the table... I needed to use the setting DataViewRowState.CurrentRows to display the rows as they currently existed in the table.

1 comment:

Anonymous said...

You're about the first person _ever_ to write something halfway decent about the topic of querying DataTables.
Kudos, you just made my week!