Begin automatic row versioning on an existing table (Temporal Tables part 2)

In part 1 I showed in a video how to create a new, temporal table for row versioning in Sql Server. In this post, I will show how to convert an existing table to a system versioned (temporal) table.

CREATE TABLE dbo.PriceList(
	ProductID INT NOT NULL CONSTRAINT PK_PriceList PRIMARY KEY CLUSTERED,
	ListPrice MONEY NOT NULL
);

INSERT INTO dbo.PriceList (ProductID, ListPrice)
VALUES
	(1,10),
	(2,11),
	(3,12);

We start with this small table, same as we used in part 1. Only this table already contains data. Same principles apply to existing tables, we need to add the two columns for start- and end-dates.

So we could try this.

ALTER TABLE dbo.PriceList ADD 
	RowStart DATETIME2(7) GENERATED ALWAYS AS ROW START,
	RowEnd DATETIME2(7) GENERATED ALWAYS AS ROW END,
	PERIOD FOR SYSTEM_TIME(RowStart, RowEnd);

But that gives us this result

Msg 4901, Level 16, State 1, Line 12
ALTER TABLE only allows columns to be added that can contain nulls, or have a DEFAULT definition specified, or the column being added is an identity or timestamp column, or alternatively if none of the previous conditions are satisfied the table must be empty to allow addition of this column. Column 'RowStart' cannot be added to non-empty table 'PriceList' because it does not satisfy these conditions.

So we can do this different ways. We could add the columns as normal columns, set values to them and then decorate them with “Generated always”. Instead, I’m going to add two default-constraints to the columns. RowStart can be set to whatever you find suitable. Current date and time perhaps? I’m going with 1900-01-01, to indicate the actual start-date is unknown.

ALTER TABLE dbo.PriceList ADD 
	RowStart DATETIME2(7) GENERATED ALWAYS AS ROW START CONSTRAINT DF_PriceList_RowStart DEFAULT '1900-01-01',
	RowEnd DATETIME2(7) GENERATED ALWAYS AS ROW END CONSTRAINT DF_PriceList_RowEnd DEFAULT '9999-12-31 23:59:59.9999999',
	PERIOD FOR SYSTEM_TIME(RowStart, RowEnd);

More important than the value for RowStart is the value for RowEnd. It has to be the max available value for the data-type. Since I’m using datetime2 with a high resolution, I need to set the date with seven decimals (three for milliseconds, three for microseconds and 1 for 100 nanoseconds).

Now I have prepared the table for system versioning. All that’s left to do is to alter the table to make it system versioned.

ALTER TABLE dbo.PriceList SET (SYSTEM_VERSIONING=ON(HISTORY_TABLE=dbo.PriceList_History));

And now, if we want to, we can drop the default constraints. They are kind of redundant, since the columns are decorated with Generated Always.

ALTER TABLE dbo.PriceList DROP CONSTRAINT DF_PriceList_RowStart;
ALTER TABLE dbo.PriceList DROP CONSTRAINT DF_PriceList_RowEnd;

And finally, let’s change values in the main table.

UPDATE dbo.PriceList SET ListPrice = 100 WHERE ProductID = 1;

If we now check the main table and the history table, we can see that the automatic row versioning is indeed in effect.

SELECT * FROM dbo.PriceList WHERE ProductID = 1;
SELECT * FROM dbo.PriceList_History WHERE ProductID = 1;

Done!

I hope you found this post useful. In part 3, we will look at how we can convert existing temporal pattern tables and make them automagically system versioned, temporal tables. That is: Tables where we have data in both a main table and a history table.

Temporal Tables for Row Versioning with Transact-SQL (part 1)

I’m working on a video series about Temporal Tables. In this first video I’m showing how to create a table which has System Versioning switched on. You might have heard the word Temporal Tables. What it means is that System Versioning is switched on for the table, and that all changes to the table are recorded in its history table.

Part 1 of my Temporal Tables video series.

In upcoming videos and blog posts, we’re going to look at some more complex scenarios, such as

  • Add system versioning to an existing table
  • Switch from home-cooked versioning to automatic system versioning
  • Use the FOR SYSTEM_TIME features in Transact-SQL
  • Performance considerations for temporal tables

Do you have any specific use cases for temporal tables that you want me to bring up in this series? Drop a comment here, or on the YouTube video, and we’ll see what we can do.

Video: T-SQL to Generate Time Slots. FAST, with a Tally Table

In my previous post, I demonstrated how we can use a Tally Table, or a numbers table, to generate all possible combinations of four letter codes from the letters A-Z.

Today, I have another example of the usage of Tally Tables. I have used this approach many times, to generate a calendar dimension for a Data Warehouse. Or more often, when I need to do reporting with fixed periods (hourly, daily, weekly etc) but there’s no underlying data for some of the periods.

In the video, I show how the technique with generating variable length time slots work.

As in my previous post, I have finalised the code and made a function that you can use. It’s an inline table valued function, so you can join or cross apply to it without having to worry too much about performance issues with it. Having said that, cardinality estimation may or may not be a performance issue for you if you use this function to drive eg a report. Because SQL Server might make some funky assumptions about the number of rows returned from the query. But let’s first look at the code for the actual function.

CREATE FUNCTION dbo.GenerateTimeSlots(@Start DATETIME, @End DATETIME, @IntervalLengthSeconds INT)
RETURNS TABLE AS
RETURN(
WITH ten AS(
	SELECT n FROM (VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t(n)
), millions AS (
	SELECT 
	TOP(DATEDIFF(SECOND,@start,@end)/@IntervalLengthSeconds)
	ROW_NUMBER() OVER(ORDER BY (SELECT 'no specific order')) AS n 
	FROM ten t1
	CROSS JOIN ten t2
	CROSS JOIN ten t3 
	CROSS JOIN ten t4 
	CROSS JOIN ten t5 
	CROSS JOIN ten t6
	ORDER BY n
)SELECT 
	DATEADD(SECOND,@IntervalLengthSeconds * (n-1),@start) AS TimeSlotStart, n AS TimeSlotNumber
FROM millions);

Depending on how you call this function, you may or may not suffer from bad cardinality estimation. Let’s start with an example where SQL Server estimates cardinality properly

SELECT * FROM dbo.GenerateTimeSlots('2020-03-11',CURRENT_TIMESTAMP, 3600);

In the above call, SQL Server estimates 13255 rows to be returned from the query. Which turns out to be exactly right (depending on when you run it of course, since I’m using CURRENT_TIMESTAMP for the end-date).

Now, try calling the function with these lines of code.

DECLARE @IntervalLengthSeconds INT=3600;
SELECT * FROM dbo.GenerateTimeSlots('2020-03-11',CURRENT_TIMESTAMP,@IntervalLengthSeconds)

This time, SQL Server estimates 100 rows to be returned from the query. Which of course isn’t correct. It will still return 13255 rows. This is because SQL Server sees a variable for the @IntervalLengthSeconds parameter and doesn’t know the value of the parameter.

If you WILL use a variable for any of the parameters, and you know how many time slots you _most often_ will return from the query, you can make an adjustment to add your own estimation using the query hint OPTIMIZE FOR. Like this:

DECLARE @IntervalLengthSeconds INT=3600;
SELECT * FROM dbo.GenerateTimeSlots('2020-03-11',CURRENT_TIMESTAMP,@IntervalLengthSeconds)
OPTION(OPTIMIZE FOR(@IntervalLengthSeconds=3600));

In the above example, this of course looks a bit stupid, because I set the variable value to a constant and then optimize for the variable to have that value. It would be much easier to just call the function using that constant. But you might pull up that value from a table in a previous statement, and in that case, you can use this technique to force a certain value, for the cardinality estimation.

Happy playing with Tally Tables!

Video and Stored Procedure: Generate N-length codes with T-SQL and a Tally Table

This is my first attempt at short Transact-SQL videos. I was answering a question on a SQL Server related Facebook group. The question was about generating four-letter codes containing all possible combinations of letters A-Z. There are 26 letters in the english alphabet. Meaning the T-SQL code should generate 26^4 rows, or 456,976 rows.

I immediately thought this is great fit for a Tally Table solution (Read more in this article by Jeff Moden on SQL Server Central: The “Numbers” or “Tally” Table: What it is and how it replaces a loop – SQLServerCentral). And also that it would be a nice fit or a quick demo-video.

So first of all, here’s the video.

But I worked a bit further on the solution, and created a stored procedure which with parameters can create any length strings for the codes and also persist the data in a table instead of returning the rows to the caller.

What I’ve done is generate a string with SQL-code, based on the parameters, and then execute the SQL-code with the system stored procedure sys.sp_executesql.

CREATE OR ALTER PROC dbo.GenerateLetterStrings
(
    @LetterCount TINYINT,
    @ResultTable NVARCHAR(128) = NULL
)
AS
BEGIN
    SET NOCOUNT ON;
    DECLARE @sql NVARCHAR(MAX)
        = N'
WITH alphabet AS (
	SELECT CHAR(ASCII(''A'')+ ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) -1) AS n 
	FROM (VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t(n)
)SELECT 
	<<selectlist>> AS word
	<<intoclause>>
FROM alphabet AS t0
<<crossjoin>>
'   ;
    DECLARE @SelectList NVARCHAR(MAX) = N't0.n',
            @CrossJoin NVARCHAR(MAX) = N'';
    WITH eight
    AS (SELECT n
        FROM
        (
            VALUES
                (1),
                (1),
                (1),
                (1),
                (1),
                (1),
                (1),
                (1)
        ) t (n) ),
         twofiftysix
    AS (SELECT TOP (@LetterCount - 1)
               ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS n
        FROM eight
            CROSS JOIN eight e2
        ORDER BY 1)
    SELECT @CrossJoin = @CrossJoin + CONCAT('CROSS JOIN alphabet as t', n, '
		'),
           @SelectList = @SelectList + CONCAT('+t', n, '.n')
    FROM twofiftysix;
    SET @sql = REPLACE(@sql, N'<<selectlist>>', @SelectList);
    SET @sql = REPLACE(@sql, N'<<crossjoin>>', @CrossJoin);
    SET @sql = REPLACE(@sql, N'<<intoclause>>', COALESCE(N'INTO ' + @ResultTable, N''));
    EXEC sys.sp_executesql @sql;
END;

Duplicate key in sysclsobjs using DBCC CLONEDATABASE

This blog post is about an error message I got the other day when using DBCC CLONEDATABASE in a T-sql-script. But first some background to DBCC CLONEDATABASE.

I was pretty excited about the DBCC CLONEDATABASE command, which was introduced in SQL Server 2014 SP2 and SQL Server 2016 SP1. It creates a schema-only (that means all the database objects, but no data) copy of a database, keeping all statistics data, so that you can troubleshoot Query plans for certain queries without having to copy all the data. Before DBCC CLONEDATABASE (and to be honest probably also afterwords, DBCC CLONEDATABASE doesn’t replace all the needs) one had to make a full copy of a database to get the statistics data along. That’s usually copied to a test box. If the test box is identical to your production box, you’re almost fine. But on your test box, you don’t have the cached execution plans from the production box. Therefore, you might end up with very different Query plans in your test box. With DBCC CLONEDATABASE, you get a readonly copy of a database, on your production box and you can use that to tweak your queries and see what new estimated execution plans they get.

Continue reading “Duplicate key in sysclsobjs using DBCC CLONEDATABASE”

OPTION(USE HINT) – New SQL Server 2016 SP1 feature

Many SQL Server developers and admins found, after upgrading to SQL Server 2014, that some queries started taking much longer time than before. The reason is the new cardinality estimation formula which was introduced in SQL Server 2014. Cardinality Estimation is done all the time by the SQL Server optimizer. To produce a Query plan, the optimizer makes some assumptions about how many rows exist for each condition in the table. In most cases, the new cardinality estimation formula in SQL Server 2014 and onwards gives slightly better estimates and the optimizer therefore produces slightly better plans. In some cases however, mostly when there are predicates on more than one column in a WHERE clause or JOIN clause, the 2014 cardinality estimation is a lot worse than in previous versions of SQL Server.

Continue reading “OPTION(USE HINT) – New SQL Server 2016 SP1 feature”

Archiving with the OUTPUT clause

Most database developers have been faced with the task to archive old data. It could look something like this:

CREATE TABLE dbo.Cars(
  CarID int identity(1,1) PRIMARY KEY,
  BrandName varchar(100),
  ModelName varchar(100),
  releaseYear smallint
);

CREATE TABLE dbo.Cars_Archive(
  CarID int,
  BrandName varchar(100),
  ModelName varchar(100),
  releaseYear smallint,
  ArchivedDateTime datetime DEFAULT CURRENT_TIMESTAMP,
  CONSTRAINT PK_Cars_Archive PRIMARY KEY(CarID, ArchivedDateTime)
)

And updating a row would often require a stored procedure and some explicit transactions

Continue reading “Archiving with the OUTPUT clause”

Generate time slots

Here’s an Inline Table Valued Function (TVF) for generating time-slots from a start-date to an end-date, given a certain time for each slot, given in minutes.

This would be useful for many applications, like scheduling systems, sales statistics broken down into certain slices of time etc. The function does have some limitations, eg there can’t be more than 100.000 minutes between start and endtime. This is easily fixed by just adding Another CROSS JOIN to CTE2, or by changing the DATEADD-functions to use hour instead of minute if that fits your purpose.

Continue reading “Generate time slots”