31

We use Guid's for primary key, which you know is clustered by default.

When inserting a new row into a table it is inserted at a random page in the table (because Guid's are random). This has a measurable performance impact because the DB will split data pages all the time (fragmentation). But the main reason I what a sequential Guid is because I want new rows to be inserted as the last row in the table... which will help when debugging.

I could make a clustered index on CreateDate, but our DB is auto generated and in development, we need to do something extra to facilitate this. Also CreateDate is not a good candidate for a clustered index.

Back in the day I used Jimmy Nielsons COMB's, but I was wondering if there is something in the .NET framework for this. In SQL 2005 Microsoft introduced newsequentialid() as an alternative to newid(), so I was hoping that they made a .NET equivalent, because we generate the ID in the code.

PS: Please don't start discussing if this is right or wrong, because GUID's should be unique etc.

  • following @edg's comments, I really wonder why keeping records "ordered" makes sense or is of any interest to you. Aren't you trying here to solve an artificial problem or constraint? – Philippe Grondier Oct 18 '08 at 23:05
  • 4
    I'm pretty sure that the second paragraph (Third if you count "Hi") established exactly why he wants them in order. I wish people would stick to answering the actual question. – Mel Jan 6 '09 at 18:42
  • 1
    +1 for the last sentence. – nawfal May 8 '13 at 3:38

11 Answers 11

24

It should be possible to create a sequential GUID in c# or vb.net using an API call to UuidCreateSequential. The API declaration (C#) below has been taken from Pinvoke.net where you can also find a full example of how to call the function.

[DllImport("rpcrt4.dll", SetLastError=true)]
static extern int UuidCreateSequential(out Guid guid);

The MSDN article related to the UuidCreateSequential function can be found here which includes the prerequisites for use.

  • 2
    I marked this as the correct answer. This is not what I was after though. I wanted a COMB, and I orginally thougt that SequentialId's was the same as comb. That is... the Question my description is not aligned. But it seams that this is the correct answer to the question. – Thomas Jespersen Aug 14 '10 at 11:48
  • 1
    Beware that this sequential guid is not sequential in SQL Server as it uses another ordering algorithm, so using COMBs is a better option – Marc Climent Nov 10 '17 at 12:26
18

Update 2018: Also check my other answer

This is how NHibernate generate sequantial IDs:

NHibernate.Id.GuidCombGenerator

/// <summary>
/// Generate a new <see cref="Guid"/> using the comb algorithm.
/// </summary>
private Guid GenerateComb()
{
    byte[] guidArray = Guid.NewGuid().ToByteArray();

    DateTime baseDate = new DateTime(1900, 1, 1);
    DateTime now = DateTime.Now;

    // Get the days and milliseconds which will be used to build the byte string 
    TimeSpan days = new TimeSpan(now.Ticks - baseDate.Ticks);
    TimeSpan msecs = now.TimeOfDay;

    // Convert to a byte array 
    // Note that SQL Server is accurate to 1/300th of a millisecond so we divide by 3.333333 
    byte[] daysArray = BitConverter.GetBytes(days.Days);
    byte[] msecsArray = BitConverter.GetBytes((long) (msecs.TotalMilliseconds / 3.333333));

    // Reverse the bytes to match SQL Servers ordering 
    Array.Reverse(daysArray);
    Array.Reverse(msecsArray);

    // Copy the bytes into the guid 
    Array.Copy(daysArray, daysArray.Length - 2, guidArray, guidArray.Length - 6, 2);
    Array.Copy(msecsArray, msecsArray.Length - 4, guidArray, guidArray.Length - 4, 4);

    return new Guid(guidArray);
}
  • Like that it's based on Nilson's implementation and it's used by a well used library – bbqchickenrobot Jun 23 '13 at 21:55
  • @Gian Marco Gherardi:Why does the algorithm only changes the latest section of guid? – Mohsen Apr 29 '17 at 13:06
9

Perhaps a simple way to determine the order in which rows have been added would be to add an IDENTITY column to the table, avoiding the need to keep your GUIDS in order and hence avoiding the performance hit of maintaining a clustered index on the GUID column.

I can't help but wonder how keeping these rows in order helps you when debugging. Could you expand that a bit?

  • Wouldn't that add an extra column to my table? A column which is of no other use, and not a good choice for a clustered index! Regarding debugging: When I select * form my table I get the rows in random order. If I get the latest row in the bottom, I don't have do do any sorting every time I do so. – Thomas Jespersen Oct 17 '08 at 9:32
  • Why the fixation on having a Clustered Index? Please explain how this helps you with debugging. SELECT * FROM TABLE ORDER BY my_ident_column is not so hard, is it? – Ed Guiness Oct 17 '08 at 9:46
  • 2
    The standard way of handling this is to have an IDENTITY column. Yes, it'll add an extra column to your table, but that shouldn't be a big deal. It takes up 4 to 8 bytes a row depending on your setup. It's actually the perfect choice for a clustered index! – nathaniel Oct 17 '08 at 14:57
4

It's important to note that the UUIDs generated by UuidCreateSequential will not be sequential when ordered by SQL Server.

  • SQL Server follows the RFC when it comes to sorting UUIDs
  • the RFC got it wrong
  • UuidCreateSequential did it right
  • but UuidCreateSequential creates something different from what SQL Server expects

Background

The Type 1 UUIDs created by UuidCreateSequential don't sort in SQL Server.

SQL Server's NewSequentialID uses UuidCreateSequential, with some byte shuffling applied. From the Books Online:

NEWSEQUENTIALID (Transact-SQL)

NEWSEQUENTIALID is a wrapper over the Windows UuidCreateSequential function, with some byte shuffling applied

which then references an MSDN blog post:

How to Generate Sequential GUIDs for SQL Server in .NET (archive)

public static Guid NewSequentialId()
{
   Guid guid;
   UuidCreateSequential(out guid);
   var s = guid.ToByteArray();
   var t = new byte[16];

   t[3] = s[0];
   t[2] = s[1];
   t[1] = s[2];
   t[0] = s[3];

   t[5] = s[4];
   t[4] = s[5];
   t[7] = s[6];
   t[6] = s[7];
   t[8] = s[8];
   t[9] = s[9];
   t[10] = s[10];
   t[11] = s[11];
   t[12] = s[12];
   t[13] = s[13];
   t[14] = s[14];
   t[15] = s[15];

   return new Guid(t);
}

It all starts with the number of ticks since 1582-10-15 00:00:00 (October 15, 1592, the date of Gregorian reform to the Christian calendar). Ticks is the number of 100 ns intervals.

For example:

  • 12/6/2017 4:09:39 PM UTC
  • = 137,318,693,794,503,714 ticks
  • = 0x01E7DA9FDCA45C22 ticks

The RFC says that we should split this value into three chunks:

  • UInt32 low (4 bytes)
  • Uint16 mid (2 bytes)
  • UInt32 hi (2 bytes)

So we split it up:

0x01E7DA9FDCA45C22

|   Hi   |   Mid  |    Low     |
|--------|--------|------------|
| 0x01E7 | 0xDA9F | 0xDCA45C22 |

And then the RFC says that these three integers should be written out in the order of:

  • Low: 0xDCA45C22
  • Mid: 0xDA9F
  • High: 0x01E7

If you follow the RFC, these values must be written in big-endian (aka "network byte order"):

DC A4 5C 22 DA 9F x1 E7 xx xx xx xx xx xx xx xx

This was a bad design, because you cannot take the first 8 bytes of the UUID and treat them either as a big-endian UInt64, nor as a little-endian UInt64. It's a totally dumb encoding.

UuidCreateSequential gets it right

Microsoft followed all the same rules so far:

  • Low: 0xDCA45C22
  • Mid: 0xDA9F
  • High: 0x1E7

But they write it out in Intel little-endian order:

22 5C A4 DC 9F DA E7 x1 xx xx xx xx xx xx xx xx

If you look at that, you've just written out a little-endian Int64:

225CA4DC9FDAE701

Meaning:

  • if you wanted to extract the timestamp
  • or sort by the timestamp

it's trivial; just treat the first 8 bytes as a UInt64.

With the RFC, you have no choice but to perform all kinds of bit fiddling. Even on big-endian machines, you can't treat the 64-bit timestamp as a 64-bit timestamp.

How to reverse it

Given a little-endian guid from UuidCreateSequential:

DCA45C22-DA9F-11E7-DDDD-FFFFFFFFFFFF

with the raw bytes of:

22 5C A4 DC 9F DA E7 11 DD DD FF FF FF FF FF FF

This decodes into:

Low      Mid  Version High
-------- ---- ------- ---- -----------------
DCA45C22-DA9F-1       1E7 -DDDD-FFFFFFFFFFFF
  • Low: 0xDCA45C22
  • Mid: 0xDA9F
  • High: 0x1E7
  • Version: 1 (type 1)

We can write this back out in RFC big-endian order:

DC A4 5C 22 DA 9F 11 E7 DD DD FF FF FF FF FF FF

Short version

               |   Swap      | Swap  | Swap  | Copy as-is
Start index    |  0  1  2  3 |  4  5 |  6  7 | 
End index      |  3  2  1  0 |  5  4 |  7  6 | 
---------------|-------------|-------|-------|------------------------ 
Little-endian: | 22 5C A4 DC | 9F DA | E7 11 | DD DD FF FF FF FF FF FF
Big-endian:    | DC A4 5C 22 | DA 9F | 11 E7 | DD DD FF FF FF FF FF FF
  • Why would anyone want to sort by Guid? – Mariusz Jamro Jan 20 '18 at 13:45
  • 1
    @MariuszJamro You want newer rows to be physically stored after older rows. When older data occupies separate pages in the database, the limited RAM can more effectively be used to cache currently used data. See also: UUID "Version 6": The version RFC 4122 forgot – Ian Boyd Jan 20 '18 at 16:19
  • 1
    Ahh so it's all about how SQL Server works internally. Thanks for clarification – Mariusz Jamro Jan 20 '18 at 16:33
2

Unfortunatley, no there isn't a .NET equivalent to newsequentialid(). You could continue using a Comb. I actually have a C# implementation of a Comb somewhere...I'll see if I can dig it up.

2

Here is the C# code to generate a COMB GUID.

byte[] guidArray = System.Guid.NewGuid().ToByteArray();

DateTime baseDate = new DateTime(1900, 1, 1);
DateTime now = DateTime.Now;

// Get the days and milliseconds which will be used to build the byte string 
TimeSpan days = new TimeSpan(now.Ticks - baseDate.Ticks);
TimeSpan msecs = new TimeSpan(now.Ticks - (new DateTime(now.Year, now.Month, now.Day).Ticks));

// Convert to a byte array 
// Note that SQL Server is accurate to 1/300th of a millisecond so we divide by 3.333333 
byte[] daysArray = BitConverter.GetBytes(days.Days);
byte[] msecsArray = BitConverter.GetBytes((long)(msecs.TotalMilliseconds / 3.333333));

// Reverse the bytes to match SQL Servers ordering 
Array.Reverse(daysArray);
Array.Reverse(msecsArray);

// Copy the bytes into the guid 
Array.Copy(daysArray, daysArray.Length - 2, guidArray, guidArray.Length - 6, 2);
Array.Copy(msecsArray, msecsArray.Length - 4, guidArray, guidArray.Length - 4, 4);

return new System.Guid(guidArray);
  • 2
    I marked this as the "accepted answer" even though I haven't tested it. But it looks right. If anyone can tell me that this is wrong please, write a comment and I will undo this. PS: It seams to me that many don't se the benefit of GUID's as primary keys. Here is a two: 1. When createing objects like Order and OrderLine in the same Unit Of Work, one can set the OrderLine.OrderId before ever talking to the database. Thus mark OrderId as immutable. 2. If you have a project where you need to merge or migrate databases guid are great because you don't have to do integer manipulation. – Thomas Jespersen Aug 22 '09 at 16:43
  • 1
    I use this in my current project ComicsInventory.com and it works like a charm. Also I believe another benefit of using GUIDs is that you can use them in your client side code and not have to worry about someone figuring out the next id. – Donny V. Aug 23 '09 at 21:33
  • 1
    This does not create a globally unique ID because it does not involve the network card. John's answer below is correct. – joshcomley Aug 27 '09 at 15:27
1

For the people who are specially using the Entity Framework, You can keep a stored procedure in the server to generate a New sequential ID and return the ID. Then you can use that sequential ID to populate your another Table. I think This should work.

  • 1
    Stored procedures can be used without EF also i believe. Anyway, that will incur a round-trip cost to a database. Isn't it? – Khadim Ali Oct 17 '15 at 16:28
1

You can use the tiny NewId library for this.

Install it via NuGet:

Install-Package NewId

And use it like this:

Guid myNewSequentialGuid =  NewId.NextGuid();

See Project Page on GitHub

0

The key problem is knowing what the last value was in a .NET application. SQL Server keeps track of this for you. You will need to hold the last value yourself and use the Guid constructor with a byte array containing the next value. Of course, on a distributed application this probably isn't going to help and you may have to use the randomised Guids. (Not that I see anything wrong with this.)

http://msdn.microsoft.com/en-us/library/90ck37x3.aspx

  • I didn't realize what the newsequentialId() is really doing. I don't need them to be 100% sequential. I just need the guid created at one point in time to have a lower sorting than one created later. Jimmy's COMB is using current time to do this and therefore I don't have to keep track of the ids. – Thomas Jespersen Oct 17 '08 at 9:49
0

I've been lead to believe that random Guids can be beneficial to performance in some use cases. Apparently inserting to random pages can avoid contention that would otherwise occur in the end page when multiple people are trying to insert at the same time.

John's PInvoke suggestions is probably the closest to SQL's version but the UUidCreateSequential docs state that you shouldn't use it to identify an object that it's strictly local to the machine generating the Guid.

I'd be measuring the actual use case performance hit with realistic data in realistic quantities before I investigated sequential Guid generation any further.

  • I think "avoiding contention" in the last page pales in comparison to the complete and utter disaster this makes of any attempts to index a table using the uuid as the clustered index. – Mel Jan 6 '09 at 18:54
0

About the selected answer. The docs says... The generated Guid will not give you uniqueId between computers if they don't have ehternet access.

If you must know the guid when inserting, couldn't you let Sql-server return a block of sequential guids that you assign to your data before you insert them?

declare @ids table(id uniqueidentifier default NEWSEQUENTIALID(), dummy char(1))

declare @c int
set @c = 0;
while (@c < 100)
begin
    insert into @ids (dummy) values ('a');
    set @c += 1;
end

select id from @ids

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy

Not the answer you're looking for? Browse other questions tagged or ask your own question.