loading...

A more controllable random string generator function for SQL Server

peledzohar profile image Zohar Peled Originally published at zoharpeled.wordpress.com ・3 min read

(First published on What the # do I know)

In a previous post, entitled How to pre-populate a random strings pool, I’ve written about an inline table-valued user defined function for generating random strings in SQL Server.

Now, following a stackoverflow question entitled SQL SERVER generate data using Regex pattern, I want to present an improvement to this function.

This new and improved function gives it’s user a better control over the random strings it generates – it has minimum and maximum length parameters to enable generating variable-length random strings, and it has another parameter called @CharType, which enables the user to control whether the function will generate lower-case chars, upper-case chars, digits, or any combination of the above.

It also returns one more column – which is basically a row-number, starting with 1. This enables the creation of multiple randomly generated columns using simple joins.

As before, it still uses a view to generate a guid, basing all randomness on that guid. So here’s the code:

CREATE VIEW dbo.GuidGenerator
AS
    SELECT Newid() As NewGuid;

And the function:

CREATE FUNCTION dbo.RandomStringGenerator
(
    -- the minimum length
    @MinLength int, 
    -- the maximum length
    @MaxLength int, 
    -- the maximum number of rows to return. Note: up to 1,000,000 rows
    @Count int, 
    -- 1, 2 and 4 stands for lower-case, upper-case and digits. 
    -- a bitwise combination of these values can be used to generate all possible combinations:
    -- 3: lower and upper, 5: lower and digis, 6: upper and digits, 7: lower, upper nad digits
    @CharType tinyint 
)
RETURNS TABLE
AS
RETURN
-- An inline tally table with 1,000,000 rows
WITH E1(N) AS (SELECT N FROM (VALUES (1), (2), (3), (4), (5), (6), (7), (8), (9), (10)) V(N)),   -- 10
     E2(N) AS (SELECT 1 FROM E1 a, E1 b), --100
     E3(N) AS (SELECT 1 FROM E2 a, E2 b), --10,000
     Tally(N) AS (SELECT ROW_NUMBER() OVER (ORDER BY @@SPID) FROM E3 a, E2 b) --1,000,000

SELECT TOP(@Count) 
        n As Number, 
        (
        SELECT  TOP (Length) 
                -- choose what char combination to use for the random part
                CASE @CharType 
                    WHEN 1 THEN Lower
                    WHEN 2 THEN Upper
                    WHEN 3 THEN IIF(Rnd % 2 = 0, Lower, Upper)
                    WHEN 4 THEN Digit
                    WHEN 5 THEN IIF(Rnd % 2 = 0, Lower, Digit)
                    WHEN 6 THEN IIF(Rnd % 2 = 0, Upper, Digit)
                    WHEN 7 THEN
                        CASE Rnd % 3
                            WHEN 0 THEN Lower
                            WHEN 1 THEN Upper
                            ELSE Digit
                        END
                END
        FROM Tally As t0  
        -- create a random number from a guid using the GuidGenerator view
        CROSS APPLY (SELECT Abs(Checksum(NewGuid)) As Rnd FROM GuidGenerator) As rand
        CROSS APPLY
        (
            -- generate a random lower-case char, upper-case char and digit
            SELECT  CHAR(97 + Rnd % 26) As Lower, -- Random lower case letter
                    CHAR(65 + Rnd % 26) As Upper,-- Random upper case letter
                    CHAR(48 + Rnd % 10) As Digit -- Random digit
        ) As Chars
        WHERE  t0.n <> -t1.n -- Needed for the subquery to get re-evaluated for each row
        FOR XML PATH('') 
        ) As RandomString
FROM Tally As t1
CROSS APPLY
(
    -- Select a random length between @MinLength and @MaxLength (inclusive)
    SELECT TOP 1 n As Length
    FROM Tally As t2
    CROSS JOIN GuidGenerator 
    WHERE t2.n >= @MinLength
    AND t2.n <= @MaxLength
    AND t2.n <> t1.n
    ORDER BY NewGuid
) As Lengths;

One usage of this function is to create tables with random values, as demonstrated here:

DECLARE @Count int = 10; 

SELECT  CAST(IntVal.RandomString As Int) As IntColumn, 
        UpVal.RandomString as UpperCaseValue, 
        LowVal.RandomString as LowerCaseValue, 
        MixVal.RandomString as MixedValue
FROM dbo.RandomStringGenerator(3, 7, @Count, 4) As IntVal
JOIN dbo.RandomStringGenerator(10, 10, @Count, 1) As LowVal
    ON IntVal.Number = LowVal.Number
JOIN dbo.RandomStringGenerator(5, 10, @Count, 2) As UpVal
    ON IntVal.Number = UpVal.Number
JOIN dbo.RandomStringGenerator(10, 20, @Count, 7) As MixVal
    ON IntVal.Number = MixVal.Number

Which will result in a 4 columns table, each with the specified type of random value:

IntColumn   UpperCaseValue  LowerCaseValue  MixedValue          
674         CCNVSDI         esjyyesesv      O2FAC7bfwg2Be5a91Q0 
30732       UJKSL           jktisddbnq      7o8B91Sg1qrIZSvG3AcL
4669472     HDLJNBWPJ       qgtfkjdyku      xUoLAZ4pAnpn        
26347       DNAKERR         vlehbnampb      NBv08yJdKb75ybhaFqED
6084965     LJPMZMEU        ccigzyfwnf      MPxQ2t8jjmv0IT45yVcR
6619851     FEHKGHTUW       wswuefehsp      40n7Ttg7H5YtVPF     
781         LRWKVDUV        bywoxqizju      UxIp2O4Jb82Ts       
52237       XXNPBL          beqxrgstdo      Uf9j7tCB4W2         
876150      ZDRABW          fvvinypvqa      uo8zfRx07s6d0EP     

And since this is still an inline table-valued function, performance are pretty darn good. Tests I’ve made shown creation of a 5 columns table with 1,000 rows average time of just half a second.

You can see a live demo on rextester.

Posted on by:

peledzohar profile

Zohar Peled

@peledzohar

By day, try to work. By night, try to sleep.

Discussion

pic
Editor guide