Sunday, November 30, 2014

SQL query

First of all, the root problem here is the database design. This column should have been several columns from the start. Or at least XML...


Next thing to observe is that as you trawl through the data, you will undoubtedly find rows where the format deviates. Maybe fields are in different order. Maybe colons are missing. Or maybe the space after the colon is missing. Or even worse, there are colons in the values, and in fact this is the case in your sample (the URL).


Kalman suggested to use a split function with ': ' as the separator, but that would give you values like Address, Newway 5 Zip Code, 52880 Place, so this is a non-started.


As Uri hints the code to crack this in T-SQL would be ugly, even we naïvely assume that this sample reflects all complications we can encounter.


No, there are better languages than T-SQL to deal with this mess. Assuming that it is not in your powers to change the table definition (but you could still raise the point to the power that be), your best bet might be to parse the columns client-side, if that fits with your application. Or write a CLR procedure in C# (or VB .NET) cracking the field into columns, using RegEx class and all the other powers of string manipulation in .NET which exceeds what T-SQL offers.





Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

No comments:

Post a Comment