Today I want to show off a quick trick I learned many years ago that is useful to this day.
The Scenario: you have a spreadsheet, CSV file, or some other delimited flat file and need to load that into a database. There are many options to load data into SQL Server; however, a quick method I am fond of is to use Microsoft Excel and manipulate the data to craft insert statements then paste into SSMS and execute.
Read on to see how to parse and load a CSV flat file, character delimited flat file, and columnar text.
Surprises exist in SQL Server. There are “gotchas” to be aware of. One thing that can trip up your code is how tables, indexes, and other objects are named. They’re called identifiers. If you step outside the bound of a regular identifier then you must be careful how it is referenced.
For example adding a datetimestamp suffix to a table – let’s see how this can cause trouble.
Null marks are a fact of life in relational databases. Sometimes it is good to revisit the basics of how they are treated in basic system functions and T-SQL because they can be tricky. Look at the following scenarios of how null marks are treated different than the unsuspecting eye might see.
Using T-SQL to insert, update, or delete large amounts of data from a table will results in some unexpected difficulties if you’ve never taken it to task.
Let’s say you have a table in which you want to delete millions of records. If the goal was to remove all then we could simply use TRUNCATE. However, if we want to remove records which meet some certain criteria then executing something like this will cause more trouble that it is worth.
Why not run the following in a production environment?
Simple delete - no concern for size
Transaction log growth – expect a large transaction like this to cause the transaction log to expand. Calls to expand the database files are some of the most expensive operations SQL Server makes to the OS. Hopefully the initial sizing was good and the growth settings are reasonable.
Blocking – concurrency is going to increase the likelihood of lock escalation. That will lead to the possibility of excessive blocking.
Rollbacks – if the SPID is killed then the transaction will enter a rollback state. For very large transactions this can take a lot of time. Don’t even think about killing or stopping SQL Services during a rollback unless you want a corrupt database!
What can we do to improve this?
We follow an important maxim of computer science – break large problems down into smaller problems and solve them.
We break up the transaction into smaller batches to get the job done. Each of the above points can be relived in this manner. Let’s take a look at some examples.