Here, I will include various little tid-bits of interesting information that I pick up along my journey of SQL Server Discovery. This page will serves as an index to relevant posts, and more details may be found by clicking the index links, or going to the ::Menu to the right.
1. Copying / Moving Files using SQL Server Agent Jobs
Moving backup files from one location to another is a common DBA task. Obviously we could do this manually, but our goal as DBAs is to automate as much of the monkey work as possible, so that we can get on with the really interesting challenges (or just go and get a coffee).
2. MCSA Exam Progress
Now that I am embarking on the exciting career path of a SQL Server DBA, obtaining my Microsoft MCSA certification is top priority. This series of blogs will track my progress through the practice exams as I go.
3. Restoring and Moving Databases
I am practicing backup and restore techniques for SQL Server databases. There are three scenarios that I find myself constantly looking up, so thought I would blog them here. Restore with MOVE, Restore from production, Simple Restore.
4. Connection Pooling, a DBA’s perspective
Error: 18056, Severity: 20, State: 29. The client was unable to reuse a session with SPID 56, which had been reset for connection pooling.
On any normal day, we receive this error with regular frequency. Our standard response is to register it and move on. This error would not fall under the umbrella of a typical DBA, but if it is important enough to be logged by SQL Server then it is important to me to understand it. By understanding the events that lead to the error I can make an informed decision as to how to deal with this based on whether-or-not it impacts the performance of SQL Server.
5. SQL Server & Lock Pages In Memory
“SQL Server has encountered “X” occurrences of I/O requests taking longer than 15 seconds to complete on file “Y”…”.
All of a sudden today, our system has been flooded with this error from one of our clients – so I jumped on to have a look. What I found was a slightly controversial debate on the use of “Local Pages In Memory”.
There are times when you need to apply a routine to all rows in a table. T-SQL provides a handful of declarative routines for transforming, filtering and aggregating data; and syntactic sugar like CROSS APPLY allows you to interweave function application into queries. However because T-SQL functions are stateless, you can’t use them to update, insert or delete data directly or call external procedures. I was surprised that there is no standard syntax for applying stored procedures in the same manner that CROSS APPLY does for functions. sp_Map and sp_ForEach provide a mechanism to invoke stored procedures on each row in a small table.
I sat (and passed!) the 462 exam today. Yay! Glad to have it done, but also glad to have gone through it. It was a lot harder than I was expecting. It was stressful, challenging and most of all it was a really important eye-opener. Let me explain what I mean…
I have been thinking about the various specialisations and opportunities within the world of data, and there are two areas that really appeal to me: data analysis and ETL. As part of our KPI package, we are all allowed to choose a personal KPI. The personal KPI should be in an area of interest to us, that will give us the opportunity to develop our skills in this area. For my personal KPI, I have chosen to explore ETL and SSIS.
Replication is becoming increasingly common. Recently at work, it seems that nearly all new clients have replication as part of their environment, such that replication is becoming a part of our everyday life here and I can no longer leave it to the more experienced DBAs in the team. So this afternoon I have determined to do SQL Server Central’s Stairway to SQL Server Replication.
Most DBAs have a healthy number of test labs on hand. These are great for experimenting in areas your are unfamiliar with, or simply a comfort net if you want to really dig deep into a topic. However, I think the traditional test lab is limited and suggest that encrypted stored procedures may help to enhance the learning experience by enabling scenario-based approach to labs.
My first foray into SQL Server Data Tools left me sitting on the fence as to its usefulness (see my previous post, Beginning SSIS). As I dig into the literature on data integration, I am beginning to appreciate the enormity of the problems faced and realise that my initial doubts about SSIS and SQL Server Data Tools were probably quite naive. The real benefit of such tools is their ability to scale gracefully as the size and complexity of the project increases.
Recently I got the chance to do some query tuning for a client. The client had contacted us because their users were experiencing particularly slow performance running reports from their BI Dashboard. Their internal teams thoroughly reviewed their network and SSRS packages and couldn’t find anything untoward. When they dug deeper into one particularly slow report, they realised that this report was quite heavily SQL-based and they asked us to investigate. After reviewing the server activity and CPU / memory performance and not finding anything to be concerned about, I turned my attention to the main query within the report and found a two really juicy targets for refactoring: LEFT OUTER JOINs and DATETIME conversions in the WHERE clause.
This afternoon I got a call from a client with a problem with their replication setup: transactions were not being replicated across to the subscriber. Below is a record of the steps I went through to investigate this issue
Query tuning is a lot of fun. At times it can be challenging and frustrating but ultimately getting good performance gains is immensely satisfying. I have been doing a fair bit of this at work recently, and one of the challenges I have been facing is to quantify the improvements. Total running time is simple to measure, but I have been looking for a way to quantify the improvements in I/O and CPU usage as well. Thankfully, the SQL Profiler does a great job of this.
Being able to design, manage and maintain data pipelines is big business these days. Every major vendor offers a suite of tools to manage ETL and data pipelines: Microsoft, IBM, Oracle, SAP, SAS all have their own flavours. But I wonder how many people are tempted to hack it together themselves using their favourite programming language?
This is brief – it is an area that I need to follow up on and dive into deeper. Below are my notes on the caching of ad hoc queries developed during recent work for a client.
Rough and ready notes relating to question I have about Data Warehousing and ETL
More frequently, I am coming across code that incorporates the various ranking functions now available in T-SQL. These functions have been around for a while and provide ways to include explicit ranking columns in query result. Unfortunately, I am also beginning to see queries using these ranking functions to select “top x” results, or the most recent results. Not only do I find this a little contrived, it is not a scalable solution.
In this blog post, I have modified Paul Randal’s Wait Stats query and compare the relative efficiency of the two versions. It is a little unthinkable to go head-to-head with Paul Randal, who is a legend in the SQL Server community. However, when I ran Randal’s query I was surprised by how long it took. So I began to dissect the query, eliminating the most expensive operations to develop a useful, cut-back version.
Steve Jones has just written a great editorial for SQL Server Central titled, The Decline of SQL Server. In this post he compares the market position of Microsoft with it’s competitors, Oracle, DB2, MySQL , PostGreSQL and NoSQL and questions whether Microsoft have enough punch to continue to claim market share. This is particularly timely for me, as I have been exploring the future of data management in general. There is no doubt in my mind that it is not a question of SQL Server over PostGreSQL or relational platforms vs. NoSQL; I see a working, scalable and agile future being created by blending these technologies together.
Have you got transactional replication in your SQL environment? And do you need to add a new table to your publication, but can’t afford the time necessary to create a full new snapshot? Then this article is a blow-by-blow how to just for you! Sometimes it is just easier to reinitialise the whole subscription, but at 11 PM that’s the last thing you want to do. Thankfully, adding a single table is easier than I thought.