Posts tagged Connections

Shadows Rock, Filtering Platform not so much!

RDP remote control (shadowing) of multiple sessions is a great way to allow geographically separated teams to work on the same Server console.  You can do this from task manager

Control your co-workers!

Control your co-workers!

 

Today I had a new install of windows 2008 that was rejecting the attempts at remote control, the error was ”remote control failed”, nothing was logged in the System or Application event logs. In the Security event log was only one error: “The Windows Filtering Platform has blocked a bind to a local port”

After plenty of fiddling and making sure there was no ”firewall” or reason for the filtering platform to be enabled, I came across this command I never knew existed “shadow”  

Apparently whatever had the filtering platform angry and blocking access was ok with that simple command.   So in this case going to a command window and running “shadow 3” worked perfectly, I could once again see both terminals and the windows filtering platform allowed me to actually work, instead of impeding me at every turn.

The Windows Filtering Platform on Server 2008 and 2008 R2 has been the culprit more times than I can count lately when the “gremlins” are inhabiting our servers, If only there were a way you could turn it off totally, but I guess that’s sort of like Internet Explorer, it cant be unbundled from the OS.

Sql Server and SSPI handshake failed error hell

The infamous SSPI Failed error strikes again!

One of our SQL servers was generating these errors for “some” Windows logins but not all.

Error: 17806, Severity: 20, State: 2.

SSPI handshake failed with error code 0x8009030c while establishing a connection with integrated security; the connection has been closed. [CLIENT: 192.168.1.1]

Error: 18452, Severity: 14, State: 1.

Login failed for user ”. The user is not associated with a trusted SQL Server connection. [CLIENT: 192.168.1.1]

After exhausting all of the normal troubleshooting for this error (accounts locked, disabled, Sql Service accts, bad connection strings, SPN’s, etc.) I spent the next few hours learning more about the way SQL handles authentication requests than I had ever wanted to know.

The Scenario –

A couple of separate individual Windows ID’s started generating these errors while attempting connections, all other windows logins were working properly. The connections were initially happening through applications, but also occurred through sqlcmd. When logged in to the server locally with the offending ID’s the connections to SQL would succeed.

The Troubleshooting process –

Check all the regular SSPI issues, I wont bore you with the details as they are easily searchable

  • A relatively easy way of checking the “easy” authentication issues If possible/appropriate is to log into the SQL Server locally with the offending ID and fire up sqlcmd and connect to the server via sqlcmd –Sservername,port –E  (by specifying the port you force TCP/IP instead of LPC, thereby forcing the network into the equation)

Verify whether the login is trying to use NTLM or Kerberos (many ways to do this but simplest is to see if there are any other KERBEROS connections on the machine)

  • SELECT DISTINCT auth_scheme FROM sys.dm_exec_connections
  • If Kerberos is in use, there are a few additional things to verify related to SPN’s, since only NTLM was in use on this server I skipped that

Determine if the accounts were excluded from connecting to the machine through the network through a group policy or some other AD setting

After all of these checked out OK, I began to try and figure out what the error code 0x8009030c meant, turns out, its fairly obvious what the description is : sec_e_logon_denied.  This description was so helpful I thought about making this server into a boat anchor but, luckily for my employer the server room is located many miles away and has armed guards.

Since I knew we could logon locally to the SQL Server with the ID that SQL was rejecting with logon denied something else was trying to make my life miserable.

We didn’t have logon failure security auditing turned on so, I had no way of getting a better error description, As luck would have it though this would prove instrumental in finding the root cause. To get a better error message, I found this handy KB article detailing steps needed to put net logon into debug mode.

Say hello to my new best friend!  — nltest.exe

After downloading nltest & using it to enable netlogon debugging on the SQL Server, I got this slightly better message in the netlogon.log file

06/15 14:15:39 [LOGON] SamLogon: Network logon of DOMAIN\USER from Laptop Entered

06/15 14:15:39 [CRITICAL] NlPrintRpcDebug: Couldn’t get EEInfo for I_NetLogonSamLogonEx: 1761 (may be legitimate for 0xc0000064)

06/15 14:15:39 [LOGON] SamLogon: Network logon of DOMAIN\USER from Laptop Returns 0xC0000064

The error code 0XC0000064 maps to “NO_SUCH_USER”

Since I was currently logged in to the server with the ID that was returning no such user, something else was obviously wrong, and luckily at this point I knew it wasn’t SQL.

Running “set log” on the server revealed that a local DC (call it DC1) was servicing the local logon request.

After asking our AD guys about DC1 and its synchronization status, as well as whether the user actually existed there, everything still looked OK.

After looking around a bit more I discovered this gem of a command for nltest to determine which DC will handle a logon request

C:\>nltest /whowill:Domain Account

[16:32:45] Mail message 0 sent successfully (\MAILSLOT\NET\GETDC579)
[16:32:45] Response 0: DC2 D:Domain A:Account (Act found)
The command completed successfully

Even though this command returned “act found” it was returning from DC2.  (I dont exactly understand why the same account would authenticate against 2 different DC’s based on a local desktop login or a SQL login but it apparently can)

After asking the AD guys about DC2 the light bulbs apparently went off for them as that server actually exists behind a different set of firewalls, in a totally different location. While DC2 would return a ping, the console wouldn’t allow logons for some reason. After a quick reboot of DC2, and some magic AD pixie dust (I am not an AD admin, if it wasn’t totally obvious from my newfound friend nltest) the windows Id’s that were having trouble started authenticating against DC3 and our SSPI errors went away.

Interesting tidbit — During troubleshooting, I found that this particular SQL Server was authenticating accounts against at least 5 different DC’s. Some of this might be expected since there are different domains at play but, I haven’t heard a final answer from the AD guys about whether it should work that way.

The solution

Reboot the misbehaving DC, of course there may be other ways to fix this by redirecting requests to a different DC without a reboot but, since it was misbehaving anyway, and the AD experts wanted to reboot so we went with that. A reboot of SQL would have likely solved this problem too but, I hate reboot fixes of issues, they always seem to come back!

Whose got my DAC?

What is the DAC?

The Dedicated Admin Connection, Commonly called the DAC is used to manage SQL Server when a regular connection wont succeed.  Here’s what SQL Books Online (BOL) has to say about the DAC “This diagnostic connection allows an administrator to access SQL Server to execute diagnostic queries and troubleshoot problems even when SQL Server is not responding to standard connection requests.”

DAC Errors

Occasionally, while troubleshooting SQL servers in a large environment, especially one thats managed from many different geographic locations you could come up with this error, if more than 1 person is using the DAC.  It should also be noted this only happens if you have remote DAC enabled in your environment

Error 17810

Could not connect because the maximum number of ‘1’ dedicated administrator connections already exists. Before a new connection can be made, the existing dedicated administrator connection must be dropped, either by logging off or ending the process. [CLIENT: 127.0.0.1]

Since I could still connect with a regular connection currently, I set out looking for a query to determine who was using the DAC connection.  I whipped this up, and since I couldnt find anything in search, I thought id blog it

select conn.session_id, sess.login_name, sess.nt_domain, sess.nt_user_name, conn.connect_time, conn.last_read, conn.last_write, sess.host_name, conn.client_net_address
    from sys.dm_exec_connections conn
    join sys.endpoints edp
        on conn.endpoint_id = edp.endpoint_id
    join sys.dm_exec_sessions sess
        on sess.session_id = conn.session_id   
    where edp.is_admin_endpoint = 1

This should return everything you need to know about who is using your DAC connection so you can ask them to disconnect, or KILL their connection.

Using Aliases with SQL Server

Even an old dog can learn new tricks

Where Can I get a dog like this?
I had an Aha! moment recently.  For my entire career as a DBA I have generally considered aliases for connections a workaround for bad behaving applications.  Whenever someone said “alias” my mind immediately heads to SQL Server client configuration aliases which I try to avoid if at all possible (since they are configured on each client)  It never snapped to me until recently that DNS aliases may be a good solution to a few problems  we’re currently experiencing.

For disaster recovery reasons, as well as for manageability reasons we have decided to start using DNS aliases for every application connection to database servers.  This should allow us to have the luxury of moving databases from server to server without having to reconfigure multiple applications which would normally be a whole process in itself since the code was already migrated to production which is locked.

Using DNS aliases should also allow us to swap highly important applications over individually to a remote datacenter, which could have less computing power, without having to switch every application and thus kill the performance of that standby server.

There are certainly limitations to this, if for instance you want to move applications from one named instance to another.  In our current environment this isnt much of an issue since many of our production instances are indeed default.  The other major limitation to this is any change will have a small amount of downtime while the DNS changes are propgated throughout the network.

One more “gotcha” that weve already run into is vendor applications (surprise surprise) that resolve the DNS name to an ip address and then store that inside the application configuration.

We decided on working out a naming standard that looks like this:
For direct database access, where an application only connects to 1 database the following is used
dsDBNameEnvName
i.e. dsCustomerProd
for an application (like sharepoint) where many databases are going to be accessed we change it a bit
dsAppNameEnvName
i.e. dsSharepointProd

While this isnt necessarily a “new” idea, it was certainly a different idea in our environment and I suspect there are other “DBA’s”  out there like myself that dont have a habit for using our network skills on a regular basis

Photo courtesy: Ronn Ashore

Go to Top