Results 1 to 26 of 26

Thread: Studying a SQL Server Connection Failure

  1. #1

    Thread Starter
    Super Moderator Shaggy Hiker's Avatar
    Join Date
    Aug 2002
    Location
    Idaho
    Posts
    40,104

    Studying a SQL Server Connection Failure

    This is an intermittent problem that has plagued me for a decade, at least. I have a program that makes use of a SQL Server Express instance running on a different computer. This might work fine for a long time, then one day the computer running the program simply refuses to see the SQL Server instance.

    I started playing around with WireShark to see what was going on, but when I did that, all was well. I got several dumps of the connection working as expected.

    Today, it stopped working, so I started looking into it again. For one thing, I see a UDP request go out asking for information about the SQL Server instance. A reply comes back promptly with all the information. So, THAT much is working just fine. In fact, the program will (eventually) show all the available SQL Server instances, of which there are three on the target computer (for no good reason). Those three are available, but I still can't connect to them.

    If I ping the other computer, the ping times out, either with the computer name or the IP address.

    Looking at the WireShark traffic around the connection attempt, after the UDP request and response, the computer attempting to connect tries to establish a TCP connection (though not to the SQL Server port, which is odd). This attempt is repeated every second or two for a good long time. In among those repeated attempts, the computer trying to connect makes a few MDNS requests asking about the computer that has the SQL Server. There never seems to be a response to those requests.

    At a later point in the program, once I have the SQL Server instances, I attempt to connect to one of them. Looking at the WireShark traffic around that attempt, it's just a long sequence of requests to MSDNS and other services asking for information about the other computer. No reply is ever received.

    So, it appears that the other computer is sometimes responding (the UDP request for the DB is replied to promptly), but other times not responding (those TCP calls and MSDNS calls).

    Does this suggest anything?
    My usual boring signature: Nothing

  2. #2
    PowerPoster Zvoni's Avatar
    Join Date
    Sep 2012
    Location
    To the moon and then left
    Posts
    5,261

    Re: Studying a SQL Server Connection Failure

    Firewall?
    You know, kinda like "UDP is fine, TCP is no-no"

    Though mentioning mDNS.... sounds like some "misconfiguration" or one of those blasted Microsoft updates that resets your custom config
    Last edited by Zvoni; Tomorrow at 31:69 PM.
    ----------------------------------------------------------------------------------------

    One System to rule them all, One Code to find them,
    One IDE to bring them all, and to the Framework bind them,
    in the Land of Redmond, where the Windows lie
    ---------------------------------------------------------------------------------
    People call me crazy because i'm jumping out of perfectly fine airplanes.
    ---------------------------------------------------------------------------------
    Code is like a joke: If you have to explain it, it's bad

  3. #3
    PowerPoster wqweto's Avatar
    Join Date
    May 2011
    Location
    Sofia, Bulgaria
    Posts
    6,167

    Re: Studying a SQL Server Connection Failure

    Quote Originally Posted by Shaggy Hiker View Post
    Does this suggest anything?
    NIC problems on *local* computer? Connectivity issues are often h/w related.

    Might be worth disabling "TCP chimney offload" in netsh and similar h/w helpers on both machines so that NICs are less involved in TCP (if you don't do heavy traffic).

    cheers,
    </wqw>

  4. #4
    PowerPoster Zvoni's Avatar
    Join Date
    Sep 2012
    Location
    To the moon and then left
    Posts
    5,261

    Re: Studying a SQL Server Connection Failure

    Something else: Do you use VLAN-Segments on that network?
    If you are on (different) VLANs, mDNS needs a repeater/reflector
    Last edited by Zvoni; Tomorrow at 31:69 PM.
    ----------------------------------------------------------------------------------------

    One System to rule them all, One Code to find them,
    One IDE to bring them all, and to the Framework bind them,
    in the Land of Redmond, where the Windows lie
    ---------------------------------------------------------------------------------
    People call me crazy because i'm jumping out of perfectly fine airplanes.
    ---------------------------------------------------------------------------------
    Code is like a joke: If you have to explain it, it's bad

  5. #5

    Thread Starter
    Super Moderator Shaggy Hiker's Avatar
    Join Date
    Aug 2002
    Location
    Idaho
    Posts
    40,104

    Re: Studying a SQL Server Connection Failure

    I don't use vLANS as far as I know. The one computer is my main computer hooked to the router using an ethernet cable. The other computer is a portable computer that is used in a variety of places, but when at home, it connects via WiFi. I haven't done anything special about that, just let it connect, which it does.


    The idea that some update altered something sounds pretty plausible, but not really actionable. I'm comparing WireShark dumps that worked (from a couple months back...I haven't tried in the intervening months) to ones that failed yesterday. Ultimately, I'm trying to understand what CAN happen better. This is not the network where this program would ultimately reside. Therefore, what I'm really trying to do is build up a better understanding of how this kind of thing works such that I could better diagnose issues in actual deployment...if that ever happens, which is quite a big if.

    This does not appear to be well documented at all. For example, I have never found any resource that talked about how connections work, aside from TCP being involved on (by default) port 1433. I've figured out how some discovery works, which is the UDP calls I'm seeing, and that DOES work, which makes the situation a bit more peculiar.

    I think that the ultimate question might hinge on why ping doesn't find the other computer. I'll be studying that question today...when I get the time.
    My usual boring signature: Nothing

  6. #6

    Thread Starter
    Super Moderator Shaggy Hiker's Avatar
    Join Date
    Aug 2002
    Location
    Idaho
    Posts
    40,104

    Re: Studying a SQL Server Connection Failure

    Ping from mobile to desktop works fine. Ping from desktop to mobile...times out.

    That makes me think that this is a desktop issue. And now that I think about it, it's the desktop that has changed since when it was working. I rebuilt this system with new MOBO, memory, CPU, video card, power supply and case (same type of case as the old one, just one with a working power switch), just kept the HD, so it feels like the same...ish computer.

    That means that the NIC changed, as that's built into the MOBO.
    My usual boring signature: Nothing

  7. #7

    Thread Starter
    Super Moderator Shaggy Hiker's Avatar
    Join Date
    Aug 2002
    Location
    Idaho
    Posts
    40,104

    Re: Studying a SQL Server Connection Failure

    Okay, some progress.

    Turning off the firewall on the mobile device, which I'd rather not do, solved the issue completely...if not ideally. So, the firewall was blocking incoming requests. Now, what rules do I need to set up to keep the firewall mostly there, but allow SQL Server requests through it?
    My usual boring signature: Nothing

  8. #8
    Administrator Steve R Jones's Avatar
    Join Date
    Apr 2012
    Location
    Clearwater, FL.
    Posts
    2,345

    Re: Studying a SQL Server Connection Failure

    Open port 1433 within the firewall settings.

  9. #9

    Thread Starter
    Super Moderator Shaggy Hiker's Avatar
    Join Date
    Aug 2002
    Location
    Idaho
    Posts
    40,104

    Re: Studying a SQL Server Connection Failure

    That was done a long time back. It had no effect.

    However, this turned out to be the solution: https://superuser.com/questions/1303...ql-server-2014

    That was a surprise. The remote couldn't connect to the server, so the solution was to allow the SQL Server application on the server. That did create the inbound rules, and those inbound rules did solve the problem. There was some debate in the comments on the solution as to whether or not all of that was necessary, and it appears that it was. If I don't include the rule for the SQL Server Browser, then the connection still fails, even though the browser did appear to be working without the rule in place (the UDP ping got a response with the correct information in it).

    One thing that had me a bit confused at first was the mention of "Open up services". It's imprecise language, so I'll clarify it a bit. What is meant is opening SQL Server Configuration Manager (which was shown in the original post). The services in question are the SQL Server Services node shown in the Configuration Manager. Right clicking on the server instance you want to allow connections to and choosing properties will take you to a form that has the path on one of the tabs. The path is beyond ugly, and it doesn't appear that it can be copied out of that tab, but at least you know where to browse to when choosing the application to allow through in Windows Firewall.
    My usual boring signature: Nothing

  10. #10

    Thread Starter
    Super Moderator Shaggy Hiker's Avatar
    Join Date
    Aug 2002
    Location
    Idaho
    Posts
    40,104

    Re: Studying a SQL Server Connection Failure

    So, there are still questions that remain. The computer that changed was the desktop, NOT the server where the rule had to be added. That hasn't changed in many years. It's a Surface Pro, it's a bit hard to change. Why did it EVER work? Did changing the desktop somehow change something such that changed the behavior on the Surface? The answer to that is probably. After all ping is STILL failing. When I ping the Surface from the desktop, it fails. When I ping the desktop from the Surface, it works. Clearly there is still something going on that is not resolved. Unfortunately, I never pinged either way back when it was working...cause it was working, so I don't know whether ping was working before I updated all the hardware on the desktop.

    The suggestion is that there is still some handshaking going on, which is probably captured by WireShark, and which is probably still not working the way it had been working. I'd like to better understand what handshaking is happening with SQL Server. As it stands, I have a solution, I just don't know WHY it works, and that's bugging me. Something is still not working, since ping is not working, and a change to the desktop appears to have broken something on a computer that didn't change, or at least required a different configuration on that computer. It's more of a black box than I like. If there is a resource for understanding that, I'd like to see it. I've read up on the TCP/IP layers, but this goes a bit further.
    My usual boring signature: Nothing

  11. #11
    PowerPoster Zvoni's Avatar
    Join Date
    Sep 2012
    Location
    To the moon and then left
    Posts
    5,261

    Re: Studying a SQL Server Connection Failure

    Ok, nice that i was in a way right about the firewall.Otoh, your issue with ping: I remember having something similar: I couldn’t ping/connect FROM a particular computer on the network, but i could ping/connect TO it.My solution then (some 10 years ago) was to remove the NIC in device-manager of that particular computer, reboot to let the NIC reinstall itself.Some update borked the driver
    Last edited by Zvoni; Tomorrow at 31:69 PM.
    ----------------------------------------------------------------------------------------

    One System to rule them all, One Code to find them,
    One IDE to bring them all, and to the Framework bind them,
    in the Land of Redmond, where the Windows lie
    ---------------------------------------------------------------------------------
    People call me crazy because i'm jumping out of perfectly fine airplanes.
    ---------------------------------------------------------------------------------
    Code is like a joke: If you have to explain it, it's bad

  12. #12
    PowerPoster wqweto's Avatar
    Join Date
    May 2011
    Location
    Sofia, Bulgaria
    Posts
    6,167

    Re: Studying a SQL Server Connection Failure

    Windows Firewall does not explain intermitten problems with client connections to SQL Server -- it looks like a red herring to me.

    cheers,
    </wqw>

  13. #13

    Thread Starter
    Super Moderator Shaggy Hiker's Avatar
    Join Date
    Aug 2002
    Location
    Idaho
    Posts
    40,104

    Re: Studying a SQL Server Connection Failure

    Unfortunately, I don't have enough data to say definitively that it IS intermittent. There are two problems. The first is that the testing itself is intermittent, the second is that too many variables change between each test to be certain that they are comparable. For example, this WAS working the last time I tried it, but a whole lot happened between then and now...including the months of August, September, October, and November, but also I changed pretty nearly ALL the hardware on the desktop.

    This issue of too many variable changing exists between any pair of failed and working test. The earlier failures were using a different server, and sometimes got resolved by simply rebooting. When it worked, it would work for a long time. For example, when it was last working and I was looking at network traffic using WireShark, it worked for days. I moved on to other things and there were no issues.

    It hasn't been good testing, either. Partly, I mostly only sought solutions when it failed, which might be once a year. One should change only one variable at a time, and I wasn't being good about that. In this case, I tried to be methodical. I'd make a change, run the test, make a new change, run the test again, and so forth. Then, once it was working, I'd undo a step and see whether that changed things. That's how I decided that both applications (SQL Server and SQL Server Browser) had to be allowed through. I wasn't always so methodical, or I ended up chasing other issues at the same time. For instance, I was having an issue on one server and not a different one, but at the same time, I realized that I ultimately had to stop using the first server for other reasons, so I was highly motivated to switch over to the new server. That meant that I switched hardware without fully understanding why one worked and the other did not.

    It may well be a red herring. It isn't the total solution, either, because of ping still isn't working. I also haven't compared network traffic now to the saved traffic patterns I have from the earlier success. I still have to do that to see if it is the same. Lots of chaff in those WireShark dumps, though, so it's a bit slow to get through, and I won't be able to get back to it until Sunday, at this rate.

    I'd say there is something NIC related going on, probably due in some way to the new hardware being used. I would say I have learned something, but not quite enough, yet.
    Last edited by Shaggy Hiker; Dec 19th, 2025 at 06:29 PM.
    My usual boring signature: Nothing

  14. #14

    Thread Starter
    Super Moderator Shaggy Hiker's Avatar
    Join Date
    Aug 2002
    Location
    Idaho
    Posts
    40,104

    Re: Studying a SQL Server Connection Failure

    Yeah, it was something of a red herring, though perhaps suggestive. For two days, it worked fine. I was able to get a series of network traces. I realized that some of what I had been doing before was a bit wrong. I thought I was watching the traffic for a successful connection, without realizing that it was actually failing. SSMS was connecting, so I thought my program would too. It was, just not when I put breakpoints in places to catch the traffic. I think I put the breakpoint in such a place that the connection attempt timed out and failed, at which point I started tracking. Once I realized that, I moved the breakpoints and was able to see that success for SSMS and my program looked essentially the same. A bunch of TLS packets that acted perhaps as a handshake or negotiation, along with a bunch of TCP packets (though not to 1433, which was interesting) that had data that was not meaningful to me.

    So, today I decided to check out a few other aspects...and neither SSMS nor the program will connect at all. The UDP discovery goes through, but the connection just does not....until I started writing this and it worked. Rebooting the desktop seems to have sorted the issue, which strongly suggests that there's some kind of hardware issue going on. The fact that pinging from desktop to surface is consistently failing, also suggests that. The problem is that every piece of hardware along the way has been replaced, and my spotty data is not sufficient to say whether or not the NIC in the desktop always worked before it was replaced.

    The one thing I have found is that sometimes when the Surface sleeps when a connection has not been established, connectivity will fail. Not always, but it's fairly often.

    I think I need to study why the ping from the desktop to the surface is consistently failing, while the ping from the surface to the desktop is consistently working. One can always see the other, the other can never see the one. That seems likely to be relevant.
    My usual boring signature: Nothing

  15. #15
    PowerPoster jdc2000's Avatar
    Join Date
    Oct 2001
    Location
    Idaho Falls, Idaho USA
    Posts
    2,525

    Re: Studying a SQL Server Connection Failure

    On the surface, make sure the power settings are not set to turn off the network hardware when on AC power. The ping issue could still be a firewall problem.

  16. #16

  17. #17

    Thread Starter
    Super Moderator Shaggy Hiker's Avatar
    Join Date
    Aug 2002
    Location
    Idaho
    Posts
    40,104

    Re: Studying a SQL Server Connection Failure

    Yeah, it was a firewall issue. It took me all bloody morning to track it down. There are LOADS of tutorials, AI suggestions, and videos, on how to enable network discovery. It's simple, you just flip this switch! Of course, when I flipped the switch, it didn't stay flipped, but was unflipped the next time I looked at it. It would remain visibly flipped only so long as the switch was visible, but it wasn't changing any setting, so the next time the page displayed, it was back to the not set state.

    Some AI and videos talked about this being due to a firewall setting where Network Discovery rules (or rules in that group) were not set, or perhaps some services were not enabled. I enabled the services, but there was no network discovery rule group and every tutorial would then just tell me to flip that stupid switch.

    After lengthy study, and even resetting the hardware on the Surface, I found a thread with a single mention of this:

    netsh advfirewall reset

    That works. Resetting the firewall restored the network discovery group that had been missing. It also blew away any other rules I had, including the SQL Server rules, but it did restore the discovery group. After a reboot, the ping from the desktop to the surface is now working. SQL Server is connecting promptly. Perhaps it is now all solved?

    So what happened? One suggestion was that a Windows Update might have borked the firewall rules. Maybe so, and somebody already suggested it on here (I haven't looked back to see who that was, yet...it was zvoni). Likely I will never know, though.
    My usual boring signature: Nothing

  18. #18

    Thread Starter
    Super Moderator Shaggy Hiker's Avatar
    Join Date
    Aug 2002
    Location
    Idaho
    Posts
    40,104

    Re: Studying a SQL Server Connection Failure

    I'm not going to mark this resolved just because every test is passing. I've fallen for that before.
    My usual boring signature: Nothing

  19. #19
    PowerPoster wqweto's Avatar
    Join Date
    May 2011
    Location
    Sofia, Bulgaria
    Posts
    6,167

    Re: Studying a SQL Server Connection Failure

    > including the SQL Server rules

    You should keep these simple: in basic view of firewall.cpl just allow apps both sqlserv.exe and sqlbrowser.exe

    On our db servers we do it from a batch file with something like this:

    c:> netsh advfirewall firewall add rule name="SQL Server (INS67)" dir=in action=allow program="D:\MSSQL15.INS67\MSSQL\Binn\sqlservr.exe" enable=yes
    c:> netsh advfirewall firewall add rule name="SQL Browser" dir=in action=allow program="C:\Program Files (x86)\Microsoft SQL Server\90\Shared\sqlbrowser.exe" enable=yes

    No fiddling with protocols or ports whatsoever. We then configure sql server surface from within sql server config and all ports we assign there are open on the firewall by default i.e. one less config to think about.

  20. #20
    PowerPoster
    Join Date
    Nov 2017
    Posts
    3,630

    Re: Studying a SQL Server Connection Failure

    Quote Originally Posted by Shaggy Hiker View Post
    I'm not going to mark this resolved just because every test is passing. I've fallen for that before.
    Having opined about this seemingly ongoing issue in your past threads myself, I share your skepticism that this is actually finally resolved.

  21. #21

    Thread Starter
    Super Moderator Shaggy Hiker's Avatar
    Join Date
    Aug 2002
    Location
    Idaho
    Posts
    40,104

    Re: Studying a SQL Server Connection Failure

    Quote Originally Posted by wqweto View Post
    > including the SQL Server rules

    You should keep these simple: in basic view of firewall.cpl just allow apps both sqlserv.exe and sqlbrowser.exe

    On our db servers we do it from a batch file with something like this:

    c> netsh advfirewall firewall add rule name="SQL Server (INS67)" dir=in action=allow program="D:\MSSQL15.INS67\MSSQL\Binn\sqlservr.exe" enable=yes
    c> netsh advfirewall firewall add rule name="SQL Browser" dir=in action=allow program="C:\Program Files (x86)\Microsoft SQL Server\90\Shared\sqlbrowser.exe" enable=yes

    No fiddling with protocols or ports whatsoever. We then configure sql server surface from within sql server config and all ports we assign there are open on the firewall by default i.e. one less config to think about.
    That's a good suggestion. I have some setup to do on a new computer, though the program does most of it. I'll add that into the setup.
    My usual boring signature: Nothing

  22. #22

    Thread Starter
    Super Moderator Shaggy Hiker's Avatar
    Join Date
    Aug 2002
    Location
    Idaho
    Posts
    40,104

    Re: Studying a SQL Server Connection Failure

    Quote Originally Posted by OptionBase1 View Post
    Having opined about this seemingly ongoing issue in your past threads myself, I share your skepticism that this is actually finally resolved.
    Yeah. I know there are some people who have been around through the whole saga. I'd do something, and all would be well for some indeterminate length of time, at which point I'd think I had it solved....then it wouldn't be, and the cycle would start again. I'm somewhat more hopeful, this time. I feel like I learned a whole lot, at the very least, and am better able to study the problem if it arises again, as I can compare new failures to past failures and past success.
    My usual boring signature: Nothing

  23. #23

    Thread Starter
    Super Moderator Shaggy Hiker's Avatar
    Join Date
    Aug 2002
    Location
    Idaho
    Posts
    40,104

    Re: Studying a SQL Server Connection Failure

    Well, that didn't exactly last.

    For the last several days, all has been well. Today, while the desktop would ping the surface, SQL Server wouldn't connect, at least not right away. Rebooting solved the problem, but it still exists to some extent. Things seem to be better, but there's clearly something remaining. The desktop could see the Surface, and SQL Server could not. However, I didn't really study the situation any. If it happens again, I'll take the time to get a bit of network traffic around the connection attempt. It's pretty interesting that the two will connect for four days, then kind of lose it, until a reboot.
    My usual boring signature: Nothing

  24. #24

    Thread Starter
    Super Moderator Shaggy Hiker's Avatar
    Join Date
    Aug 2002
    Location
    Idaho
    Posts
    40,104

    Re: Studying a SQL Server Connection Failure

    Once again, the initial connection to the Surface failed, but rebooting the Surface solved the problem.

    That's pretty annoying. It seems that there's something going on with the Surface such that it usually has things working right, and occasionally it just decides to not start out right. If the solution is always to reboot the Surface and all is well, then that's only annoying. If it becomes more complicated than that, then it would be much more problematic.
    My usual boring signature: Nothing

  25. #25
    PowerPoster
    Join Date
    Nov 2017
    Posts
    3,630

    Re: Studying a SQL Server Connection Failure

    Does the version of SQL server you are using have a limited number of simultaneous connections? If so, is it possible that you are experiencing some sort of issue where connections aren't being terminated properly, and you are unable to connect because of that? Have you looked in the Event Logs for specific events related to SQL Server on the Surface?

  26. #26

    Thread Starter
    Super Moderator Shaggy Hiker's Avatar
    Join Date
    Aug 2002
    Location
    Idaho
    Posts
    40,104

    Re: Studying a SQL Server Connection Failure

    I don't believe that could be the issue. If it fails, it fails on the first and only connection after starting the computer.

    One thing I did note is that there is zero traffic from the desktop to the Surface if I start tracking before attempting to create a connection with SSMS. The trace for a good connection shows TCP and TLS packets flying from desktop to Surface and back right away. Therefore, the zero attempts suggests to me that the issue precedes SQL Server. It acts like the Surface isn't announcing itself in some fashion such that when SSMS doesn't even know which IP to send messages to. I do see a series of requests by the desktop asking who has the IP that would be the Surface, but it gets no reply.
    My usual boring signature: Nothing

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width