Another reason NOT to use a public DNS name as your internal domain namespace
ByI was called in to work with a client this week who was having some trouble with employees who were connecting to the network via VPN. The basic problem was this: when the employees made a VPN connection and tried to load the companyweb web site, they got directed to someplace else altogether. When they tried to connect to companyweb from machines on the internal network, no problems.
The core problem boiled down to the internal domain name space. It was the same as their public DNS name. I.e., their internal domain was smallbizco.net (not their real domain) and their public domain was smallbizco.net.
I was able to give them a workaround ( use the URL https://SBSserverIPaddress:444/ ) since they couldn’t implement the real solution, which is to rename the internal domain with a private, non-routable namespace (such as smallbizco.local or smallbizco.lan).
Every SBS consultant worth his or her salt will tell you that you never, EVER use a public domain name for your internal domain name. DNS lookup failures, like the ones experienced here, are the reason why. And had this client set up the internal domain name correctly, they could have avoided this problem.
However, the real reason WHY it was failing was because of what I now believe is a flaw in the way Windows handles VPN connections, not only because they used a public DNS namespace for their internal domain. What follows is how I determined that the problem lies with Windows and not solely with the client.
Part of what first threw me off about this problem was when the client described the situation to me, I immediately created a VPN connection to their network and brought up companyweb immediately, much to my surprise. Then I remembered that I needed to replicate their setup exactly, so I created a VPN connection from a Windows PC instead of from my Mac. Sure enough, loading companyweb failed from the PC, even though it worked from my Mac.
It didn’t take long to realize the problem was a name resolution issue. When I did an IPCONFIG /ALL from the PC, I saw the settings for the VPN connection had an IP address on their internal network, I had a DNS server entry for their SBS server running DNS, and I got a WINS server entry that also pointed to their SBS server. The default DNS lookup zone was their public DNS name, not a private name, so I knew that would come into play.
I then attempted to ping the internal companyweb site. Surprisingly, I got a ping response, but no from an internal machine. Instead, I was pinging a public IP address. So I went and looked at the DNS zones on the SBS server. Sure enough, only one zone was listed, and it was the public DNS namespace. I looked for the companyweb CNAME in the DNS configuration, and sure enough, there it was, and it pointed to the A record for the internal server, which in turn had the internal IP address of the server. So why when I tried to ping the internal name did I get a public IP?
That’s when I went to nslookup and found what I believe to be the core problem. When you launch nslookup, it tells you which DNS server it is going to query by default. It will look something like this:
C:\Documents and Settings\Administrator>nslookup Default Server: sbs.smallbizco.lan Address: 192.168.19.2
And that’s exactly what I saw – nslookup was pointing to my default internal DNS server, the one on my network, not the DNS server on the remote network. That somehow didn’t seem right. And when I did an nslookup query on the full FQDN for the companyweb reference on what should have been their internal domain, I got returned a public IP address. They had actually created a public DNS entry for companyweb, apparently in an earlier attempt to fix the problem.
But I was still bothered that my Windows XP workstation was trying to look at my internal DNS server for queries instead of using the information provided by the VPN connection first. That just didn’t’ seem right. So I created a new XP box in a virtual environment and created the VPN connectoid on that and tried again. Same result on loading the incorrect companyweb page, and nslookup still defaulted to the internal DNS server instead of the VPN-connected DNS server.
On a whim, I created a VPN connectoid to another client that does have a private internal domain namespace and tried the same thing. I connected in, opened my web browser, went to companyweb, and got his internal companyweb page to load correctly. Then I went and checked nslookup – still pointed to my internal DNS server.
That’s when I got out the network tracers and started looking at the network level to see what was going on. I started collecting traces on the second VPN connection, the one with the private internal domain namespace, and found something fascinating. Sure enough, when I loaded companyweb, my XP workstation did a DNS query (after I cleared the local DNS cache) to my internal DNS server trying to find the internal domain name of my remotely connected client. Only when that lookup failed did the workstation then generate a second DNS query against the DNS server on the VPN-connected network. I then ran several of these tests using several different workstations, making sure I flushed the local DNS cache before every attempt, against several of my clients with VPN connections enabled, and sure enough, I saw exactly the same behavior.
Windows XP attempts to do DNS lookups against the DNS server specified on the internal NIC before it attempts a lookup on the DNS server provided by the VPN connection.
And this is why the client with the same public and internal DNS namespace was failing to get a page. Since he was using his public DNS name as his internal name, when XP attempted the lookup, it first went to the DNS server on my network, which returned a valid public IP address for the public name and never went to look at the DNS server listed in the VPN connection. And this is why he would not have had this problem if he had used a private internal domain name. The initial lookup to my internal DNS server would have failed to return a record, and the second query to the VPN-supplied DNS server would have succeeded.
That’s when I remembered that the first time I tried to replicate his problem I actually got it to work. My Mac VPN connection was able to look up and connect to the companyweb page on his internal network on the first try. So I did a little more digging. I reconnected back to this client’s VPN from my Mac and opened nslookup (yes, we do have command-line tools on the Mac, too – you PC folks actually took network tools like nslookup from UNIX, but I digress). When I opened nslookup, it defaulted to the DNS server on the VPN network, not my local DNS server. I then ran a few network traces from my Mac. Sure enough, while I was connected to the client’s site via VPN, every DNS query the apps on my Mac executed went to the DNS server on the remote network. My Mac never looked at my local DNS server. I tested this also against all of the same clients I had used to test the XP workstation and got the same results. When connected via VPN, all the network activities from my Mac defaulted to the VPN-connected network first. There were no local network connections attempted.
That started my quest to figure out how to make the VPN connection on the PC be the default connection used for all network traffic when the VPN connection was active. Bottom line, I could not. I tried marking the VPN connectoid as the default connection, I set the network priority order on the workstation so the Remote Connection was primary, I manually configured the DNS server and DNS namespace on the VPN connectoid, and so on. Every time I did any sort of DNS query while a VPN connection was active, the XP OS first directed the query against my internal DNS server and then only looked at the VPN DNS serve if the lookup on the internal server failed.
This is why I now believe that the VPN software on XP is broken. When a VPN connection is active, there should be no reason for the workstation to have any network activity on the local network. All network activity should be handled by the VPN connection- name lookups traffic routing, etc. And that was one of the interesting items I noted in all the testing I did. Even though the DNS lookups hit my internal DNS server first and got a public IP address, the route the PC took to get to that public IP address was through the VPN network. I believe that was because I enabled the “Use default gateway on remote network” checkbox in the VPN connectoid. But why on earth would DNS traffic not get routed across the VPN network first as well? Again, I think this is what’s broken.
Had the XP VPN code worked in the way I believe it should, this particular client would not have had the trouble in the first place. Again, when I connected with my Mac, the DNS query for companyweb went to his SBS server and got a valid internal IP address, even though the internal domain namespace was the same as his public namespace. Had the XP boxes behaved in the same way, the DNS query for companyweb would have gone to the SBS server on the remote network instead of the DNS server on my network, and I would have received the internal IP address for his SBS server and not some public IP address that actually went nowhere.
Does this mean that I’m not advocating the use of a private domain namespace for internal networks? Absolutely not! If anything, this situation has given me yet another reason to use a private domain name for SBS (and other) networks. But the real problem here is that the VPN implementation provided by Microsoft for Windows XP is broken. I haven’t had a chance to test this on a Windows 2000 workstation, nor have I had a chance to use a third-party VPN client to see how they behave. But I’ve tested on the MS VPN tools with XP enough to know that they don’t work as they should. Now, it doesn’t have a huge impact on the world – most of the folks I talked with about this didn’t see any ill effects of the problem. And in this case, if my client had not put a public DNS record for companyweb with his public DNS provider, he still might not have had the problem. This is probably a 1 in 1000 or possibly 1 in 10,000 situation. But nonetheless, the software is broken, and now I have another scenario to use when troubleshooting VPN connection problems for clients.
1 Comments
November 12th, 2005 at 11:09 pm
I’ve seen this behaviour as well with the two SBS machines I’m responsible for (one at work, one personally owned.) I recently tried setting up a VPN client on one of the work XP systems to connect back to my home box.
http://companyweb would resolve to the local (at work) SBS Sharepoint site rather than the remote site (my server), and pings verified this.
IPCONFIG revealed some anomalous entries, notably a DNS server listed twice (e.g. under Primary DNS Server is listed “myserver.dom.loc myserver.dom.loc”)
This happened on all XP clients on the workplace SBS server. This did NOT happen, oddly enough, on my home XP box when connecting to the workplace VPN. (My home XP machine is connected to the home SBS box *but it is not joined to its domain*. Emphasis added, I think this may be an important fact.)
We have two Macs at work but I haven’t yet tried their VPN clients. I wonder what ifconfig would produce?
When I’m next at work, I’ll do some tests and post the results in my own blog (http://spaces.msn.com/dmoisan)