Archive for Troubleshooting

One of the significant differences in the minimum specs for installing SBS 2008 versus SBS 2003 was the minimum size of the C: partition needed for installation and operation. SBS 2008 requires a minimum of 60GB in the install partition or it won’t go. Those of us who were used to fighting the 12GB C: partition implemented by OEM vendors in SBS 2003 initially looked at that and thought “yeah, that’s a good change.” Well, as it turns out, kinda like the 4GB RAM minimum spec, the 60GB C: partition may not be big enough after all.

If you ask around those who have been doing SBS 2008 deployments, one of the best practices adopted by most is to use the Move Data Wizards in the Server Storage tab of the SBS 2008 Console and get the key data components off the C: partition and onto another partition (Exchange, SharePoint, User’s folders, User’s redirected documents, and WSUS content). And if you take the step that some do of installing third-party software to a partition other than C:, we should be ending up with a fairly pristine C: partition with minimal dynamic data on it. In theory.

I’ve been deploying my SBS 2008 installs with a 100GB C: partition simply because I figured that over time, something would find a way to suck up all the space on C: and we’d eventually get to a point where we’d have to deal with resizing paritions or doing manual data cleanup. I didn’t expect that I’d hit that scenario just over a year after my first SBS 2008 production deployment.

In the last couple of weeks, my monitoring tools have started chirping about low disk space on C: on a couple of installs. Sure enough, one installation had 17GB remaining of a 100GB partition, another had 3.5GB remaining on an 80GB partition (my own production box, and yeah, it really needs an overhaul, but that’s another story). I started digging around and found the most common disk hog that’s been complained about across the net, the winsxs folder. Based on everything I’ve been able to read about winsxs, including a post from the Windows Server Core Team, that’s something that we’ll just have to live with, and really isn’t the point of this post anyway. Still, on my boxes, the winsxs folder still only amounted to about 12GB (bigger than what I’d like, but certainly not the primary culprit) which is only about 10% of my standard install C: space. Something else had been sucking away space and keeping it from me.

We use TreeSize from JAM Software as a standard utility on our server deployments to help monitor disk space usage, as this is something that comes up from time to time. [NOTE: this is not a specific endorsement of TreeSize, just a note that it's one of the many tools that we use in our operation.] So in the case of these low-free-space servers, I fired up TreeSize and went looking for the disk hog. Surprisingly, I couldn’t find it. I did clear up some areas that showed a larger-than-expected usage, but couldn’t find the smoking gun. A few weeks have gone by, and while I’ve been monitoring the state of these servers to ensure that free space didn’t get critically low, other tasks moved up on the priority list.

Then a discussion on one of my private lists cropped up regarding this exact topic, and I learned two valuable tidbits from that discussion.

The first is that in order for TreeSize to see the contents of ALL folders on the C: partition, it must be Run As Administrator. Upon reflection, this makes sense, but I know it’s catching a lot of experienced system admins off-guard. Some are advocating disabling UAC on the server to avoid this kind of issue, and I’m honestly not fully decided where I stand on that, so I won’t comment either way on that. But it does serve as a reminder that many system tools we may have been using for years on 2003 servers might not behave the same way under 2008 if you don’t use the almighty Run As Admin option.

The second is that the WSUS site in IIS has been logging an OBSCENE amount of data into the IIS logs folder. One of my servers had nearly 30GB (yes, that’s 30 gigabytes) of data in the WSUS log folder (C:\inetpub\logs\LogFiles\W3SVC1372222313). Another had just over 20GB. And in looking in the folder, I saw numerous DAILY log files that were well over 100MB each, with some well over 200MB each.

Once I cleared out the old log files (honestly, how far back am I going to need to look at WSUS logs anyway?) the free space on C: increased to a reasonable level, and my monitoring stopped yelling at me quite so often.

There are multiple lessons learned from this experience for me. The first is the whole reminder about Run As Administrator in the Server 2008 era. I’ve even taken to labeling some shortcuts with “Run As Administrator” in the icon name just to serve as a reminder. The second lesson is that 60GB is certainly NOT going to be sufficient as a minimum partition size on a production SBS 2008 server, even if all other data is moved off to different volumes (and I haven’t even covered the option of moving the WSUS SQL database files off of C: to another partition, which can’t be done through wizards but must be done by hand). With winsxs and the WSUS logs as two items that will definitely be grabbing disk space unexpectedly (well, it’s expected now anyway), we can be sure that over time there will be others. And as stated on the Core Team blog, you can only expect that winsxs will continue to grow over time. If it’s 12GB now, how large will it be in a couple of years? The third lesson is that some logging that happens automatically on the server probably should not just be left unchecked. If you enable SMTP logging (which I do and recommend for troubleshooting purposes), you should clean out old SMTP logs on a regular basis. Well, now you can add WSUS/IIS logs to that approach as well. There are numerous posts out there for ways to script this process, and I’m evaluating the approach we’re going to take within our operation to make this happen for our customer base.

If you’ve been struggling with low disk space issues on SBS 2008 C: partitions, hopefully this information will help you get a better handle on the immediate actions as well as the long term strategy that you’ll develop for your particular environment.

Comments (12)

Earlier this month an associate pinged me about an unusual situation. He had an SBS 2003 server that was shutting itself down periodically, claiming that it was doing so because there was another SBS server in the domain. Well, this is expected behavior if there is, in fact, another SBS server in the domain, but this particular network had only one server, the SBS sever, and not a single other server or history of another server in the network. Another unusual symptom of the behavior is that the server would remain up for a little over 24 hours before it would shut itself down because of the phantom SBS server. According to MS KB 925652 the SBS server will shut down every hour if it detects another SBS server in the domain, so clearly a different set of events were causing this behavior. The server was logging SBCore 1011 errors in the event logs, but only after the server had been online for about a day.

On a tip from a colleague at MS, we started to look for a possible memory leak in the system. I worked with my colleague to set up perfwiz and poolmon to try to identify the process (or processes) that were leaking. The theory was that a runaway leak could strip the server of valuable no-paged pool memory which could cause the SBCore check to fail and generate the errors and shutdown event. I must admit, perfwiz and poolmon never were my strong points, so even after we got some results back, the review didn’t come up with a smoking gun.

Then my associate found a tip that I’d not heard of before, even though I regularly modify settings where this tip was found. He opened the Task Manger on the server, selected the Processes tab, then opened Select Columns under the View menu. In here, he enabled the “Memory – Non-paged Pool” column and then sorted the Task Manager process list by that column. Sure enough, he not only quickly found the culprit, but also could sit and watch the Non-paged Pool count grow steadily right before his eyes. The service causing the problem? spoolsv.exe, the print spooler service.

A quick bit of Googling on his part ultimately led him to this post from Tek-Tips which helped him identify the root cause of the problem: HP Standard TCP/IP ports for printers on the sever. He changed the port types for the printers from HP Standard TCP/IP ports to Standard TCP/IP ports, and the server hasn’t shut down again since.

Turns out, there is a KB on this situation, too, MS KB 933999. And in going back and looking further, the server was logging the Srv 2019 errors in the event logs as well. Since we were sidetracked by the anomalous SBCore behavior, we did overlook the 2019 as a possible factor as well.

In the end, I learned two things from this. One, you can track non-paged pool memory usage in Task Manager (which really isn’t a *revelation* per se, just something that I wouldn’t have necessarily deliberately gone out and looked for), and two, memory leak issues can cause anomalous SBCore errors and the shutdown of an SBS server. The good news is that the server was shutting down “normally” because of the SBCore misfire instead of totally running out of non-paged pool memory and crashing, as MS KB 933999 points out can happen. Bottom line, customer happy, and tech support further educated!

Categories : How To, SBS, Troubleshooting
Comments (0)
Dec
09

Windows Activation Errors

Posted by: Q | Comments (0)

One of the advantages of the activation process in newer versions of Windows is that you can install the OS in evaluation mode for 60 days without having to use a license key. Additionally, you can extend this evaluation for more than 60 days by following steps outlined in several public posts (I’m including this link to Sean Daniel’s post on this).

A critical step in this process, however, is the restart of the box AFTER the slmgr.vbs -rearm command has been run. If the system is NOT restarted after this process, some unusual behaviors can be observed. This post is to identify the specific errors that can result from this specific set of circumstances so that should someone run across this situation you can see what may be going on.

The Windows Activation Error from an slmgr -rearm without a restart.

I recently ran into this issue with an SBS 2008 server. When signing into the server, the above error dialog appeared on the server. Closing the error allowed continued normal use of the server, both from an interactive login point of view as well as from a remote resource use point of view. Checking the state of the activation window using the slmgr.vbs script generated the error below:

The error appears quickly (unlike the normal response of the slmgr.vbs script) and the key element is the error code. The 0xC004D302 indicates that an slmgr.vbs -rearm has been run, but the server has not been restarted. In the case of this system, a normal restart of the system returned the box to normal operation without Activation errors and slmgr.vbs ran correctly.

NOTE: This does not cover ALL possible causes for the Windows Activation Errors tied in with slmgr.vbs script errors. It is possible that this behavior could indicate other issues. But if you can log in and use the system “normally” after seeing this error (other activation errors prevent you from completing the login process and you never get to a desktop), chances are you just need to restart the server to return to normal behavior.

Categories : Troubleshooting
Comments (0)
Aug
18

More Fun with SBS 2008 and Sharepoint Updates

Posted by: Q | Comments (0)

Anyone who has been dealing with SBS 2008 for the last couple of months knows that there have been issues with recent Sharepoint and SBS 2008 updates:

Companyweb Inaccessible After Sharepoint 3.0 Service Pack 2

Files in Companyweb are Opening Read-Only After SBS 2008 UR2

Sharepoint Service 3 Search event errors after an SBS 2008 Update Rollup

Event 2436 for Sharepoint Services 3 Search

Bottom line, it’s not been an easy road. Fortunately, the SBS team have done a good job of documenting the issues as they come up. Unfortunately, not everything has been caught yet. As I found out this week.

I’ve had two new SBS 2008 deployments in the last two months. One a migration (won’t go there), and the other a clean install. Ironically, the clean install is the one that’s caused me the most grief. The initial install went smoothly, and we’ve been keeping up to date with all the updates. Based on the information above, we knew to install the Sharepoint 3 SP2 before installing SBS 2008 UR2, and flipped the database off of Read Only.

Yesterday, I went to create a new security group. I launched the Add Group Wizard from the SBS 2008 console and was immediately greeted with:

“Windows SBS 2008 Add Group Wizard has stopped working”

The first wizard screen never even launched. Of course, I started digging through the addgroup.log file in C:\Program Files\Windows Small Business Server\Logs, and found the following after hunting for several minutes:

An exception of type 'Type: System.Data.SqlClient.SqlException, System.Data, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089' has occurred.

Message: Access to table dbo.Versions is blocked because the signature is not valid.

In the stack dump that followed, many of the references were to Sharepoint. “Ah ha!” I thought. “The Add Group Wizard also does some things in Sharepoint!” and I went off to look at Sharepoint. Sure enough, companyweb wouldn’t come up. So, I went back to  Companyweb Inaccessible After Sharepoint 3.0 Service Pack 2 and went through those steps again. I verified that the database was not read-only, then I went through and followed the steps to re-run the setup wizard from the command line. Uh, oh, got errors. Fortunately, the psconfig command had me look at the PSCDiagnostics log in C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\12\LOGS. Unfortunately, those logs didn’t really tell me anything useful. What I found was this:

08/17/2009 17:12:59  1  ERR        One or more configuration tasks has failed to execute

08/17/2009 17:12:59  1  INF        Entering function TaskDriver.Stop

08/17/2009 17:12:59  1  INF          Entering function StringResourceManager.GetResourceString

08/17/2009 17:12:59  1  INF            Resource id to be retrieved is PostSetupConfigurationFailedEventLog for language English (United States)

08/17/2009 17:12:59  1  INF            Resource retrieved id PostSetupConfigurationFailedEventLog is Configuration of SharePoint Products and Technologies failed.  Configuration must be performed in order for this product to operate properly.  To diagnose the problem, review the extended error information located at {0}, fix the problem, and run this configuration wizard again.

08/17/2009 17:12:59  1  INF          Leaving function StringResourceManager.GetResourceString

08/17/2009 17:12:59  1  ERR          Configuration of SharePoint Products and Technologies failed.  Configuration must be performed in order for this product to operate properly.  To diagnose the problem, review the extended error information located at C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\12\LOGS\PSCDiagnostics_8_17_2009_17_7_9_507_298886299.log, fix the problem, and run this configuration wizard again.

I actually found a reference to the solution in the comments in the  Companyweb Inaccessible After Sharepoint 3.0 Service Pack 2 post. Not directly, but one of the comments mentions that an account name was changed after the initial setup. I haven’t renamed any accounts, but I was reminded that I was running the psconfig command under a different account than had been used to initially install the Sharepoint SP2 update. I logged out of that account and logged back in with the account that was used to install the update, and the psconfig command completed successfully.

Woo hoo! Got it working! Only, http://companyweb and the Sharepoint Central Administration 3.0 sites still would not come up. I once again connected to the database via SQL Management Studio (reminder: run that with elevated permissions or you’ll never authenticate successfully) and verified that it was not read only. And the services were running. I checked the web site configuration in IIS and found the issue – all of the web sites had stopped. That’s when I remembered getting all the alerts overnight about the World Wide Web Publishing Service and the TS Gateway service being stopped. I had started them again first thing this morning and promptly forgot about them. Sure enough, when I checked again, they were both stopped (not surprised that the TS Gateway service stopped since it’s dependent upon the WWW Publishing service). I started both services and both companyweb and Sharepoint Central Administration were back online.

And I was able to finally add the one security group I needed to get added.

Takeaways from this process that aren’t documented in the SBS blog posts:

  1. If the Sharepoint SP2 update doesn’t take the first time and you need to run the psconfig command manually to complete the install, make sure you are running the command from the same user account that was used to attempt to install SP2 in the first place.
  2. Note that the psconfig command stops the World Wide Web Publishing Service (and TS Gateway) and does NOT restart them automatically.
Comments (0)
Jun
10

Getting your IP back

Posted by: Q | Comments (0)

So you’re having trouble getting to the Internet? Can’t ping the Internet gateway? Can’t ping your own IP address? Have network adapters that refuse to enable or disable? Could be a corrupt IP stack. You can take a look at MSKB 299357, or you can follow these steps:

  1. Make sure you’re logged in with a local administrator account.
  2. Open a command prompt.
  3. Run the following command :
    netsh int ip reset logfile.txt
    where logfile.txt is the name of a file where the command can write its output.
  4.  When the command completes, run it again with a different filename for the output file. 
  5. When that run completes, run it one more time, again with a different filename for the log file.
  6. Restart the computer in Safe Mode with Networking.

This will reset the TCP/IP settings back to sane defaults, which means all adapters in the computer will be set for DHCP. If you’re doing this on an SBS server, restarting in Safe Mode with Networking is absolutely crucial in order to avoid the dreaded 30 minute reboot. When the computer comes back up, set the network settings as needed, then reboot normally.

You may still have other issues, but these steps will get you a nice, clean, DHCP-enabled set of network adapters in the system.

Categories : How To, Troubleshooting
Comments (0)
Apr
22

SSL Certificate Validation

Posted by: Q | Comments (0)

I put up a post this morning regarding SSL certificate request validation over on the Third Tier web site. If you’ve been wondering how SSL certificates work in SBS 2008 or if you’re about to renew an SSL certificate on an SBS 2003 box, you might want to check out that post.

Comments (0)

You can almost always count on interesting things happening during Update Weekend. Sometimes a patch will yield unexpected results, sometimes you lose access to the server after initiating a restart (and yet the server doesn’t actually restart), and so on. Well, this past weekend was no different, but the types of issues encountered was.

As such, I’m going to start a new series of posts in the vein of demonstrating how troubleshooting was approached during a particular situation to help others identify other possible troubleshooting steps or avenues when encountering problems. We’ll start with a rather typical behavior (restarted a server remotely and could not get access back to the server when it should have come up) that had a very unusual root problem.

Read More→

Categories : Troubleshooting
Comments (0)
Apr
15

Remotely Installing This Month’s ISA Update

Posted by: Q | Comments (0)

Just a heads-up for those of you who remotely install security updates for your customers. This month includes an update for ISA, and if you don’t know about it beforehand, you could end up in a bit of a jam.

As expected, when installing the ISA update, access to the Internet through the server is interrupted. Unlike some previous updates, however, when the installation of this update completes, Internet access is NOT restored. You don’t get Internet back until you restart the server.

So if you don’t have some mechanism in place for restarting the server automatically after updates install, you could find yourself, and your customer, in a rather unexpected place.

Comments (0)

As more and more anti-spam solutions start doing “interesting” things with SMTP and mail delivery, there is an increased chance of users reporting that mail messages to certain domains are delayed. Unlike a full non-delivery report (NDR) which will list the SMTP error codes for easy identification of the reason for the rejection, a delayed delivery report could be the result of an Internet connection issue, spam filter, offline server, or any number of other causes. The remainder of this post details how to track down possible causes for Internet delivery issues. Read More→

Categories : How To, Troubleshooting
Comments (0)
Mar
28

Restoring SBS 2008 to Different Hardware

Posted by: Q | Comments (1)

While doing some testing on the restore capabilities of SBS 2008 using the native Server 2008 backup and restore tools, I ran cross an interesting tidbit regarding the restore process. Once I thought about it, it made sense, but not having tested a full system restore yet, I hadn’t run across it just yet.

When doing a bare metal restore of SBS 2008 using the native Windows Backup tools, your restore system must match the disk configuration of the source server as closely as possible. Specifically, if you have your backup from a server with two partitions on a single volume, you must restore to a single volume whose size is at least as large as the source volume. You cannot restore the two partitions from the original backup to a system with two volumes and expect that one partition would restore to one volume and the second partition would restore to the second volume. If your backup came from a system with a single volume and two partitions, you must restore to a system with a single volume so the backup can put two partitions on it.

I’m assuming that the reverse is true (if you have two volumes as the source for the backup, you must have two volumes for the restore) but have not had the ability to test this yet.

Again, this holds for a bare metal restore using the recovery method available when booting from the SBS 2008 installation CD. Using the native tools when SBS 2008 is running, you have the option to restore to alternate locations.

Comments (1)