My new site vmaware.net has finally gone live. This is my new digital playground for all things virtualization and related topics.
Not much content yet, but that is due to change pretty quickly. :)
---
Update: Well, turns out vm-aware.net was already occupied by someone with very similar interests as myself, so I've renamed the site to vNinja.net to avoid confusion. Enjoy!
In an effort to try and clean-up my online presence, I've set up a new site blog.opticalpork.com. This site will, from now on, be my main photography/portfolio site.
This is the first of several planned changes in how/where I publish my online content, more stuff to come later on!
A little while ago, disaster struck. What seemed like a normal day at work, suddenly turned into a frenzy I have yet to experience anything similar to.
What happened? We realized something was wrong when we lost
contact with one of our non-virtualized servers.
I couldn’t contact it at all; it had just vanished from the face
of our network.
My natural reaction was to run into our server room, to check what had happened. I figured it would be a power supply failure, or NIC failure.
Boy, was I wrong.
It turns out that a plastic pipe going through the wall, providing shielding for the power cables that provide power for the outdoor unit of the cooling system, led water straight into the server room. When I entered the server room, and heard splish-splashy sounds as soon as my feet hit the floor, I immediately grabbed a bucket and held it under the aforementioned pipe. While I stood there, trying to do some damage control, several other people rushed to my assistance.
As soon as there were enough hands on deck trying to get rid of the water, I grabbed the file server and brought it downstairs for some open heart surgery.
It’s a well known fact that water and servers don’t really mix that well. Even less so when the water in question flows down the walls in your server room, right on top of your main file server. That’s right; water meet server.
Of course, the very last of our non-rack based servers was located in a straight line below the pipe. Everything else was fine; the rack servers aren’t located directly on the floor, nor is anything else. We did have a good 2cm of water on the floor, but that wasn’t enough to hit the rack servers or UPS’s.
So, what was the end result? One pretty dead server. It did try
to get our hopes up, and initially it did.
At first, things looked good. I removed the HDDs and the power
supplies, opened the cabinet and looked for water damage. The
power supplies seemed to have gotten a bit wet, which is probably
why the server went MIA in the first place. Other than that,
everything looked good. I still had some hope that the data on
the HDDs was undamaged. Considering that I had removed the HDDs,
I tried powering on the server. Any, yay, it started up, went
through the BIOS OK and generally seemed like a happy little
server again.
I let it run for a while with no apparent errors or hiccups, so I decided to try and boot it with disks in it again. At first, the RAID controller complained that its logical drive(s) was missing, but that was expected after I had started it without the drives in it. I tried setting the logical drive to online, but then it complained about missing information. My next move was to copy the RAID/Logical Drive information from the drives to the controller, and that worked perfectly. The server rebooted, and started without problems. I let it run for a while, no problem what so ever, it seemed we caught a lucky break and could continue running.
Sadly that was not the case, as it only lasted a good 20 minutes
before the server died completely, breaking the RAID as a result.
The drives died, the power supply died, and our inventory is now
one physical file server smaller.
Next, restore from backup. As most small companies/IT-depts. we
do backups to tape. We even have a pretty decent LTO3 based
changer, and we run Tivoli Storage Manager as out backup
software. As this was a physical server that was due to be
replaced with a VM, we decided to restore its data to a new
pre-provisioned VM. That should be a breeze, right?
As anyone that has attempted to restore large amounts of data from a tape library will attest to, things can, and will, fail. Tapes can go bad, drives can go nuts and changers can decide that they don’t want to change anymore. We experienced two of the above;
In the end, we were 100% successful in recovering the data from
our latest backup set. We restored nearly 1 000 000 files (which
also increased the restore time by a huge amount), but the entire
restore process took us close to 56 hours in total.
Of course, in hindsight this whole mess could pretty easily have
been avoided, on several different levels:
The most significant thing we could have done, before disaster struck, was to have a proper disaster recovery site in place. Irony has it that we got the quote on the hardware from HP, and software from Veeam, on Tuesday, two days before “the incident”. We have the DR location in place, and the lease contracts have been signed. We even have 100Mbit direct access to the DR site being installed as we speak. If this had happened a month or two from now, we would have been up and running through the whole ordeal. Of course, it could not have happened at a worse time, but when would something like this be well timed, really?
Now, we were already in the process of getting a DR site in place, so both IT and Management knew about the need for a secondary location. What surprised us though, was the sheer amount of files we had to restore from tape, and how much time it took. 56 hours is an extremely long time, especially when you are looking at restore jobs...
This means that our DR site setup, won’t be based on tape based backups. We can’t rely on tape medium as a primary medium for restore processes, it simply takes too long and is too error prone for us to base our business on. The fact of the matter is that even small businesses now have so many files and so much critical data floating around, that tape just isn’t feasible anymore. Don’t get me wrong, I’m glad we had tape backups, as we don’t really have the storage space available to do disk based backups right now.
As soon as the DR site is up and running, tape is dead as far as I’m concerned.
I’ll outline our DR site setup later, when we have it in place, but I’m definitely looking into using Virtual Tape Libraries (VTL) with dedup built-in for the new setup. And of course, snapshot based VM backups using Veeam Backup and Replication to the DR location, you know, for those really critical VMs that we can’t live without.
I for one will have backups everywhere from now on.
I initially bought a HP Proliant ML 115 server as a cheap test/lab server for VMware vSphere and miscellaneous rollout projects at work, but all of a sudden I needed it for some other project that required that I install Windows Server 2008 directly on the hardware itself.
As is the story with most HP Proliant servers, you should install it with the tools that HP provides. In the case of the ML 115, you can't use the normal SmartStart setup, but it's little cousin Easy Set-up CD.
The installation started fine, after running through the initial
HP wizard, but when the time came to actually get the
installation started it went all blue screened on me, complaining
about nvstor.sys.
I knew that the Windows 2008 installation medium doesn't include
support for the built-in nVidia NFP3400 SATA storage controller
in RAID mode, but I wasn't running a RAID based setup on it
anyway so that shouldn't cause the problem.
Next I tried installing Windows Server 2008 without using the Easy Set-up CD, in other words just plain old booting of the Windows Server 2008 installation CD and initially it seemed like it was running ok. Thats until it just stopped at 0% progress at the "Expanding files" section of the installation.
So, there I was. Using the HP tools, the installation ends in a big old BSOD, using "native" Windows Server 2008 installation it just stops without any indication on what might be wrong.
As it turns out, the solution was pretty weird. The HDD shipped with the server causes the problem (160GB NHP SATA). I have no idea how, but replacing it with another SATA drive and starting the installation again, with the Easy Set-up CD, fixed it.
The HDD shipped with the server makes the installation of Windows Server 2008 crash, replacing it with a "generic" Western Digital AV-GP 1.5TB SATA drive lets me install without problems.
Obviously the nvstor.sys driver shipped with Windows Server 2008 has problems with some drives, but not all. Imagine that a cheap server, that can run VMware ESX/ESXi right out of the box, can't run Windows Server 2008 with the HDD it came shipped with.
Now, how weird is that? Note that that wasn't tested with Windows Server 2008 R2, so the nvstor.sys file shipped with that version might not have the same problem. Also, I did not try loading newer nVidia drivers during the Windows installation procedure, because a) when using the Easy Setup CD you don't get the option to load third party drivers, and b) because after I figured out that changing the HDD helped I didn't want to try another manual installation.
Remind me again, why don't we just virtualize everything? In this instance, it would actually be easier (and quicker!) to install ESXi on the bare metal hardware, create a VM and install Windows Server 2008 in that instead of installing Windows Server 2008 on the hardware directly. How the world has indeed changed.
After finishing the installation, I did run into another problem that quite possibly is also related to the nvstor.sys driver. Windows would fail in creating partitions, of the amount of space used by the partitions exceeded approximately 1TB in total.
Upgrading the server to Windows Server 2008 R2 fixed this issue, and I was able to utilize the full disk. This leads me to think that had I installed Server 2008 R2 from the get-go I would not have seen the installation issues with the original drive at all.
Microsoft has re-released the previously revoked Windows 7 USB/DVD Download Tool. This time around, it's GPL licensed with source-code.
The tool has previously been released and subsequently revoked again after Microsoft was made aware that the tool, developed by a third party, included GPL licensed code in the compiled binary.
Personally I'm happy that the tool is available again, and that Microsoft "did the right thing ®" and released it with the proper license.
Maish Saidel-Keesing has revisited his previous post "Hot Add and "Need have have"" where he (like I did) pokes some fun at a rather strange error message in ESXi 4.0. Now that Update 1 is out, Maish tries again, this time with better results.
Read the whole post: "Need have have" - revisited.
I'm glad to say we don't need have have any more!
Yesteday I had to reinstall my home computer due to a botched BIOS flash (don't ask, long story...), and decided that it was time I installed Windows 7 on that computer as well.
Remembering the Microsoft's Windows 7 USB/DVD Download Tool, I went looking for the download only to be met by a 404 (page not found) error when I tried to download it. The whole information/documentation section was still available on the Microsoft Store site, but the downloadable file was missing. No information was given, so I assumed it was a glitch on Microsofts behalf and located an alternative download site (CNet) that still had it available.
The tool did it's job, and I got Windows 7 Enterprise installed from a USB pendrive without any problems at all, just as expected.
Today, however, all information regarding the tool has been removed. All you get now is a "Sorry, the page you are looking for cannot be found." 404 error when you try to access it's previous location and no explanation is given.
Turns out, Microsoft has indeed pulled the tool from the site. According to Rafael Rivera Jr. this is because he discovered that the Microsoft tool was using code from "CodePlex-hosted (yikes) GPLv2-licensed ImageMaster"
Clearly a breach of the GPL as the Microsoft tool wasn't GPL'ed itself.
Read all the details in Rafael's post "Microsoft lifts GPL code, uses in Microsoft Store tool". I guess that means we are back to using Novicorp WinToFlash again. For more details on WinToFlash, check out my post called "Installing Windows from a USB Stick".
How did this ever slip through Microsofts QA?
On November 13th Microsoft confirmed that their own internal code review of the tool had uncovered that Rafael Riviera Jr. was indeed right. The tool does contain GPL code. The tool was develped for Microsoft by a third party, but still, this could, and should, have been avoided if Microsoft had conducted a proper code review before releasing the tool into the wild.
So, Microsoft now what? Well, it seems like they indend to do the only thing they can do, release the whole tool as GPL licensed:
As a result, we will be making the source code as well as binaries for this tool available next week under the terms of the General Public License v2 as described here, and are also taking measures to apply what we have learned from this experience for future code reviews we perform.
Read the whole statement from Microsoft: Update on the Windows 7 USB/DVD Tool
I must say that even if this shouldn't have happened, Microsoft did the right thing here. Admitting what happened and took the natural consequences. Well played.
Over time the boot partition on a Windows Server 2003 installation might just turn out to be too small. There can be various reasons for this, but the fact remains that over time you will accumulate data on the boot drive that you didn't take account for when you set it up initially.
Luckily I run almost all of my servers in a VMware based virtualized environment, where it's easy to expand the the virtual disks. The problem is that Windows Server 2003 doesn't let you easily expand the boot volume, at least not without downtime. I've previously talked about using tools like GParted to expand the boot volume but there are easier ways to do it and prevent downtime at the same time!
All you need is love. No,wait, that's something else entirely! All you need is ExtPart. ExtPart is a lovely little 36KB tool that Dell has provided to expand partitions on Dell based servers and storage systems. It is a little known fact that ExtPart can do the job in any 32 bit Windows Server 2000 or 2003 based install (no 64 bit support, sadly), and in Server 2008 there are other methods of doing this.
Enough talk, lets get down to the business at hand.
Thats it. The following screenshots outline the process very well, without having to guide you through each step. Have a look!
It can't get much simpler that this, honestly.
A little while a go I mentioned a great little tool called Novicorp WinToFlash.
Seems like Microsoft figured out that was a great little idea, and in conjunction with todays official Windows 7 release, they've also made the Windows 7 USB/DVD Download Tool available.
Since you can buy Windows 7 and then download the ISO directly from the new online Microsoft Store (Can anyone say Apple?!) it makes sense that they have created their own little tool that enables you to install Windows 7 from an USB stick. The tool makes it easy to copy the ISO to a USB stick, and then use that to boot your computer and install from it. Nothing more, nothing less.
I love utilities like these, you know the ones that do one task and do it well?
Now this is something I don't often do as this is mostly a tech blog, but this is huge. Last night Temple of the Dog reunited when Chris Cornell joined Pearl Jam on stage.
Temple of the Dog was Chris Cornell, Jeff Ament, Stone Gossard, Matt Cameron, Mike McCready and Eddie Vedder all of which were present at Los Angeles’ Gibson Amphitheatre performing “Hunger Strike” from the self titled album released in 1991.
Now, can Pearl Jam please come play in Bergen, Norway? And, yes, I wouldn't mind it much if Chris Cornell came along for the ride too...