Page 1 of 5

On Dec 3, 2013 a server crash wiped out the database...

Posted: Thu Dec 05, 2013 6:43 pm
by webmaster
Some astute individuals out there may have noticed that the website hasn't been accessible since about late night Sunday... well, funny thing, turns out there was a monumental system failure which corrupted just about everything possible, so we pretty much lost all email and all database tables. Backups? Yes, well... normally we're very good at keeping backups, but the backups that were on the disc also got corrupted and the external backup was scheduled to be hooked up the next morning.

It's likely there is a semi-recent backup on some hard drive somewhere, but it's going to take time to find. So I figured there probably wouldn't be any better time to start fresh with the forums, and when we can locate a backup of all the forum threads then we can post it as an archive.

I'll let Gary explain in more detail when he has the time.

Posted: Fri Dec 06, 2013 9:59 am
by I've commented.
I'd like it if you'd restore Absolute Anime to normal. The 'anime profiles section' and the 'character profiles' section, the comments are missing, and so is the comment box. I hope you fix Absolute Anime back to normal.

Re: What a week!

Posted: Fri Dec 06, 2013 12:59 pm
by Tyche
Starting fresh! Whether you can get the forums back to normal matters not to me. It's kinda nice not having all the dead threads anymore.

Re: What a week!

Posted: Fri Dec 06, 2013 1:00 pm
by Tyche
^That was me. Forgot I had to activate my account first xD

Re: What a week!

Posted: Sat Dec 07, 2013 12:27 am
by Theo
First off, let me apologize for everything that has happened. As Ken can probably contest, “protect the data” is a motto that I’ve been living by for some years. My second job had that written on a white board with permanent marker.

So here is the situation as it sits. About 3 weeks ago I took one of our two iSCSI server’s offline. It was the backup. The purpose of this was to upgrade the drive capacity of this unit and to them make it the primary, and then doing the same to the old one. When I say old, I don’t mean 10 year old hardware. The upgrade was actually schedule to go in place that next morning.

At exactly 2:00 PST Sunday morning, the virtual server I had running as the iSCSI target initiated a check and immediately caused the box to freeze. At around 5:00 am PST I received a call about email being down from a client. Checking into it I had one of the techs from the colo reboot the box. It came back just fine. Mind you, this wasn’t a hardware failure (as this is a very redundant machine) but rather a software hiccup.

I spent the next several hours attempting to fix the issue. The iSCSI is built on Sun Solaris 11 running zpool and zfs. The underlying disk drives are virtual (4 of them, each 512mb – VMware limitation) creating a 2TB iSCSI storage system. It sound’s small, but the underlying drives are high speed very redundant drivers to prevent failures. The problem is the 4 “virtual” drives that make up the 2TB share were marked as bad. Again, the underlying disks were 100%.

I worked with a Solaris tech who guided me through attempting to fix this problem. In the end the commands we worked through managed to get the drive working again, just with no data. So we stopped at that point as to ensure that we don’t cause any additional damage to the unit and we have contacted a data recovery specialist.

We are in the process of arranging terms of contract and shipping of the hardware to the consultant. At the end of this, regardless of data recovery, we’re looking at about $6,000 in total fees with no guarantee of data recovery. It will also take 10 days or so to find out whether the data is viable when it’s finished.

Besides the web site, I myself had amassed 10+ years’ worth of data, emails, etc. My data was on a second virtual disk array (so we’re actually talking about 2 x 2TB iSCSI shares worth of data).

This is a series of unfortunate events in that we’re been very diligent about leasing redundant hardware (high end sun boxes, commercial raid solutions, etc.), software, and using as much underlying protections as we can, and yet we still failed you, our community.

The hosting site of this business has been generating us little revenue over the years. We have kept this in place mostly because it breaks even, and we have a lot of friends that utilize out services (as well as a few valued customers). In 2013 alone we’re upgrade the bulk of our hardware to plan for the future of the business and to better support the existing clients that we have. Unfortunately that’s put us in a tough situation. We will continue to attempt record data until it is deemed that we cannot recover it. There is no ETC for that at this time. In 15 years we’re never lost any significant data and have been able to recover from my outages within less than a day (worst case scenario) with 100% of that data intact. This is Murphy ’s Law; the small period of time when you’re not protected is when you will fail. That was a 3 week span out of 15 years.

I have been working painstakingly to get everything up to at least a usable state (I’m literally 100 hours in to a 6 day week right now). For the AA site, there will need to be some performance tweaking, and I’m going to have one of my guys on it once we get the remainder of our clients back online.

Ken has done an outstanding job with AA over the years and I know he’s put his heart and soul into it, and as a personal friend since college (a lifetime ago) I hits me hard to know that I’ve impacted him (and his users) in this way. I was watching the site it received it 1 millionth his after 2 years, I was there when it hit 1 million hits per day (2005 ish ??).

With all of this you have my heart filled apology.

From the Hold Stead perspective, we will be winding down the commercial hosting, my friends sister company will continue to run this site (as a few others) as they are the ones that leased us the hardware. So this will go on…

As we will rebuild, bigger and better, we must…

AKA Gary Smith


Posted: Sat Dec 07, 2013 2:09 am
by webmaster
I've commented. wrote:I'd like it if you'd restore Absolute Anime to normal. The 'anime profiles section' and the 'character profiles' section, the comments are missing, and so is the comment box. I hope you fix Absolute Anime back to normal.
Yeah, I'd really like that as well! I just wish it were possible. All the comments we're in the database, so for now they are gone and we'll just have to wait and see how the data recovery goes. Once we get phpMyAdmin installed I'll be able to setup the database so we can get the comment feature going again.

Posted: Sun Dec 08, 2013 8:51 am
by I've commented.
I had to re-register.

Posted: Sun Dec 08, 2013 9:00 am
by I've commented.
That new "posting review" thing is making posting difficult.

Posted: Sun Dec 08, 2013 9:40 am
by I've commented.
I hope there is a working backup to fully restore Absolute Anime and that Absolute Anime can be fully restored.

Posted: Sun Dec 08, 2013 9:44 am
by I've commented.
Fix the clock on Absolute Anime. The time displayed is incorrect.