Frogboy Frogboy

Hair pulling

Hair pulling

This is just unbelievable.  I am sure many fo you, like us, have dealt with plenty of machines that run fine 24/7 over years and years.

Today WinCustomize lost another server to total hardware failure. This is the THIRD server in 3 months. This one in some ways was worse and the timing was particularly bad.

The SQL database was on that machine. Luckily, because of our previous experiences, we have backups. But because of an additional glitch, the auto-backup didn't work which means we lost all the database changes since our last manual backup which was just before we left on the holidays. Net result is that we lost any uploads between the time we went away for Christmas (22nd?) and yesterday.

You can imagine how frustrated we are. Oh, we have a pretty good idea the commonolity on this (all 3 servers were quite very new, from the same provider with the same OS). But we won't get into public finger pointing. 

It's just incredible though. We have el-cheapo servers here that have run 24/7 for the past 5 years without any problems.

We'll have the libraries restored with what we have. If authors who have recently submitted stuff could resubmit their stuff we appreciate it and apologize for the inconvenience. Ironically, we have the files (the crash in November wiped out the physical files but the database was fine). I'm just glad we keep backups.

Thanks for your patience and support. Unfortunatley, we are now down a main server again. This one was our fastest/newest (and most expensive) machine unfortunately.

If anyone needs me, I'll be over here pulling my hair out.

18,290 views 47 replies
Reply #27 Top
Sorry to hear that Frogboy.

If you need, you can go here http://www.hairclub.com/google/

Reply #28 Top
Not sure which backup software you are using, we are using Backupexec software. Just in case you want more information sharing, I don't mind if you contact me directly as well.

Besides bringing new servers, below is some of my comments:

1. Consider to have dedicated SQL server, currently I observe that the skin library is not available due to several issues (1. web server issue - can't handle further request 2. DB issue 3. hardware issue).

I believe majority of weekend downtime (not counting these three special downtime) is due to web front-end issue. You can still provide the DB access to other web servers front-end even one particular web server is down.

2. Consider to seperate several web application on different servers, or using 'application pool' feature of IIS6 (from Windows Server 2003) - I understand you are using IIS5/W2K? This can prevent the situation that the 'skin library' web will bring down other application as well (e.g. customer page, stardock central...etc)

Also there has many performance tuning aress on both web servers and DB servers.

3. Hmm...errr... any plan to migrate the front-end to ASP.NET? Believe compiled code is faster than ASP engine that trying to render the ASP code everytime?

4. This may be off-topic, but currently the logon to Stardock Central still need to validate the account in three servers (1. sdcentral.stardock.com 2. wincustomize.com 3. galciv.com). I prefer that we can still access to Stardock Central even the latter two websites is not available.
Reply #29 Top
Gee you'd think the terrorist would have better things to attack than Stardocks servers. Taking our fun away like that, its downright dirty.

We'll be here patiently waiting for your return. Its a shame this had to happen. I'm sure WC will pull out of this better than ever.
Reply #30 Top
When the serie begin it may goes on ... Hope you'll don't get another major trouble and things will be as much secure as possible with the new servers ...
Reply #31 Top
have you tried hitting the servier with a hammer yet. sometimes that works with the old ibm pc.

And by this day and age I thought reliability would be taken for granted.
Reply #32 Top
hope things go well with the repair, and thanks for all the hard work to those dealing with resolving the server problems.
Reply #34 Top
The first thing i would do is pointing the finger...do you hafve an agreement with the "brandname poopmachine" that you can't mention that their POS machines crashes all the time?

I would label them loud and clear - Get their attention - You don't get millions of hits for nothing

Reply #35 Top
Brad......Dreadful news but as always you will bounce back. You must take heart from all the messages on this thread and don't be shy in looking at some of the offers that people are making from their hearts. It's at times like this that you realise how many people are counting on you and the rest of the team there.....and what is more everyone is 100 percent behind you all the way.......no moans no groans....just support and that says a million things about this wonderful community that you have worked so hard to build. You must be feeling like someone has it in for you at the moment but you will come out of all this with flying colours. You are a fighter and you have a whole army behind you
Reply #36 Top
That's the way computers go, isn't it.... Just when u think it's all working, something crashes and u're back to zip-zero... Happens to everyone of us, doesn't it?

Well, I cannot offer my help, being a poor uni student and having no contacts to IT pros at all, but I want to tell u that I feel your pain either way! Good wishes from Germany! Get WinCust back up as soon as ya can, coz it rocks! Keep it up, Frogboy!!
Reply #37 Top

Grumble, mutter, mutter, grouch.....


Still....'twas a little forced holiday.....


Good luck with resolving everything, guys....

Reply #38 Top
Curious if you are using RAID or Mirroring or striping?
Reply #40 Top
You know, if you'd just downgrade the quality of the products and web site to the "below average" range, a lot of folks would stop visiting and the load on your POS servers would decrease to the point where maybe they could manage the traffic.

Yup. I think that's the cheapest solution. Don't allow any high-quality art on the site and riddle the Stardock apps with bugs. Presto! Everything becomes manageable again!





Jeff
Reply #41 Top
Ay caramba! That's some bad news indeed.
But as has been said before, we're all behind you and would help in any way humanly possible to get things back to *normal* again.
I don't think another subscription drive is a bad idea. Considering the circustances it may be your best bet.

I hope everything goes well from now on.
Reply #42 Top

Cartier, thanks for your suggestions but believe me, we knwo what we're doing.


1) It takes several servers to run this site including multiple dedicated SQL boxes.


2) Every server that has gone down was running Win 2K3. 


3) We plan to migrate to ASP.NET but you have to understand - our IT dept (Stardock's IT dept.) is salaried. Which means cost is an issue. Compiled code is much faster but it is not a major issue at this time.


4) SDCentral was unaffected.

Reply #43 Top
Ugh....another dead server. Maybe it's time to send them to wherever you guys got em from and demand your money back.

Frog, save yourself the aggrivation. Build your OWN boxes, and maintain them right at Stardock. It'll be cheaper and better for you in the long run. The best server I've seen ran 6 1/2 years without a crash, and the guy built it himself. Course, that was NT4 on a Xeon 533 with 2 gigs of ram...

Seriously. I think you'll do better with some homebrew boxes. My $.02
Reply #44 Top
Brad, I had trouble with SDC last night... I couldn't log in (tried it about 10 times for about 2 hours). Of course, it could be that this has nothing to do with the server-failure, but I thought, I report it anyway.
Reply #46 Top
Maybe it's just me, but seems like computers (both servers and pc's) keep getting cheaper and cheaper in their construction (not just price). Quality just continues to go down as more and more components are mfr'd in Asia to ever lower standards. Which might explain why your 5 yr old servers have performed more reliably than your brand-spanking-new turds...

Sad, but I would have happily bought a new system from one of the big guys a few years ago. Now I'd rather just purchase individual components at a premium, just to know what exactly I was putting in my machine. Profits from computer sales continue to decline and I'm now thoroughly convinced that mfr's are looking for any way possible to cheapen their product. Hope I'm wrong, but I've been disappointed with just about every one of the machines I've purchased in the last 3 years.

Sorry to ramble at such length. Great job WC on getting things up and running again!!

-C77
Reply #47 Top
"2) Every server that has gone down was running Win 2K3."

That explains the software problem