myITforum.com Community Forum myITforum.com Community Forum

Home  Forums  Blogs  Live Support chat  Search Articles  Wiki  FAQ  Email Lists  Register  Login  My Profile  Inbox  Address Book  My Subscription  My Forums 

Photo Gallery  Member List  Search  Calendars  FAQ  Ticket List  Log Out

All Forums RSS Feed Subscription:


  


Disaster Recover / High Availabity

 
Logged in as: Guest
  Printable Version
All Forums >> [Management Products] >> System Center Products >> System Center Configuration Manager >> Disaster Recover / High Availabity Page: [1]
Login
Message << Older Topic   Newer Topic >>
Disaster Recover / High Availabity - 10/5/2009 2:19:01 PM   
rspinelli682

 

Posts: 314
Score: 18
Joined: 1/24/2006
Status: offline
Does anyone have any recommendation in regards to DR or HA for a primary?  We have had 2 cases in the last few months in which drives went bad and primaries had to be restored from backup.  It's a whole long story and if the region had the HW we suggested (Raid 10 vs Raid 5) it wouldn’t have been as issue.  Taking HW out of the equation what / is their a best way to recover from a primary failure that won' take hours?  In the case of the last primary, it took us about 8 hours to get back up and running:

1) Install SQL - 30 mins
2) Install SCCM - 1 hour
3) Run site repair wizard.  We noticed the amount of adverts will impact how long this step takes to fnish.  We had around 8k adverts and the repair would do around 1500 adverts and hour (you can see this is smsprov.log while the restore is going) - 6 hrs

The problem with the primary going down is that all of my secondaries below the primary (in this case 10) are also useless until the primary is restored.  I think 8 hours is pretty good to get everything going again, but business doesn't agree.

Business keeps pushing us to go to SAN, since all our primaries have direct attached storage, but we believe SAN will impact performance.  Does anyone have any input on SAN ?

Thank you.
Post #: 1
RE: Disaster Recover / High Availabity - 10/5/2009 3:11:24 PM   
skissinger


Posts: 3592
Score: 285
Joined: 9/13/2001
From: Sherry Kissinger
Status: offline
This is anectodal, and by NO means scientific, but I've had SAN attached drives go walkabout, and be completely unrecoverable, requiring a restore from backup anyway.

If they are pushing for SAN, and the costs involved in that, I'd push for RAID 10 local storage instead. Much more reliable to me... but that's based on personal experience again, no actual data.

I know you said not to focus on hardware as the answer, and I'm sorry. But really, 8 hours restoring onto completely different hardware is quite robust to me. The last time I had to do a "real DR" (not a planned for DR test scenario, where I could get all my notes in order, verify I knew where the backups were, etc. before we even really started), it took me much longer than 8 hrs. I think it was closer to 12 to 14 hours before I felt comfortable snatching a few hours of sleep.

Another thought: 8 *thousand* advertisements? Maybe it's time for a "let's cleanup old advertisements" initiative? Do you really have 8 thousand applications being deployed?

_____________________________

mofmaster@myitforum.com
My Blog
Microsoft MVP - ConfigMgr

(in reply to rspinelli682)
Post #: 2
RE: Disaster Recover / High Availabity - 10/5/2009 4:55:30 PM   
rspinelli682

 

Posts: 314
Score: 18
Joined: 1/24/2006
Status: offline
Sherry, I totally agree with you, 8 hrs in my opinion is fine, but LOB doesn't agree.

We actually do have 8k applications.  I agree there are probably 1k or so that could be cleaned up, but those 1k will come back pretty quick.  We have alot of adverts that are based on user group, so they will never go away.

SCCM is pretty bad when it comes to to quickly being able to recover, it seems no real way around it.  What the LOB wants is if SCCM goes down another servers kicks in and takes over, no downtime... not going to happen.

(in reply to skissinger)
Post #: 3
RE: Disaster Recover / High Availabity - 10/5/2009 5:55:26 PM   
gjones


Posts: 1589
Score: 108
Joined: 6/5/2001
From: Ottawa, Ontario, Canada
Status: offline
I have to ask what need are they trying to solve, where 8 hour is too long of down time? If 8 hour is too long then what about moving each secondary site to a primary child site?

IMHO, I don’t see how moving to a SAN will “fix” the downtime issue.


_____________________________

Garth@enhansoft.com

For a List of my Articles
http://www.myitforum.com/contrib/default.asp?cid=116
Blogs:
http://smsug.ca/blogs/garth_jones/default.aspx
http://myitforum.com/cs2/blogs/gjones/default.aspx


(in reply to rspinelli682)
Post #: 4
RE: Disaster Recover / High Availabity - 10/5/2009 8:16:31 PM   
skissinger


Posts: 3592
Score: 285
Joined: 9/13/2001
From: Sherry Kissinger
Status: offline
Well, if the LOB doesn't agree, you're likely going to need to bring in a top-shelf consultant to help you design the most robust, recoverable design. I'm not sure if you can even get 5 9's with Configmgr. You probably can. But I doubt it will be cheap. You're first activity though would be to tell them that in order to design the type of reliability they are looking for you'll need a Microsoft Consultant to come in, evaluate the hierarchy, and make a recommendation based upon their stated need to have "no downtime".

With so many variables in every organization, on the forums we can give some basic recommendations, but each company is different; and what I've seen in my limited career will often not apply to your company. You'll need someone to come in and spend a few days evaluating and coming up with a recommendation.

_____________________________

mofmaster@myitforum.com
My Blog
Microsoft MVP - ConfigMgr

(in reply to gjones)
Post #: 5
RE: Disaster Recover / High Availabity - 10/18/2009 3:12:46 AM   
jnelson993


Posts: 910
Score: 145
Joined: 2/18/2005
From: Minneapolis, MN
Status: offline
Yeah, I know I'm coming to this late, but if they want an easy speed up, kill the RAID 5 b.s. and go with RAID 10.  Like Sherry said, SAN won't fix squat, especially since you'll likely be lumped in with everyone else's SAN traffic and won't have any control over limiting the use of those spindles.  And if the read/write heads are moving while your transaction log is trying to write, it can't do it's big sequential writes as fast and inserts/updates/deletes are slowed.  RAID 10 is really a giant leap in helping you out here.  Aside from providing better redundancy, it provides much better performance.  Trust me, I've spend months and months on RAID and spindles and logical disks and partition offsets and allocation unit sizes, etc.

And I also agree with Sherry, 8 hours seems very quick.  I suppose you've done it so many times, you've got the process down .  You could speed that up by having an unattended build of the server, including SQL and all the configuration stuff, then you won't spend so long clicking NEXT->NEXT->NEXT


_____________________________

Number2 (John Nelson)
MyITForum - Blog
MyITForum - Forum Posts

(in reply to skissinger)
Post #: 6
RE: Disaster Recover / High Availabity - 10/18/2009 1:42:20 PM   
rspinelli682

 

Posts: 314
Score: 18
Joined: 1/24/2006
Status: offline
Guys thanks for the info. 

John, thanks for the SAN info thats very helpful. 

(in reply to jnelson993)
Post #: 7
Page:   [1]
All Forums >> [Management Products] >> System Center Products >> System Center Configuration Manager >> Disaster Recover / High Availabity Page: [1]
Jump to:





New Messages No New Messages
Hot Topic w/ New Messages Hot Topic w/o New Messages
Locked w/ New Messages Locked w/o New Messages
 Post New Thread
 Reply to Message
 Post New Poll
 Submit Vote
 Delete My Own Post
 Delete My Own Thread
 Rate Posts



  
Forum Software © ASPPlayground.NET Advanced Edition 2.4.5 ANSI

0.828