server outage

server outage

For the discussion of infrastructure improvements and changes.

Moderator: Moderators

Post Reply
abma
Spring Developer
Posts: 3798
Joined: 01 Jun 2009, 00:08

server outage

Post by abma »

we had a short outage today from 12:36 - 16:41. The host machine crashed and after reboot, the spring vm didn't start because of a config error (which is fixed now), this is why it took this long.

my/our lesson from this is to setup a backup server with a sync script:

http://springrts.com/mantis/view.php?id=4619
MetalSucker
Posts: 98
Joined: 22 Sep 2014, 20:29

Re: server outage

Post by MetalSucker »

mysqlhotcopy/mysqldump + dar
User avatar
qray
Posts: 377
Joined: 02 Feb 2009, 18:49

Re: server outage

Post by qray »

Maybe I don't get how the setup works (resp. is supposed to work), but wouldn't multimaster replication be an idea to keep the server databases synced in both directions (each server is master and slave to the other)?
abma
Spring Developer
Posts: 3798
Joined: 01 Jun 2009, 00:08

Re: server outage

Post by abma »

i don't think current lobby server handles updates from database nicely... so a script which dumps db, copies to backup server / restarts backup lobby would be better atm.

problematic are user renames / logging i guess.
dansan
Server Owner & Developer
Posts: 1203
Joined: 29 May 2010, 23:40

Re: server outage

Post by dansan »

afaik stock mysql doesnt support multi-master-replication. imo having just 1 live server at a time and a master-slave-replication to the backup server would be enough. servers could be connected via vpn.
User avatar
PicassoCT
Journeywar Developer & Mapper
Posts: 10450
Joined: 24 Jan 2006, 21:12

Re: server outage

Post by PicassoCT »

Mom-client, dadDev, im a server. There is out in the open. Hate all you want. Im gonna leave now.
dansan
Server Owner & Developer
Posts: 1203
Joined: 29 May 2010, 23:40

Re: server outage

Post by dansan »

There will be a planned (though late-announced ;) lobby server offline time today at 10:30 CEST for approximately 20-60 minutes.

The hosting company will check the BIOS and exchange the RAM in the hopes to fix the recurring system crashes.
dansan
Server Owner & Developer
Posts: 1203
Joined: 29 May 2010, 23:40

Re: server outage

Post by dansan »

Server is back up. Let's hope he stays that way.
abma
Spring Developer
Posts: 3798
Joined: 01 Jun 2009, 00:08

Re: server outage

Post by abma »

thanks a lot! *crossing fingers*
Post Reply

Return to “Infrastructure Development”