sync data between clustered servers

HOW TO SYNC DATA BETWEEN CLUSTERED WEB SERVERS

Supposed you had a cluster of web servers being load balanced, data between these servers have to be identical to each other. One way of doing that is whenever new content is to be added you upload data to all of the web servers. Would it be better if you only needed to update a single web server and the new content will then be propagated to all the other servers. This can be done with rsync, along with ssh to make it secure.

Let’s choose one of the web servers, in the cluster, as the main server where you’ll only be updating content. I choose you www1.myhomelab.net which is one of the servers being loadbalanced for www.myhomelab.net. The other server I’ll be propagating updates to will be www2.myhomelab.net.

www1.myhomelab.net – the main server where updates/changes are to be made
www2.myhomelab.net – will get synchronized with www1.

The folder where all my web content is at “/var/www/html”. Normally you can run this command, from the www1 server, to syn www2 and www1:

[root@www1]# rsync -ave “ssh -l root” –delete /var/data/www/ www2:/var/data/www/

It would then ask you for root’s password (ssh -l root), enter it then press enter and you’ll get an output similar to this:
root@www2′s password:
building file list … done
html/
html/favicon.ico
html/testdb.php
html/asdf/
html/asdf/.form.php.swp
html/asdf/del.php
html/asdf/form.php
html/asdf/view.php
html/asdf/db_stuff/
html/asdf/db_stuff/db_close.php
html/asdf/db_stuff/db_config.php
html/asdf/db_stuff/db_open.php

sent 30873 bytes  received 368 bytes  6942.44 bytes/sec
total size is 1121503  speedup is 35.90

The contents of /var/data/www/html on both servers would now be identical. Sounds good right? Just save the command to a script have it scheduled to run at a specific interval and you’re set. But wait there’s more, you’re doing the upload using root. That doesn’t sound good. You should assign a non-admin user to do the uploads, but before doing that make sure that he has write access to the folder you’ll be uploading data to. I’m gonna assign the user “pilip” which exists on www2 to do the uploading. The user needs to have write access to the folder that’ll be synchronized. Once permissions have been taken care of, this command can now be executed:

[root@www1]# rsync -ave “ssh -l pilip” –delete /var/data/www/ www2:/var/data/www/

The next thing to do now is to cron this. But wait, there’s more, when you’re gonna cron it it’ll have to be full automated. With the command above you’ll still be asked to enter “pilip’s” password everytime it’s executed. To overcome this, set up passwordless ssh connection. On www1, as any user(you’ll be using this user later to run the rsync command above at a scheduled basis) run this command to create your ssh public key:

[anyuser@www1]$ ssh-keygen -t rsa

When asked to enter or change default settings, just press enter. Same as true when asked for a passphrase, this is what you want for a passwordless ssh. Then to copy over your ssh public key to www2, issue this command:

[anyuser@www1]$ ssh-copy-id -i ~/.ssh/id_rsa.pub pilip@www2

Enter pilip’s password when asked to, then the next time you try to ssh to www2 as pilip you’ll log on automatically without being asked for a password. neat? great.

Then schedule it to run a specific interval, I set mine to run rsync every minute. Using cron, run:

[anyuser@www1]$ crontab -e

Then enter this:

*/2 * * * * rsync -ave “ssh -l pilip” –delete /var/www/html www2:/var//www/html

Save and exit, the crontab entry above will execute the rsync command every 2 minutes. This is only for testing purposes, time interval may have to be a bit longer depending on how often updates are made to the site.

good times.

This entry was posted in Uncategorized and tagged , , , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

Security Code:

This site uses Akismet to reduce spam. Learn how your comment data is processed.