Zaccan Pimp Sitego prop Zaccan
About this Entry
Posted by: online now ZPop

Visit ZPop's Xanga Site

Original: 5/21/2009 11:02 AM
Views: 1
Comments: 0
eProps: 0

Read Comments
Post a Comment
Back to Your Xanga Site



Thursday, May 21, 2009

Backing Up Your Server Using PHP

        by ngungo and lgpiper

Introduction


When the Ma.gnolia site went down a few weeks ago, most people's data were irretrievably lost. One can learn two lessons from this issue. One is that a robust, off-site back up plan is necessary. The other is that back-ups necessarily need be redundant. Thus, if one back-up becomes corrupt or incapacitated in any way, the data are still available via the second back up. We deal with the first of these two problems: developing a routine that ensures regular back up. Our concern is with moderate sized data bases. We are developing a few small web applications: Monpage.com and bookMarkR.us. They may well grow slowly, but they will never grow at all if we don't protect our users' data from the beginning.

Background


We wish to develop a generalized backup routine. The language of the back up is of little consequence. We have chosen to program our back-up routine in PHP. The core of the back up will be a function that looks something like the following:
 
function dirBK($from, $dest) { 
// This function will recursively copy all files and sub directories
// from a single directory, $from, on a hosting server
// to a single directory, $dest, on a remote storage disk.
 
// rsync seems the best candidate for the file-copying protocol.
}

If there are a lot data in a huge number of directories--e.g. one directory per monpage account, or one directory per site that each account monitors--, using rsync is not without its own problems. It can tie up the i/o ports on the server, then websites--anything else for that matter--on the hosting server would become temporarily unavailable. All activity would freeze whilst waiting for rsync to finish. Depending on the amount of data, this shut-down could take up to several hours. One way to solve the problem is to run the back up in piecemeal fashion, backing up one small directory at a time, with a pause in between each back-up segment. Thus, the function dirBK() must incorporate a loop which spreads the many small back-ups over an extended period of time--e.g. hours, days....
 
Another problem with rsync arises where, under a non-ideal scenario such as broadband disruption, rsync hangs and does nothing. We must, therefore, find a solution for this problem too.
 
Given the above considerations, we now have three major tasks, each of which may include some subtasks:
  1. Developing the dirBK($from, $dest) routine.
  2. Developing a general driver to run this routine in a loop.
  3. Scheduling the driver (e.g. running the back-up routine regularly using cron.)

This article only deals with the first two tasks because the scheduling task is server specific and also because, once the general routine functions properly, automatic scheduling is relatively simple to implement.

Back-up Routine Development


Developing dirBK is rsync specific and secondary to the topic here, so we will treat it last. Just be aware that it can be disrupted and hold up the whole grand scheme based on our driver routine. We will first describe the driver in simple terms, than add detail as we go along.
 
We begin with the variable, $dirs, which is an array of all directories that need to be backed up. The driver routine then becomes:
 
foreach($dirs as $dir) { 
$from = 'path/to/dir/'.$dir;
$dest = 'path/to/destination/'.$dir;
dirBK($from, $dest);
sleep(5);
}

This seems to be a good approach, backing up each directory, one at a time, and pausing few seconds in between. The pausing time could be a tuning parameter, depending on the efficiency of the i/o. What might still cause problems, however, is the situation in which dirBK() function halts in the middle of the file transfer process. One solution might be if we could somehow run each individual directory back up as a independent process. Thus, if one transfer breaks down, others will still be able to continue normally. To do this, we need to implement process spawning, using the pcntl_fork() function. Our modified routine now becomes:
 
foreach($dirs as $dir) { 
$from = 'path/to/dir/'.$dir;
$dest = 'path/to/destination/'.$dir;
$pid = pcntl_fork(); // spawn child process
if (!$pid) { // a child process returns zero
dirBK($from, $dest);
exit;
}
sleep(5);
}

We further need to check 2 things. First, if we will want to exclude some directories. For example, in a normal directory scanning we should exclude the "dot" and "double dot" directories. Secondly, we must be sure that what we are backing up is, indeed, a directory. So the next version of the routine is:
 
foreach($dirs as $dir) { 
if ($dir == '.') continue; //exclude '.' and '..' directories
if ($dir == '..') continue;
$from = 'path/to/dir/'.$dir;
if (is_dir($from)) { //make sure array element is a directory
$dest = 'path/to/destination/'.$dir;
$pid = pcntl_fork(); // spawn child process
if (!$pid) { // a child process returns zero
dirBK($from, $dest);
exit;
}
}
sleep(5);
}

At the end, the driver quits and all independent backup processes, running as the driver's children, also quit, including those that hung up for whatever reason. The command, sleep(5), gives some breathing room between each backup increment. Sometime, however, five seconds will not be long enough to ensure that the last cycle is complete. Thus, we give it a a little extra time, perhaps an extra 30 seconds.
 
Above, we specified an array of the directories to back up, $dirs. There are a number of ways to populate this array. If all of the directories are in the rootBK directory, we can use the scandir() function to populate the array. Our basic routine is now as follows:
 
<?php 
$dirs = scandir('path/to/rootBK/'); //populate the array, $dirs
foreach($dirs as $dir) {
if ($dir == '.') continue; //exclude '.' and '..' directories
if ($dir == '..') continue;
$from = 'path/to/dir/'.$dir;
if (is_dir($from)) { //make sure array element is a directory
$dest = 'path/to/destination/'.$dir;
$pid = pcntl_fork(); // spawn child process
if (!$pid) { // a child process returns zero
dirBK($from, $dest);
exit;
}
}
sleep(5);
}
sleep(30);
?>

It now remains to develop the dirBK routine. The whole routine can just a system call to the rsync command:
 rsync -r /path/to/existingsite/ username@newsite.com:/path/to/newsite/ 

It may be necessary to extend this with a few more parameters to specify a password file:
rsync -azq --delete -e "ssh -i /users/home/myaccount/.ssh/ss" /path/to/backups/ myaccount@myaccount.strongspace.com:/home/myaccount/backups

We won't go into the details of the rsync command, but you can find them on line at the following urls:

We will use the second method above, i.e. with the password file specified. To break up the command into more comprehensible components, we have,
$rsync = '/path/to/rsync -azq --delete -e "ssh -i /.ssh/ss"'; 
   $from = '/path/to/backups/';
   $remote = 'myaccount@myaccount.strongspace.com:/home/myaccount/backups';

Then, as mentioned above, we embed these variables into a php system call, remembering that we need a space between the three strings, $rsync, $from, and $remote:
system($rsync. ' ' .$from. ' ' .$remote); 

Placing this call into the main scheme produces the final version:
<?php 
$rsync = '/path/to/rsync -azq --delete -e "ssh -i /.ssh/ss"';
$remote = 'myaccount@myaccount.strongspace.com:/home/myaccount/backups';
 
$dirs = scandir('path/to/rootBK/');
foreach($dirs as $dir) {
if ($dir == '.') continue;
if ($dir == '..') continue;
$from = 'path/to/dir/'.$dir;
if (is_dir($from)) {
$dest = $remote.$dir;
$pid = pcntl_fork();
if (!$pid) {
system($rsync. ' ' .$from. ' ' .$dest);
// echo $dir;
exit;
}
}
sleep(5);
}
sleep(30);
?>

The echo $dir command was added for debugging purposes. To begin, one might wish to comment out the system call and un-comment the echo line. Then when the routine appears to be processing the expected directories properly, reverse the comments so that the full back-up is working.
 
If the name the file containing our routine backUp.php, we can run it manually from the command line, making sure to be in the proper working directory:
# php5 backUp.php

Sometimes it is necessary to specify the full path to the php command, and also the php.ini file. In that case the command might look as follows:
/usr/local/bin/php -c /users/home/userName/etc/php5/ /users/home/userName/backUp.php

The specifics, naturally, depend on one's particular server configuration.
 
Once the routine works as intended from the command line, then one should write a cron job to automate the process. That is a subject for another time.

Posted via email from Slothoughts Hypolite

 Posted 5/21/2009 11:02 AM - 1 View - 0 eProps - 0 comments

Give eProps or Post a Comment

Choose Identity
(?)
 
Give eProps (?)
Post a Comment
Add Link | Preview HTML comment help 


Back to ZPop's Xanga Site!
Note: your comment will appear in ZPop's local time zone:
GMT -05:00 (Eastern Standard - US, Canada)
Site 
Meter