If there is a file that´s excessively being rewritten by many different users, you´ll note that two almost-simultaneously accesses on that file could interfere with each other. For example if there´s a chat history containing only the last 25 chat lines. Now adding a line also means deleting the very first one. So while that whole writing is happening, another user might also add a line, reading the file, which, at this point, is incomplete, because it´s just being rewritten. The second user would then rewrite an incomplete file and add its line to it, meaning: you just got yourself some data loss!
If flock() was working at all, that might be the key to not let those interferences happen - but flock() mostly won´t work as expected (at least that´s my experience on any linux webserver I´ve tried), and writing own file-locking-functions comes with a lot of possible issues that would finally result in corrupted files. Even though it´s very unlikely, it´s not impossible and has happened to me already.
So I came up with another solution for the file-interference-problem:
1. A file that´s to be accessed will first be copied to a temp-file directory and its last filemtime() is being stored in a PHP-variable. The temp-file gets a random filename, ensuring no other process is able to interfere with this particular temp-file.
2. When the temp-file has been changed/rewritten/whatever, there´ll be a check whether the filemtime() of the original file has been changed since we copied it into our temp-directory.
2.1. If filemtime() is still the same, the temp-file will just be renamed/moved to the original filename, ensuring the original file is never in a temporary state - only the complete previous state or the complete new state.
2.2. But if filemtime() has been changed while our PHP-process wanted to change its file, the temp-file will just be deleted and our new PHP-fileclose-function will return a FALSE, enabling whatever called that function to do it again (ie. upto 5 times, until it returns TRUE).
These are the functions I´ve written for that purpose:
<?php
$dir_fileopen = "../AN/INTERNAL/DIRECTORY/fileopen";
function randomid() {
return time().substr(md5(microtime()), 0, rand(5, 12));
}
function cfopen($filename, $mode, $overwriteanyway = false) {
global $dir_fileopen;
clearstatcache();
do {
$id = md5(randomid(rand(), TRUE));
$tempfilename = $dir_fileopen."/".$id.md5($filename);
} while(file_exists($tempfilename));
if (file_exists($filename)) {
$newfile = false;
copy($filename, $tempfilename);
}else{
$newfile = true;
}
$fp = fopen($tempfilename, $mode);
return $fp ? array($fp, $filename, $id, @filemtime($filename), $newfile, $overwriteanyway) : false;
}
function cfwrite($fp,$string) { return fwrite($fp[0], $string); }
function cfclose($fp, $debug = "off") {
global $dir_fileopen;
$success = fclose($fp[0]);
clearstatcache();
$tempfilename = $dir_fileopen."/".$fp[2].md5($fp[1]);
if ((@filemtime($fp[1]) == $fp[3]) or ($fp[4]==true and !file_exists($fp[1])) or $fp[5]==true) {
rename($tempfilename, $fp[1]);
}else{
unlink($tempfilename);
if ($debug != "off") echo "While writing, another process accessed $fp[1]. To ensure file-integrity, your changes were rejected.";
$success = false;
}
return $success;
}
?>
$overwriteanyway, one of the parameters for cfopen(), means: If cfclose() is used and the original file has changed, this script won´t care and still overwrite the original file with the new temp file. Anyway there won´t be any writing-interference between two PHP processes, assuming there can be no absolute simultaneousness between two (or more) processes.