how to dynamically filter website content using PHP

minkia

New Member
I'm currently looking for solution to dynamically filter website content. By "dynamic" I mean I would calculate the percentage of the bad words i.e. \[code\]shit\[/code\], \[code\]f**k\[/code\], etc over the whole words on the first page. Say the website is allowed if the percentage is no more than 30%. How do I make it search each word on the first page and match them with the bad words list then divide by the total number of the words so then I would be able to get the percentage? The rationale is not to make a content filter but to just block the website should even a single word in the page matches with the bad words list. I have got this though, but it is of static.\[code\]$filename = "filters.txt";$fp = @fopen($filename, 'r');if ($fp) {$array = explode("\n", fread($fp, filesize($filename)));foreach($array as $key => $val){list($before,$after) = split("~",$val);$input = preg_replace($before,$after,$input);}}\[/code\]*filter.txt contains the list of bad wordsThanx Erisco!Tried this but it doesnt seem to work thou. \[code\]function get_content($url){ $ch = curl_init(); curl_setopt ($ch, CURLOPT_URL, $url); curl_setopt ($ch, CURLOPT_HEADER, 0); ob_start(); curl_exec ($ch); curl_close ($ch); $string = ob_get_contents(); ob_end_clean(); return $string; }/* $toLoad is from Browse.php */$sourceOfWebpage = get_content($toLoad);$textOfWebpage = strip_tags($sourceOfWebpage);/* array: Obtained by your filter.txt file */// Open the filters file and filter all of the results.$filename = "filters.txt";$badWords = @fopen($filename, 'r');if ($badWords) { $array = explode("\n", fread($fp, filesize($filename))); foreach($array as $key => $val){ list($before,$after) = split("~",$val); $input = preg_replace($before,$after,$input); }}/* float: Some decimal value */$allowedBadWordsPercent = 0.30;$numberOfWords = str_word_count($textOfWebpage);$numberOfBadWords = 0;str_ireplace($badWords, '', $sourceOfWebpage, $numberOfBadWords);if ($numberOfBadWords != 0) { $badWordsPercent = $numberOfWords / $numberOfBadWords;} else { $badWordsPercent = 0;}if ($badWordsPercent > $allowedBadWordsPercent) { echo 'This is a naughty webpage';}\[/code\]
 
Back
Top