剛才記錄了一篇《自己編寫的網站監控程序》,可以實現比較複雜的多系列網站巡檢,設置第二個參數為sitemap.xml就可以檢查網站地圖。
不過看到以前還寫過一個更簡單的sitemap.xml檢查程序monitor_xmlsitemap.php,也把PHP源代碼貼出來:
<?php
function check($host) {
//$keyword = 'xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"';
$keyword_index = 'sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"';
$keyword_urlset = 'urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"';
$url = "https://$host";
$time_start = microtime(true);
print "----------\n";
print "url = $url\n";
$content = file_get_contents($url);
print 'content length = '.strlen($content)."\n";
$time_end = microtime(true);
$time_long = $time_end - $time_start;
print "time = $time_long s\n";
if (strpos($content,$keyword_index) != FALSE) {
print "index file\n";
} elseif (strpos($content,$keyword_urlset) != FALSE) {
print "urlset file\n";
} else {
print "error: not sitemap file!\n";
}
print "\n";
}
print "monitor start\n";
$sites = array(
'www.jamesqi.com',
'jamesqi.com',
'www.youbianku.cn',
'w.youbianku.cn',
'wyoubianku.cn',
'www.baidu.com',
'www.google.com'
);
//print_r($sites);
$uri = '/sitemap.xml';
foreach ($sites as $site) {
$host = "$site$uri";
//print "host = $host\n";
check($host);
}
print "monitor end\n";
?>
運行結果輸出網址、内容字節數、抓取時間、内容性質(索引文件/内容文件/錯誤内容)這幾個數據。适合單獨手工運行。
评论