剛才記錄了一篇《自己編寫的網站監控程序》,可以實現比較複雜的多系列網站巡檢,設置第二個參數為sitemap.xml就可以檢查網站地圖。
不過看到以前還寫過一個更簡單的sitemap.xml檢查程序monitor_xmlsitemap.php,也把PHP源代碼貼出來:
<?php function check($host) { //$keyword = 'xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"'; $keyword_index = 'sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"'; $keyword_urlset = 'urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"'; $url = "https://$host"; $time_start = microtime(true); print "----------\n"; print "url = $url\n"; $content = file_get_contents($url); print 'content length = '.strlen($content)."\n"; $time_end = microtime(true); $time_long = $time_end - $time_start; print "time = $time_long s\n"; if (strpos($content,$keyword_index) != FALSE) { print "index file\n"; } elseif (strpos($content,$keyword_urlset) != FALSE) { print "urlset file\n"; } else { print "error: not sitemap file!\n"; } print "\n"; } print "monitor start\n"; $sites = array( 'www.jamesqi.com', 'jamesqi.com', 'www.youbianku.cn', 'w.youbianku.cn', 'wyoubianku.cn', 'www.baidu.com', 'www.google.com' ); //print_r($sites); $uri = '/sitemap.xml'; foreach ($sites as $site) { $host = "$site$uri"; //print "host = $host\n"; check($host); } print "monitor end\n"; ?>
運行結果輸出網址、内容字節數、抓取時間、内容性質(索引文件/内容文件/錯誤内容)這幾個數據。适合單獨手工運行。
评论