最近在嘗試為網址添加百度的MIP(Mobile Instant Page - 移動網頁加速器)版本,網站改動後先用MIP Validator進行驗證和Preview預覽,沒有大的問題就可以等着百度蜘蛛來爬取了,不過還可以在百度站長平台中主動提交MIP版本,讓百度蜘蛛更快、更全面知曉。
進入百度站長平台後,可以在已有的網站(沒有的網站需要先驗證和添加網站)菜單中選擇“移動專區、MIP引入”,先要确認“《百度MIP資源接入内容責任書》的相關協議”,然後看到“手動提交”和“主動推送(實時)”,手動提交每次可以提交20個MIP頁面鍊接,而主動推送是用程序的方式每天可以提交10000個MIP頁面鍊接,每次運行提交不超過2000個。
我們采取了PHP方式:
<?php $urls = array( 'https://xunren.longren.com/?mip', 'https://xunren.longren.com/node?mip', ); $api = 'http://data.zz.baidu.com/urls?site=xunren.longren.com&token=xxxxxx&type=mip'; $ch = curl_init(); $options = array( CURLOPT_URL => $api, CURLOPT_POST => true, CURLOPT_RETURNTRANSFER => true, CURLOPT_POSTFIELDS => implode("\n", $urls), CURLOPT_HTTPHEADER => array('Content-Type: text/plain'), ); curl_setopt_array($ch, $options); $result = curl_exec($ch); echo $result; ?>
運行的結果如下:
{"remain":9998,"success":2}
将sitemap.xml中的内容複制到一個文本文件,進行一些需要的替換,就形成了每行一個MIP網址的格式,再劃分為适當的數量,複制到上面的程序中運行。
前段時間把個人博客添加了MIP版本,今天在百度站長平台中看收錄、校驗量都是正常的,但展示量、點擊量卻很少。還要再繼續觀察看看吧。
2017年6月23日補充:百度對提交mip的速度限制比較麻煩,特别是對我們頁面數量巨大的網站來說,有幾十萬頁面就需要幾十天來提交,耗費人力。昨天編寫了一個程序,讀取事先整理好的一個鍊接列表文件來提交其中一部分。再通過Linux的Cron機制來設置定時運行,基本上一次設置好了以後,每天自動提交5次共1萬條網址,隻需要人工檢查一下日志文件是否正常就可以。先在的提交結果是這樣:
{"remain":4996000,"success":2000,"success_mip":2000,"remain_mip":6000}
如果有很多站點,可以每個站點都寫一個定時的文本放在/etc/cron.d 目錄下,例如xihanhanxicidian.cron.txt:
10 17 25 8 * root php /root/mip/submit_mip.php xihanhanxi.18dao.cn /root/mip/mip_xihanhanxi.18dao.cn.txt 0001 2000 15 17 25 8 * root php /root/mip/submit_mip.php xihanhanxi.18dao.cn /root/mip/mip_xihanhanxi.18dao.cn.txt 2001 2000 20 17 25 8 * root php /root/mip/submit_mip.php xihanhanxi.18dao.cn /root/mip/mip_xihanhanxi.18dao.cn.txt 4001 2000 25 17 25 8 * root php /root/mip/submit_mip.php xihanhanxi.18dao.cn /root/mip/mip_xihanhanxi.18dao.cn.txt 6001 2000 30 17 25 8 * root php /root/mip/submit_mip.php xihanhanxi.18dao.cn /root/mip/mip_xihanhanxi.18dao.cn.txt 8001 2000 30 8 26 8 * root php /root/mip/submit_mip.php xihanhanxi.18dao.cn /root/mip/mip_xihanhanxi.18dao.cn.txt 10001 2000 35 8 26 8 * root php /root/mip/submit_mip.php xihanhanxi.18dao.cn /root/mip/mip_xihanhanxi.18dao.cn.txt 12001 2000 40 8 26 8 * root php /root/mip/submit_mip.php xihanhanxi.18dao.cn /root/mip/mip_xihanhanxi.18dao.cn.txt 14001 2000 45 8 26 8 * root php /root/mip/submit_mip.php xihanhanxi.18dao.cn /root/mip/mip_xihanhanxi.18dao.cn.txt 16001 2000 50 8 26 8 * root php /root/mip/submit_mip.php xihanhanxi.18dao.cn /root/mip/mip_xihanhanxi.18dao.cn.txt 18001 2000
submit_mip.php程序内容:
<?php
/*
* submit mip links to baidu zhanzhang
* james qi 2017-6-22
* command line: php submit_mip.php $site_domain $file_name $start_number $length_number
* for example: php submit_mip.php jamesqi.com jamesqi.com.txt 10001 2000
*/
ini_set('memory_limit','1024M');
/*
print "
command line: php submit_mip.php $site_domain $file_name $start_number $length_number
for example: php submit_mip.php jamesqi.com jamesqi.com.txt 10001 2000
";
*/
if ( isset( $argv[1] ) ) {
$site_domain = $argv[1];
} else {
print "please provide mip links site domain (for example: jamesqi.com) arg\n";
exit;
}
if ( isset( $argv[2] ) ) {
$file_name = $argv[2];
} else {
print "please provide mip links file name (for example: jamesqi.com.txt) arg\n";
exit;
}
if ( isset( $argv[3] ) ) {
$start_number = $argv[3];
} else {
print "please provide mip links start number (for example: 10001) arg\n";
exit;
}
if ( isset( $argv[4] ) ) {
$length_number = $argv[4];
} else {
print "please provide mip links length number (for example: 2000) arg\n";
exit;
}
$log_file_name = "$file_name.log";
$log_file = fopen("$log_file_name", "a");
$date_time = "--------\n";
$date_time .= date("Y-m-d").' ';
$date_time .= date("h:i:sa");
$date_time .= "\n";
$args = "site_domain=$site_domain, file_name=$file_name, start_number=$start_number, length_number=$length_number\n";
$command = '/alidata/server/php/bin/php '.$argv[0].' '.$argv[1].' '.$argv[2].' '.$argv[3].' '.$argv[4]."\n";
print $date_time;
print $args;
print $command;
fwrite($log_file, $date_time);
fwrite($log_file, $args);
fwrite($log_file, $command);
$content = file_get_contents($file_name);
$array = explode("\r\n", $content);
$count = count($array);
if ($count <= 1) {
$array = explode("\n", $content);
$count = count($array);
if ($count <= 1) {
print "exit, count = $count\n";
exit;
}
}
print "count = $count\n";
fwrite($log_file, "count = $count\n");
/*
$urls = array(
'http://www.example.com/1.html',
'http://www.example.com/2.html',
);
*/
$array_length = array_slice($array,$start_number-1,$length_number);
//print_r($array_length);
$urls = array();
foreach ($array_length as $key=>$value) {
$urls[$key] = trim($value)."?mip";
}
$urls[0] = substr($urls[0],strpos($urls[0],'https'));
//print_r($urls);
$api = 'http://data.zz.baidu.com/urls?site=https://'.$site_domain.'&token=your_token&type=mip';
print "api=$api\n";
fwrite($log_file, "api=$api\n");
$ch = curl_init();
$options = array(
CURLOPT_URL => $api,
CURLOPT_POST => true,
CURLOPT_RETURNTRANSFER => true,
CURLOPT_POSTFIELDS => implode("\n", $urls),
CURLOPT_HTTPHEADER => array('Content-Type: text/plain'),
);
curl_setopt_array($ch, $options);
$result = curl_exec($ch);
print "result = $result\n";
fwrite($log_file, "result = $result\n");
fclose($log_file);
?>
數據文件mip_xihanhanxi.18dao.cn.txt 部分示例:
https://xinhuazidian.18dao.cn/zidian/%E8%82%A6 https://xinhuazidian.18dao.cn/zh-hant/zidian/%E8%82%A6 https://xinhuazidian.18dao.cn/zidian/%E8%82%A7 https://xinhuazidian.18dao.cn/zh-hant/zidian/%E8%82%A7 https://xinhuazidian.18dao.cn/zidian/%E8%82%A8 https://xinhuazidian.18dao.cn/zh-hant/zidian/%E8%82%A8 https://xinhuazidian.18dao.cn/zidian/%E8%82%A9 https://xinhuazidian.18dao.cn/zh-hant/zidian/%E8%82%A9 https://xinhuazidian.18dao.cn/zidian/%E4%B8%8B https://xinhuazidian.18dao.cn/zh-hant/zidian/%E4%B8%8B https://xinhuazidian.18dao.cn/zidian/%E4%BA%80 https://xinhuazidian.18dao.cn/zh-hant/zidian/%E4%BA%80 https://xinhuazidian.18dao.cn/zidian/%E5%8C%A2 https://xinhuazidian.18dao.cn/zh-hant/zidian/%E5%8C%A2 https://xinhuazidian.18dao.cn/zidian/%E8%82%AA https://xinhuazidian.18dao.cn/zh-hant/zidian/%E8%82%AA https://xinhuazidian.18dao.cn/zidian/%E8%82%AB https://xinhuazidian.18dao.cn/zh-hant/zidian/%E8%82%AB https://xinhuazidian.18dao.cn/zidian/%E8%82%AC https://xinhuazidian.18dao.cn/zh-hant/zidian/%E8%82%AC
運行後會生成.log的日志文件,用于查看提交的返回信息,了解提交是否成功。
2017年8月28日補充:用上面的辦法提交了2個多月時間,把好些站點的mip網址都提交完畢,但奇怪一直就沒有多少來自mip的流量,上周同事發現原來我在寫提交網址程序的時候,居然把添加的後綴?mip寫成了?amp,這樣當然無法驗證通過了,而且我們也沒有經常去查看提交的反饋,百度也不像Google那要自動發現amp頁面、發現amp頁面有問題會發郵件提醒,導緻2個多月提交的鍊接全部都是錯的,唉,浪費了好多時間!隻好在修改後再次提交正确的mip網址。
Drupal站的mip版本我們都是在原網址後面帶一個?mip後綴,而MediaWiki版本我們都是另外設置一個單獨的二級域名/三級域名,例如web頁“https://www.jamesqi.com/首頁”的MIP版本就是“https://mip.jamesqi.com/首頁”,批量提交的程序submit_mip.php改一下提交的網址不需要加後綴,另外保存為submit_subdomain.php,而需要提交的網址可以從https://www.jamesqi.com/sitemap.xml 獲取,常用的幾種名字空間(Magic words):
ns:-2 - ns:Media ns:-1 - ns:Special ns:0 - main ns:1 - ns:Talk ns:2 - ns:User ns:3 - ns:User talk ns:4 - ns:Project ns:5 - ns:Project talk ns:6 - ns:File ns:7 - ns:File talk ns:8 - ns:MediaWiki ns:9 - ns:MediaWiki talk ns:10 - ns:Template ns:11 - ns:Template talk ns:12 - ns:Help ns:13 - ns:Help talk ns:14 - ns:Category ns:15 - ns:Category talk
我們需要複制的網站地圖鍊接:
https://www.jamesqi.com/sitemap-jamesqi_www-jingle-NS_0-0.xml https://www.jamesqi.com/sitemap-jamesqi_www-jingle-NS_4-0.xml https://www.jamesqi.com/sitemap-jamesqi_www-jingle-NS_6-0.xml https://www.jamesqi.com/sitemap-jamesqi_www-jingle-NS_12-0.xml https://www.jamesqi.com/sitemap-jamesqi_www-jingle-NS_14-0.xml
進行整理、合并并替換其中的子域名為mip.jamesqi.com後保存成文本文件mip.jamesqi.com.txt,其中内容示例如下:
https://mip.jamesqi.com/027.cn%E7%9A%84%E5%AD%90%E5%9F%9F%E5%90%8D%E8%A2%AB%E7%99%BE%E5%BA%A6%E8%A7%A3%E5%B0%81%E4%BA%86 https://mip.jamesqi.com/027%E5%8D%9A%E5%AE%A2%E5%B0%86%E5%8D%87%E7%BA%A7%E4%B8%BA%E4%B8%AA%E4%BA%BA%E9%97%A8%E6%88%B7%EF%BC%8C%E5%8A%9F%E8%83%BD%E5%85%88%E7%9D%B9%E4%B8%BA%E5%BF%AB%EF%BC%81 https://mip.jamesqi.com/%E5%88%86%E7%B1%BB:%E9%BB%84%E9%A1%B5 https://mip.jamesqi.com/%E5%88%86%E7%B1%BB:%E9%BC%A0%E6%A0%87 https://mip.jamesqi.com/%E5%88%86%E7%B1%BB:%E9%BD%90%E8%BE%BE%E5%86%85 https://mip.jamesqi.com/%E5%88%86%E7%B1%BB:%E9%BE%99%E4%BA%BA
然後運行:
php /root/mip/submit_subdomain.php mip.jamesqi.com /root/mip/mip.jamesqi.com.txt 0001 2000
需要提交的數量超過10000以上時需要分多天提交,可以參看前面設置cron的辦法定時提交。寫一個定時的文本放在/etc/cron.d 目錄下,例如mip.jamesqi.com.cron.txt:
10 17 25 8 * root php /root/mip/submit_subdomain.php mip.jamesqi.com /root/mip/mip.jamesqi.com.txt 0001 2000 15 17 25 8 * root php /root/mip/submit_subdomain.php mip.jamesqi.com /root/mip/mip.jamesqi.com.txt 2001 2000
评论