最近需要編寫一段程序來讀取Drupal網站中頁面Node的某個文本字段,進行處理、判斷、匹配後,将這個頁面歸類Taxonomy到某個術語表Vocabulary的術語Term中。在剛開始用Drupal 6的時候就曾經編寫過類似程序來分類,見博文《Drupal中讓Node歸類的PHP程序》,在後來使用Drupal 7的過程中,絕大多數分類都是在創建網站、導入數據的時候就自動進行了,使用了術語來源Term reference字段和自動完成術語挂件(标簽)Autocomplete term widget (tagging)控件,但也有把數據作為文本導入字段,然後再運行php程序進行分類的情況,不過Drupal 7中的程序與Drupal 6的有些不同,當時沒有記錄博客,後來再找以前的程序很費勁,現在補記一下,示範程序如下:
<?php $province="anhui";//這裡寫成固定的,也可以用php運行參數的方式來引入 $offset=$argv[1]; $limit=$argv[2]; $_SERVER['HTTP_HOST'] = "ditu.mingluji.com.$province";//子目錄方式的站點就這樣寫http_host $_SERVER['SCRIPT_NAME'] = "/ditu_category.php"; $_SERVER['REMOTE_ADDR'] = '127.0.0.1'; $drupal_path = '/usr/local/apache2/htdocs/ditu.mingluji.com/'; chdir($drupal_path); define('DRUPAL_ROOT', $drupal_path); require_once './includes/bootstrap.inc'; drupal_bootstrap(DRUPAL_BOOTSTRAP_FULL); $node_type = "poi"; //$sql = "SELECT node.nid FROM {node} WHERE node.type = '$node_type' LIMIT $limit OFFSET $offset"; $sql = "SELECT node.nid FROM {node} LEFT JOIN {field_data_field_category} field_data_field_category ON node.nid = field_data_field_category.entity_id WHERE node.type = '$node_type' AND (field_data_field_category.field_category_tid IS NULL ) LIMIT $limit OFFSET $offset";//加入了聯合查詢,将已經進行了分類的頁面排除,以免重新運行 //echo $_SERVER['HTTP_HOST']; //echo "\n"; //echo $sql; //echo "\n"; $result = db_query($sql); //print_r ($result); //echo "\n"; $count=0; $count_added=0; $count_not_add=0; $count_adding=0; $vid=taxonomy_vocabulary_machine_name_load('category')->vid;//獲取vid while ($anode = $result->fetch()) { $count++; /* print_r ($anode); echo "\n"; */ $nid=$anode->nid; $entity=node_load($nid); //print_r ($node); //将CCK數據庫字段讀出賦值給變量 $classification = $entity->field_classification['und'][0]['value'];//讀取文本字段 print "nid=$nid,classification=$classification\n"; $category=$classification;//對文本字段内容進行變換處理 $category=preg_replace("/\[\d{1,3},/",'',$category); $category=str_replace('[','',$category); $category=str_replace(']','',$category); $category=str_replace('<\/font>','',$category); $category=str_replace('委,辦,局','委/辦/局',$category); $array=explode(',',$category); $i=0; foreach ($array as $name) {//多個分類詞逐個處理 $term=taxonomy_get_term_by_name($name); //print_r ($term); $tid=key($term); if ($term==NULL) {//如果術語不存在則先創建該術語 $term = new stdClass(); $term->name = $name; $term->vid = $vid; taxonomy_term_save($term); $tid=$term->tid; } print "i=$i,name=$name,tid=$tid\n"; $entity->field_category['und'][$i]['tid']=$tid;//分類屬于的tid設置 $i++; } //print $category; //$entity->field_category['und'][0]['value'] = $category; //$entity->field_classification['und'][0]['value'] = "classification:$classification category:$category"; node_save($entity);//保存 } echo "\n------------------------\n"; print "Done!\n"; print "count=$count\n"; print "count_added=$count_added\n"; print "count_adding=$count_adding\n"; print "count_not_add=$count_not_add\n"; ?>
上面這個是2014年10月編寫的一個地圖系列站的分類程序,更早2013年6月編寫的各國名錄系列站的按照地區、行業分類程序如下:
<?php //程序開頭注釋部分開始 /* 通用地區分類程序 運行步驟: 1、SSH登錄網站所在服務器:69.64.43.200 2、進入本程序所在目錄:cd /root/drupal7.bizdirlib.com-php 3、上傳分類與地區的對應文件(下面詳細解釋),例如:dza.txt 4、運行本程序,并帶上3個參數(下面詳細解釋),例如:php refresh_category_area.php dza 0 10000 5、查看本程序運行過程中以及運行結束後的屏幕提示,了解處理情況和統計數據(下面詳細解釋) 6、可以在views中增加一個test area (test industry)來查看未分類情況,找出新的規律,修改dza.txt并上傳,再次運行 參數說明: 參數1:國家代碼,也就是網站域名最前面部分,例如dza.bizdirlib.com的國家代碼為'dza' 參數2:開始偏移量,也就是程序處理開頭的序号,一般就用0,表示從頭開始,注意這個參數不是node id,而是需要處理的頁面偏移量 參數3:數量限制,也就是程序處理的條數,調試的時候可以用1表示僅處理1條,也可以是10表示10條,實際運行可以是10000或者更多,但一般不要超過5萬,否則有可能php内存不足而中斷報錯,如果需要處理的數量超過5萬,可以多次運行本程序,每次處理5萬條 對應文件: 地區對應文件命名為dza.txt,其實dza是dza.bizdirlib.com的前面部分,文件内容如下: "Algiers","Draria" "Algiers","Bordj El Kiffan" "Algiers","Ouled Fayet" "Algiers","Beni Messous" "Algiers","Sidi M'hamed" "Algiers","El Biar" "Algiers","EL Marsa" "Algiers","Ain Benian" "Algiers","Hamma Anassers" "Algiers","Zeralda" "Algiers","Bab El Oued" "Algiers","Baraki" "Sétif","Setif" "Tizi Ouzou","Freha" "Tizi Ouzou","Tigzirt" "Tipaza","Cherchell" "Mascara","Tighenif" "Tlemcen","Chetouane" "Bejaia","Akbou" "Bouira","Lakhdaria" "Ouargla","Touggourt" "Djasr Kassentina","Djasr Kassentina" "Hussein Dey","Hussein Dey" "M'sila","M'sila" "Beijing","Beijing" "Hubei","Hubei" "Hubei","Wuhan" "Hubei","Shiyan" 每一行包含兩個部分,逗号前為“分類”,逗号後為匹配的“關鍵詞”,也就是說隻要在地址字段找到“關鍵詞”,就劃分到“分類”中去。 注意:dza.txt 請用utf-8 unix格式保存,否則非英文字符都成亂碼 統計數據: count //本次運行的總計數 count_added //以前已經增加過分類的計數 count_not_add //本次運行中沒有增加分類的計數 count_adding //本次運行中正增加分類的計數 */ //程序開頭注釋部分結束 //定義區開始,請在下面填寫國家英文名稱、國家代碼以及地區數組這3個變量 $country_code=$argv[1];//國家代碼 $offset=$argv[2];//開始偏移量 $limit=$argv[3];//數量限制 switch ($country_code) { case "ae": $country_name="United Arab Emirates"; break; case "are": $country_name="United Arab Emirates"; break; case "afg": $country_name="Afghanistan"; break; case "arm": $country_name="Armenia"; break; case "cn": $country_name="China"; break; case "chn": $country_name="China"; break; case "hkg": $country_name="Hong Kong"; break; case "ind": $country_name="India"; break; case "idn": $country_name="Indonesia"; break; case "mys": $country_name="Malaysia"; break; case "aze": $country_name="Azerbaijan"; break; case "bhr": $country_name="Bahrain"; break; case "bgd": $country_name="Bangladesh"; break; case "btn": $country_name="Bhutan"; break; case "brn": $country_name="Brunei"; break; case "khm": $country_name="Cambodia"; break; case "irn": $country_name="Iran"; break; case "irq": $country_name="Iraq"; break; case "jpn": $country_name="Japan"; break; case "jor": $country_name="Jordan"; break; case "kaz": $country_name="Kazakhstan"; break; case "kwt": $country_name="Kuwait"; break; case "kgz": $country_name="Kyrgyzstan"; break; case "lao": $country_name="Laos"; break; case "lbn": $country_name="Lebanon"; break; case "mac": $country_name="Macau"; break; case "mdv": $country_name="Maldives"; break; case "mmr": $country_name="Myanmar"; break; case "npl": $country_name="Nepal"; break; case "omn": $country_name="Oman"; break; case "pak": $country_name="Pakistan"; break; case "pse": $country_name="Palestine"; break; case "phl": $country_name="Philippines"; break; case "qat": $country_name="Qatar"; break; case "sau": $country_name="Saudi Arabia"; break; case "sg": $country_name="Singapore"; break; case "sgp": $country_name="Singapore"; break; case "lka": $country_name="Sri lanka"; break; case "syr": $country_name="Syria"; break; case "tw": $country_name="Taiwan"; break; case "twn": $country_name="Taiwan"; break; case "tjk": $country_name="Tajikistan"; break; case "tha": $country_name="Thailand"; break; case "uzb": $country_name="Uzbekistan"; break; case "vnm": $country_name="Vietnam"; break; case "yem": $country_name="Yemen"; break; case "kor": $country_name="South Korea"; break; case "au": $country_name="Australia"; break; case "aus": $country_name="Australia"; break; case "nz": $country_name="New Zealand"; break; case "nzl": $country_name="New Zealand"; break; case "fji": $country_name="Fiji"; break; case "png": $country_name="Papua New Guinea"; break; case "wsm": $country_name="Samoa"; break; case "alaska": $country_name="Alaska"; break; case "abw": $country_name="Aruba"; break; case "canada": $country_name="Canada"; break; case "can": $country_name="Canada"; break; case "bhs": $country_name="Bahamas"; break; case "brb": $country_name="Barbados"; break; case "bmu": $country_name="Bermuda"; break; case "cym": $country_name="Cayman Islands"; break; case "cub": $country_name="Cuba"; break; case "dom": $country_name="Dominican Republic"; break; case "grd": $country_name="Grenada"; break; case "gtm": $country_name="Guatemala"; break; case "hti": $country_name="Haiti"; break; case "jam": $country_name="Jamaica"; break; case "pan": $country_name="Panama"; break; case "mex": $country_name="Mexico"; break; case "tto": $country_name="Trinidad and Tobago"; break; case "vir": $country_name="Virgin Islands US"; break; case "unitedstates": $country_name="United States"; break; case "dza": $country_name="Algeria"; break; case "ago": $country_name="Angola"; break; case "ben": $country_name="Benin"; break; case "bfa": $country_name="Burkina Faso"; break; case "bdi": $country_name="Burundi"; break; case "cmr": $country_name="Cameroon"; break; case "tcd": $country_name="Chad"; break; case "cog": $country_name="Congo"; break; case "dji": $country_name="Djibouti"; break; case "egy": $country_name="Egypt"; break; case "gha": $country_name="Ghana"; break; case "ken": $country_name="Kenya"; break; case "mdg": $country_name="Madagascar"; break; case "mli": $country_name="Mali"; break; case "mar": $country_name="Morocco"; break; case "nga": $country_name="Nigeria"; break; case "sdn": $country_name="Sudan"; break; case "zaf": $country_name="South Africa"; break; case "tza": $country_name="Tanzania"; break; case "eth": $country_name="Ethiopia"; break; case "lby": $country_name="Libya"; break; case "and": $country_name="Andorra"; break; case "at": $country_name="Austria"; break; case "aut": $country_name="Austria"; break; case "be": $country_name="Belgium"; break; case "bel": $country_name="Belgium"; break; case "deu": $country_name="Germany"; break; case "it": $country_name="Italy"; break; case "ita": $country_name="Italy"; break; case "nld": $country_name="Netherlands"; break; case "blr": $country_name="Belarus"; break; case "bgr": $country_name="Bulgaria"; break; case "hrv": $country_name="Croatia"; break; case "cyp": $country_name="Cyprus"; break; case "cze": $country_name="Czech"; break; case "dnk": $country_name="Denmark"; break; case "est": $country_name="Estonia"; break; case "fin": $country_name="Finland"; break; case "fr": $country_name="France"; break; case "fra": $country_name="France"; break; case "geo": $country_name="Georgia"; break; case "grc": $country_name="Greece"; break; case "hun": $country_name="Hungary"; break; case "isl": $country_name="Iceland"; break; case "irl": $country_name="Ireland"; break; case "lva": $country_name="Latvia"; break; case "lie": $country_name="Liechtenstein"; break; case "ltu": $country_name="Lithuania"; break; case "lux": $country_name="Luxembourg"; break; case "mlt": $country_name="Malta"; break; case "mda": $country_name="Moldova"; break; case "mco": $country_name="Monaco"; break; case "nor": $country_name="Norway"; break; case "pol": $country_name="Poland"; break; case "prt": $country_name="Portugal"; break; case "rus": $country_name="Russia"; break; case "srb": $country_name="Serbia"; break; case "svk": $country_name="Slovakia"; break; case "svn": $country_name="Slovenia"; break; case "swe": $country_name="Sweden"; break; case "ch": $country_name="Switzerland"; break; case "che": $country_name="Switzerland"; break; case "tur": $country_name="Turkey"; break; case "ukr": $country_name="Ukraine"; break; case "unitedkingdom": $country_name="United Kingdom"; break; case "gb": $country_name="United Kingdom"; break; case "gbr": $country_name="United Kingdom"; break; case "es": $country_name="Spain"; break; case "esp": $country_name="Spain"; break; case "rou": $country_name="Romania"; break; case "mkd": $country_name="Macedonia"; break; case "aia": $country_name="Anguilla"; break; case "arg": $country_name="Argentina"; break; case "bol": $country_name="Bolivia"; break; case "bra": $country_name="Brazil"; break; case "chl": $country_name="Chile"; break; case "col": $country_name="Colombia"; break; case "cuw": $country_name="Curacao, Netherlands Antilles"; break; case "ecu": $country_name="Ecuador"; break; case "slv": $country_name="El Salvador"; break; case "glp": $country_name="Guadeloupe French"; break; case "guy": $country_name="Guyana"; break; case "hnd": $country_name="Honduras"; break; case "nic": $country_name="Nicaragua"; break; case "per": $country_name="Peru"; break; case "pri": $country_name="Puerto Rico"; break; case "sxm": $country_name="Sint Maarten (Dutch)"; break; case "sur": $country_name="Suriname"; break; case "ven": $country_name="Venezuela"; break; case "mtq": $country_name="Martinique French"; break; case "cri": $country_name="Costa Rica"; break; default: $country_name="Country Name"; } //定義區結束,下面的程序不需要修改 //print_r($area_array); //print "offset=$offset\n"; //print "limit=$limit\n"; //print "country_name=$country_name\n"; //print "country_code=$country_code\n"; $file="$country_code.txt"; $fp=fopen($file,"r");//以隻讀的方式打開文件 $count_line=0; $file_array=array(); while(!(feof($fp))) { $text=fgets($fp);//讀取文件的一行 $text=str_replace("\n",'',$text); $text=str_replace("\r",'',$text); // print "text=$text\n"; if ($text!='') { $len=strpos($text,'","'); $term=substr($text,1,$len-1); $area=substr($text,$len+3,-1); $file_array[$count_line]['term']=$term; $file_array[$count_line]['area']=$area; // print "file_array[$count_line]['term']=".$file_array[$count_line]['term']."\n"; // print "file_array[$count_line]['area']=".$file_array[$count_line]['area']."\n"; } // print_r ($file_array); //$vocalbulary->vid='area'; /* $term_area = taxonomy_get_term_by_name ($term); if ($term_area==NULL) { //如果沒有該分類存在,則創建該分類 $term_object->vid='area'; $term_object->name=$term; taxonomy_term_save($term_object); $term_area = taxonomy_get_term_by_name ($term); print "term '$term' saved\n"; } else { print "term '$term' exist\n"; } */ $count_line++; } // print_r ($file_array); $_SERVER['HTTP_HOST'] = "$country_code.bizdirlib.com"; $_SERVER['SCRIPT_NAME'] = "/refresh_category_area.php"; $_SERVER['REMOTE_ADDR'] = '127.0.0.1'; $drupal_path = '/var/www/html/drupal7.bizdirlib.com/'; chdir($drupal_path); define('DRUPAL_ROOT', $drupal_path); #require_once './includes/bootstrap.inc'; require_once DRUPAL_ROOT.'includes/bootstrap.inc'; drupal_bootstrap(DRUPAL_BOOTSTRAP_FULL); foreach ($file_array as $file) { //print "key=$key,term=$term\n"; $term=$file['term']; $area=$file['area']; print "term='$term',area='$area'\n"; $term_area = taxonomy_get_term_by_name ($term); // print_r ($term_area); if ($term_area==NULL) { //如果沒有該分類存在,則創建該分類 $term_object=new stdClass(); $term_object->vid=2; $term_object->name=$term; // print_r ($term_object); taxonomy_term_save($term_object); $term_area = taxonomy_get_term_by_name ($term); print "term '$term' saved\n"; } else { print "term '$term' exist\n"; } } $node_type = "country"; //$limit = 10000; //$offset = 0; //$sql = "SELECT node.nid FROM {node} WHERE node.type = '$node_type' LIMIT $limit OFFSET $offset"; $sql = "SELECT node.nid FROM {node} LEFT JOIN {field_data_field_area} field_data_field_area ON node.nid = field_data_field_area.entity_id WHERE node.type = '$node_type' AND (field_data_field_area.field_area_tid IS NULL ) LIMIT $limit OFFSET $offset"; //将已經分類過的node排除掉 //print "sql=$sql\n"; $result = db_query($sql); $count=0; //本次運行的總計數 $count_added=0; //以前已經增加過分類的計數 $count_not_add=0; //本次運行中沒有增加分類的計數 $count_adding=0; //本次運行中正增加分類的計數 while ($anode = $result->fetch()) { $count++; /* print_r ($anode); echo '<br>\n'; */ $node=node_load($anode->nid); print "\n node $anode->nid \n"; /* print_r ($node); echo '<br>'; */ //将CCK數據庫字段讀出賦值給變量 //基本信息,字段對應是準确的 $address=$node->field_address[und][0]['value']; //$phone=$node->field_phone[und][0]['value']; $area=$node->field_area[und][0]['tid']; //$industry=$node->field_industry[und][0]['tid']; $address=str_ireplace($country_name,"",$address); //替換掉國家名稱,大小寫不敏感 //$address=str_ireplace("需要替換的其它字符","",$address); //如果有必要,還可以替換其它字符 //print "address=$address\n"; //$tids=array(); //$terms=array("Bouzareah","Ben Aknoun"); $i=0; $adding=false; foreach ($file_array as $file) { //print "key=$key,term=$term\n"; $term=$file['term']; $area=$file['area']; // print "term='$term',area='$area'\n"; $term_area = taxonomy_get_term_by_name ($term); // print_r ($term_area); /* if ($term_area==NULL) { //如果沒有該分類存在,則創建該分類 $term_object->vid='area'; $term_object->name=$term; taxonomy_term_save($term_object); $term_area = taxonomy_get_term_by_name ($term); print "term '$term' saved\n"; } else { print "term '$term' exist\n"; } */ $tid_area=key($term_area); // print "tid_area=$tid_area\n"; //print "stristr $address $area ".stristr($address,$area)."\n"; if (stristr($address,$area)!==false) { //地址中找到該詞 $node->field_area[und][$i]['tid']=$tid_area; $i++; node_save($node); $count_adding++; echo "country_code=$country_code,count=$count,count_adding=$count_adding \n"; $adding=true; break; } } //end foreach //print "count_adding=$count_adding\n"; if (!$adding) { $count_not_add++; print "country_code=$country_code,count=$count,count_not_add=$count_not_add,address=$address"; } } //end while //程序運行結束,下面打印統計數據 print "\n------------------------\n"; print "country_code=$country_code\n"; print "Done!\n"; print "count=$count\n"; print "count_adding=$count_adding\n"; //print "count_added=$count_added\n"; print "count_not_add=$count_not_add\n"; ?>
行業分類:
<?php //程序開頭注釋部分開始 /* 通用行業分類程序 運行步驟: 1、SSH登錄網站所在服務器:69.64.43.200 2、進入本程序所在目錄:cd /root/drupal7.bizdirlib.com-php 3、上傳分類與行業的對應文件(下面詳細解釋),例如:mdg-industry.txt 4、運行本程序,并帶上3個參數(下面詳細解釋),例如:php refresh_category_industry.php mdg 0 10000 5、查看本程序運行過程中以及運行結束後的屏幕提示,了解處理情況和統計數據(下面詳細解釋) 6、可以在views中增加一個test area (test industry)來查看未分類情況,找出新的規律,修改dza-industry.txt并上傳,再次運行 參數說明: 參數1:國家代碼,也就是網站域名最前面部分,例如dza.bizdirlib.com的國家代碼為'dza' 參數2:開始偏移量,也就是程序處理開頭的序号,一般就用0,表示從頭開始,注意這個參數不是node id,而是需要處理的頁面偏移量 參數3:數量限制,也就是程序處理的條數,調試的時候可以用1表示僅處理1條,也可以是10表示10條,實際運行可以是10000或者更多,但一般不要超過5萬,否則有可能php内存不足而中斷報錯,如果需要處理的數量超過5萬,可以多次運行本程序,每次處理5萬條 對應文件: 行業對應文件命名為mdg-industry.txt,其中mdg是mdg.bizdirlib.com的前面部分,文件内容如下: ABATTOIRS and VIANDE EN GROS ADDUCTION D'EAU and VRD ADMINISTRATIONS AEROPORTS AEROPORTS,SECURITE AERIENNE AGENCEMENT and DECORATION AGENCES DE PRESSE and D'INFORMATION AGENCES DE PUBLICITE and DE COMMUNICATION 每一行包含一個行業。 注意:mdg-industry.txt 請用utf-8 unix格式保存,否則非英文字符都成亂碼 統計數據: count //本次運行的總計數 count_added //以前已經增加過分類的計數 count_not_add //本次運行中沒有增加分類的計數 count_adding //本次運行中正增加分類的計數 */ //程序開頭注釋部分結束 //定義區開始,請在下面填寫國家英文名稱、國家代碼以及地區數組這3個變量 $country_code=$argv[1];//國家代碼 $offset=$argv[2];//開始偏移量 $limit=$argv[3];//數量限制 //定義區結束,下面的程序不需要修改 //print_r($industry_array); //print "offset=$offset\n"; //print "limit=$limit\n"; //print "country_name=$country_name\n"; //print "country_code=$country_code\n"; $file="$country_code-industry.txt"; $fp=fopen($file,"r");//以隻讀的方式打開文件 $count_line=0; $file_array=array(); while(!(feof($fp))) { $text=fgets($fp);//讀取文件的一行 $text=str_replace("\n",'',$text); $text=str_replace("\r",'',$text); // print "text=$text\n"; if ($text!='') { // $len=strpos($text,'","'); // $term=substr($text,1,$len-1); // $industry=substr($text,$len+3,-1); $term=$text; $industry=$text; $file_array[$count_line]['term']=$term; $file_array[$count_line]['industry']=$industry; // print "file_array[$count_line]['term']=".$file_array[$count_line]['term']."\n"; // print "file_array[$count_line]['industry']=".$file_array[$count_line]['industry']."\n"; } // print_r ($file_array); //$vocalbulary->vid='industry'; $count_line++; } // print_r ($file_array); $_SERVER['HTTP_HOST'] = "$country_code.bizdirlib.com"; $_SERVER['SCRIPT_NAME'] = "/refresh_category_industry.php"; $_SERVER['REMOTE_ADDR'] = '127.0.0.1'; $drupal_path = '/var/www/html/drupal7.bizdirlib.com/'; chdir($drupal_path); define('DRUPAL_ROOT', $drupal_path); #require_once './includes/bootstrap.inc'; require_once DRUPAL_ROOT.'includes/bootstrap.inc'; drupal_bootstrap(DRUPAL_BOOTSTRAP_FULL); foreach ($file_array as $file) { //print "key=$key,term=$term\n"; $term=$file['term']; $industry=$file['industry']; // print "term='$term',industry='$industry'\n"; $term_industry = taxonomy_get_term_by_name ($term); // print_r ($term_industry); if ($term_industry==NULL) { //如果沒有該分類存在,則創建該分類 $term_object=new stdClass(); $term_object->vid=3; $term_object->name=$term; // print_r ($term_object); taxonomy_term_save($term_object); $term_industry = taxonomy_get_term_by_name ($term); print "term '$term' saved\n"; } else { print "term '$term' exist\n"; } } $node_type = "country"; //$limit = 10000; //$offset = 0; //$sql = "SELECT node.nid FROM {node} WHERE node.type = '$node_type' LIMIT $limit OFFSET $offset"; $sql = "SELECT node.nid FROM {node} LEFT JOIN {field_data_field_industry} field_data_field_industry ON node.nid = field_data_field_industry.entity_id WHERE node.type = '$node_type' AND (field_data_field_industry.field_industry_tid IS NULL ) LIMIT $limit OFFSET $offset"; //将已經分類過的node排除掉 //print "sql=$sql\n"; $result = db_query($sql); $count=0; //本次運行的總計數 $count_added=0; //以前已經增加過分類的計數 $count_not_add=0; //本次運行中沒有增加分類的計數 $count_adding=0; //本次運行中正增加分類的計數 while ($anode = $result->fetch()) { $count++; /* print_r ($anode); echo '<br>\n'; */ $node=node_load($anode->nid); print "\n node $anode->nid \n"; /* print_r ($node); echo '<br>'; */ //将CCK數據庫字段讀出賦值給變量 //基本信息,字段對應是準确的 $category=$node->field_category_activities[und][0]['value']; //$phone=$node->field_phone[und][0]['value']; //$area=$node->field_area[und][0]['tid']; $industry=$node->field_industry[und][0]['tid']; // $address=str_ireplace($country_name,"",$address); //替換掉國家名稱,大小寫不敏感 //$address=str_ireplace("需要替換的其它字符","",$address); //如果有必要,還可以替換其它字符 //print "address=$address\n"; //$tids=array(); //$terms=array("Bouzareah","Ben Aknoun"); $i=0; $adding=false; $category_count=substr_count($category,"^^"); //print "category1=$category1 \n"; // print "category_count=$category_count,category=$category,category1=$category1\n"; if ($category==NULL) { // print "country_code=$country_code,count=$count,count_not_add=$count_not_add,category=$category \n"; /* } elseif ($category1==$category) { $term_industry=taxonomy_get_term_by_name ($category1); if ($term_industry!==NULL) { $tid_industry=key($term_industry); $node->field_industry[und][$i]['tid']=$tid_industry; $i++; node_save($node); $count_adding++; echo "country_code=$country_code,count=$count,count_adding=$count_adding \n"; $adding=true; } */ } else { $count_adding++; for ($i=0;$i<=$category_count;$i++) { if ($i==0) { $category1=strtok($category,"^^"); } else { $category1=strtok("^^"); } $term_industry=taxonomy_get_term_by_name ($category1); if ($term_industry!==array()) { $tid_industry=key($term_industry); $node->field_industry[und][$i]['tid']=$tid_industry; //echo "country_code=$country_code,count=$count,count_adding=$count_adding,i=$i \n"; echo "country_code=$country_code,count=$count,count_adding=$count_adding,i=$i,category1=$category1 \n"; $adding=true; } else { $count_adding--; break; } } node_save($node); } if (!$adding) { $count_not_add++; print "country_code=$country_code,count=$count,count_not_add=$count_not_add \n"; //print "country_code=$country_code,count=$count,count_not_add=$count_not_add,category=$category \n"; } } //end while //程序運行結束,下面打印統計數據 print "\n------------------------\n"; print "country_code=$country_code\n"; print "Done!\n"; print "count=$count\n"; print "count_adding=$count_adding\n"; //print "count_added=$count_added\n"; print "count_not_add=$count_not_add\n"; ?>
主要的程序内容就是上面這些,每個站在實際使用的時候需要根據實際情況進行修改,特别是文本字段讀取出來後的一些判斷處理。另外,上面的程序寫得很亂,一些中途用過做調試的語句就注釋着依然放在裡面,僅作參考。
评论