比如下面的一段html标签中,如何取出img标签和图片地址:
<div class="good-item"><ul class="clearfix"data-product-list=""><li><a href="/mobile/index/shenruDeatil/id/741"style="position: relative;"><em class="hot-label">热</em><img src="/upload/gallery/thumbnail/4CEDD57F-8D89-3346-129883902F59-tbl.jpg"><div class="good-text"><div class="good-name">1号牛皮纸盒</div><span class="pcolor">¥</span><span class="singlePrice"><!--0.720-->0.72</span><span>/个</span></div></a></li><li><a href="/mobile/index/goodsDeatil/id/257"style="position: relative;"><em class="hot-label">热</em><img src="/upload/gallery/thumbnail/10B4AE18-7B30-7873-D4F03F0842E2-tbl.jpg"><div class="good-text"><div class="good-name">手挽袋</div><span class="pcolor">¥</span><span class="singlePrice"><!--250.00-->250</span><span>/件</span></div></a></li><li><a href="/mobile/index/goodsDeatil/id/249"style="position: relative;"><em class="hot-label">热</em><img src="/upload/gallery/thumbnail/51AAFC9A-188E-2934-937CC221BBF2-tbl.jpg"><div class="good-text"><div class="good-name">4#牛皮纸袋</div><span class="pcolor">¥</span><span class="singlePrice"><!--160.00-->160</span><span>/件</span></div></a></li><li><a href="/mobile/index/goodsDeatil/id/661"style="position: relative;"><em class="hot-label">热</em><img src="/upload/gallery/thumbnail/BEB3E265-347A-5135-789673024100-tbl.jpg"><div class="good-text"><div class="good-name">双童艺术吸管</div><span class="pcolor">¥</span><span class="singlePrice"><!--100.00-->100</span><span>/件</span></div></a></li></ul></div>
我们采用正则表达式的方法获取img标签和图片地址:
- 1.获取img标签的正则写法:/<img(.*?)>/
- 2.获取图片src属性的正则,以”<img”开始,中间考虑到空格、单引号、双引号等,最后结果为:/<img.+src=\”?(.+\.(jpg|jpeg|gif|bmp|bnp|png))\”?.+>/i
PHP代码如下:
preg_match_all('/<img(.*?)>/', $html, $match); //$html = <<<HTML 上面的html文本 >>>;$images = $match[0];foreach ($images as $key=>$val){ preg_match('/<img.+src=\s*[\"|\']?(.+\.(jpg|jpeg|gif|bmp|bnp|png))[\"|\']?.+>/i', $val, $res); $arr[$key]['img_tag'] = $val; $arr[$key]['img_path'] = $res[1];}var_dump(arr);
打印结果如下:
array(4) { [0]=> array(2) { ["img_tag"]=> string(77) "<img src="/upload/gallery/thumbnail/4CEDD57F-8D89-3346-129883902F59-tbl.jpg">" ["img_path"]=> string(65) "/upload/gallery/thumbnail/4CEDD57F-8D89-3346-129883902F59-tbl.jpg" } [1]=> array(2) { ["img_tag"]=> string(77) "<img src="/upload/gallery/thumbnail/10B4AE18-7B30-7873-D4F03F0842E2-tbl.jpg">" ["img_path"]=> string(65) "/upload/gallery/thumbnail/10B4AE18-7B30-7873-D4F03F0842E2-tbl.jpg" } [2]=> array(2) { ["img_tag"]=> string(77) "<img src="/upload/gallery/thumbnail/51AAFC9A-188E-2934-937CC221BBF2-tbl.jpg">" ["img_path"]=> string(65) "/upload/gallery/thumbnail/51AAFC9A-188E-2934-937CC221BBF2-tbl.jpg" } [3]=> array(2) { ["img_tag"]=> string(77) "<img src="/upload/gallery/thumbnail/BEB3E265-347A-5135-789673024100-tbl.jpg">" ["img_path"]=> string(65) "/upload/gallery/thumbnail/BEB3E265-347A-5135-789673024100-tbl.jpg" } }
图片地址不是实际路径,仅供参考。