遇到一個(gè)編程問(wèn)題,你必須首先想到的是要簡(jiǎn)化它,簡(jiǎn)化成一個(gè)最簡(jiǎn)單的問(wèn)題后,寫(xiě)最簡(jiǎn)單的代碼來(lái)解決它,同時(shí)只付出最簡(jiǎn)單的測(cè)試代價(jià)。
簡(jiǎn)單HTML源碼:
1<!--The?loneliest?number-->
????????????????????????<a>2<!--Can?be?as?bad?as?one--><b>3
提取上述代碼中的注釋?zhuān)?/p>
from?bs4?import?BeautifulSoup,?Comment
soup?=?BeautifulSoup("""1<!--The?loneliest?number-->
????????????????????????<a>2<!--Can?be?as?bad?as?one--><b>3""")
comments?=?soup.findAll(text=lambda?text:isinstance(text,?Comment))
for?comment?in?comments:
????print?comment
輸出結(jié)果:
The?loneliest?number
Can?be?as?bad?as?one
去掉上面HTML代碼中的注釋?zhuān)?/p>
from?bs4?import?BeautifulSoup,?Comment
soup?=?BeautifulSoup("""1<!--The?loneliest?number-->
????????????????????????<a>2<!--Can?be?as?bad?as?one--><b>3""")
comments?=?soup.findAll(text=lambda?text:isinstance(text,?Comment))
[comment.extract()?for?comment?in?comments]
print?soup
輸出結(jié)果:
1
<a>2<b>3</b></a>
參考:
1、
2、