鎬庝箞浣跨敤BeautifulSoup瑙f瀽HTML鏂囨。
浣跨敤BeautifulSoup瑙f瀽HTML鏂囨。鐨勫熀鏈楠ゅ涓嬶細
- 瀵煎叆BeautifulSoup搴擄細
from bs4 import BeautifulSoup
- 鍒涘缓BeautifulSoup瀵硅薄骞朵紶鍏TML鏂囨。鍜岃В鏋愬櫒锛?/li>
html_doc = """
<html>
<head>
<title>Example HTML Document</title>
</head>
<body>
<p>This is an example paragraph.</p>
</body>
</html>
"""
soup = BeautifulSoup(html_doc, 'html.parser')
- 浣跨敤BeautifulSoup瀵硅薄鏌ユ壘鍜屾彁鍙栭渶瑕佺殑淇℃伅锛?/li>
# 鑾峰彇鏂囨。鏍囬
title = soup.title
print(title.text)
# 鑾峰彇绗竴涓钀?/span>
paragraph = soup.p
print(paragraph.text)
- 浣跨敤BeautifulSoup瀵硅薄鏌ユ壘鐗瑰畾鏍囩鎴栧睘鎬х殑鍐呭锛?/li>
# 鏌ユ壘鎵€鏈夌殑娈佃惤鏍囩
paragraphs = soup.find_all('p')
for p in paragraphs:
print(p.text)
# 鏌ユ壘鍖呭惈鐗瑰畾class灞炴€х殑鏍囩
div = soup.find('div', class_='example_class')
print(div.text)
浠ヤ笂鏄娇鐢˙eautifulSoup瑙f瀽HTML鏂囨。鐨勫熀鏈柟娉曪紝鍙互鏍规嵁鍏蜂綋鐨勯渶姹傚拰HTML鏂囨。缁撴瀯鏉ヨ繘涓€姝ュ簲鐢˙eautifulSoup鐨勫姛鑳姐€?/p>
相关问答