鎬庝箞浣跨敤NLTK搴撹瘎浼拌瑷€妯″瀷
NLTK锛圢atural Language Toolkit锛夋槸涓€涓敤浜庤嚜鐒惰瑷€澶勭悊鐨凱ython搴擄紝鍙互鐢ㄦ潵璇勪及璇█妯″瀷銆備笅闈㈡槸涓€涓畝鍗曠殑绀轰緥锛屾紨绀哄浣曚娇鐢∟LTK搴撴潵璇勪及涓€涓畝鍗曠殑璇█妯″瀷锛?/p>
棣栧厛锛屽畨瑁匩LTK搴擄細
pip install nltk
鐒跺悗锛屽鍏LTK搴撳苟涓嬭浇鎵€闇€鐨勮鏂欏簱鍜屾ā鍨嬶細
import nltk
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')
nltk.download('tagsets')
鎺ヤ笅鏉ワ紝鍒涘缓涓€涓畝鍗曠殑璇█妯″瀷骞剁敤NLTK搴撴潵璇勪及瀹冿細
from nltk.tokenize import word_tokenize
from nltk import pos_tag
# 绀轰緥璇█妯″瀷
text = "This is a simple sentence."
# 瀵规枃鏈繘琛屾爣璁?/span>
tokens = word_tokenize(text)
tags = pos_tag(tokens)
# 鎵撳嵃鏍囪缁撴灉
print(tags)
浠ヤ笂绀轰緥婕旂ず浜嗗浣曚娇鐢∟LTK搴撲腑鐨勫垎璇嶅櫒鍜岃瘝鎬ф爣娉ㄥ櫒鏉ヨ瘎浼颁竴涓畝鍗曠殑璇█妯″瀷銆備綘涔熷彲浠ヤ娇鐢∟LTK搴撲腑鐨勫叾浠栧姛鑳藉拰妯″潡鏉ヨ瘎浼版洿澶嶆潅鐨勮瑷€妯″瀷锛屽璇嶅共鎻愬彇銆佸懡鍚嶅疄浣撹瘑鍒瓑銆備笉杩囷紝闇€瑕佹牴鎹叿浣撶殑闇€姹傚拰浠诲姟鏉ラ€夋嫨鍚堥€傜殑鏂规硶鍜屽伐鍏枫€?/p>