扫描二维码下载沐宇APP

沐宇

微信扫码使用沐宇小程序

沐宇

python鎬庝箞璇诲彇hdfs涓婄殑鏂囦欢

扬州沐宇科技
2023-10-13 04:25:08
python

瑕佸湪Python涓鍙朒DFS涓婄殑鏂囦欢锛屾偍鍙互浣跨敤Hadoop鐨勬枃浠剁郴缁熷簱pyarrow鎴杊dfs3銆?/p>

浣跨敤pyarrow璇诲彇HDFS涓婄殑鏂囦欢锛屾偍闇€瑕佸畨瑁卲yarrow搴撳苟閰嶇疆濂紿adoop鐨勭幆澧冨彉閲忋€傜劧鍚庡彲浠ヤ娇鐢ㄤ互涓嬩唬鐮佺ず渚嬭鍙朒DFS涓婄殑鏂囦欢锛?/p>

import pyarrow as pa
# 杩炴帴鍒癏DFS鏂囦欢绯荤粺
fs = pa.hdfs.connect(host="namenode_host", port=8020, user="hdfs_user")
# 璇诲彇HDFS涓婄殑鏂囦欢
with fs.open("/path/to/file.txt", mode='rb') as f:
data = f.read()
# 鎵撳嵃鏂囦欢鍐呭
print(data.decode('utf-8'))

浣跨敤hdfs3搴撹鍙朒DFS涓婄殑鏂囦欢锛屾偍闇€瑕佸畨瑁卙dfs3搴撳苟閰嶇疆濂紿adoop鐨勭幆澧冨彉閲忋€傜劧鍚庡彲浠ヤ娇鐢ㄤ互涓嬩唬鐮佺ず渚嬭鍙朒DFS涓婄殑鏂囦欢锛?/p>

import hdfs3
# 杩炴帴鍒癏DFS鏂囦欢绯荤粺
fs = hdfs3.HDFileSystem(host="namenode_host", port=8020, user="hdfs_user")
# 璇诲彇HDFS涓婄殑鏂囦欢
with fs.open("/path/to/file.txt", 'rb') as f:
data = f.read()
# 鎵撳嵃鏂囦欢鍐呭
print(data.decode('utf-8'))

璇锋浛鎹?code>namenode_host涓烘偍鐨凥DFS Namenode鐨勪富鏈哄悕鎴朓P鍦板潃锛?code>8020涓篘amenode鐨勭鍙e彿锛堥粯璁や负8020锛夛紝hdfs_user涓篐DFS涓殑鐢ㄦ埛鍚嶃€傜劧鍚庡皢/path/to/file.txt鏇挎崲涓烘偍瑕佽鍙栫殑鏂囦欢鐨勮矾寰勩€?/p>

扫码添加客服微信