python濡備綍娓呮礂鏁版嵁
鍦≒ython涓紝鍙互浣跨敤鍚勭搴撳拰宸ュ叿鏉ユ竻娲楁暟鎹€備笅闈㈡槸涓€浜涘父鐢ㄧ殑鏂规硶锛?/p>
- 鏁版嵁鍘婚噸锛氫娇鐢╬andas搴撶殑
drop_duplicates()
鍑芥暟鍙互鍘婚櫎閲嶅鐨勬暟鎹銆?/li>
import pandas as pd
df = pd.DataFrame({'col1': ['A', 'B', 'A', 'C', 'B'],
'col2': [1, 2, 3, 4, 5]})
df.drop_duplicates()
- 缂哄け鍊煎鐞嗭細浣跨敤pandas搴撶殑
fillna()
鍑芥暟鍙互濉厖缂哄け鍊硷紝浣跨敤dropna()
鍑芥暟鍙互鍒犻櫎鍚湁缂哄け鍊肩殑琛屻€?/li>
import pandas as pd
df = pd.DataFrame({'col1': [1, 2, None, 4],
'col2': [None, 2, 3, 4]})
df.fillna(0) # 濉厖缂哄け鍊间负0
df.dropna() # 鍒犻櫎鍚湁缂哄け鍊肩殑琛?/span>
- 鏁版嵁杞崲锛氫娇鐢╬andas搴撶殑
apply()
鍑芥暟鍙互瀵规暟鎹繘琛岃浆鎹紝閫氳繃鑷畾涔夌殑鍑芥暟鍙互瀹炵幇鍚勭鏁版嵁娓呮礂鎿嶄綔銆?/li>
import pandas as pd
df = pd.DataFrame({'col1': ['a', 'b', 'c', 'd'],
'col2': [1, 2, 3, 4]})
def convert_to_uppercase(x):
return x.upper()
df['col1'] = df['col1'].apply(convert_to_uppercase) # 灏哻ol1鍒楃殑鍊艰浆鎹负澶у啓
- 鏁版嵁鏍煎紡杞崲锛氫娇鐢╬andas搴撶殑
astype()
鍑芥暟鍙互灏嗘暟鎹殑绫诲瀷杞崲涓烘寚瀹氱殑鏍煎紡銆?/li>
import pandas as pd
df = pd.DataFrame({'col1': [1, 2, 3, 4],
'col2': [1.1, 2.2, 3.3, 4.4]})
df['col2'] = df['col2'].astype(int) # 灏哻ol2鍒楃殑鍊艰浆鎹负鏁村瀷
- 鏁版嵁鏍囧噯鍖栵細浣跨敤sklearn搴撶殑
StandardScaler
绫诲彲浠ュ鏁版嵁杩涜鏍囧噯鍖栧鐞嗐€?/li>
from sklearn.preprocessing import StandardScaler
data = [[1, 2], [3, 4], [5, 6]]
scaler = StandardScaler()
scaled_data = scaler.fit_transform(data) # 瀵规暟鎹繘琛屾爣鍑嗗寲澶勭悊
杩欎簺鍙槸涓€浜涘父瑙佺殑鏁版嵁娓呮礂鏂规硶锛屽疄闄呬笂锛屾暟鎹竻娲楃殑鍏蜂綋鎿嶄綔鍜屾楠ゆ牴鎹笉鍚岀殑鏁版嵁绫诲瀷鍜岄渶姹傚彲鑳戒細鏈夋墍宸紓锛屽彲浠ユ牴鎹叿浣撴儏鍐甸€夋嫨鍚堥€傜殑鏂规硶鏉ヨ繘琛屾暟鎹竻娲椼€?/p>