from What Code single column df[‘food’] multiple columns df[[‘color’, ‘food’, ‘score’]] single row df.loc[‘Niko’] multiple rows df.loc[[‘Niko’, ‘Penelope’]] slice notation to select a range of rows df.loc[‘Niko’:’Dean’] df.loc[:’Aaron’] stepping by 2 df.loc[‘Niko’:’Christina’:2] rows and columns df.loc[row_selection, column_selection] df.loc[‘Jane’:’Penelope’, [‘state’, ‘color’]] single row df.iloc[3] multiple rows df.iloc[[5, 2, 4]] df.iloc[3:5] df.iloc[[2,3], [0,… Continue reading Selecting Subsets of Data in Pandas
Tag: pandas
python, Pandas Categorize the range
df[‘PriceBin’] = pd.cut(df[‘PriceAvg’], bins = 3)df[‘PriceBin’].value_counts() (54060.0, 2040000.0] 209 (2040000.0, 4020000.0] 4 (4020000.0, 6000000.0] 1 Name: PriceBin, dtype: int64 df[‘PriceBin’] = pd.qcut(df[‘PriceAvg’], q=3) df[‘PriceBin’].value_counts().sort_index() (59999.999, 210000.0] 77 (210000.0, 315000.0] 66 (315000.0, 6000000.0] 71 Name: PriceBin, dtype: int64 h = df.groupby(‘PriceBin’, as_index=False).median()[‘SalesAvg’] h = pd.DataFrame(h) h.reset_index(inplace=True) h PriceBin SalesAvg0(59999.999, 210000.0] 42.0000001(210000.0, 315000.0] 145.1666672(315000.0, 6000000.0] 114.200000
Useful Python pandas codes
– To Rename the data framedf.rename(columns={“contract_id”:”deal_id”},inplace=True) – Where statement tips[tips[‘time’] == ‘Dinner’].head(۵) – vlookupmg = pd.merge(df,AgReg,on=”deal_id”,how=”left”) – choose the first column of an array or first part of a string with a delimitter df[“cat”] = df[“CategoryID”].str.split(‘,’,1).str[0] – filling na or nan or Null values df[“CategoryID”].fillna(“”,inplace=True) – Convert To date time pd.to_datetime(df[“start_date”],errors=’ignore’) combination of where and select some.… Continue reading Useful Python pandas codes
Pandas V.S SQL
If you knew SQL before and want to migrate to Python, you can use this article. TiTle SQL Pandas Desc Simple SELECT total_bill, tip, smoker, time FROM tips LIMIT ۵; tips[[‘total_bill’, ‘tip’, ‘smoker’, ‘time’]].head(۵) Where SELECT * FROM tips WHERE time = ‘Dinner’ LIMIT ۵; tips[tips[‘time’] == ‘Dinner’].head(۵) Multiple conditions SELECT * FROM tips WHERE… Continue reading Pandas V.S SQL