apply
aggregate
transform
filter
diff
df.groupby(pd.cut(df['val'],[0,3000,5000,7000,10000])).size()
df.groupby(
df['sales rep'].apply(lambda x: 'william' in x)
).size()sort_values(ascending = False)
will sum all rows based on columns
df.apply(sum, axis = 0)
aggregation:
size
sum
mean / Median
max/min
idxmax / idsmin ==> inde of the maximum / minimum
agg({'orderId':'size' , 'val':['sum','mean'] , 'sale:['sum','mean']})
with names:
df.groupby('Column0').agg(Name1=('Column1','count') ,
Name2=('Column2' , 'nunique' ))
gr = {'name':('column1':'sum) , 'name2':('column2':'size')}
df.agg(**gr)
Transform:
df.groupby('sales rep')['val'].transform(lambda x:x/sum(x))
Filter:
it is like Having in sql
df.groupby('sales rep').filter(labmda x: (x['val']* x['sales']).sum() > 200000 )
diff()
df.groupby('UserID')['ShamsiDate'].diff()
در ردیف دوم اختلاف آن با ردیف اول را میآورد.