محمد حسین ابراهیم زاده – Page 3 – وبلاگ شخصی محمد حسین ابراهیم‌زاده اصفهانی

Pandas Groupby

Install Python and Jupyter Notebook to Windows 10 (64 bit)

This blog post is a step-by-step tutorial to install Python and Jupyter Notebook to Windows 10 (64 bit). Python 3.3 or greater, or Python 2.7 is required to install the Jupyter Notebook. Download Python 3.7.4 from “https://www.python.org/downloads/release/python-374/” url 2. Choose and select “x86–64 executable installer” for Windows 10–64 bit computer 3. Select location to save the executable… Continue reading Install Python and Jupyter Notebook to Windows 10 (64 bit)

شاخص‌های عملکرد تحلیل‌گر داده

قبل از پرداختن به این موضوع بهتر است مروری بر تفاوت‌های چند پیشه داشته باشیم. تحلیل و آمار متفاوتند تحلیل، اخبار داده‌ها نیست. بازاریابی نیست. نقل کردن اتفاقات نیست. تحلیل، تصمیم گیری نیست. تفاوت تحلیل و آمار با اینکه ابزارها و معادلات استفاده شده آمار و تحلیل بسیار شبیه هم هستند، آماردانان و تحلیل‌گران، آموزش‌های… Continue reading شاخص‌های عملکرد تحلیل‌گر داده

Protected: عبور از بحران

There is no excerpt because this is a protected post.

Pandas Transform and Filter

Split Apply Combine Filter Data with Transform Transform with Lambda Filter with Pandas Groupby use Map to create a new Column see more also

محاسبه حداکثر میزان تخفیف قابل پرداخت به مشتریان جهت بازگشت

فرمول محاسبه (از نظر ما) اینه:‌
حداکثر تخفیف قابل تخصیص = میانگین درآمد از هر سفارش + احتمال بازگشت‌های بیشتر * متوسط تعداد خریدهای بعدی * میانگین درآمد از هر سفارش

Selecting Subsets of Data in Pandas

from What Code single column df[‘food’] multiple columns df[[‘color’, ‘food’, ‘score’]] single row df.loc[‘Niko’] multiple rows df.loc[[‘Niko’, ‘Penelope’]] slice notation to select a range of rows df.loc[‘Niko’:’Dean’] df.loc[:’Aaron’] stepping by 2 df.loc[‘Niko’:’Christina’:2] rows and columns df.loc[row_selection, column_selection] df.loc[‘Jane’:’Penelope’, [‘state’, ‘color’]] single row df.iloc[3] multiple rows df.iloc[[5, 2, 4]] df.iloc[3:5] df.iloc[[2,3], [0,… Continue reading Selecting Subsets of Data in Pandas

Asking Questions About our Data

Asking Questions About our DataThanks to Super Data Science we can look at a training data set with some sales data and gain some insights from it. Let’s take a little look at the data as it looks in Excel. Upon initial inspection of the data, we can start thinking of some questions about it… Continue reading Asking Questions About our Data

python, Pandas Categorize the range

df[‘PriceBin’] = pd.cut(df[‘PriceAvg’], bins = 3)df[‘PriceBin’].value_counts() (54060.0, 2040000.0] 209 (2040000.0, 4020000.0] 4 (4020000.0, 6000000.0] 1 Name: PriceBin, dtype: int64 df[‘PriceBin’] = pd.qcut(df[‘PriceAvg’], q=3) df[‘PriceBin’].value_counts().sort_index() (59999.999, 210000.0] 77 (210000.0, 315000.0] 66 (315000.0, 6000000.0] 71 Name: PriceBin, dtype: int64 h = df.groupby(‘PriceBin’, as_index=False).median()[‘SalesAvg’] h = pd.DataFrame(h) h.reset_index(inplace=True) h PriceBin SalesAvg0(59999.999, 210000.0] 42.0000001(210000.0, 315000.0] 145.1666672(315000.0, 6000000.0] 114.200000

Useful Python pandas codes

– To Rename the data framedf.rename(columns={“contract_id”:”deal_id”},inplace=True) – Where statement tips[tips[‘time’] == ‘Dinner’].head(۵) – vlookupmg = pd.merge(df,AgReg,on=”deal_id”,how=”left”) – choose the first column of an array or first part of a string with a delimitter df[“cat”] = df[“CategoryID”].str.split(‘,’,1).str[0] – filling na or nan or Null values df[“CategoryID”].fillna(“”,inplace=True) – Convert To date time pd.to_datetime(df[“start_date”],errors=’ignore’) combination of where and select some.… Continue reading Useful Python pandas codes