A beginner’s guide to Linear Regression in Python with Scikit-Learn

from here There are two types of supervised machine learning algorithms: Regression and classification. The former predicts continuous value outputs while the latter predicts discrete outputs. For instance, predicting the price of a house in dollars is a regression problem whereas predicting whether a tumor is malignant or benign is a classification problem. In this… Continue reading A beginner’s guide to Linear Regression in Python with Scikit-Learn

Install Python and Jupyter Notebook to Windows 10 (64 bit)

This blog post is a step-by-step tutorial to install Python and Jupyter Notebook to Windows 10 (64 bit). Python 3.3 or greater, or Python 2.7 is required to install the Jupyter Notebook. Download Python 3.7.4 from “https://www.python.org/downloads/release/python-374/” url 2. Choose and select “x86–64 executable installer” for Windows 10–64 bit computer 3. Select location to save the executable… Continue reading Install Python and Jupyter Notebook to Windows 10 (64 bit)

محاسبه حداکثر میزان تخفیف قابل پرداخت به مشتریان جهت بازگشت

فرمول محاسبه (از نظر ما) اینه:‌
حداکثر تخفیف قابل تخصیص = میانگین درآمد از هر سفارش + احتمال بازگشت‌های بیشتر * متوسط تعداد خریدهای بعدی * میانگین درآمد از هر سفارش

Selecting Subsets of Data in Pandas

from What Code single column  df[‘food’] multiple columns df[[‘color’, ‘food’, ‘score’]] single row df.loc[‘Niko’] multiple rows df.loc[[‘Niko’, ‘Penelope’]] slice notation to select a range of rows df.loc[‘Niko’:’Dean’]   df.loc[:’Aaron’] stepping by 2 df.loc[‘Niko’:’Christina’:2] rows and columns df.loc[row_selection, column_selection]   df.loc[‘Jane’:’Penelope’, [‘state’, ‘color’]] single row df.iloc[3] multiple rows df.iloc[[5, 2, 4]]   df.iloc[3:5]   df.iloc[[2,3], [0,… Continue reading Selecting Subsets of Data in Pandas

python, Pandas Categorize the range

df[‘PriceBin’] = pd.cut(df[‘PriceAvg’], bins = 3)df[‘PriceBin’].value_counts() (54060.0, 2040000.0] 209 (2040000.0, 4020000.0] 4 (4020000.0, 6000000.0] 1 Name: PriceBin, dtype: int64 df[‘PriceBin’] = pd.qcut(df[‘PriceAvg’], q=3) df[‘PriceBin’].value_counts().sort_index() (59999.999, 210000.0] 77 (210000.0, 315000.0] 66 (315000.0, 6000000.0] 71 Name: PriceBin, dtype: int64 PriceBin SalesAvg0(59999.999, 210000.0] 42.0000001(210000.0, 315000.0] 145.1666672(315000.0, 6000000.0] 114.200000

Useful Python pandas codes

– To Rename the data framedf.rename(columns={“contract_id”:”deal_id”},inplace=True) – Where statement tips[tips[‘time’] == ‘Dinner’].head(۵) – vlookupmg = pd.merge(df,AgReg,on=”deal_id”,how=”left”) – choose the first column of an array or first part of a string with a delimitter df[“cat”] = df[“CategoryID”].str.split(‘,’,1).str[0] – filling na or nan or Null values df[“CategoryID”].fillna(“”,inplace=True) – Convert To date time pd.to_datetime(df[“start_date”],errors=’ignore’) combination of where and select some.… Continue reading Useful Python pandas codes

Pandas V.S SQL

If you knew SQL before and want to migrate to Python, you can use this article. TiTle SQL Pandas Desc Simple SELECT total_bill, tip, smoker, time FROM tips LIMIT ۵; tips[[‘total_bill’, ‘tip’, ‘smoker’, ‘time’]].head(۵)   Where SELECT * FROM tips WHERE time = ‘Dinner’ LIMIT ۵; tips[tips[‘time’] == ‘Dinner’].head(۵)   Multiple conditions SELECT * FROM tips WHERE… Continue reading Pandas V.S SQL