Tag: data science
10 Top Types of Data Analysis Methods
from Data is everywhere around us. A report shows that people, things, and organizations are generating 2.5 quintillion bytes of data each day. It is a staggering figure indeed, but there is a clear explanation for it. For example, you are not only reading this post right now but also leaving digital traces about your content interests… Continue reading 10 Top Types of Data Analysis Methods
Pandas Transform and Filter
Split Apply Combine Filter Data with Transform Transform with Lambda Filter with Pandas Groupby use Map to create a new Column see more also
فرمت اعداد در پانداس
برای فرمت سه رقم سه رقم جدا شده و نمایش تا دو رقم اعشار در پایتون پانداس از این کد استفاده ببرید. یا این کد برای هر ستون
pandas create new column based on values from other columns
Selecting Subsets of Data in Pandas
from What Code single column df[‘food’] multiple columns df[[‘color’, ‘food’, ‘score’]] single row df.loc[‘Niko’] multiple rows df.loc[[‘Niko’, ‘Penelope’]] slice notation to select a range of rows df.loc[‘Niko’:’Dean’] df.loc[:’Aaron’] stepping by 2 df.loc[‘Niko’:’Christina’:2] rows and columns df.loc[row_selection, column_selection] df.loc[‘Jane’:’Penelope’, [‘state’, ‘color’]] single row df.iloc[3] multiple rows df.iloc[[5, 2, 4]] df.iloc[3:5] df.iloc[[2,3], [0,… Continue reading Selecting Subsets of Data in Pandas
Useful Python pandas codes
– To Rename the data framedf.rename(columns={“contract_id”:”deal_id”},inplace=True) – Where statement tips[tips[‘time’] == ‘Dinner’].head(۵) – vlookupmg = pd.merge(df,AgReg,on=”deal_id”,how=”left”) – choose the first column of an array or first part of a string with a delimitter df[“cat”] = df[“CategoryID”].str.split(‘,’,1).str[0] – filling na or nan or Null values df[“CategoryID”].fillna(“”,inplace=True) – Convert To date time pd.to_datetime(df[“start_date”],errors=’ignore’) combination of where and select some.… Continue reading Useful Python pandas codes
Process mining – Introduction 1
Process mining is the combination of Data mining and Business process management. It works with log files. Every log file must have: Case ID (order ID) Activity (purchased, Request, rejected, …) Time stamp Process mining Internet of events Big data Internet of contents (google, Wikipedia) Social media Internet of people Cloud Internet of things Mobility Internet… Continue reading Process mining – Introduction 1
Pandas V.S SQL
If you knew SQL before and want to migrate to Python, you can use this article. TiTle SQL Pandas Desc Simple SELECT total_bill, tip, smoker, time FROM tips LIMIT ۵; tips[[‘total_bill’, ‘tip’, ‘smoker’, ‘time’]].head(۵) Where SELECT * FROM tips WHERE time = ‘Dinner’ LIMIT ۵; tips[tips[‘time’] == ‘Dinner’].head(۵) Multiple conditions SELECT * FROM tips WHERE… Continue reading Pandas V.S SQL