Data Frame Data structure in Python pandas.pptx

Data Frames
A Pandas DataFrame is a 2 dimensional data structure, like a 2 dimensional array, or a table with
rows and columns.
import pandas as pd
import pandas as pd
data = {
"Marks": [80, 75, 90],
"Sub": ['Python', 'Java', 'Database']
}
#load data into a DataFrame object:
df = pd.DataFrame(data)
print(df)

Data Frames
Locate Row
As you can see from the result above, the DataFrame is like a table with rows and
columns.
Pandas use the loc attribute to return one or more specified row(s)
import pandas as pd
data = {
"Marks": [80, 75, 90],
}
print(df.loc[0]) #print(df.loc[[0, 1]])

Data Frames
Named Index
import pandas as pd
data = {
"Marks": [80, 75, 90],
}
df = pd.DataFrame(data,index= ["day1","day2","day3"])
print(df)
Locate Named Indexes
Use the named index in the loc attribute to return the specified row(s).
Example
Return "day2":
#refer to the named index:
print(df.loc["day2"])

Data Frames
Load Files Into a DataFrame
If your data sets are stored in a file, Pandas can load them into a DataFrame.
import pandas as pd
df = pd.read_csv('data.csv')
print(df)
import pandas as pd
print(pd.options.display.max_rows)

Data Frames
Read JSON
Big data sets are often stored, or extracted as JSON.
JSON is plain text, but has the format of an object, and is well known in the world of
programming, including Pandas.
In our examples we will be using a JSON file called 'data.json’.
use to_string() to print the entire DataFrame.

Data Frames
import pandas as pd
data = {
"Duration":{
"0":60,
"1":60,
"2":60,
"3":45,
"4":45,
"5":60
},
"Pulse":{
"0":110,
"1":117,
"2":103,
"3":109,
"4":117,
"5":102
},
"Maxpulse":{
"0":130,
"1":145,
"2":135,
"3":175,
"4":148,
"5":127
},
"Calories":{
"0":409,
"1":479,
"2":340,
"3":282,
"4":406,
"5":300
}
}
print(df)

Viewing the Data
• One of the most used method for getting a quick overview of the
DataFrame, is the head() method.
• The head() method returns the headers and a specified number of rows,
starting from the top.
• import pandas as pd
• df = pd.read_csv('data.csv')
• print(df.head(10))
• #Print the first 5 rows of the DataFrame:print(df.head())

• There is also a tail() method for viewing the last rows of the
DataFrame.
• The tail() method returns the headers and a specified number of
rows, starting from the bottom.
• Example
• Print the last 5 rows of the DataFrame:
• print(df.tail())

import pandas as pd
# making data frame from csv file
data = pd.read_csv("nba.csv", index_col ="Name")
# retrieving rows by iloc method
row2 = data.iloc[3]
print(row2)

# importing pandas as pd
import pandas as pd
# importing numpy as np
import numpy as np
# dictionary of lists
dict = {'First Score':[100, 90, np.nan, 95],
'Second Score': [30, np.nan, 45, 56],
'Third Score':[52, 40, 80, 98],
'Fourth Score':[np.nan, np.nan, np.nan, 65]}
# creating a dataframe from dictionary
df = pd.DataFrame(dict)
print(df)

• Dropping missing values using dropna() :
• In order to drop a null values from a dataframe, we used dropna() function this fuction drop Rows/Columns of datasets with Null
values in different ways.
import pandas as pd
import numpy as np
'Third Score':[52, 40, 80, 98],
Print(df)

• Now we drop rows with at least one Nan value (Null value).
import pandas as pd
import numpy as np
'Third Score':[52, 40, 80, 98],
# using dropna() function
print(df.dropna())

• Iterating over rows and columns
• Iteration is a general term for taking each item of something, one
after another. Pandas DataFrame consists of rows and columns so, in
order to iterate over dataframe, we have to iterate a dataframe like a
dictionary.
• Iterating over rows :
• In order to iterate over rows, we can use three function iteritems(),
iterrows(), itertuples() . These three function will help in iteration
over rows.

import pandas as pd
dict = {'name':["aparna", "pankaj", "sudhir", "Geeku"],
'degree': ["MBA", "BCA", "M.Tech", "MBA"],
'score':[90, 40, 80, 98]}
# creating a dataframe from a dictionary
print(df)

Now we apply iterrows() function in order to get a each element of rows.
import pandas as pd
dict = {'name':["aparna", "pankaj", "sudhir", "Geeku"],
'degree': ["MBA", "BCA", "M.Tech", "MBA"],
'score':[90, 40, 80, 98]}
# creating a dataframe from a dictionary
# iterating over rows using iterrows() function
for i, j in df.iterrows():
print(i, j)
print()

Data Frame Data structure in Python pandas.pptx

Data Frame Data structure in Python pandas.pptx

More Related Content

Similar to Data Frame Data structure in Python pandas.pptx

More from Ramakrishna Reddy Bijjam

Recently uploaded

Data Frame Data structure in Python pandas.pptx