Class 12 IP NCERT Solutions Chapter 2 | Data Handling Using Pandas Solutions

Contents

What is a Series and how is it different from a 1-D array, a list and a dictionary?
  • Series is a one dimensional data structure present in python pandas library.
  • It can contain a sequence of homogeneous value of any data type like int, float, char etc.
  • It is value mutable but size immutable.
  • All elements of Series are associated with a data labels called index.
  • Values of other data types can also be assigned as index

Following table comparison shows how series is different from 1-D array, list and a dictionary.

Series1-D arrayListDictionary
Contains homogeneous dataContains homogeneous dataCan contains heterogeneous dataCan contains heterogeneous data
Default indexing begins with numerical value 0.Default indexing begins with numerical value 0.Default indexing begins with numerical value 0.Each value is associated with a key value defined manually
Values of other data types can also be assigned as index  Values of other data types cannot be assigned as index  Values of other data types cannot be assigned as index  Key value is treated as index which can contain any type of value.
Size immutableSize immutableSize mutableSize mutable
Mathematical operations can be performed directlyMathematical operations can be performed directlyMathematical operations cannot be performed directlyMathematical operations cannot be performed directly
What is a DataFrame and how is it different from a 2-D array?
  • A DataFrame is a two dimensional data structure present in python pandas library.
  • It can contain heterogeneous data in tabular format like spreadsheet or table in MySQL.
  • It is both value and size mutable.
  • It is a labelled data structure where both rows and columns are indexed
  • Values of other data types can also be assigned as index for rows and columns

Following table comparison shows how DataFrame differs from 2-D array:

DataFrame2-D array
DataFrame have default numerical index that can be labelled with any other type of values2D array have default numerical index that cannot be labelled with other type of values
DataFrame can stores heterogeneous data2D array stores homogeneous data
DataFrame can deal with dynamic data and mixed data types2d array better deal with numerical data type
Dataframe is size mutable2D array is size immutable
How are DataFrames related to Series?

DataFrame is related to Series as:

  • Both are data structure of python pandas library
  • Dataframe can be created from Series
  • Both dataframe and series can be labelled with values of different types
  • Both dataframe and series can deal with dynamic data and mixed datatypes.
What do you understand by the size of (i) a Series, (ii) a DataFrame?

(i) size of a Series refers to total no of elements present in a Series. Consider the following example:
s = pd.Series([2,5,6,np.NaN,8])
print(s.size)

output:
5

(ii) size of a DataFrame refers to total number of elements of DataFrame which is product of rows and columns. Consider the following example:
df = pd.DataFrame({‘a’:[4,np.NaN,7],’b’:[6,2,np.NaN]})
print(df.size)

output:
6

Create the following Series and do the specified operations:
a) Anaglyph, having 26 elements with the alphabets as values and default index values.

import pandas as pd
Anaglyph  = pd.Series(chr(i) for i in range(97,123))
print(Anaglyph)

b) Vowels, having 5 elements with index labels ‘a’, ‘e’, ‘i’, ‘o’ and ‘u’ and all the five values set to zero. Check if it is an empty series.

Import pandas as pd
Vowels  = pd.Series( 0, [‘a’,’e’,’i’,’o’,’u’])
print(Vowels)
if S.empty:
print(“Empty Series”)
else:
   Print(“Series is not empty”)

c) Friends, from a dictionary having roll numbers of five of your friends as data and their first name as keys.

import pandas as pd
Friends  = pd.Series({‘ram’:1,’hari’:2,’raheem’:3,’kabir’:4,’rasool’:5})
print(Friends)

d) MT series, an empty Series. Check if it is an empty series.

import pandas as pd
MT = pd.Series()
if S.empty:
   print(“Empty Series”)
else:
   print(“Series is not empty”)

e) MonthDays, from a numpy array having the number of days in the 12 months of a year. The labels should be the month numbers from 1 to 12.

import pandas as pd
import Numpy as np
MonthDays = pd.Series(np.array([31,28,31,30,31,30,31,31,30,31,30,31]),range(1,13))
print(MonthDays)

Using the Series created in Question 5, write commands for the following:
a) Set all the values of Vowels to 10 and display the Series.

Vowels[:] = 10
print(Vowels)

b) Divide all values of Vowels by 2 and display the Series.

Vowels = Vowels/2
print(Vowels)

c) Create another series Vowels1 having 5 elements with index labels ‘a’, ‘e’, ‘i’, ‘o’ and ‘u’ having values [2,5,6,3,8] respectively.

import pandas as pd
Vowels1  = pd.Series([2,5,6,3,8],[‘a’,’e’,’i’,’o’,’u’])
print(Vowels1)

d) Add Vowels and Vowels1 and assign the result to Vowels3.

import pandas as pd
Vowels  = pd.Series(0,[‘a’,’e’,’i’,’o’,’u’])
Vowels1  = pd.Series([2,5,6,3,8],[‘a’,’e’,’i’,’o’,’u’])
Vowels3 = Vowels + Vowels1
print(Vowels3)

e) Subtract, Multiply and Divide Vowels by Vowels1.

print(Vowels1 – Vowels)
print(Vowels1 *Vowels)
print(Vowels/ Vowels1)

f) Alter the labels of Vowels1 to [‘A’, ‘E’, ‘I’, ‘O’, ‘U’].

vowels1.index  = [‘A’,’E’,’I’,’O’,’U’]
print(vowels1)

7. Using the Series created in Question 5, write commands for the following:
a) Find the dimensions, size and values of the Series EngAlph, Vowels, Friends, MTseries, and MonthDays

To find the dimensions, size and values of the Series object we can use shape, size and values attributes respectively as given below:

print(“Dimension,size and values of EngAlph”, EngAlph.shape, EngAlph.size, EngAlph.values)
print(“Dimension,size and values of Vowels”, Vowels.shape, Vowels.size, Vowels.values)
print(“Dimension,size and values of MTseries”, MTseries.shape, MTseries.size, MTseries.values)
print(“Dimension,size and values of MonthDays”, MonthDays.shape, MonthDays.size, MonthDays.values)

b) Rename the Series MTseries as SeriesEmpty.

We can rename Series MTseries as SeriesEmpty using name property as given below:

MTseries.name = ‘SeriesEmpty”

c) Name the index of the Series MonthDays as monthno and that of Series Friends as Fname.

To name the index of the MonthDays as monthno we can write:

MonthDays.index.name = “monthno”

And to name the index of the Friends as Fname we can write:

Friends.index.name = “fname”

d) Display the 3rd and 2nd value of the Series Friends, in that order.

We can display the 3rd and 2nd value of the Series Friends in that order in two ways as given below:

Using Index:

print(“3rd and 2nd value of the Series Friends are”, Friends[2], “ “, Friends[1])

Using Slice:

print(“3rd and 2nd value of the Series Friends are”, Friends[2:0:-1])

e) Display the alphabets ‘e’ to ‘p’ from the Series EngAlph.

To display the alphabets ‘e’ to ‘p’ from the Series EngAlph, we can write:

print(EngAlph[4:16])

f) Display the first 10 values in the Series EngAlph.

We can display the first 10 values in the Series EngAlph in following ways:

print(EngAlph.head(10))

OR

print(EngAlph[:10])

g) Display the last 10 values in the Series EngAlph.

We can display the last 10 values in the Series EngAlph as:

print(EngAlph.tail(10))

h) Display the MTseries

print(MTseries)

8. Using the Series created in Question 5, write commands for the following:
a) Display the names of the months 3 through 7 from the Seies MonthDays.

Print(MonthDays[2:7])

b) Display the Series MonthDays in reverse order.

print(MonthDays[::-1])

9. Create the following DataFrame Sales containing year wise sales figures for five sales persons in INR. Use the years as column labels and sales person names as row labels.
(Image Source: NCERT Textbook)

We can create dataframe ‘sales’ in various was as given below:

Using 2D dictionary:

Import pandas as pd
D = {2014:[100.5,150.8,200.9,30000,40000],2015:[12000,18000,22000,30000,45000],2016:[20000,50000,70000,100000,125000],2017:[50000,60000, 70000, 80000, 90000]}
Sales= pd.DataFrame(D, index =  [‘Madhu’,’Kusum’,’Kinshuk’,’Ankit’, ‘Shruti’])

Using 2D dictionary having values as dictionary objects:

Import pandas as pd
D = {2014:{‘madhu’:100.5, ‘kusum’:150.8,’kinshuk’:200.9,’ankit’:30000, ‘shruti’:40000}, 2015:{‘madhu’:12000, ‘kusum’:18000,’kinshuk’:22000,’ankit’:30000, ‘shruti’:45000}, 2016:{‘madhu’:20000, ‘kusum’:60000,’kinshuk’:70000,’ankit’:100000, ‘shruti’:125000},2017:{‘madhu’:50000, ‘kusum’:60000,’kinshuk’:70000,’ankit’:80000, ‘shruti’:90000} }
Sales= pd.DataFrame(D)

10. Use the DataFrame created in question 9 above to do the following:
a) Display the row labels of Sales

sales.index

b) Display the column labels of Sales

sales.columns

c) Display the data types of each column of Sales

sales.dtypes

d) Display the dimensions, shape, size and values of Sales

we can use shape, size and values attributes of dataframe to display dimension, size and values as given below:

Print(“Dimension,size and values of Sales”, Sales.shape, sales.size, sales.values)

e) Display the last two rows of Sales

print(sales.tail(2))

f) Display the first two columns of Sales

print(sales.iloc[:,:2]

g) Create a dictionary using the following data. Use this dictionary to create a DataFrame Sales2.
(Image Source: NCERT Textbook)

import pandas as pd
D = {2018 :{ ‘madhu’:160000, ‘kusum’:110000,’kinshuk’:500000,’ankit’:340000, ‘shruti’:900000}}
Sales2 = pd.Dataframe(D)

OR

import pandas as pd
D = {2018:[160000,110000,500000,340000,900000]}
Sales2 = pd.DataFrame(D, index  = [‘madhu’,’kusum’,’kinshuk’,’ankit’,’shruti’])

h) Check if Sales2 is empty or it contains data

if sales2.empty:
      print(‘sales2 is empty’)
else:
     print(‘it contains data’)

11. Use the DataFrame created in Question 9 above to do the following:
a) Append the DataFrame Sales2 to the DataFrame Sales.

In earlier versions of python append() method were used to merge two dataframes as given below:

Sales = sales.append(sales2)

But now a days in python recent versions append() method is deprecated and instead of append() now concat() is used to merge or join two dataframes as given below:

Sales = pd.concat([sales,sales2], axis = 0)

b) Change the DataFrame Sales such that it becomes its transpose

print(sales.T)

c) Display the sales made by all sales persons in the year 2017.

print(sales[2017])

OR

print(sales.loc[:,2017])

d) Display the sales made by Madhu and Ankit in the year 2017 and 2018.

df.loc[[‘Ankit’,’Kusum’],2017:]

e) Display the sales made by Shruti 2016.

df.at[‘Shruti’,2016]

OR

df[2016][‘Shruti’]

OR

df.loc[‘Shruti’,2016]

f) Add data to Sales for salesman Sumeet where the sales made are [196.2, 37800, 52000, 78438, 38852] in the years [2014, 2015, 2016, 2017, 2018] respectively.

Df.loc[‘Sumit’,:] = [196.2,37800, 52000, 78438, 38852]

g) Delete the data for the year 2014 from the DataFrame Sales.

Del df[2014]

OR

Df = Df.drop([2014],axis = 1)

h) Delete the data for sales man Kinshuk from the DataFrame Sales.

Df = Df.drop(‘Kinshuk’)

 i) Change the name of the salesperson Ankit to Vivaan and Madhu to Shailesh.

Df.rename(index = {‘Ankit’:’Vivaan’, ‘Madhu’:’Shailesh’}, Inplace = True)

j) Update the sale made by Shailesh in 2018 to 100000.

Df[2018][‘Shailesh’]=100000

OR

Df.loc[‘Shailesh’,2018]=100000

k) Write the values of DataFrame Sales to a comma separated file SalesFigures.csv on the disk. Do not write the row labels and column labels.

Sales.to_csv(‘e:\\programs\\python\\SalesFigures.csv’, header = False, index = False)

l) Read the data in the file SalesFigures.csv into a DataFrame SalesRetrieved and Display it. Now update the row labels and column labels of SalesRetrieved to be the same as that of Sales.

SalesRetrieved = pd.read_csv(‘e:\\programs\\python\\SalesFigures.csv’,names = [2014,2015,2016,2-17,2018])
salesRetrieved.rename(index = {0:’Madhu’,1:’Kusum’,2:’Kinshuk’,3:’Ankit’,4:’Shruti’,5:’Sumeet’}, inplace = True)

Leave a Comment

Your email address will not be published. Required fields are marked *

error: Content is protected !!