Data Handling Using Pandas - 2 Preeti Arora Class 12 Information Practices (IP) Solution
Note :- Please Click on Question to Answer of that Question !!!
Q1. What do quantile and var() functions do?
Q2. What is a quartile? How is it different from quantile?
Q3. How do you create quantiles and quartiles in Python Pandas?
Q4. What is pivoting? How is it useful?
Q5. Which pivoting function can work with duplicate values?
Q6. What is the use of aggregation?
Q7. How useful is sorting and grouping?
Q8. How is pivot_table() different from pivot() when both perform pivoting?
Q9. Write a program to create two dataframes with the following data:
df1
Emp_code Name
110 Taksh
112 Jeet Arora
114 Shubham Jain
df2
Emp_code Name Salary
110 Taksh 45000
112 Jeet Arora 56000
114 Shubham Jain 55000
Store these two dataframes as two separate table files inside the same database.
Q10. What is the use of creating groups?
Q11. How are agg() and transform() similar and different?
Q12. How is reindexing useful?
Q13. How can we print specific number of rows using dataframes?
Q14. Write a program to print data from a column and find out the maximum value.
Q15. Give example to implement the functions pipe, apply, aggregation (groupby), transform and applymap.
Q16. Write a Python program to select the 'name' and 'score' columns from the following dataframe.
Sample Python dictionary data and list labels:
exam_data = {'name': ['Anastasia', 'Dima', 'Katherine', 'James', 'Emily', Michael', 'Matthew', 'Laura', 'Kevin', 'Jonas'], 'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19], 'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1], 'qualify': ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes', 'no', 'no', 'yes']}
labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
Q17. Write a Python program to select the specified columns and rows from a given dataframe.
Select 'name' and 'score' columns in rows 1, 3, 5, 6 from the following dataframe.
exam_data = {'name': ['Anastasia', 'Dima', 'Katherine', 'James', 'Emily', 'Michael', 'Matthew', 'Laura', 'Kevin', 'Jonas'], 'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19], 'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1], "qualify": ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes', 'no', 'no', 'yes']}
labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
Q18. Write a Python program to select the rows where the number of attempts in the examination is greater than 2.
exam_data = {'name': ['Anastasia', 'Dima', 'Katherine', 'James', 'Emily', 'Michael', 'Matthew', 'Laura', 'Kevin', 'Jonas'], 'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19], 'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1], 'qualify': ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes', 'no', 'no', 'yes']}
labels = ['a', 'b', 'e', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
Q19. For the given dataframe df, write a Python statement to sort the dataframe in the ascending order of points :
House Year Points
0 Raman 2010 500
1 Tagore 2010 600
2 Raman 2011 300
3 Tagore 2011 400
4 Ashok 2010 500
Q20. For the following dataframe df, what will be the output of the given statement?
rollno name physics chem
0 101 Pat 90 75
1 101 Sid 40 80
2 103 Tom 50 60
3 102 Kim 90 85
4 104 Ray 65 60
df_desc = df.sort_values ('physics', ascending = False)
Q21. Predict the output of the following code:
import pandas as pd
d1 = {'rollno': [101, 101, 103, 102, 104], "name": ["Pat", "Sid", "Tom", "Kim", "Ray"],\ 'physics': [90, 40, 50, 90, 65], "chem": [75, 80, 60, 85, 60]}
df = pd.DataFrame (dl)
print (df)
print (' ------ Basic aggregate functions min (), max(), sun() and mean ()')
print('minimun is:', df["physics"].min())
print('maximum is:', df["physics"].max())
print('sum is:', df ['physics'].sum())
print ("average is:', df ["physics'].mean())
Q22. Consider the following dataframe, df1:
Classes Country Quarter Tutor
28 USA 1 Tahira
36 UK 1 Jacob
41 Japan 2 Venkat
32 USA 2 Tahira
40 USA 3 Venkat
40 UK 3 Tahira
df1 = df.groupby (['Tutor', 'Country'])
print (df1.groups)
print (df1.get_group (('Tahira', 'USA')))
print (df1.size())
print (df1.count())
print (df1['classes'].head())
print (df1.get_group (('Jacob', 'UK')))
(a) What will be the output of the following statement upon execution:
print (dfl.groupby ("Tutor").transform (np.mean))
(b) Differentiate between agg() and transform()
(c) Find the output:
print (df2.groupby ('Tutor') ['Classes'].transform (np.mean))
df2['Classmean') = df1.groupby('Tutor') ["Classes'].transform (np.mean)
print (df2)
Q23. Write a Python program to count the number of rows and columns of a dataframe. Sample data:
exam_data = {'name': ["Anastasia', 'Dima', 'Katherine', 'James', 'Emily', Michael', 'Matthew', 'Laura', 'Kevin', 'Jonas'], 'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19], 'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1], "qualify": ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes', 'no', 'no', 'yes']}
labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
Q24. Write a Python program to select the rows where the score is missing, i.e., NaN.
exam_data = {'name': ['Anastasia', 'Dima', 'Katherine', 'James', 'Emily', Michael', 'Matthew', 'Laura', 'Kevin', 'Jonas'], 'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19], 'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1], 'qualify': ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes', 'no', 'no', 'yes']}
labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
Q25. Write a Python program to calculate the mean score for each student in the dataframe.
exam_data('name': ['Anastasia", "Dima', 'Katherine', 'James', 'Emily', 'Michael', 'Matthew', 'Laura', 'Kevin', 'Jonas'], 'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19], 'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1], 'qualify': ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes', 'no', 'no', 'yes'])
labels = ['a', 'b', 'e', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
Q26. Write a Python program to sort the dataframe first by 'name' in descending order, then by 'score' in ascending order.
exam_data = {'name': ['Anastasia', 'Dima', 'Katherine', 'James', 'Emily', 'Michael', 'Matthew', 'Laura', 'Kevin', 'Jonas'], 'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19], 'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1], 'qualify': ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes', 'no', 'no', 'yes']}
labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
Q27. Why do you need connection to an SQL database in order to get data from a table?
Q28. What all libraries do you require in order to interact with databases (and dataframe) from within Python?
Post a Comment
You can help us by Clicking on ads. ^_^
Please do not send spam comment : )