{"metadata":{"kernelspec":{"language":"python","display_name":"Python 3","name":"python3"},"language_info":{"name":"python","version":"3.10.14","mimetype":"text/x-python","codemirror_mode":{"name":"ipython","version":3},"pygments_lexer":"ipython3","nbconvert_exporter":"python","file_extension":".py"},"kaggle":{"accelerator":"none","dataSources":[],"isInternetEnabled":true,"language":"python","sourceType":"notebook","isGpuEnabled":false}},"nbformat_minor":4,"nbformat":4,"cells":[{"cell_type":"markdown","source":"# Please \"Copy & Edit\" this notebook to your profile and write your solutions in each cell.","metadata":{"_uuid":"8f2839f25d086af736a60e9eeb907d3b93b6e0e5","_cell_guid":"b1076dfc-b9ad-4769-8c92-a6c4dae69d19"}},{"cell_type":"markdown","source":"> Based on notebook: [Dive into Python-Section 2](https://www.kaggle.com/code/rouzbeh/dive-into-python-section-2)","metadata":{}},{"cell_type":"markdown","source":"## Install and Import Libraries in Python","metadata":{}},{"cell_type":"markdown","source":"#### Q1: Check if the library `pandas` is installed by using `!pip show`. If it is not installed, use `!pip install pandas` to install it.","metadata":{}},{"cell_type":"code","source":"# Your code...","metadata":{"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"#### Q2: Check the version of the NumPy library. If it is outdated, upgrade NumPy to the latest version. After upgrading, remember to restart the kernel to ensure the new version is in use.\n\n> **Explanation:** Sometimes, libraries need to be updated for compatibility with newer features or other packages. If the installed version of NumPy is outdated, you can upgrade it with !pip install numpy --upgrade. After upgrading, restart the kernel (in Jupyter Notebook, go to Kernel > Restart Kernel | in kaggle notebook, right click on notebook > select Restart Kernel) to apply the new version.\n","metadata":{}},{"cell_type":"code","source":"# Your code...","metadata":{"execution":{"iopub.status.busy":"2024-10-27T10:53:39.249498Z","iopub.execute_input":"2024-10-27T10:53:39.250487Z","iopub.status.idle":"2024-10-27T10:53:39.265358Z","shell.execute_reply.started":"2024-10-27T10:53:39.250433Z","shell.execute_reply":"2024-10-27T10:53:39.264142Z"},"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"#### Q3: Import the `pandas library` with the alias `pd` and the `matplotlib.pyplot module` with the alias `plt`.","metadata":{}},{"cell_type":"code","source":"# Your code...","metadata":{},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"#### Q4: Use the `random library` to generate a random integer between 10 and 50, then generate a random floating-point number between 0 and 1.","metadata":{}},{"cell_type":"code","source":"# Your code...","metadata":{},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"#### Q5: Use the `os module` to print the current working directory and list all files in that directory.","metadata":{}},{"cell_type":"code","source":"# Your code...","metadata":{},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"#### Q6: Import the `math module` and use it to calculate the square root of 64 and the cosine of an angle in radians `(use π/4)`.","metadata":{}},{"cell_type":"code","source":"# Your code...","metadata":{},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"#### Q7: Use `collections.Counter` to count the occurrences of each element in this `list: ['apple', 'orange', 'apple', 'banana', 'orange', 'apple']`.","metadata":{}},{"cell_type":"code","source":"# Your code...","metadata":{},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"#### Q8: Use `datetime` to print the current date and time, then create a specific date (e.g., October 1, 2023) and print it.","metadata":{}},{"cell_type":"code","source":"# Your code...","metadata":{},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"## Pandas (Read and manipulate tabular data)","metadata":{}},{"cell_type":"markdown","source":"#### Q9: Create a `DataFrame` from the dictionary below and display it.\n\n`data = {\n    'Name': ['Ali', 'Sara', 'Hooman', 'Mina', 'Maryam', 'Siavash', 'Zahra'],\n    'Age': [28, 24, 35, 30, 23, 19, 34],\n    'City': ['Tehran', 'Shiraz', 'Tabriz', 'Mashhad', 'Shiraz', 'Tehran', 'Tehran'],\n    'Course': ['Statistics', 'Python', 'Python', 'Machine Learning', 'Statistics', 'Python', 'Machine Learning'],\n    'Grade': [14, 17, 12, 13, 12, 16, 19]\n}\n`","metadata":{}},{"cell_type":"code","source":"# Your code...","metadata":{},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"#### Q10: Use the `.info()` method to display the data types of columns and the number of non-null entries in each column of df.","metadata":{}},{"cell_type":"code","source":"# Your code...","metadata":{},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"#### Q11: Use the `.describe()` method to display the statistical report for `all` data in dataframe.","metadata":{}},{"cell_type":"code","source":"# Your code...","metadata":{},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"#### Q12: Select and display JUST the `Age` and `City` columns from the df DataFrame. Then, display the `Grade` column as a `DataFrame` (with column labels) and as a `Series` (without column labels).","metadata":{}},{"cell_type":"code","source":"# Your code...","metadata":{},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"#### Q13: Filter and display rows where `Grade` is greater than 15.","metadata":{}},{"cell_type":"code","source":"# Your code...","metadata":{},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"#### Q14: Add a new column called `Status` to df with the value `Active` for all rows.","metadata":{}},{"cell_type":"code","source":"# Your code...","metadata":{},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"#### Q15: Find and display the unique values in the `City` and `Course` columns.","metadata":{}},{"cell_type":"code","source":"# Your code...","metadata":{},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"#### Q16: Find the maximum and minimum values in the `Age` and `Grade` columns.","metadata":{}},{"cell_type":"code","source":"# Your code...","metadata":{},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"#### Q17: Count the unique values in the `City` column to see how many unique cities are present.","metadata":{}},{"cell_type":"code","source":"# Your code...","metadata":{},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"#### Q18: Find the most frequently occurring value in the `Course` column.","metadata":{}},{"cell_type":"code","source":"# Your code...","metadata":{},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"#### Q19: Using `.loc`, select the `City` of the first row. Then, using `.iloc`, select the `Age` in the second row.","metadata":{}},{"cell_type":"code","source":"# Your code...","metadata":{},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"#### Q20: Set the `Name` column as the index and select the row for 'Mina' using `.loc`.","metadata":{}},{"cell_type":"code","source":"# Your code...","metadata":{},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"#### Q21: Filter and display rows where `Grade` is greater than 15.","metadata":{}},{"cell_type":"code","source":"# Your code...","metadata":{},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"#### Q22: Sort the DataFrame by `Grade` in descending order.","metadata":{}},{"cell_type":"code","source":"# Your code...","metadata":{},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"#### Q23: Group the data by `City` and calculate the average `Grade` for each city.","metadata":{}},{"cell_type":"code","source":"# Your code...","metadata":{},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"#### Q24: Add a new column `Grade_Level` with values `'Pass'` if `Grade` is 15 or higher, and `'Fail'` if it is below 15.","metadata":{}},{"cell_type":"code","source":"# Your code...","metadata":{},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"#### Q25: Find the row where the student has the highest grade.","metadata":{}},{"cell_type":"code","source":"# Your code...","metadata":{},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"#### Q26: Create a new column called `Age_Norm` by dividing each value in the `Age` column by the maximum `Age` using the `.apply()` method.","metadata":{}},{"cell_type":"code","source":"# Your code...","metadata":{},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"## NumPy (Working with arrays and linear algebra)","metadata":{}},{"cell_type":"markdown","source":"#### Q27: Create a 1D NumPy array with numbers from 1 to 10 and a 2D array with numbers from 1 to 12 organized into 3 rows and 4 columns. Print each array.","metadata":{}},{"cell_type":"code","source":"# Your code...","metadata":{},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"#### Q28: Check the `shape`, `size`, and `data type` of the arrays created in `Question 27`.","metadata":{}},{"cell_type":"code","source":"# Your code...","metadata":{},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"#### Q29: Create a list of the squares of numbers from 1 to 100. Convert this list into a `2D NumPy` array with shape (10, 10). Print the resulting array.","metadata":{}},{"cell_type":"code","source":"# Your code...","metadata":{},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"#### Q30: Change the data type of the array in `Question 29` to `int16` and print the new data type.","metadata":{}},{"cell_type":"code","source":"# Your code...","metadata":{},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"#### Q31: Using the 2D array from `Question 29`, retrieve the element in the 5th row and 3rd column.","metadata":{}},{"cell_type":"code","source":"# Your code...","metadata":{},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"#### Q32: Slice a `subarray` from the 2D array that includes rows 3 to 5 and columns 2 to 4.","metadata":{}},{"cell_type":"code","source":"# Your code...","metadata":{},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"#### Q33: Select only the first two rows and last three columns from the array in `Question 29`.","metadata":{}},{"cell_type":"code","source":"# Your code...","metadata":{},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"#### Q34: Calculate and print the `maximum`, `minimum`, and `unique values` in the 2D array from `Question 29`.","metadata":{}},{"cell_type":"code","source":"# Your code...","metadata":{},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"#### Q35: Find the `most frequent value` in the 2D array.","metadata":{}},{"cell_type":"code","source":"# Your code...","metadata":{},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"#### Q36: Create three arrays:\n\n> `A 3x2 array filled with zeros  \n> A 3x2 array filled with ones  \n> A 3x2 empty array`","metadata":{}},{"cell_type":"code","source":"# Your code...","metadata":{},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"#### Q37: Write a Python function that reshapes a given matrix into a specified shape.  \n\n> `Example:\n>         input: a = [[1,2,3,4],[5,6,7,8]], new_shape = (4, 2)\n>         output: [[1, 2], [3, 4], [5, 6], [7, 8]]\n>         reasoning: The given matrix is reshaped from 2x4 to 4x2.`","metadata":{}},{"cell_type":"code","source":"# Your code...","metadata":{"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"#### Q38: Write a Python function that calculates the `mean` of a matrix either by `row` or by `column`, based on a given mode. The function should take a matrix (list of lists) and a mode ('row' or 'column') as input and return a list of means according to the specified mode.","metadata":{}},{"cell_type":"code","source":"# Your code...","metadata":{},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"## Matplotlib and Seaborn (Visualize data)","metadata":{}},{"cell_type":"markdown","source":"#### Q39: Create a basic `line plot` with `matplotlib.pyplot` that shows the change in values over a sequence. Use `x = [0, 1, 2, 3, 4]` and `y = [0, 1, 4, 9, 16]`. Add a title and labels for both axes.","metadata":{}},{"cell_type":"code","source":"# Your code...","metadata":{},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"#### Q40: Create a figure with two `subplots` in one row. In the first subplot, create a `line plot` of `y = [1, 4, 9, 16, 25]` over `x = [1, 2, 3, 4, 5]`. In the second subplot, create a `scatter plot` using the same x and y values.","metadata":{}},{"cell_type":"code","source":"# Your code...","metadata":{},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"#### Q41: Use `matplotlib.pyplot` to create a `histogram` of the list `data = [3, 7, 8, 5, 9, 4, 3, 5, 6, 7, 7, 5, 9]`. Add labels for the x-axis and y-axis, and set a title. Adjust the color and transparency.","metadata":{}},{"cell_type":"code","source":"# Your code...","metadata":{},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"#### Q42: Create a simple DataFrame from the following data and use `Seaborn` to create a `scatter plot` showing the relationship between Age and Score. Add titles and labels for clarity.\n\n> `data = {\n>     'Age': [22, 25, 30, 35, 40, 45, 50],\n>     'Score': [88, 92, 85, 90, 95, 91, 89]\n> }`","metadata":{}},{"cell_type":"code","source":"# Your code...","metadata":{},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"#### Q43: Using the data below, create a DataFrame and use `Seaborn` to create a `histogram` with a `KDE plot` for the column Scores. Customize the plot by adding labels and a title.\n\n> `data = {\n>     'Scores': [65, 70, 68, 75, 80, 85, 88, 90, 92, 85, 88, 90, 95, 78, 85]\n> }`","metadata":{}},{"cell_type":"code","source":"# Your code...","metadata":{},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"#### Q44: Using the following dictionary, create a DataFrame and plot a `line plo`t **directly** with `Pandas` for the columns `Year` and `Sales`. Customize the plot by adding a title, labels for the x and y axes, and adjusting the color.\n\n> `data = {\n>     'Year': [2017, 2018, 2019, 2020, 2021],\n>     'Sales': [250, 270, 290, 310, 330]\n> }`","metadata":{}},{"cell_type":"code","source":"# Your code...","metadata":{},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"#### Q45: Create a `bar plot` **directly** with `Pandas` to visualize the total `Revenue` for each `Product` category. Use the dictionary below to create a DataFrame and customize the plot by changing the color and adding a title.\n\n> `data = {\n>     'Product': ['A', 'B', 'C', 'D'],\n>     'Revenue': [150, 120, 180, 210]\n> }`","metadata":{}},{"cell_type":"code","source":"# Your code...","metadata":{},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"## SciKit-Learn (Preprocessing, modeling and evaluations)","metadata":{}},{"cell_type":"markdown","source":"#### Q46: Given the following DataFrame, `encode` the categorical column `Color` using `LabelEncoder` to prepare it for model training.  \n\n> `data = {'Color': ['Red', 'Blue', 'Green', 'Red', 'Green', 'Blue']}`","metadata":{}},{"cell_type":"code","source":"# Your code...","metadata":{},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"#### Q47: Using the data provided below, `split` the data into `features (X)` and `target (y)`. Then split the data into `training` and `test` sets with a 70-30 ratio.\n \n> `data = {'Age': [25, 45, 35, 50, 23, 37],\n>         'Salary': [50000, 80000, 75000, 60000, 52000, 67000],\n>         'Purchased': [1, 0, 1, 0, 1, 1]}`","metadata":{}},{"cell_type":"code","source":"# Your code...","metadata":{},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"#### Q48: `Scale` the feature columns `Age` and `Salary` using `StandardScaler` so that both columns have a mean of 0 and a standard deviation of 1.","metadata":{}},{"cell_type":"code","source":"# Your code...","metadata":{},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"#### Q49: Now, `Train` a `RandomForestClassifier` using the scaled training data. Then make predictions on the test set.","metadata":{}},{"cell_type":"code","source":"# Your code...","metadata":{},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"#### Q50: Now, `Evaluate` the model’s `accuracy` on the test data and display a `confusion matrix` and `classification report` for further analysis.","metadata":{}},{"cell_type":"code","source":"# Your code...","metadata":{},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"# Wishing you the best!","metadata":{}}]}