You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

918 lines
72 KiB

This file contains ambiguous Unicode characters!

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

{
"cells": [
{
"cell_type": "markdown",
"id": "6da5789a",
"metadata": {},
"source": [
"## Problématique :\n"
]
},
{
"cell_type": "markdown",
"id": "f0c31a3f",
"metadata": {},
"source": [
"# <span style=\"color: #FF0000\">**Qu'est ce qui fait qu'une voiture est vendue plus chère qu'une autre ?**</span>\n"
]
},
{
"cell_type": "markdown",
"id": "f64fb802",
"metadata": {},
"source": [
"## I/ Charger et explorer les données\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "c0f0ed8f",
"metadata": {},
"outputs": [],
"source": [
"# On charge les données, avec la librairie Pandas:\n",
"import pandas as pd\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"df = pd.read_csv(\"carDetailsOld.csv\", encoding=\"latin-1\")"
]
},
{
"cell_type": "markdown",
"id": "every-islam",
"metadata": {},
"source": [
"Nous affichons notre **DataFrame** pandas.\n"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "65ea7cfb",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Make</th>\n",
" <th>Model</th>\n",
" <th>Price</th>\n",
" <th>Year</th>\n",
" <th>Kilometer</th>\n",
" <th>Fuel Type</th>\n",
" <th>Transmission</th>\n",
" <th>Location</th>\n",
" <th>Color</th>\n",
" <th>Owner</th>\n",
" <th>Seller Type</th>\n",
" <th>Engine</th>\n",
" <th>Max Power</th>\n",
" <th>Max Torque</th>\n",
" <th>Drivetrain</th>\n",
" <th>Length</th>\n",
" <th>Width</th>\n",
" <th>Height</th>\n",
" <th>Seating Capacity</th>\n",
" <th>Fuel Tank Capacity</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Honda</td>\n",
" <td>Amaze 1.2 VX i-VTEC</td>\n",
" <td>505000</td>\n",
" <td>2017</td>\n",
" <td>87150</td>\n",
" <td>Petrol</td>\n",
" <td>Manual</td>\n",
" <td>Pune</td>\n",
" <td>Grey</td>\n",
" <td>First</td>\n",
" <td>Corporate</td>\n",
" <td>1198 cc</td>\n",
" <td>87 bhp @ 6000 rpm</td>\n",
" <td>109 Nm @ 4500 rpm</td>\n",
" <td>FWD</td>\n",
" <td>3990.0</td>\n",
" <td>1680.0</td>\n",
" <td>1505.0</td>\n",
" <td>5.0</td>\n",
" <td>35.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Maruti Suzuki</td>\n",
" <td>Swift DZire VDI</td>\n",
" <td>450000</td>\n",
" <td>2014</td>\n",
" <td>75000</td>\n",
" <td>Diesel</td>\n",
" <td>Manual</td>\n",
" <td>Ludhiana</td>\n",
" <td>White</td>\n",
" <td>Second</td>\n",
" <td>Individual</td>\n",
" <td>1248 cc</td>\n",
" <td>74 bhp @ 4000 rpm</td>\n",
" <td>190 Nm @ 2000 rpm</td>\n",
" <td>FWD</td>\n",
" <td>3995.0</td>\n",
" <td>1695.0</td>\n",
" <td>1555.0</td>\n",
" <td>5.0</td>\n",
" <td>42.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Hyundai</td>\n",
" <td>i10 Magna 1.2 Kappa2</td>\n",
" <td>220000</td>\n",
" <td>2011</td>\n",
" <td>67000</td>\n",
" <td>Petrol</td>\n",
" <td>Manual</td>\n",
" <td>Lucknow</td>\n",
" <td>Maroon</td>\n",
" <td>First</td>\n",
" <td>Individual</td>\n",
" <td>1197 cc</td>\n",
" <td>79 bhp @ 6000 rpm</td>\n",
" <td>112.7619 Nm @ 4000 rpm</td>\n",
" <td>FWD</td>\n",
" <td>3585.0</td>\n",
" <td>1595.0</td>\n",
" <td>1550.0</td>\n",
" <td>5.0</td>\n",
" <td>35.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Toyota</td>\n",
" <td>Glanza G</td>\n",
" <td>799000</td>\n",
" <td>2019</td>\n",
" <td>37500</td>\n",
" <td>Petrol</td>\n",
" <td>Manual</td>\n",
" <td>Mangalore</td>\n",
" <td>Red</td>\n",
" <td>First</td>\n",
" <td>Individual</td>\n",
" <td>1197 cc</td>\n",
" <td>82 bhp @ 6000 rpm</td>\n",
" <td>113 Nm @ 4200 rpm</td>\n",
" <td>FWD</td>\n",
" <td>3995.0</td>\n",
" <td>1745.0</td>\n",
" <td>1510.0</td>\n",
" <td>5.0</td>\n",
" <td>37.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Toyota</td>\n",
" <td>Innova 2.4 VX 7 STR [2016-2020]</td>\n",
" <td>1950000</td>\n",
" <td>2018</td>\n",
" <td>69000</td>\n",
" <td>Diesel</td>\n",
" <td>Manual</td>\n",
" <td>Mumbai</td>\n",
" <td>Grey</td>\n",
" <td>First</td>\n",
" <td>Individual</td>\n",
" <td>2393 cc</td>\n",
" <td>148 bhp @ 3400 rpm</td>\n",
" <td>343 Nm @ 1400 rpm</td>\n",
" <td>RWD</td>\n",
" <td>4735.0</td>\n",
" <td>1830.0</td>\n",
" <td>1795.0</td>\n",
" <td>7.0</td>\n",
" <td>55.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>...</th>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2054</th>\n",
" <td>Mahindra</td>\n",
" <td>XUV500 W8 [2015-2017]</td>\n",
" <td>850000</td>\n",
" <td>2016</td>\n",
" <td>90300</td>\n",
" <td>Diesel</td>\n",
" <td>Manual</td>\n",
" <td>Surat</td>\n",
" <td>White</td>\n",
" <td>First</td>\n",
" <td>Individual</td>\n",
" <td>2179 cc</td>\n",
" <td>138 bhp @ 3750 rpm</td>\n",
" <td>330 Nm @ 1600 rpm</td>\n",
" <td>FWD</td>\n",
" <td>4585.0</td>\n",
" <td>1890.0</td>\n",
" <td>1785.0</td>\n",
" <td>7.0</td>\n",
" <td>70.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2055</th>\n",
" <td>Hyundai</td>\n",
" <td>Eon D-Lite +</td>\n",
" <td>275000</td>\n",
" <td>2014</td>\n",
" <td>83000</td>\n",
" <td>Petrol</td>\n",
" <td>Manual</td>\n",
" <td>Ahmedabad</td>\n",
" <td>White</td>\n",
" <td>Second</td>\n",
" <td>Individual</td>\n",
" <td>814 cc</td>\n",
" <td>55 bhp @ 5500 rpm</td>\n",
" <td>75 Nm @ 4000 rpm</td>\n",
" <td>FWD</td>\n",
" <td>3495.0</td>\n",
" <td>1550.0</td>\n",
" <td>1500.0</td>\n",
" <td>5.0</td>\n",
" <td>32.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2056</th>\n",
" <td>Ford</td>\n",
" <td>Figo Duratec Petrol ZXI 1.2</td>\n",
" <td>240000</td>\n",
" <td>2013</td>\n",
" <td>73000</td>\n",
" <td>Petrol</td>\n",
" <td>Manual</td>\n",
" <td>Thane</td>\n",
" <td>Silver</td>\n",
" <td>First</td>\n",
" <td>Individual</td>\n",
" <td>1196 cc</td>\n",
" <td>70 bhp @ 6250 rpm</td>\n",
" <td>102 Nm @ 4000 rpm</td>\n",
" <td>FWD</td>\n",
" <td>3795.0</td>\n",
" <td>1680.0</td>\n",
" <td>1427.0</td>\n",
" <td>5.0</td>\n",
" <td>45.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2057</th>\n",
" <td>BMW</td>\n",
" <td>5-Series 520d Luxury Line [2017-2019]</td>\n",
" <td>4290000</td>\n",
" <td>2018</td>\n",
" <td>60474</td>\n",
" <td>Diesel</td>\n",
" <td>Automatic</td>\n",
" <td>Coimbatore</td>\n",
" <td>White</td>\n",
" <td>First</td>\n",
" <td>Individual</td>\n",
" <td>1995 cc</td>\n",
" <td>188 bhp @ 4000 rpm</td>\n",
" <td>400 Nm @ 1750 rpm</td>\n",
" <td>RWD</td>\n",
" <td>4936.0</td>\n",
" <td>1868.0</td>\n",
" <td>1479.0</td>\n",
" <td>5.0</td>\n",
" <td>65.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2058</th>\n",
" <td>Mahindra</td>\n",
" <td>Bolero Power Plus ZLX [2016-2019]</td>\n",
" <td>670000</td>\n",
" <td>2017</td>\n",
" <td>72000</td>\n",
" <td>Diesel</td>\n",
" <td>Manual</td>\n",
" <td>Guwahati</td>\n",
" <td>White</td>\n",
" <td>First</td>\n",
" <td>Individual</td>\n",
" <td>1493 cc</td>\n",
" <td>70 bhp @ 3600 rpm</td>\n",
" <td>195 Nm @ 1400 rpm</td>\n",
" <td>RWD</td>\n",
" <td>3995.0</td>\n",
" <td>1745.0</td>\n",
" <td>1880.0</td>\n",
" <td>7.0</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>2059 rows × 20 columns</p>\n",
"</div>"
],
"text/plain": [
" Make Model Price Year \\\n",
"0 Honda Amaze 1.2 VX i-VTEC 505000 2017 \n",
"1 Maruti Suzuki Swift DZire VDI 450000 2014 \n",
"2 Hyundai i10 Magna 1.2 Kappa2 220000 2011 \n",
"3 Toyota Glanza G 799000 2019 \n",
"4 Toyota Innova 2.4 VX 7 STR [2016-2020] 1950000 2018 \n",
"... ... ... ... ... \n",
"2054 Mahindra XUV500 W8 [2015-2017] 850000 2016 \n",
"2055 Hyundai Eon D-Lite + 275000 2014 \n",
"2056 Ford Figo Duratec Petrol ZXI 1.2 240000 2013 \n",
"2057 BMW 5-Series 520d Luxury Line [2017-2019] 4290000 2018 \n",
"2058 Mahindra Bolero Power Plus ZLX [2016-2019] 670000 2017 \n",
"\n",
" Kilometer Fuel Type Transmission Location Color Owner \\\n",
"0 87150 Petrol Manual Pune Grey First \n",
"1 75000 Diesel Manual Ludhiana White Second \n",
"2 67000 Petrol Manual Lucknow Maroon First \n",
"3 37500 Petrol Manual Mangalore Red First \n",
"4 69000 Diesel Manual Mumbai Grey First \n",
"... ... ... ... ... ... ... \n",
"2054 90300 Diesel Manual Surat White First \n",
"2055 83000 Petrol Manual Ahmedabad White Second \n",
"2056 73000 Petrol Manual Thane Silver First \n",
"2057 60474 Diesel Automatic Coimbatore White First \n",
"2058 72000 Diesel Manual Guwahati White First \n",
"\n",
" Seller Type Engine Max Power Max Torque \\\n",
"0 Corporate 1198 cc 87 bhp @ 6000 rpm 109 Nm @ 4500 rpm \n",
"1 Individual 1248 cc 74 bhp @ 4000 rpm 190 Nm @ 2000 rpm \n",
"2 Individual 1197 cc 79 bhp @ 6000 rpm 112.7619 Nm @ 4000 rpm \n",
"3 Individual 1197 cc 82 bhp @ 6000 rpm 113 Nm @ 4200 rpm \n",
"4 Individual 2393 cc 148 bhp @ 3400 rpm 343 Nm @ 1400 rpm \n",
"... ... ... ... ... \n",
"2054 Individual 2179 cc 138 bhp @ 3750 rpm 330 Nm @ 1600 rpm \n",
"2055 Individual 814 cc 55 bhp @ 5500 rpm 75 Nm @ 4000 rpm \n",
"2056 Individual 1196 cc 70 bhp @ 6250 rpm 102 Nm @ 4000 rpm \n",
"2057 Individual 1995 cc 188 bhp @ 4000 rpm 400 Nm @ 1750 rpm \n",
"2058 Individual 1493 cc 70 bhp @ 3600 rpm 195 Nm @ 1400 rpm \n",
"\n",
" Drivetrain Length Width Height Seating Capacity Fuel Tank Capacity \n",
"0 FWD 3990.0 1680.0 1505.0 5.0 35.0 \n",
"1 FWD 3995.0 1695.0 1555.0 5.0 42.0 \n",
"2 FWD 3585.0 1595.0 1550.0 5.0 35.0 \n",
"3 FWD 3995.0 1745.0 1510.0 5.0 37.0 \n",
"4 RWD 4735.0 1830.0 1795.0 7.0 55.0 \n",
"... ... ... ... ... ... ... \n",
"2054 FWD 4585.0 1890.0 1785.0 7.0 70.0 \n",
"2055 FWD 3495.0 1550.0 1500.0 5.0 32.0 \n",
"2056 FWD 3795.0 1680.0 1427.0 5.0 45.0 \n",
"2057 RWD 4936.0 1868.0 1479.0 5.0 65.0 \n",
"2058 RWD 3995.0 1745.0 1880.0 7.0 NaN \n",
"\n",
"[2059 rows x 20 columns]"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# display(df) produit un affichage \"spécial jupyter\" du contenu du DataFrame df\n",
"# Taper le nom d'une variable à la dernière ligne d'une cellule est un raccourci pour display.\n",
"df #.head(5) permet d'afficher juste les 5 premiers "
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "d846d8e4",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0 1198 cc\n",
"1 1248 cc\n",
"2 1197 cc\n",
"3 1197 cc\n",
"4 2393 cc\n",
" ... \n",
"2054 2179 cc\n",
"2055 814 cc\n",
"2056 1196 cc\n",
"2057 1995 cc\n",
"2058 1493 cc\n",
"Name: Engine, Length: 2059, dtype: object\n"
]
}
],
"source": [
"print(df[\"Engine\"])"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "2aea6e9f",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Type de données par colonne : \n",
" Make object\n",
"Model object\n",
"Price int64\n",
"Year int64\n",
"Kilometer int64\n",
"Fuel Type object\n",
"Transmission object\n",
"Location object\n",
"Color object\n",
"Owner object\n",
"Seller Type object\n",
"Engine object\n",
"Max Power object\n",
"Max Torque object\n",
"Drivetrain object\n",
"Length float64\n",
"Width float64\n",
"Height float64\n",
"Seating Capacity float64\n",
"Fuel Tank Capacity float64\n",
"dtype: object\n",
"\n",
"\n",
"Nb de lignes : 2059\n",
"Nb de colonnes : 20\n",
"\n",
"\n",
"Les colonnes les plus importantes pour nous sont : la marque, le modèle, le prix, le kilométrage et la puissance\n"
]
}
],
"source": [
"print(\"Type de données par colonne : \\n\", df.dtypes)\n",
"print(\"\\n\")\n",
"print(\"Nb de lignes : \", len(df))\n",
"print(\"Nb de colonnes : \", len(df.columns))\n",
"print(\"\\n\")\n",
"print(\"Les colonnes les plus importantes pour nous sont : la marque, le modèle, le prix, le kilométrage et la puissance\")"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "b7406055",
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAX0AAAEFCAYAAAAPCDf9AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8qNh9FAAAACXBIWXMAAAsTAAALEwEAmpwYAAARzklEQVR4nO3df4xlZ13H8feHbltElC3tpNbdla260RQiUjelSkII649SCNtEIEuMLLhmo1ZFMZFFExsxJCUaK/gDs6GVxZBCLWhXKOKmLUETWpgilP4AOpYf3U2hI/0BWBUXv/5xn623w8zOnbmzd+74vF/JzZzzPM8953tPZj/3zHPPPZuqQpLUhyetdwGSpMkx9CWpI4a+JHXE0Jekjhj6ktSRTetdwMmcc845tX379vUuQ5I2lNtvv/3fqmpmsb6pDv3t27czOzu73mVI0oaS5ItL9Tm9I0kdMfQlqSOGviR1xNCXpI4Y+pLUEUNfkjpi6EtSRwx9SeqIoS9JHZnqb+SOa/uBD4w07gtXvvgUVyJJ08EzfUnqiKEvSR0x9CWpI4a+JHXE0Jekjhj6ktQRQ1+SOmLoS1JHDH1J6siyoZ/kmiQPJrlzqO0Pk3wmyR1J/jbJ5qG+NySZS/LZJD8z1H5Ja5tLcmDNX4kkaVmjnOm/A7hkQdsR4FlV9SPA54A3ACS5ANgDPLM95y+SnJbkNODPgRcBFwCvbGMlSRO0bOhX1UeAhxa0/WNVHW+rtwJb2/Ju4N1V9V9V9XlgDrioPeaq6r6q+ibw7jZWkjRBazGn/wvAB9vyFuD+ob6jrW2p9m+TZH+S2SSz8/Pza1CeJOmEsUI/ye8Cx4F3rU05UFUHq2pnVe2cmZlZq81Kkhjj1spJXg28BNhVVdWajwHbhoZtbW2cpF2SNCGrOtNPcgnw28BLq+qxoa7DwJ4kZyY5H9gBfAz4OLAjyflJzmDwYe/h8UqXJK3Usmf6Sa4FXgCck+QocAWDq3XOBI4kAbi1qn6pqu5Kch1wN4Npn8ur6lttO78KfAg4Dbimqu46Ba9HknQSy4Z+Vb1ykearTzL+TcCbFmm/EbhxRdVJktaU38iVpI4Y+pLUEUNfkjpi6EtSRwx9SeqIoS9JHTH0Jakjhr4kdcTQl6SOGPqS1BFDX5I6YuhLUkcMfUnqiKEvSR0x9CWpI4a+JHXE0Jekjhj6ktQRQ1+SOmLoS1JHDH1J6oihL0kdMfQlqSPLhn6Sa5I8mOTOobanJzmS5N7286zWniRvTTKX5I4kFw49Z28bf2+Svafm5UiSTmaUM/13AJcsaDsA3FRVO4Cb2jrAi4Ad7bEfeBsM3iSAK4DnAhcBV5x4o5AkTc6yoV9VHwEeWtC8GzjUlg8Blw21v7MGbgU2JzkP+BngSFU9VFUPA0f49jcSSdIptto5/XOr6oG2/GXg3La8Bbh/aNzR1rZU+7dJsj/JbJLZ+fn5VZYnSVrM2B/kVlUBtQa1nNjewaraWVU7Z2Zm1mqzkiRWH/pfadM2tJ8PtvZjwLahcVtb21LtkqQJWm3oHwZOXIGzF7hhqP1V7Sqei4FH2zTQh4CfTnJW+wD3p1ubJGmCNi03IMm1wAuAc5IcZXAVzpXAdUn2AV8EXtGG3whcCswBjwGvAaiqh5L8AfDxNu6NVbXww2FJ0im2bOhX1SuX6Nq1yNgCLl9iO9cA16yoOknSmvIbuZLUEUNfkjpi6EtSRwx9SeqIoS9JHTH0Jakjhr4kdcTQl6SOGPqS1BFDX5I6YuhLUkcMfUnqiKEvSR0x9CWpI4a+JHXE0Jekjhj6ktQRQ1+SOmLoS1JHDH1J6oihL0kdMfQlqSOGviR1ZKzQT/KbSe5KcmeSa5M8Ocn5SW5LMpfkPUnOaGPPbOtzrX/7mrwCSdLIVh36SbYAvw7srKpnAacBe4A3A1dV1Q8CDwP72lP2AQ+39qvaOEnSBI07vbMJ+I4km4CnAA8ALwSub/2HgMva8u62TuvflSRj7l+StAKrDv2qOgb8EfAlBmH/KHA78EhVHW/DjgJb2vIW4P723ONt/NkLt5tkf5LZJLPz8/OrLU+StIhxpnfOYnD2fj7wvcB3ApeMW1BVHayqnVW1c2ZmZtzNSZKGjDO985PA56tqvqr+G3gf8Dxgc5vuAdgKHGvLx4BtAK3/acBXx9i/JGmFxgn9LwEXJ3lKm5vfBdwN3AK8rI3ZC9zQlg+3dVr/zVVVY+xfkrRC48zp38bgA9lPAJ9u2zoIvB54XZI5BnP2V7enXA2c3dpfBxwYo25J0ipsWn7I0qrqCuCKBc33ARctMvY/gZePsz9J0nj8Rq4kdcTQl6SOGPqS1BFDX5I6YuhLUkcMfUnqiKEvSR0x9CWpI4a+JHXE0Jekjhj6ktQRQ1+SOmLoS1JHDH1J6oihL0kdMfQlqSOGviR1xNCXpI4Y+pLUEUNfkjpi6EtSRwx9SeqIoS9JHTH0JakjY4V+ks1Jrk/ymST3JPnxJE9PciTJve3nWW1skrw1yVySO5JcuDYvQZI0qnHP9N8C/ENV/TDwbOAe4ABwU1XtAG5q6wAvAna0x37gbWPuW5K0QqsO/SRPA54PXA1QVd+sqkeA3cChNuwQcFlb3g28swZuBTYnOW+1+5ckrdw4Z/rnA/PAXyX5lyRvT/KdwLlV9UAb82Xg3La8Bbh/6PlHW9sTJNmfZDbJ7Pz8/BjlSZIWGif0NwEXAm+rqucA/87/TeUAUFUF1Eo2WlUHq2pnVe2cmZkZozxJ0kLjhP5R4GhV3dbWr2fwJvCVE9M27eeDrf8YsG3o+VtbmyRpQlYd+lX1ZeD+JD/UmnYBdwOHgb2tbS9wQ1s+DLyqXcVzMfDo0DSQJGkCNo35/F8D3pXkDOA+4DUM3kiuS7IP+CLwijb2RuBSYA54rI2VJE3QWKFfVZ8Edi7StWuRsQVcPs7+JEnj8Ru5ktQRQ1+SOmLoS1JHDH1J6oihL0kdMfQlqSOGviR1xNCXpI4Y+pLUEUNfkjpi6EtSRwx9SeqIoS9JHTH0Jakjhr4kdcTQl6SOGPqS1BFDX5I6YuhLUkcMfUnqiKEvSR0x9CWpI4a+JHVk7NBPclqSf0ny/rZ+fpLbkswleU+SM1r7mW19rvVvH3ffkqSVWYsz/dcC9wytvxm4qqp+EHgY2Nfa9wEPt/ar2jhJ0gSNFfpJtgIvBt7e1gO8ELi+DTkEXNaWd7d1Wv+uNl6SNCHjnun/CfDbwP+09bOBR6rqeFs/Cmxpy1uA+wFa/6Nt/BMk2Z9kNsns/Pz8mOVJkoatOvSTvAR4sKpuX8N6qKqDVbWzqnbOzMys5aYlqXubxnju84CXJrkUeDLw3cBbgM1JNrWz+a3AsTb+GLANOJpkE/A04Ktj7F+StEKrPtOvqjdU1daq2g7sAW6uqp8DbgFe1obtBW5oy4fbOq3/5qqq1e5fkrRyp+I6/dcDr0syx2DO/urWfjVwdmt/HXDgFOxbknQS40zvPK6qPgx8uC3fB1y0yJj/BF6+FvuTJK2O38iVpI4Y+pLUkTWZ3tnoth/4wEjjvnDli09xJZJ0anmmL0kdMfQlqSOGviR1xNCXpI4Y+pLUEUNfkjpi6EtSRwx9SeqIoS9JHTH0Jakjhr4kdcTQl6SOGPqS1BFDX5I6YuhLUkcMfUnqiKEvSR0x9CWpI4a+JHXE0Jekjqw69JNsS3JLkruT3JXkta396UmOJLm3/TyrtSfJW5PMJbkjyYVr9SIkSaMZ50z/OPBbVXUBcDFweZILgAPATVW1A7iprQO8CNjRHvuBt42xb0nSKqw69Kvqgar6RFv+OnAPsAXYDRxqww4Bl7Xl3cA7a+BWYHOS81a7f0nSyq3JnH6S7cBzgNuAc6vqgdb1ZeDctrwFuH/oaUdb28Jt7U8ym2R2fn5+LcqTJDVjh36SpwLvBX6jqr423FdVBdRKtldVB6tqZ1XtnJmZGbc8SdKQsUI/yekMAv9dVfW+1vyVE9M27eeDrf0YsG3o6VtbmyRpQsa5eifA1cA9VfXHQ12Hgb1teS9ww1D7q9pVPBcDjw5NA0mSJmDTGM99HvDzwKeTfLK1/Q5wJXBdkn3AF4FXtL4bgUuBOeAx4DVj7FuStAqrDv2q+mcgS3TvWmR8AZevdn/TYPuBD4w07gtXvvgUVyJJq+M3ciWpI4a+JHXE0Jekjhj6ktQRQ1+SOmLoS1JHDH1J6oihL0kdMfQlqSOGviR1xNCXpI6Mc8M1LcF79EiaVp7pS1JHDH1J6oihL0kdMfQlqSOGviR1xNCXpI54yeY68tJOSZPmmb4kdcQz/Q1g1L8IwL8KJJ2cZ/qS1BFDX5I6YuhLUkcmPqef5BLgLcBpwNur6spJ1/D/mVcESTqZiYZ+ktOAPwd+CjgKfDzJ4aq6e5J1yDcHqVeTPtO/CJirqvsAkrwb2A0Y+lNqJVcOrYf1fFNa6zdO34g1CZMO/S3A/UPrR4HnDg9Ish/Y31a/keSzq9zXOcC/rfK562Ej1Ts1tebNIw1b13pHrPGEZWtd4fZOtan5XRjBRqoVxqv3GUt1TN11+lV1EDg47naSzFbVzjUoaSI2Ur0bqVbYWPVupFphY9W7kWqFU1fvpK/eOQZsG1rf2tokSRMw6dD/OLAjyflJzgD2AIcnXIMkdWui0ztVdTzJrwIfYnDJ5jVVddcp2t3YU0QTtpHq3Ui1wsaqdyPVChur3o1UK5yielNVp2K7kqQp5DdyJakjhr4kdWTDh36SS5J8NslckgOL9J+Z5D2t/7Yk29ehzBO1LFfrq5PMJ/lke/zietQ5VM81SR5McucS/Uny1vZ67khy4aRrHKpluVpfkOTRoWP7e5OucaiWbUluSXJ3kruSvHaRMdN0bEepdyqOb5InJ/lYkk+1Wn9/kTHTlAmj1Lu2uVBVG/bB4MPgfwW+HzgD+BRwwYIxvwL8ZVveA7xnimt9NfBn631ch+p5PnAhcOcS/ZcCHwQCXAzcNsW1vgB4/3of01bLecCFbfm7gM8t8rswTcd2lHqn4vi24/XUtnw6cBtw8YIxU5EJK6h3TXNho5/pP35bh6r6JnDitg7DdgOH2vL1wK4kmWCNJ4xS61Spqo8AD51kyG7gnTVwK7A5yXmTqe6JRqh1alTVA1X1ibb8deAeBt9WHzZNx3aUeqdCO17faKunt8fCq1WmJRNGrXdNbfTQX+y2Dgt/GR8fU1XHgUeBsydS3RJ1NIvVCvCz7c/565NsW6R/moz6mqbFj7c/oz+Y5JnrXQxAm1p4DoMzvGFTeWxPUi9MyfFNclqSTwIPAkeqaslju86ZAIxUL6xhLmz00P//5u+B7VX1I8AR/u9sROP7BPCMqno28KfA361vOZDkqcB7gd+oqq+tdz3LWabeqTm+VfWtqvpRBt/4vyjJs9arllGMUO+a5sJGD/1Rbuvw+Jgkm4CnAV+dSHVL1NF8W61V9dWq+q+2+nbgxyZU22ptmNtqVNXXTvwZXVU3AqcnOWe96klyOoMAfVdVvW+RIVN1bJerd9qOb6vjEeAW4JIFXdOSCU+wVL1rnQsbPfRHua3DYWBvW34ZcHO1T0cmbNlaF8zZvpTB3Ok0Owy8ql1pcjHwaFU9sN5FLSbJ95yYt01yEYPf/XX5h97quBq4p6r+eIlhU3NsR6l3Wo5vkpkkm9vydzD4vzs+s2DYtGTCSPWudS5M3V02V6KWuK1DkjcCs1V1mMEv618nmWPwQd+eKa7115O8FDjean31etR6QpJrGVyVcU6So8AVDD5ooqr+EriRwVUmc8BjwGvWp9KRan0Z8MtJjgP/AexZr3/owPOAnwc+3eZyAX4H+D6YvmPLaPVOy/E9DziUwX/Y9CTguqp6/zRmQjNKvWuaC96GQZI6stGndyRJK2DoS1JHDH1J6oihL0kdMfQlaUpkmRsHLhh71dBN2D6X5JGR9uHVO5I0HZI8H/gGg/sujfxN4iS/Bjynqn5hubGe6UvSlFjsxoFJfiDJPyS5Pck/JfnhRZ76SuDaUfaxob+cJUkdOAj8UlXdm+S5wF8ALzzRmeQZwPnAzaNszNCXpCnVbnL3E8DfDN39+cwFw/YA11fVt0bZpqEvSdPrScAj7S6cS9kDXL6SDUqSplC7hfXnk7wcHv9vNJ99or/N758FfHTUbRr6kjQl2o0DPwr8UJKjSfYBPwfsS/Ip4C6e+D/u7QHevZKb23nJpiR1xDN9SeqIoS9JHTH0Jakjhr4kdcTQl6SOGPqS1BFDX5I68r9w7h0NWQ6N5QAAAABJRU5ErkJggg==\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Prix moyen d'un véhicule : 1702992 ₹\n",
"Nombre de voitures en FWD : 1330 nombre de voitures en RWD : 321 nombre de voitures en AWD : 272\n",
"Soit en pourcentage : 64.59446333171442 % de FWD, 15.590092277804759 % de RWD et 13.210296260320545 % de AWD\n"
]
}
],
"source": [
"prix = df[\"Price\"].to_list()\n",
"plt.hist(prix, bins=30)\n",
"plt.show()\n",
"\n",
"print(\"Prix moyen d'un véhicule : \", int(df[\"Price\"].mean().round()), \"₹\")\n",
"\n",
"print(\"Nombre de voitures en FWD : \", len(df[df[\"Drivetrain\"] == \"FWD\"]), \"nombre de voitures en RWD : \", len(df[df[\"Drivetrain\"] == \"RWD\"]), \"nombre de voitures en AWD : \", len(df[df[\"Drivetrain\"] == \"AWD\"]))\n",
"print(\"Soit en pourcentage : \", len(df[df[\"Drivetrain\"] == \"FWD\"])/len(df)*100, \"% de FWD, \", len(df[df[\"Drivetrain\"] == \"RWD\"])/len(df)*100, \"% de RWD et \", len(df[df[\"Drivetrain\"] == \"AWD\"])/len(df)*100, \"% de AWD\")"
]
},
{
"cell_type": "markdown",
"id": "5cc5c8ff",
"metadata": {},
"source": [
"## III/ Nettoyage et présentation de données\n"
]
},
{
"cell_type": "markdown",
"id": "3eedcf6a",
"metadata": {},
"source": [
"### Supprimer les colonnes non pertinentes"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "c068815f",
"metadata": {},
"outputs": [
{
"ename": "FileNotFoundError",
"evalue": "[Errno 2] No such file or directory: 'carDetailsV4.csv'",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mFileNotFoundError\u001b[0m Traceback (most recent call last)",
"Input \u001b[0;32mIn [6]\u001b[0m, in \u001b[0;36m<cell line: 4>\u001b[0;34m()\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[38;5;28;01mdel\u001b[39;00m df[\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mColor\u001b[39m\u001b[38;5;124m\"\u001b[39m]\n\u001b[1;32m 2\u001b[0m \u001b[38;5;28;01mdel\u001b[39;00m df[\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mLocation\u001b[39m\u001b[38;5;124m'\u001b[39m]\n\u001b[0;32m----> 4\u001b[0m df \u001b[38;5;241m=\u001b[39m \u001b[43mpd\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mread_csv\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mcarDetailsV4.csv\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mencoding\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mlatin-1\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m)\u001b[49m\n\u001b[1;32m 5\u001b[0m df\u001b[38;5;241m=\u001b[39mdf\u001b[38;5;241m.\u001b[39mdropna(axis\u001b[38;5;241m=\u001b[39mInteger(\u001b[38;5;241m0\u001b[39m))\n\u001b[1;32m 7\u001b[0m \u001b[38;5;66;03m#Permet d'afficher le dataframe\u001b[39;00m\n",
"File \u001b[0;32m/usr/local/lib/python3.9/dist-packages/pandas/util/_decorators.py:311\u001b[0m, in \u001b[0;36mdeprecate_nonkeyword_arguments.<locals>.decorate.<locals>.wrapper\u001b[0;34m(*args, **kwargs)\u001b[0m\n\u001b[1;32m 305\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28mlen\u001b[39m(args) \u001b[38;5;241m>\u001b[39m num_allow_args:\n\u001b[1;32m 306\u001b[0m warnings\u001b[38;5;241m.\u001b[39mwarn(\n\u001b[1;32m 307\u001b[0m msg\u001b[38;5;241m.\u001b[39mformat(arguments\u001b[38;5;241m=\u001b[39marguments),\n\u001b[1;32m 308\u001b[0m \u001b[38;5;167;01mFutureWarning\u001b[39;00m,\n\u001b[1;32m 309\u001b[0m stacklevel\u001b[38;5;241m=\u001b[39mstacklevel,\n\u001b[1;32m 310\u001b[0m )\n\u001b[0;32m--> 311\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43mfunc\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43margs\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\u001b[43m)\u001b[49m\n",
"File \u001b[0;32m/usr/local/lib/python3.9/dist-packages/pandas/io/parsers/readers.py:680\u001b[0m, in \u001b[0;36mread_csv\u001b[0;34m(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, cache_dates, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, doublequote, escapechar, comment, encoding, encoding_errors, dialect, error_bad_lines, warn_bad_lines, on_bad_lines, delim_whitespace, low_memory, memory_map, float_precision, storage_options)\u001b[0m\n\u001b[1;32m 665\u001b[0m kwds_defaults \u001b[38;5;241m=\u001b[39m _refine_defaults_read(\n\u001b[1;32m 666\u001b[0m dialect,\n\u001b[1;32m 667\u001b[0m delimiter,\n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 676\u001b[0m defaults\u001b[38;5;241m=\u001b[39m{\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mdelimiter\u001b[39m\u001b[38;5;124m\"\u001b[39m: \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124m,\u001b[39m\u001b[38;5;124m\"\u001b[39m},\n\u001b[1;32m 677\u001b[0m )\n\u001b[1;32m 678\u001b[0m kwds\u001b[38;5;241m.\u001b[39mupdate(kwds_defaults)\n\u001b[0;32m--> 680\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43m_read\u001b[49m\u001b[43m(\u001b[49m\u001b[43mfilepath_or_buffer\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mkwds\u001b[49m\u001b[43m)\u001b[49m\n",
"File \u001b[0;32m/usr/local/lib/python3.9/dist-packages/pandas/io/parsers/readers.py:575\u001b[0m, in \u001b[0;36m_read\u001b[0;34m(filepath_or_buffer, kwds)\u001b[0m\n\u001b[1;32m 572\u001b[0m _validate_names(kwds\u001b[38;5;241m.\u001b[39mget(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mnames\u001b[39m\u001b[38;5;124m\"\u001b[39m, \u001b[38;5;28;01mNone\u001b[39;00m))\n\u001b[1;32m 574\u001b[0m \u001b[38;5;66;03m# Create the parser.\u001b[39;00m\n\u001b[0;32m--> 575\u001b[0m parser \u001b[38;5;241m=\u001b[39m \u001b[43mTextFileReader\u001b[49m\u001b[43m(\u001b[49m\u001b[43mfilepath_or_buffer\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwds\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 577\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m chunksize \u001b[38;5;129;01mor\u001b[39;00m iterator:\n\u001b[1;32m 578\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m parser\n",
"File \u001b[0;32m/usr/local/lib/python3.9/dist-packages/pandas/io/parsers/readers.py:934\u001b[0m, in \u001b[0;36mTextFileReader.__init__\u001b[0;34m(self, f, engine, **kwds)\u001b[0m\n\u001b[1;32m 931\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39moptions[\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mhas_index_names\u001b[39m\u001b[38;5;124m\"\u001b[39m] \u001b[38;5;241m=\u001b[39m kwds[\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mhas_index_names\u001b[39m\u001b[38;5;124m\"\u001b[39m]\n\u001b[1;32m 933\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mhandles: IOHandles \u001b[38;5;241m|\u001b[39m \u001b[38;5;28;01mNone\u001b[39;00m \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;01mNone\u001b[39;00m\n\u001b[0;32m--> 934\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_engine \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_make_engine\u001b[49m\u001b[43m(\u001b[49m\u001b[43mf\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mengine\u001b[49m\u001b[43m)\u001b[49m\n",
"File \u001b[0;32m/usr/local/lib/python3.9/dist-packages/pandas/io/parsers/readers.py:1218\u001b[0m, in \u001b[0;36mTextFileReader._make_engine\u001b[0;34m(self, f, engine)\u001b[0m\n\u001b[1;32m 1214\u001b[0m mode \u001b[38;5;241m=\u001b[39m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mrb\u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m 1215\u001b[0m \u001b[38;5;66;03m# error: No overload variant of \"get_handle\" matches argument types\u001b[39;00m\n\u001b[1;32m 1216\u001b[0m \u001b[38;5;66;03m# \"Union[str, PathLike[str], ReadCsvBuffer[bytes], ReadCsvBuffer[str]]\"\u001b[39;00m\n\u001b[1;32m 1217\u001b[0m \u001b[38;5;66;03m# , \"str\", \"bool\", \"Any\", \"Any\", \"Any\", \"Any\", \"Any\"\u001b[39;00m\n\u001b[0;32m-> 1218\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mhandles \u001b[38;5;241m=\u001b[39m \u001b[43mget_handle\u001b[49m\u001b[43m(\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;66;43;03m# type: ignore[call-overload]\u001b[39;49;00m\n\u001b[1;32m 1219\u001b[0m \u001b[43m \u001b[49m\u001b[43mf\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1220\u001b[0m \u001b[43m \u001b[49m\u001b[43mmode\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1221\u001b[0m \u001b[43m \u001b[49m\u001b[43mencoding\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43moptions\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mget\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mencoding\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;28;43;01mNone\u001b[39;49;00m\u001b[43m)\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1222\u001b[0m \u001b[43m \u001b[49m\u001b[43mcompression\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43moptions\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mget\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mcompression\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;28;43;01mNone\u001b[39;49;00m\u001b[43m)\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1223\u001b[0m \u001b[43m \u001b[49m\u001b[43mmemory_map\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43moptions\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mget\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mmemory_map\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;28;43;01mFalse\u001b[39;49;00m\u001b[43m)\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1224\u001b[0m \u001b[43m \u001b[49m\u001b[43mis_text\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mis_text\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1225\u001b[0m \u001b[43m \u001b[49m\u001b[43merrors\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43moptions\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mget\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mencoding_errors\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mstrict\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m)\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1226\u001b[0m \u001b[43m \u001b[49m\u001b[43mstorage_options\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43moptions\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mget\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mstorage_options\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;28;43;01mNone\u001b[39;49;00m\u001b[43m)\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1227\u001b[0m \u001b[43m\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 1228\u001b[0m \u001b[38;5;28;01massert\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mhandles \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m\n\u001b[1;32m 1229\u001b[0m f \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mhandles\u001b[38;5;241m.\u001b[39mhandle\n",
"File \u001b[0;32m/usr/local/lib/python3.9/dist-packages/pandas/io/common.py:786\u001b[0m, in \u001b[0;36mget_handle\u001b[0;34m(path_or_buf, mode, encoding, compression, memory_map, is_text, errors, storage_options)\u001b[0m\n\u001b[1;32m 781\u001b[0m \u001b[38;5;28;01melif\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(handle, \u001b[38;5;28mstr\u001b[39m):\n\u001b[1;32m 782\u001b[0m \u001b[38;5;66;03m# Check whether the filename is to be opened in binary mode.\u001b[39;00m\n\u001b[1;32m 783\u001b[0m \u001b[38;5;66;03m# Binary mode does not support 'encoding' and 'newline'.\u001b[39;00m\n\u001b[1;32m 784\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m ioargs\u001b[38;5;241m.\u001b[39mencoding \u001b[38;5;129;01mand\u001b[39;00m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mb\u001b[39m\u001b[38;5;124m\"\u001b[39m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;129;01min\u001b[39;00m ioargs\u001b[38;5;241m.\u001b[39mmode:\n\u001b[1;32m 785\u001b[0m \u001b[38;5;66;03m# Encoding\u001b[39;00m\n\u001b[0;32m--> 786\u001b[0m handle \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mopen\u001b[39;49m\u001b[43m(\u001b[49m\n\u001b[1;32m 787\u001b[0m \u001b[43m \u001b[49m\u001b[43mhandle\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 788\u001b[0m \u001b[43m \u001b[49m\u001b[43mioargs\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mmode\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 789\u001b[0m \u001b[43m \u001b[49m\u001b[43mencoding\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mioargs\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mencoding\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 790\u001b[0m \u001b[43m \u001b[49m\u001b[43merrors\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43merrors\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 791\u001b[0m \u001b[43m \u001b[49m\u001b[43mnewline\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\n\u001b[1;32m 792\u001b[0m \u001b[43m \u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 793\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[1;32m 794\u001b[0m \u001b[38;5;66;03m# Binary mode\u001b[39;00m\n\u001b[1;32m 795\u001b[0m handle \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mopen\u001b[39m(handle, ioargs\u001b[38;5;241m.\u001b[39mmode)\n",
"\u001b[0;31mFileNotFoundError\u001b[0m: [Errno 2] No such file or directory: 'carDetailsV4.csv'"
]
}
],
"source": [
"\n",
"del df[\"Color\"]\n",
"del df['Location']\n",
"\n",
"df = pd.read_csv(\"carDetailsV4.csv\", encoding=\"latin-1\")\n",
"df=df.dropna(axis=0)\n",
"\n",
"#Permet d'afficher le dataframe\n",
"display(df[30:33])\n",
"\n",
"df1=df\n",
"# Permet de suppr les NAN\n",
"df1[\"Engine\"] =df1[\"Engine\"].dropna()\n",
"# Permet d'enlever les deux caractères cc\n",
"df1[\"Engine\"] = df1[\"Engine\"].replace('cc', '')\n",
"df1[\"Engine\"] = df1[\"Engine\"].astype(str).apply(lambda x: x[:-3])"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "69d69464",
"metadata": {},
"outputs": [],
"source": [
"df1[\"Engine\"]"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ee792795",
"metadata": {},
"outputs": [],
"source": [
"df"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6704d8d5",
"metadata": {},
"outputs": [],
"source": [
"print(df[df['Engine']==''])"
]
},
{
"cell_type": "markdown",
"id": "24c19ef3",
"metadata": {},
"source": [
"#### Les données comprennent-elles des caractéristiques pertinentes pour la problématique ?\n"
]
},
{
"cell_type": "markdown",
"id": "fcfa439d",
"metadata": {},
"source": [
"Oui, de nombreuses caractériqtiques présente dans notre base de données peuvent influer sur le prix tel que la réputation de la marque, le nombre de kilomètrage, la puissance du vehicule, son type de carburant, son type de boite de vitesse.\n"
]
},
{
"cell_type": "markdown",
"id": "3845c3c4",
"metadata": {},
"source": [
"## V/ Choisir les variables explicatives et la variable à expliquer : faire une régression et commenter les resultats\n"
]
},
{
"cell_type": "markdown",
"id": "8fbf061b",
"metadata": {},
"source": [
"les variables explicatives sont :\n",
"\n",
"- les marques et son model\n",
"- l'année\n",
"- le kilometrage\n",
"- le type de carburant\n",
"- le type de transmission (boite et type de motorisation)\n",
"- la ville où elle est disponible\n",
"- la puissance\n",
"- la taille\n",
"- la capacité de carburant\n",
"\n",
"La variable à expliquer sera le **prix**\n"
]
},
{
"cell_type": "markdown",
"id": "ad3c25e3",
"metadata": {},
"source": [
"### Afficher le pourcentage de chaque marque dans un camembert "
]
},
{
"cell_type": "markdown",
"id": "3ec4b075",
"metadata": {},
"source": [
"### Quels sont les types de données présents (symbolique, numérique, etc.) ?"
]
},
{
"cell_type": "markdown",
"id": "cae0b428",
"metadata": {},
"source": [
"Les types de données présentes sont des valeurs numériques avec des unités (exemple : la puissance du véhicule), des chaînes de caractère (exemple : nom de la marque) ou des valeurs numériques simples (exemple : le prix)."
]
},
{
"cell_type": "markdown",
"id": "4fcb98c8",
"metadata": {},
"source": [
"### Est-il possible de ne garder que les colonnes pertinentes ?\n"
]
},
{
"cell_type": "markdown",
"id": "a3bf3a6f",
"metadata": {},
"source": [
"Oui, on peut supprimer les colonnes qui nous paraissent non pertinentes car elles n'affectent pas ou très peu le prix tels que la couleur et la ville de vente."
]
},
{
"cell_type": "markdown",
"id": "6cd3a984",
"metadata": {},
"source": [
"## Qu'est ce que la régression ?"
]
},
{
"cell_type": "markdown",
"id": "60a1ce8d",
"metadata": {},
"source": [
"Évolution qui ramène à un degré moindre.\n",
"\n",
"Une régression est basée sur l'idée qu'une variable dépendante est déterminée par une ou plusieurs variables indépendantes\n",
"\n",
"# Exemple de régression :"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "1b0173e3",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"la taille de notre échantillon est : (50,)\n"
]
},
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"rng = np.random.RandomState(42) #pour générer les mêmes données\n",
"\n",
"#constituer un exmple de points aléatoires \n",
"x = 10 * rng.rand(50) #genere un tbl de 50\n",
"print('la taille de notre échantillon est :',x.shape)\n",
"\n",
"y=2*x-1 + rng.randn(50) # définir une relation entre x et y + bruit \n",
"\n",
"#afficher data y=f(x) [y en fonction de x] comme un nuage de points \n",
"plt.scatter(x, y);"
]
},
{
"cell_type": "markdown",
"id": "15f5561e",
"metadata": {},
"source": [
"## Comparer deux véhicules"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "846e7e8f",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Prix moyen d'une Audi : 2703134 ₹\n",
"\n",
"Prix moyen d'une BMW : 3768967 ₹\n",
"\n",
"En moyenne, les BMW sont plus chers que les Audi\n",
"\n",
"Année moyenne d'un Audi : 2016 \n",
"\n",
"Année moyenne d'un BMW : 2017 \n",
"\n",
"La BMW est plus récente que l'Audi en moyenne.\n",
"\n",
"Kilométrage moyen d'un Audi : 54319 km\n",
"\n",
"Kilométrage moyen d'un BMW : 50453 km\n",
"\n",
"En moyenne,l'Audi a plus de kilomètres que le BMW\n",
"\n"
]
},
{
"ename": "TypeError",
"evalue": "can only concatenate str (not \"int\") to str",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)",
"Input \u001b[0;32mIn [8]\u001b[0m, in \u001b[0;36m<cell line: 33>\u001b[0;34m()\u001b[0m\n\u001b[1;32m 28\u001b[0m \u001b[38;5;28mprint\u001b[39m(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mEn moyenne, la BMW a plus de kilomètres que l\u001b[39m\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mAudi\u001b[39m\u001b[38;5;130;01m\\n\u001b[39;00m\u001b[38;5;124m\"\u001b[39m)\n\u001b[1;32m 30\u001b[0m \u001b[38;5;66;03m# print(vehicule1[\"Engine\"])\u001b[39;00m\n\u001b[1;32m 31\u001b[0m \n\u001b[1;32m 32\u001b[0m \u001b[38;5;66;03m# print(\"Puissance moyenne d'un Audi : \", int(vehicule1[\"Engine\"].mean().round()), \"ch\\n\")\u001b[39;00m\n\u001b[0;32m---> 33\u001b[0m \u001b[38;5;28mprint\u001b[39m(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mPuissance moyenne d\u001b[39m\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mun BMW : \u001b[39m\u001b[38;5;124m\"\u001b[39m, \u001b[38;5;28mfloat\u001b[39m(\u001b[43mvehicule2\u001b[49m\u001b[43m[\u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mEngine\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m]\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mmean\u001b[49m\u001b[43m(\u001b[49m\u001b[43m)\u001b[49m\u001b[38;5;241m.\u001b[39mround()), \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mch\u001b[39m\u001b[38;5;130;01m\\n\u001b[39;00m\u001b[38;5;124m\"\u001b[39m)\n\u001b[1;32m 34\u001b[0m \u001b[38;5;28mprint\u001b[39m(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mligne : \u001b[39m\u001b[38;5;124m\"\u001b[39m, df[\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mEngine\u001b[39m\u001b[38;5;124m\"\u001b[39m])\n\u001b[1;32m 36\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m vehicule1[\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mEngine\u001b[39m\u001b[38;5;124m\"\u001b[39m]\u001b[38;5;241m.\u001b[39mmean() \u001b[38;5;241m>\u001b[39m vehicule2[\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mEngine\u001b[39m\u001b[38;5;124m\"\u001b[39m]\u001b[38;5;241m.\u001b[39mmean():\n",
"File \u001b[0;32m/usr/local/lib/python3.9/dist-packages/pandas/core/generic.py:11119\u001b[0m, in \u001b[0;36mNDFrame._add_numeric_operations.<locals>.mean\u001b[0;34m(self, axis, skipna, level, numeric_only, **kwargs)\u001b[0m\n\u001b[1;32m 11101\u001b[0m \u001b[38;5;129m@doc\u001b[39m(\n\u001b[1;32m 11102\u001b[0m _num_doc,\n\u001b[1;32m 11103\u001b[0m desc\u001b[38;5;241m=\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mReturn the mean of the values over the requested axis.\u001b[39m\u001b[38;5;124m\"\u001b[39m,\n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 11117\u001b[0m \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mkwargs,\n\u001b[1;32m 11118\u001b[0m ):\n\u001b[0;32m> 11119\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43mNDFrame\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mmean\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43maxis\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mskipna\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mlevel\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mnumeric_only\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\u001b[43m)\u001b[49m\n",
"File \u001b[0;32m/usr/local/lib/python3.9/dist-packages/pandas/core/generic.py:10689\u001b[0m, in \u001b[0;36mNDFrame.mean\u001b[0;34m(self, axis, skipna, level, numeric_only, **kwargs)\u001b[0m\n\u001b[1;32m 10681\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21mmean\u001b[39m(\n\u001b[1;32m 10682\u001b[0m \u001b[38;5;28mself\u001b[39m,\n\u001b[1;32m 10683\u001b[0m axis: Axis \u001b[38;5;241m|\u001b[39m \u001b[38;5;28;01mNone\u001b[39;00m \u001b[38;5;241m|\u001b[39m lib\u001b[38;5;241m.\u001b[39mNoDefault \u001b[38;5;241m=\u001b[39m lib\u001b[38;5;241m.\u001b[39mno_default,\n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 10687\u001b[0m \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mkwargs,\n\u001b[1;32m 10688\u001b[0m ) \u001b[38;5;241m-\u001b[39m\u001b[38;5;241m>\u001b[39m Series \u001b[38;5;241m|\u001b[39m \u001b[38;5;28mfloat\u001b[39m:\n\u001b[0;32m> 10689\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_stat_function\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 10690\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mmean\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mnanops\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mnanmean\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43maxis\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mskipna\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mlevel\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mnumeric_only\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\n\u001b[1;32m 10691\u001b[0m \u001b[43m \u001b[49m\u001b[43m)\u001b[49m\n",
"File \u001b[0;32m/usr/local/lib/python3.9/dist-packages/pandas/core/generic.py:10641\u001b[0m, in \u001b[0;36mNDFrame._stat_function\u001b[0;34m(self, name, func, axis, skipna, level, numeric_only, **kwargs)\u001b[0m\n\u001b[1;32m 10631\u001b[0m warnings\u001b[38;5;241m.\u001b[39mwarn(\n\u001b[1;32m 10632\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mUsing the level keyword in DataFrame and Series aggregations is \u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m 10633\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mdeprecated and will be removed in a future version. Use groupby \u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 10636\u001b[0m stacklevel\u001b[38;5;241m=\u001b[39mfind_stack_level(),\n\u001b[1;32m 10637\u001b[0m )\n\u001b[1;32m 10638\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_agg_by_level(\n\u001b[1;32m 10639\u001b[0m name, axis\u001b[38;5;241m=\u001b[39maxis, level\u001b[38;5;241m=\u001b[39mlevel, skipna\u001b[38;5;241m=\u001b[39mskipna, numeric_only\u001b[38;5;241m=\u001b[39mnumeric_only\n\u001b[1;32m 10640\u001b[0m )\n\u001b[0;32m> 10641\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_reduce\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 10642\u001b[0m \u001b[43m \u001b[49m\u001b[43mfunc\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mname\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mname\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43maxis\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43maxis\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mskipna\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mskipna\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mnumeric_only\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mnumeric_only\u001b[49m\n\u001b[1;32m 10643\u001b[0m \u001b[43m\u001b[49m\u001b[43m)\u001b[49m\n",
"File \u001b[0;32m/usr/local/lib/python3.9/dist-packages/pandas/core/series.py:4471\u001b[0m, in \u001b[0;36mSeries._reduce\u001b[0;34m(self, op, name, axis, skipna, numeric_only, filter_type, **kwds)\u001b[0m\n\u001b[1;32m 4467\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mNotImplementedError\u001b[39;00m(\n\u001b[1;32m 4468\u001b[0m \u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mSeries.\u001b[39m\u001b[38;5;132;01m{\u001b[39;00mname\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m does not implement \u001b[39m\u001b[38;5;132;01m{\u001b[39;00mkwd_name\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m.\u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m 4469\u001b[0m )\n\u001b[1;32m 4470\u001b[0m \u001b[38;5;28;01mwith\u001b[39;00m np\u001b[38;5;241m.\u001b[39merrstate(\u001b[38;5;28mall\u001b[39m\u001b[38;5;241m=\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mignore\u001b[39m\u001b[38;5;124m\"\u001b[39m):\n\u001b[0;32m-> 4471\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43mop\u001b[49m\u001b[43m(\u001b[49m\u001b[43mdelegate\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mskipna\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mskipna\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwds\u001b[49m\u001b[43m)\u001b[49m\n",
"File \u001b[0;32m/usr/local/lib/python3.9/dist-packages/pandas/core/nanops.py:93\u001b[0m, in \u001b[0;36mdisallow.__call__.<locals>._f\u001b[0;34m(*args, **kwargs)\u001b[0m\n\u001b[1;32m 91\u001b[0m \u001b[38;5;28;01mtry\u001b[39;00m:\n\u001b[1;32m 92\u001b[0m \u001b[38;5;28;01mwith\u001b[39;00m np\u001b[38;5;241m.\u001b[39merrstate(invalid\u001b[38;5;241m=\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mignore\u001b[39m\u001b[38;5;124m\"\u001b[39m):\n\u001b[0;32m---> 93\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43mf\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43margs\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 94\u001b[0m \u001b[38;5;28;01mexcept\u001b[39;00m \u001b[38;5;167;01mValueError\u001b[39;00m \u001b[38;5;28;01mas\u001b[39;00m e:\n\u001b[1;32m 95\u001b[0m \u001b[38;5;66;03m# we want to transform an object array\u001b[39;00m\n\u001b[1;32m 96\u001b[0m \u001b[38;5;66;03m# ValueError message to the more typical TypeError\u001b[39;00m\n\u001b[1;32m 97\u001b[0m \u001b[38;5;66;03m# e.g. this is normally a disallowed function on\u001b[39;00m\n\u001b[1;32m 98\u001b[0m \u001b[38;5;66;03m# object arrays that contain strings\u001b[39;00m\n\u001b[1;32m 99\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m is_object_dtype(args[\u001b[38;5;241m0\u001b[39m]):\n",
"File \u001b[0;32m/usr/local/lib/python3.9/dist-packages/pandas/core/nanops.py:155\u001b[0m, in \u001b[0;36mbottleneck_switch.__call__.<locals>.f\u001b[0;34m(values, axis, skipna, **kwds)\u001b[0m\n\u001b[1;32m 153\u001b[0m result \u001b[38;5;241m=\u001b[39m alt(values, axis\u001b[38;5;241m=\u001b[39maxis, skipna\u001b[38;5;241m=\u001b[39mskipna, \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mkwds)\n\u001b[1;32m 154\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[0;32m--> 155\u001b[0m result \u001b[38;5;241m=\u001b[39m \u001b[43malt\u001b[49m\u001b[43m(\u001b[49m\u001b[43mvalues\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43maxis\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43maxis\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mskipna\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mskipna\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwds\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 157\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m result\n",
"File \u001b[0;32m/usr/local/lib/python3.9/dist-packages/pandas/core/nanops.py:410\u001b[0m, in \u001b[0;36m_datetimelike_compat.<locals>.new_func\u001b[0;34m(values, axis, skipna, mask, **kwargs)\u001b[0m\n\u001b[1;32m 407\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m datetimelike \u001b[38;5;129;01mand\u001b[39;00m mask \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m:\n\u001b[1;32m 408\u001b[0m mask \u001b[38;5;241m=\u001b[39m isna(values)\n\u001b[0;32m--> 410\u001b[0m result \u001b[38;5;241m=\u001b[39m \u001b[43mfunc\u001b[49m\u001b[43m(\u001b[49m\u001b[43mvalues\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43maxis\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43maxis\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mskipna\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mskipna\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mmask\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mmask\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 412\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m datetimelike:\n\u001b[1;32m 413\u001b[0m result \u001b[38;5;241m=\u001b[39m _wrap_results(result, orig_values\u001b[38;5;241m.\u001b[39mdtype, fill_value\u001b[38;5;241m=\u001b[39miNaT)\n",
"File \u001b[0;32m/usr/local/lib/python3.9/dist-packages/pandas/core/nanops.py:698\u001b[0m, in \u001b[0;36mnanmean\u001b[0;34m(values, axis, skipna, mask)\u001b[0m\n\u001b[1;32m 695\u001b[0m dtype_count \u001b[38;5;241m=\u001b[39m dtype\n\u001b[1;32m 697\u001b[0m count \u001b[38;5;241m=\u001b[39m _get_counts(values\u001b[38;5;241m.\u001b[39mshape, mask, axis, dtype\u001b[38;5;241m=\u001b[39mdtype_count)\n\u001b[0;32m--> 698\u001b[0m the_sum \u001b[38;5;241m=\u001b[39m _ensure_numeric(\u001b[43mvalues\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43msum\u001b[49m\u001b[43m(\u001b[49m\u001b[43maxis\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mdtype\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mdtype_sum\u001b[49m\u001b[43m)\u001b[49m)\n\u001b[1;32m 700\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m axis \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m \u001b[38;5;129;01mand\u001b[39;00m \u001b[38;5;28mgetattr\u001b[39m(the_sum, \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mndim\u001b[39m\u001b[38;5;124m\"\u001b[39m, \u001b[38;5;28;01mFalse\u001b[39;00m):\n\u001b[1;32m 701\u001b[0m count \u001b[38;5;241m=\u001b[39m cast(np\u001b[38;5;241m.\u001b[39mndarray, count)\n",
"File \u001b[0;32m/usr/local/lib/python3.9/dist-packages/numpy/core/_methods.py:48\u001b[0m, in \u001b[0;36m_sum\u001b[0;34m(a, axis, dtype, out, keepdims, initial, where)\u001b[0m\n\u001b[1;32m 46\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21m_sum\u001b[39m(a, axis\u001b[38;5;241m=\u001b[39m\u001b[38;5;28;01mNone\u001b[39;00m, dtype\u001b[38;5;241m=\u001b[39m\u001b[38;5;28;01mNone\u001b[39;00m, out\u001b[38;5;241m=\u001b[39m\u001b[38;5;28;01mNone\u001b[39;00m, keepdims\u001b[38;5;241m=\u001b[39m\u001b[38;5;28;01mFalse\u001b[39;00m,\n\u001b[1;32m 47\u001b[0m initial\u001b[38;5;241m=\u001b[39m_NoValue, where\u001b[38;5;241m=\u001b[39m\u001b[38;5;28;01mTrue\u001b[39;00m):\n\u001b[0;32m---> 48\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43mumr_sum\u001b[49m\u001b[43m(\u001b[49m\u001b[43ma\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43maxis\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mdtype\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mout\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mkeepdims\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43minitial\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mwhere\u001b[49m\u001b[43m)\u001b[49m\n",
"\u001b[0;31mTypeError\u001b[0m: can only concatenate str (not \"int\") to str"
]
}
],
"source": [
"# On compare le prix, l'année, le kilomètrage et la puissance de deux vehicules de marques différentes de df\n",
"\n",
"vehicule1 = df[df[\"Make\"] == \"Audi\"]\n",
"vehicule2 = df[df[\"Make\"] == \"BMW\"]\n",
"\n",
"print(\"Prix moyen d'une Audi : \", int(vehicule1[\"Price\"].mean().round()), \"₹\\n\")\n",
"print(\"Prix moyen d'une BMW : \", int(vehicule2[\"Price\"].mean().round()), \"₹\\n\")\n",
"\n",
"if vehicule1[\"Price\"].mean() > vehicule2[\"Price\"].mean():\n",
" print(\"En moyenne, les Audi sont plus cher que les BMW\\n\")\n",
"else:\n",
" print(\"En moyenne, les BMW sont plus chers que les Audi\\n\")\n",
"\n",
"print(\"Année moyenne d'un Audi : \", int(vehicule1[\"Year\"].mean().round()), \"\\n\")\n",
"print(\"Année moyenne d'un BMW : \", int(vehicule2[\"Year\"].mean().round()), \"\\n\")\n",
"\n",
"if vehicule1[\"Year\"].mean() > vehicule2[\"Year\"].mean():\n",
" print(\"L'Audi est plus récente que le BMW en moyenne.\\n\")\n",
"else:\n",
" print(\"La BMW est plus récente que l'Audi en moyenne.\\n\")\n",
"\n",
"print(\"Kilométrage moyen d'un Audi : \", int(vehicule1[\"Kilometer\"].mean().round()), \"km\\n\")\n",
"print(\"Kilométrage moyen d'un BMW : \", int(vehicule2[\"Kilometer\"].mean().round()), \"km\\n\")\n",
"\n",
"if vehicule1[\"Kilometer\"].mean() > vehicule2[\"Kilometer\"].mean():\n",
" print(\"En moyenne,l'Audi a plus de kilomètres que le BMW\\n\")\n",
"else:\n",
" print(\"En moyenne, la BMW a plus de kilomètres que l'Audi\\n\")\n",
" \n",
"# print(vehicule1[\"Engine\"])\n",
"\n",
"# print(\"Puissance moyenne d'un Audi : \", int(vehicule1[\"Engine\"].mean().round()), \"ch\\n\")\n",
"#print(\"Puissance moyenne d'un BMW : \", float(vehicule2[\"Engine\"].mean().round()), \"ch\\n\")\n",
"#print(\"ligne : \", df[\"Engine\"])\n",
"\n",
"#if vehicule1[\"Engine\"].mean() > vehicule2[\"Engine\"].mean():\n",
"# print(\"En moyenne, l'Audi a plus de puissance que le BMW\\n\")\n",
"#else:\n",
"# print(\"En moyenne, la BMW a plus de puissance que l'Audi\\n\")"
]
},
{
"cell_type": "markdown",
"id": "104b1ad6",
"metadata": {},
"source": [
"**On peut voir, que le prix est influé par l'année de sortie et le kilometrage. Ici, les BMW sont en moyenne plus récentes, ont en moyenne moins de kilomètrage et sont plus puissante ce qui peut expliquer leur prix plus élévé.**"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "dc0a7b57",
"metadata": {},
"outputs": [],
"source": [
"# Export moi le dataframe en csv\n",
"df.to_csv('carDetails.csv', index=False)\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "SageMath 9.2",
"language": "sage",
"name": "sagemath"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.2"
}
},
"nbformat": 4,
"nbformat_minor": 5
}