Skip to main content

Streamlit | Car Price Prediction using machine learning

ML | PREDICTING PRICE OF PRE-OWNED CARS


In this article we are going to discuss a practical application of machine learning using a case study on Linear Regression. Linear Regression is a supervised learning algorithm. It is used to predict the real- valued output y based on the given input value x. it depicts the relationship between the dependent variable y and the independent variable x.


Let’s look at the problem statement:


Strom Motors is an e-commerce company who act as mediators between parties interested in selling and buying pre-owned cars.

Storm Motors wishes to develop an algorithm to predict the price of the cars based on various attributes associated with the car.

we have downloaded the dataset for this problem statement from kaggle.com. Kaggle is an online community devoted to Data Science and Machine Learning founded by Google in 2010. The dataset is named as CAR DETAILS FROM CAR DEKHO.csv. The dataset can be found here.

Data Cleaning

Let us understand the problem statement using the dataset. As we know that Storm Motors act as mediators, and they wish to set price of car depending on the various car attributes. So the selling price is our output variable y and rest other variables are our input x to our model.

Data comes in very messy and unstructured form in order to bring it to workable or structured form, we need to “clean” our data. Some common cleaning includes parsing, converting to one-hot, removing unnecessary data etc. In our case, our data has four categorical independent variables namely [fuel, seller type, transmission, owner]. Our algorithm requires numbers; we can’t work with alphabets popping up in our data. So we have converted our data to numerical using .cat.codes function.


Machine learning dummies

Training model

Once the data is cleaned, it can be used as an input to our Linear Regression model. We will use Scikit-learn’s linear regression model to train our dataset. We have split our data into training set and test set so that we can monitor the performance of our model. Once the model is trained, we have tested the model performance using test data. Now we know that our model is performing well, we will use this model to build an app.

Training machine learning model

Building ML App using streamlit

Building app using streamlit follows the same steps as we have shown in our previous articles. In this app we have written the code to take the various attributes of car as input. We can type these attributes in the GUI of the app and click on Predict Car Price to predict the price of the car. These features include year of registration, KM car has travelled, fuel type, seller type, transmission, and type of owner. These inputs are processed by the price_pred() function and result obtained is the predicted price of the car.

Machine learning web application



UFuncTypeError

We would also like to share the error occurred while running this app, and how we have tackled this error. The error was named as:

UFuncTypeError: ufunc ‘matmul’ did not contain a loop with signature matching types (dtype(‘<U32’), dtype(‘<U32’)) -> dtype(‘<U32’)

Machine learning python error


In our model case we have two numerical input feature that is year and km_driven. We have taken these numerical inputs using st.text_input() function. Although, our main function was taking these numerical input, price_pred() function was still treating it as categorical input. So we have changed the function as st.number_input() for both these features and the error automatically get resolved.















Comments

Popular posts from this blog

Salary Prediction Web App using Streamlit

Salary Prediction Web App In this article, we are going to discuss how to predict the salary based on various attributes related to salary  using Random Forest Regression. This study focuses on a system that predicts the salary of a candidate based on candidate’s qualifications, historical data, and work experience. This app uses a machine learning algorithm to give the result. The algorithm used is Random Forest Regression. In this problem, the target variable (or output), y, takes value of salary for a given set of input features (or inputs), X. The dataset contains gender, secondary school percentage, higher secondary school percentage, higher secondary school stream, degree percentage, degree type, work experience and specialization of candidate. Below is the step-by-step Approach: Step 1: Import the necessary modules and read the dataset we are going to use for this analysis. Below is a screenshot of the dataset we used in our analysis. Step 2: Now before moving ...

STREAMLIT MULTIPAGE WEB APPLICATION | AREA CALCULATOR

Multipage Web App So far, we have worked with python streamlit library and we have built machine learning web applications using streamlit. In this blog we will see how to build a multi-page web app using streamlit. Streamlit multipage web app We can create multiple apps and navigate across each of them in a main app using a radio button. First, we have created separate apps for each shape to calculate the area of that particular shape example app1.py, app2.py, app3.py etc. Then we have created a main app and added a navigator using radio buttons. Now we just have to run the main app and navigate through the desired web page. Area Calculator This particular multipage web app we named it as area calculator. We have included introduction page and ten shapes of which we can calculate the area by putting required inputs. We have downloaded the multiapp.py framework from GitHub, as we have a greater number of web pages. Each shape in the navigation bar indicates new web p...