At Westonci.ca, we make it easy to get the answers you need from a community of informed and experienced contributors. Join our Q&A platform and get accurate answers to all your questions from professionals across multiple disciplines. Get detailed and accurate answers to your questions from a dedicated community of experts on our Q&A platform.

Let's model this housing price data! Before we can do this, however, we need to split the data into training and test sets. Remember that the response vector (housing prices) lives in the target attribute. A random seed is set here so that we can deterministically generate the same splitting in the future if we want to test our result again and find potential bugs. Use the train_test_split function to split out 10% of the data for the test set. Call the resulting splits X_train, X_test, Y_train, Y_test.

Sagot :

fichoh

The program reads in a dataset into a pandas dataframe, and uses the train_test_split function in the sklearn library to split the data into training and test sets. The code goes thus :

import pandas as pd

#import the pandas dataframe and alias it as pd

from sklearn.model_selection import train_test_split

#import the train_test_split function

housing_df = pd.read_csv('housing price.csv')

#read in the housing data

features_df = df.iloc[:,1:]

#seperate the features from the label ;

target_df = df.iloc[:,0]

#put the label into a seperate dataframe as well.

X_train, X_test, Y_train, Y_test = train_test_split(features_df, target_df, test_size = 0.1, random_state = 1)

#uses tuple unpacking to randomly assign the data each of the 4 variables.

#Test size is test percent of the entire dataset

Learn more :https://brainly.com/question/4257657?referrer=searchResults