B. a dataset has 1000 records and 50 variables with 5% of the

 

B.   A dataset has 1000 records and 50 variables with 5% of the values missing, spread randomly throughout the records and variables. An analysis decides to remove records that have missing values. About how many records would you expect would be removed? (20 points)

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

C. Given a database table containing weather data as follows:

Outlook

Temperature

Humidity

Windy

Class: Play

Sunny

Hot

High

False

No

Sunny

Hot

High

True

No

Overcast

Hot

High

False

Yes

Rainy

Mild

High

False

Yes

Rainy

Cool

Normal

False

Yes

Rainy

Cool

Normal

True

No

Overcast

Cool

Normal

True

Yes

Sunny

Mild

High

False

No

Sunny

Cool

Normal

False

Yes

Rainy

Mild

Normal

False

Yes

Sunny

Mild

Normal

True

Yes

Overcast

Mild

High

True

Yes

Overcast

Hot

Normal

False

Yes

Rainy

Mild

High

True

No

 

 

Where  Outlook, Temperature, Humidity, and Windy are the input variables (predictors), and Play is the output variable (response).

a.    Compute the prior probability

P(PLAY=’Yes’) =

      P(PLAY=’No’) =

b.   Compute the conditional probability

P(Outlook=’Sunny’|PLAY=’Yes’) =

      P(Outlook=’Sunny’|PLAY=’No’) =

 

      P(Temperature = ‘Mild’|PLAY=’Yes’) =

      P(Temperature = ‘Mild’|PLAY=’No’) =

     

      P(Humidity = ‘High’| PLAY=’Yes’) =

      P(Humidity = ‘High’| PLAY=’No’) =

 

      P(Windy = ‘False’| PLAY=’Yes’) =

      P(Windy = ‘False’| PLAY=’No’)=

 

c.    Using naïve Bayes classification method to classify the following unknown record and to indicate whether to play or not.

 

(Outlook = ‘Sunny’,  Temperature = ‘Mild’ , Humidity = ‘High’ ,  Windy = ‘False’)

(20 points)

 

 

 

 

 

 

 

 

 

 

 

 

 

 

D. Association Rule Mining: (20 points)

Given a transaction database for mining association rule as follows:

Database D

TID

Items

100

A C D

200

B C E

300

A B C E

400

B E

 

      Please useApriorialgorithm to mine association rules with minimum support count = 2.

      (Please show the derivation process step by step with candidate itemsets.)

 

 

 

 

grammarlytutors
Calculate your essay price
(550 words)

Approximate price: $22

How it Works

1

It only takes a couple of minutes to fill in your details, select the type of paper you need (essay, term paper, etc.), give us all necessary information regarding your assignment.

2

Once we receive your request, one of our customer support representatives will contact you within 24 hours with more specific information about how much it'll cost for this particular project.

3

After receiving payment confirmation via PayPal or credit card – we begin working on your detailed outline, which is based on the requirements given by yourself upon ordering.

4

Once approved, your order is complete and will be emailed directly to the email address provided before payment was made!