QUESTION 51

Which word or phrase completes the statement? A Data Scientist would consider that a RDBMS is to a Table as R is to a ______________ .

 A. Data frame B. List C. Matrix D. Array

Correct Answer: A

QUESTION 52

Refer to the exhibit. In the exhibit, the x-axis represents the derived probability of a borrower defaulting on a loan. Also in the exhibit, the pink represents borrowers that are known to have not defaulted on their loan, and the blue represents borrowers that are known to have defaulted on their loan. Which analytical method could produce the probabilities needed to build this exhibit? A. Logistic Regression B. Linear Regression C. Discriminant Analysis D. Association Rules

Correct Answer: A

QUESTION 53

You are building a logistic regression model to predict whether a tax filer will be audited within the next two years. Your training set population is 1000 filers. The audit rate in your training data is 4.2%. What is the sum of the probabilities that the model assigns to all the filers in your training set that have been audited?

 A. 42 B. 4.2 C. 0.42 D. 0.042

Correct Answer: A

QUESTION 54

In linear regression modeling, which action can be taken to improve the linearity of the relationship between the dependent and independent variables?

 A. Apply a transformation to a variable B. Use a different statistical package C. Calculate the R-Squared value D. Change the units of measurement on the independent variable

Correct Answer: A

QUESTION 55

Refer to the exhibit. You are building a decision tree. In this exhibit, four variables are listed with their respective values of info-gain. Based on this information, on which attribute would you expect the next split to be in the decision tree? A. Credit Score B. Age C. Income D. Gender

Correct Answer: A

QUESTION 56

Refer to the exhibit. The graph represents an ROC space with four classifiers labelled A through D. Which point in the graph represents a perfect classification? A. S B. P C. Q D. R

Correct Answer: A

QUESTION 57

You have been assigned to do a study of the daily revenue effect of a pricing model of online transactions. You have tested all the theoretical models in the previous model planning stage, and all tests have yielded statistically insignificant results. What is your next step?

 A. Report that the results are insignificant, and reevaluate the original business question. B. Run all the models again against a larger sample, leveraging more historical data. C. Move forward on the model with the highest significance scores relative to the others. D. Modify samples used by the models and iterate until a significant result occurs.

Correct Answer: A

QUESTION 58

In data visualization, what is used to focus the audience on a key part of a chart?

 A. Emphasis colors B. Detailed text C. Pastel colors D. A data table

Correct Answer: A

QUESTION 59

When would you prefer a Naive Bayes model to a logistic regression model for classification?

 A. When you are using several categorical input variables with over 1000 possible values each. B. When you need to estimate the probability of an outcome, not just which class it is in. C. When all the input variables are numerical. D. When some of the input variables might be correlated.

Correct Answer: A

QUESTION 60

In which phase of the analytic lifecycle would you expect to spend most of the project time?

 A. Discovery B. Data preparation C. Communicate Results D. Operationalize

Correct Answer: B

