It0095 F1
It0095 F1
Data exploration mandatory initial step whether or not more formal analysis follows
2. Which of the following is not a major visualization and operations according to data mining
goals?
3. Examine scatterplots with added color/panels/size to determine the need for interaction
terms.
4. Data visualization can help determine which variables to include in the analysis and which
might be redundant.
7. Visualization techniques fit into the data mining process, as describes so far.
8. Using data conversion techniques such as converting categorical variables into numerical
variables is necessary on data exploration
9. It is the process of transferring continuous functions, models, variables, and equations into
discrete counterparts.
10. Dimension reduction uses data summaries to detect information overlap between
variables (and remove or combine redundant variables or categories)
11. The more data we are dealing with, the greater the chance of encountering erroneous
values resulting from measurement error, data entry error, or the like.
12. It is the process of discovering meaningful correlations, patterns and trends by sifting
through large amounts of data stored in repositories.
15. These are algorithms which used in the value of the outcome of interest (e.g, purchase or
no purchase) is known.
16. The more data we are dealing with, the greater the chance of encountering erroneous
values resulting from measurement error, data entry error, or the like.
21. We explore data in order to bring important aspects of that data into focus for further
analysis.
22. The process of visualization methods can help determine, using a sample, which variables
and metrics are useful.
23. Data exploration is essential for understanding
24. Visualization techniques fit into the data mining process, as describes so far.
6. The process of reducing the number of random variables under consideration by obtaining a
set of principal variables.
7. The process of visualization methods can help determine, using a sample, which variables
and metrics are useful.
8. It is the science of analytical reasoning supported by interactive visual interfaces.
9. It is the ability to see and interpret (analyze and give meaning to) the visual information that
surrounds us.
10. There different visualizations and operations can support data mining tasks.
11. A common task in data mining which examine data where the classification is unknown or
will occur in the future, with the goal of predicting what that classification is or will be.
14. People make better decisions when they’re based on understanding.
15. These are algorithms which are used in where there is no outcome variable to predict or
classify. Hence, there is no "learning" from cases where such an outcome variable is known.
17. It is the ability to see and interpret (analyze and give meaning to) the visual information
that surrounds us.
18. If many different models are being tried out, it is prudent to save a third sample of known
outcomes (the test data) to use with the model finally selected to predict how well it will do.
19. These are algorithms which used in the value of the outcome of interest (e.g, purchase or
no purchase) is known.
20. These are values that lie far away from the bulk of the data.
1. Generate a parallel coordinate plot to identify clusters of observations.
2. There are different visualizations and operations that can support data mining tasks.
3. Overlay trend lines of different types to determine adequate modeling choices.
4. We focus on the use of graphical presentations for the purpose of data exploration, in
particular with relation to predictive analytics.
5. Data exploration is essential for understanding
6. Data exploration is the most human-centric step of Data Science, allows to investigate the
characteristics of the data through Data Visualization.
7. It is about describing the data by means of statistical and visualization techniques.
10. Numerical variables can be handled by most routines, but often require special handling.
15. This data mining technique is more complex, using attributes of data to move them into
discernable categories, helping you draw further conclusions.
16. It is a mix of skills in the areas of statistics, machine learning, math, programming, business,
and IT.
17. Overfitting are values that lie far away from the bulk of the data.
18. Training data are the data from which the classification or prediction algorithm "learns", or
is "trained," about the relationship between predictor variables and the outcome variable.
18. A data reduction is a process of consolidating a large number of variables (or cases) into a
smaller set.
19. The purpose of preparation is to transform data sets so that their information content is
best exposed to the mining tool
20. Error prediction rate should be lower (or the same) after the preparation as before it
9. The exploration of a dataset, which is the number of variables, must be reduced for the data
mining algorithms to operate efficiently.
11. We explore data in order to bring important aspects of that data into focus for further
analysis.
12. If many different models are being tried out, it is prudent to save a third sample of known
outcomes (the test data) to use with the model finally selected to predict how well it will do.
13. This process of consolidating a large number of variables (or cases) into a smaller set is
termed _____
14. It refers to data visualization and reporting for understanding “what happened and what is
happening.”
15. In variable selection, the more variables we include, the greater the number of records we
will need to assess relationships among the variables.