Week 121 / 28

Revision 22 / 28

Question 1

Consider the following string. Which command would you use to replace the x with blank (whitespace)?

string <- c("169 millimeters x 117 millimeters x 9.1 millimeters")

A. chartr(string, x)
B. chartr(string, "x", "~")
C. chartr(string, old = "x", new=" ")
D. chartr(string, "x", " - ")

3 / 28

Question 1

Consider the following string. Which command would you use to replace the x with blank (whitespace)?

string <- c("169 millimeters x 117 millimeters x 9.1 millimeters")

A. chartr(string, x)
B. chartr(string, "x", "~")
C. chartr(string, old = "x", new=" ")
D. chartr(string, "x", " - ")

CORRECT ANSWER: C

3 / 28

Question 2

What is the result of the following R code?

df1 <- c("VIC", "NSW", "TAS", "WA", "SA")
df2 <- c("WA", "SA", "NSW", "TAS", "VIC")
identical(df1, df2)

A. TRUE
B. FALSE
C. "WA", "SA", "NSW"
D. "TAS", "VIC"

4 / 28

Question 2

What is the result of the following R code?

df1 <- c("VIC", "NSW", "TAS", "WA", "SA")
df2 <- c("WA", "SA", "NSW", "TAS", "VIC")
identical(df1, df2)

A. TRUE
B. FALSE
C. "WA", "SA", "NSW"
D. "TAS", "VIC"

CORRECT ANSWER: B

4 / 28

Question 3

Which one of the following is NOT one of the print functions?

A. cat()
B. print()
C. noquote
D. quote

5 / 28

Question 3

Which one of the following is NOT one of the print functions?

A. cat()
B. print()
C. noquote
D. quote

CORRECT ANSWER: D

5 / 28

Question 4

Which one of the following removes all punctuations in the vector x?

x <- c("hello!", "good-day.", "hi 5:)")

A. str_subset(x, "[:alnum:]")
B. str_extract(x, "[:alnum:]")
C. str_remove(x, "[:punct:]")
D. str_replace_all(x, "[:punct:]", "")

6 / 28

Question 4

Which one of the following removes all punctuations in the vector x?

x <- c("hello!", "good-day.", "hi 5:)")

A. str_subset(x, "[:alnum:]")
B. str_extract(x, "[:alnum:]")
C. str_remove(x, "[:punct:]")
D. str_replace_all(x, "[:punct:]", "")

CORRECT ANSWER: D

6 / 28

Question 5

According to the following code, what will be the result of y?

x <- "Now, I am HAPPY"
y <- length(x)
y

A. 4
B. 1
C. 2
D. 5

7 / 28

Question 5

According to the following code, what will be the result of y?

x <- "Now, I am HAPPY"
y <- length(x)
y

A. 4
B. 1
C. 2
D. 5

CORRECT ANSWER: B

7 / 28

Question 6

Which one of the following functions from lubridate package will convert z into a date format?

z <- c("08.06.2018", "29062018", "23/03/2018", "30-01-2018")

A. ymd(z)
B. dmy(z)
C. ydm(z)
D. hms(z)

8 / 28

Question 6

Which one of the following functions from lubridate package will convert z into a date format?

z <- c("08.06.2018", "29062018", "23/03/2018", "30-01-2018")

A. ymd(z)
B. dmy(z)
C. ydm(z)
D. hms(z)

CORRECT ANSWER: B

8 / 28

Question 7

In which one of the following, values are divided by their standard deviation (or root mean square)?

A. Box-Cox transformation
B. logarithmic transformation
C. z-score standardisation
D. square root transformation

9 / 28

Question 7

In which one of the following, values are divided by their standard deviation (or root mean square)?

A. Box-Cox transformation
B. logarithmic transformation
C. z-score standardisation
D. square root transformation

CORRECT ANSWER: C

9 / 28

Question 8

According to the following code, what will be the result of y?

minmaxnormalise <- function(x) {(x - min(x)) / (max(x) - min(x))}
x <- c(5, 4, NA, 2, 5)
y <- minmaxnormalise(x)
y

A. 1.00 1.00 NA 1.00 1.00
B. 1.00 0.67 NA 0.00 1.00
C. NA NA NA NA NA
D. 0.00 0.00 NA 1.00 1.00

10 / 28

Question 8

According to the following code, what will be the result of y?

minmaxnormalise <- function(x) {(x - min(x)) / (max(x) - min(x))}
x <- c(5, 4, NA, 2, 5)
y <- minmaxnormalise(x)
y

A. 1.00 1.00 NA 1.00 1.00
B. 1.00 0.67 NA 0.00 1.00
C. NA NA NA NA NA
D. 0.00 0.00 NA 1.00 1.00

CORRECT ANSWER: C

10 / 28

Question 9

Which one of the following packages has a function to detect multivariate outliers?

A. library(dplyr)
B. library(MVN)
C. library(tidyr)
D. library(validate)

11 / 28

Question 9

Which one of the following packages has a function to detect multivariate outliers?

A. library(dplyr)
B. library(MVN)
C. library(tidyr)
D. library(validate)

CORRECT ANSWER: B

11 / 28

Question 10

Which of the following can be used to deal with outliers?

A. Capping
B. Transforming
C. Imputing
D. All of them

12 / 28

Question 10

Which of the following can be used to deal with outliers?

A. Capping
B. Transforming
C. Imputing
D. All of them

CORRECT ANSWER: D

12 / 28

Question 11

Which one of the following is the reason for the error given below?

df <- data.frame(col1 = c(2, 0 / 0, NA, 1 / 0,-Inf, Inf),
                 col2 = c(NA, Inf / 0, 2 / 0, NaN,-Inf, 4))
is.infinite(df)

A. is.infinite() function accepts only vectorial input.
B. there is no infinite value in the data frame.
C. data frame has missing values.
D. there is a division by zero problem in the data frame.

13 / 28

Question 11

Which one of the following is the reason for the error given below?

df <- data.frame(col1 = c(2, 0 / 0, NA, 1 / 0,-Inf, Inf),
                 col2 = c(NA, Inf / 0, 2 / 0, NaN,-Inf, 4))
is.infinite(df)

A. is.infinite() function accepts only vectorial input.
B. there is no infinite value in the data frame.
C. data frame has missing values.
D. there is a division by zero problem in the data frame.

CORRECT ANSWER: A

13 / 28

Question 12

Consider the following data frame. What command would you use to find the total missing values in each column?

df <- data.frame(col1 = c(1:3, NA),
                 col2 = c("this", NaN, "is", "text"),
                 col3 = c(TRUE, FALSE, TRUE, TRUE),
                 col4 = c(2.5, 4.2, 3.2, NA))

A. sum(is.na(df))
B. is.na(df)
C. is.nan(df)
D. colSums(is.na(df))

14 / 28

Question 12

Consider the following data frame. What command would you use to find the total missing values in each column?

df <- data.frame(col1 = c(1:3, NA),
                 col2 = c("this", NaN, "is", "text"),
                 col3 = c(TRUE, FALSE, TRUE, TRUE),
                 col4 = c(2.5, 4.2, 3.2, NA))

A. sum(is.na(df))
B. is.na(df)
C. is.nan(df)
D. colSums(is.na(df))

CORRECT ANSWER: D

14 / 28

Question 13

According to the following code, what will be the result of y?

x <- c(1:3, NA, 5, NA)
y <- which(is.na(x))
y

A. 4 6
B. TRUE
C. FALSE FALSE FALSE TRUE FALSE TRUE
D. NA

15 / 28

Question 13

According to the following code, what will be the result of y?

x <- c(1:3, NA, 5, NA)
y <- which(is.na(x))
y

A. 4 6
B. TRUE
C. FALSE FALSE FALSE TRUE FALSE TRUE
D. NA

CORRECT ANSWER: A

15 / 28

Dataset scenario for Questions 14 & 15

A relational database contains 2 data sets namely sales and employees.

The sales data set gives information about the each sale with an id followed by customer id and salesperson id with quantity of the item and payment type. Here is the sales data set:

sales

## # A tibble: 4 x 6
##   sales_id sales_person_id customer_id product_id quantity payment_type
##      <dbl> <chr>                 <dbl>      <dbl>    <dbl> <chr>       
## 1      201 A1                        1        102        2 Debit       
## 2      202 B3                        2        101        3 Credit      
## 3      203 A1                        3        101        1 Cash        
## 4      204 A2                        1        103        5 Debit

16 / 28

Dataset scenario for Questions 14 & 15 Cont.

The employees data set allows you to look up the name and surname of the sales person using the sales person id. Here is the employees data set:

employees

## # A tibble: 6 x 3
##   sales_person_id first_name last_name
##   <chr>           <chr>      <chr>    
## 1 A1              John       Doe      
## 2 A2              Jane       Smith    
## 3 A3              Micheal    Brown    
## 4 B1              Jim        Johnson  
## 5 B2              Karen      Wilson   
## 6 B3              Kate       Taylor

employees connects to sales via the sales_person_id variable.

17 / 28

sales

## # A tibble: 4 x 6
##   sales_id sales_person_id customer_id product_id quantity payment_type
##      <dbl> <chr>                 <dbl>      <dbl>    <dbl> <chr>       
## 1      201 A1                        1        102        2 Debit       
## 2      202 B3                        2        101        3 Credit      
## 3      203 A1                        3        101        1 Cash        
## 4      204 A2                        1        103        5 Debit

employees

## # A tibble: 6 x 3
##   sales_person_id first_name last_name
##   <chr>           <chr>      <chr>    
## 1 A1              John       Doe      
## 2 A2              Jane       Smith    
## 3 A3              Micheal    Brown    
## 4 B1              Jim        Johnson  
## 5 B2              Karen      Wilson   
## 6 B3              Kate       Taylor

# Q16: How would you find the names of sales people who made a sale while dropping all the information in `sales` data set?
employees %>% semi_join(sales)

## Joining, by = "sales_person_id"

## # A tibble: 3 x 3
##   sales_person_id first_name last_name
##   <chr>           <chr>      <chr>    
## 1 A1              John       Doe      
## 2 A2              Jane       Smith    
## 3 B3              Kate       Taylor

# Q17: How would you find the names of sales people who didn't make a sale?
employees %>% anti_join(sales)

## Joining, by = "sales_person_id"

## # A tibble: 3 x 3
##   sales_person_id first_name last_name
##   <chr>           <chr>      <chr>    
## 1 A3              Micheal    Brown    
## 2 B1              Jim        Johnson  
## 3 B2              Karen      Wilson

Question 14

According to the given information, how would you find the names of sales people (employees) who made a sale while dropping all the information in the sales data set?

A. anti_join(employees, sales)
B. semi_join(employees, sales)
C. union(employees, sales)
D. bind_cols(employees, sales)

18 / 28

Question 14

According to the given information, how would you find the names of sales people (employees) who made a sale while dropping all the information in the sales data set?

A. anti_join(employees, sales)
B. semi_join(employees, sales)
C. union(employees, sales)
D. bind_cols(employees, sales)

CORRECT ANSWER: B

18 / 28

Question 15

According to the given information, how would you find the names of sales people who didn't make a sale?

A. anti_join(employees, sales)
B. semi_join(employees, sales)
C. union(employees, sales)
D. bind_cols(employees,sales)

19 / 28

Question 15

According to the given information, how would you find the names of sales people who didn't make a sale?

A. anti_join(employees, sales)
B. semi_join(employees, sales)
C. union(employees, sales)
D. bind_cols(employees,sales)

CORRECT ANSWER: A

19 / 28

For Questions 16 and 17

20 / 28

For Questions 16 and 17

Picture 1:

Picture 2:

21 / 28

For Questions 16 and 17

Picture 3:

Picture 4:

22 / 28

Question 16

Consider the id_lookup and ratings data sets, what would be the result of:

ratings %>% left_join(id_lookup)
#OR
left_join(ratings, id_lookup)

A. Picture 1
B. Picture 2
C. Picture 3
D. Picture 4

23 / 28

Question 16

Consider the id_lookup and ratings data sets, what would be the result of:

ratings %>% left_join(id_lookup)
#OR
left_join(ratings, id_lookup)

A. Picture 1
B. Picture 2
C. Picture 3
D. Picture 4

CORRECT ANSWER: A

23 / 28

Question 17

Consider the id_lookup and ratings data sets, what would be the result of:

id_lookup %>% anti_join(ratings)
#OR
anti_join(id_lookup, ratings)

A. Picture 1
B. Picture 2
C. Picture 3
D. Picture 4

24 / 28

Question 17

Consider the id_lookup and ratings data sets, what would be the result of:

id_lookup %>% anti_join(ratings)
#OR
anti_join(id_lookup, ratings)

A. Picture 1
B. Picture 2
C. Picture 3
D. Picture 4

CORRECT ANSWER: D

24 / 28

Question 18

Which one of the following will order this data frame in an ascending order using col2 , col3 and col1 , respectively?

df <- data.frame(col1 = c(4, 3, 1),
                 col2 = c(81, 12, 4),
                 col3 = c(54, 22, 66))

A. df %>% select(col1, col2, col3)
B. df %>% filter(col1, col2, col3)
C. df %>% arrange(col1, col2, col3)
D. df %>% arrange(col2, col3, col1)

25 / 28

Question 18

Which one of the following will order this data frame in an ascending order using col2 , col3 and col1 , respectively?

df <- data.frame(col1 = c(4, 3, 1),
                 col2 = c(81, 12, 4),
                 col3 = c(54, 22, 66))

A. df %>% select(col1, col2, col3)
B. df %>% filter(col1, col2, col3)
C. df %>% arrange(col1, col2, col3)
D. df %>% arrange(col2, col3, col1)

CORRECT ANSWER: D

25 / 28

Question 19

According to the following code, what will be the class of df?

df <- data.frame(col1 = 1:3,
                 col2 = c("this", "is", "text"),
                 col3 = c(TRUE, FALSE, TRUE),
                 col4 = c(25.5, 44.2, 54.9))
df <- as.matrix(df)
class(df)

A. list
B. vector
C. matrix
D. data.frame

26 / 28

Question 19

According to the following code, what will be the class of df?

df <- data.frame(col1 = 1:3,
                 col2 = c("this", "is", "text"),
                 col3 = c(TRUE, FALSE, TRUE),
                 col4 = c(25.5, 44.2, 54.9))
df <- as.matrix(df)
class(df)

A. list
B. vector
C. matrix
D. data.frame

CORRECT ANSWER: C

26 / 28

Question 20

According to the following code, what will be the ordering of the levels for y?

y <- factor(c("low", "moderate", "low", "severe", "low", "high", "moderate", "severe"), 
             levels = c("low" , "moderate", "high" , "severe"), 
             ordered = TRUE) 
y

A. moderate < high < severe < low
B. low < severe < high < moderate
C. low < moderate < high < severe
D. severe < high < moderate < low

27 / 28

Question 20

According to the following code, what will be the ordering of the levels for y?

y <- factor(c("low", "moderate", "low", "severe", "low", "high", "moderate", "severe"), 
             levels = c("low" , "moderate", "high" , "severe"), 
             ordered = TRUE) 
y

A. moderate < high < severe < low
B. low < severe < high < moderate
C. low < moderate < high < severe
D. severe < high < moderate < low

CORRECT ANSWER: C

27 / 28

Return to Course Website

28 / 28

↑, ←, Pg Up, k	Go to previous slide
↓, →, Pg Dn, Space, j	Go to next slide
Home	Go to first slide
End	Go to last slide
Number + Return	Go to specific slide
b / m / f	Toggle blackout / mirrored / fullscreen mode
c	Clone slideshow
p	Toggle presenter mode
t	Restart the presentation timer
?, h	Toggle this help

Week 12

Revision 2

Question 1

Question 1

Question 2

Question 2

Question 3

Question 3

Question 4

Question 4

Question 5

Question 5

Question 6

Question 6

Question 7

Question 7

Question 8

Question 8

Question 9

Question 9

Question 10

Question 10

Question 11

Question 11

Question 12

Question 12

Question 13

Question 13

Dataset scenario for Questions 14 & 15

Dataset scenario for Questions 14 & 15 Cont.

Question 14

Question 14

Question 15

Question 15

For Questions 16 and 17

For Questions 16 and 17

For Questions 16 and 17

Question 16

Question 16

Question 17

Question 17

Question 18

Question 18

Question 19

Question 19

Question 20

Question 20

Revision 2

Help