+ - 0:00:00
Notes for current slide
Notes for next slide

Week 12

1 / 28

Revision 2

2 / 28

Question 1

Consider the following string. Which command would you use to replace the x with blank (whitespace)?

string <- c("169 millimeters x 117 millimeters x 9.1 millimeters")
  • A. chartr(string, x)
  • B. chartr(string, "x", "~")
  • C. chartr(string, old = "x", new=" ")
  • D. chartr(string, "x", " - ")
3 / 28

Question 1

Consider the following string. Which command would you use to replace the x with blank (whitespace)?

string <- c("169 millimeters x 117 millimeters x 9.1 millimeters")
  • A. chartr(string, x)
  • B. chartr(string, "x", "~")
  • C. chartr(string, old = "x", new=" ")
  • D. chartr(string, "x", " - ")


CORRECT ANSWER: C

3 / 28

Question 2

What is the result of the following R code?

df1 <- c("VIC", "NSW", "TAS", "WA", "SA")
df2 <- c("WA", "SA", "NSW", "TAS", "VIC")
identical(df1, df2)
  • A. TRUE
  • B. FALSE
  • C. "WA", "SA", "NSW"
  • D. "TAS", "VIC"
4 / 28

Question 2

What is the result of the following R code?

df1 <- c("VIC", "NSW", "TAS", "WA", "SA")
df2 <- c("WA", "SA", "NSW", "TAS", "VIC")
identical(df1, df2)
  • A. TRUE
  • B. FALSE
  • C. "WA", "SA", "NSW"
  • D. "TAS", "VIC"




CORRECT ANSWER: B

4 / 28

Question 3

Which one of the following is NOT one of the print functions?

  • A. cat()
  • B. print()
  • C. noquote
  • D. quote
5 / 28

Question 3

Which one of the following is NOT one of the print functions?

  • A. cat()
  • B. print()
  • C. noquote
  • D. quote




CORRECT ANSWER: D

5 / 28

Question 4

Which one of the following removes all punctuations in the vector x?

x <- c("hello!", "good-day.", "hi 5:)")
  • A. str_subset(x, "[:alnum:]")
  • B. str_extract(x, "[:alnum:]")
  • C. str_remove(x, "[:punct:]")
  • D. str_replace_all(x, "[:punct:]", "")
6 / 28

Question 4

Which one of the following removes all punctuations in the vector x?

x <- c("hello!", "good-day.", "hi 5:)")
  • A. str_subset(x, "[:alnum:]")
  • B. str_extract(x, "[:alnum:]")
  • C. str_remove(x, "[:punct:]")
  • D. str_replace_all(x, "[:punct:]", "")




CORRECT ANSWER: D

6 / 28

Question 5

According to the following code, what will be the result of y?

x <- "Now, I am HAPPY"
y <- length(x)
y
  • A. 4
  • B. 1
  • C. 2
  • D. 5
7 / 28

Question 5

According to the following code, what will be the result of y?

x <- "Now, I am HAPPY"
y <- length(x)
y
  • A. 4
  • B. 1
  • C. 2
  • D. 5




CORRECT ANSWER: B

7 / 28

Question 6

Which one of the following functions from lubridate package will convert z into a date format?

z <- c("08.06.2018", "29062018", "23/03/2018", "30-01-2018")
  • A. ymd(z)
  • B. dmy(z)
  • C. ydm(z)
  • D. hms(z)
8 / 28

Question 6

Which one of the following functions from lubridate package will convert z into a date format?

z <- c("08.06.2018", "29062018", "23/03/2018", "30-01-2018")
  • A. ymd(z)
  • B. dmy(z)
  • C. ydm(z)
  • D. hms(z)




CORRECT ANSWER: B

8 / 28

Question 7

In which one of the following, values are divided by their standard deviation (or root mean square)?

  • A. Box-Cox transformation
  • B. logarithmic transformation
  • C. z-score standardisation
  • D. square root transformation
9 / 28

Question 7

In which one of the following, values are divided by their standard deviation (or root mean square)?

  • A. Box-Cox transformation
  • B. logarithmic transformation
  • C. z-score standardisation
  • D. square root transformation




CORRECT ANSWER: C

9 / 28

Question 8

According to the following code, what will be the result of y?

minmaxnormalise <- function(x) {(x - min(x)) / (max(x) - min(x))}
x <- c(5, 4, NA, 2, 5)
y <- minmaxnormalise(x)
y
  • A. 1.00 1.00 NA 1.00 1.00
  • B. 1.00 0.67 NA 0.00 1.00
  • C. NA NA NA NA NA
  • D. 0.00 0.00 NA 1.00 1.00
10 / 28

Question 8

According to the following code, what will be the result of y?

minmaxnormalise <- function(x) {(x - min(x)) / (max(x) - min(x))}
x <- c(5, 4, NA, 2, 5)
y <- minmaxnormalise(x)
y
  • A. 1.00 1.00 NA 1.00 1.00
  • B. 1.00 0.67 NA 0.00 1.00
  • C. NA NA NA NA NA
  • D. 0.00 0.00 NA 1.00 1.00




CORRECT ANSWER: C

10 / 28

Question 9

Which one of the following packages has a function to detect multivariate outliers?

  • A. library(dplyr)
  • B. library(MVN)
  • C. library(tidyr)
  • D. library(validate)
11 / 28

Question 9

Which one of the following packages has a function to detect multivariate outliers?

  • A. library(dplyr)
  • B. library(MVN)
  • C. library(tidyr)
  • D. library(validate)


    CORRECT ANSWER: B
11 / 28

Question 10

Which of the following can be used to deal with outliers?

  • A. Capping
  • B. Transforming
  • C. Imputing
  • D. All of them
12 / 28

Question 10

Which of the following can be used to deal with outliers?

  • A. Capping
  • B. Transforming
  • C. Imputing
  • D. All of them


    CORRECT ANSWER: D
12 / 28

Question 11

Which one of the following is the reason for the error given below?

df <- data.frame(col1 = c(2, 0 / 0, NA, 1 / 0,-Inf, Inf),
col2 = c(NA, Inf / 0, 2 / 0, NaN,-Inf, 4))
is.infinite(df)
  • A. is.infinite() function accepts only vectorial input.
  • B. there is no infinite value in the data frame.
  • C. data frame has missing values.
  • D. there is a division by zero problem in the data frame.
13 / 28

Question 11

Which one of the following is the reason for the error given below?

df <- data.frame(col1 = c(2, 0 / 0, NA, 1 / 0,-Inf, Inf),
col2 = c(NA, Inf / 0, 2 / 0, NaN,-Inf, 4))
is.infinite(df)
  • A. is.infinite() function accepts only vectorial input.
  • B. there is no infinite value in the data frame.
  • C. data frame has missing values.
  • D. there is a division by zero problem in the data frame.


    CORRECT ANSWER: A
13 / 28

Question 12

Consider the following data frame. What command would you use to find the total missing values in each column?

df <- data.frame(col1 = c(1:3, NA),
col2 = c("this", NaN, "is", "text"),
col3 = c(TRUE, FALSE, TRUE, TRUE),
col4 = c(2.5, 4.2, 3.2, NA))
  • A. sum(is.na(df))
  • B. is.na(df)
  • C. is.nan(df)
  • D. colSums(is.na(df))
14 / 28

Question 12

Consider the following data frame. What command would you use to find the total missing values in each column?

df <- data.frame(col1 = c(1:3, NA),
col2 = c("this", NaN, "is", "text"),
col3 = c(TRUE, FALSE, TRUE, TRUE),
col4 = c(2.5, 4.2, 3.2, NA))
  • A. sum(is.na(df))
  • B. is.na(df)
  • C. is.nan(df)
  • D. colSums(is.na(df))


    CORRECT ANSWER: D
14 / 28

Question 13

According to the following code, what will be the result of y?

x <- c(1:3, NA, 5, NA)
y <- which(is.na(x))
y
  • A. 4 6
  • B. TRUE
  • C. FALSE FALSE FALSE TRUE FALSE TRUE
  • D. NA
15 / 28

Question 13

According to the following code, what will be the result of y?

x <- c(1:3, NA, 5, NA)
y <- which(is.na(x))
y
  • A. 4 6
  • B. TRUE
  • C. FALSE FALSE FALSE TRUE FALSE TRUE
  • D. NA


    CORRECT ANSWER: A
15 / 28

Dataset scenario for Questions 14 & 15

A relational database contains 2 data sets namely sales and employees.

The sales data set gives information about the each sale with an id followed by customer id and salesperson id with quantity of the item and payment type. Here is the sales data set:

sales
## # A tibble: 4 x 6
## sales_id sales_person_id customer_id product_id quantity payment_type
## <dbl> <chr> <dbl> <dbl> <dbl> <chr>
## 1 201 A1 1 102 2 Debit
## 2 202 B3 2 101 3 Credit
## 3 203 A1 3 101 1 Cash
## 4 204 A2 1 103 5 Debit
16 / 28

Dataset scenario for Questions 14 & 15 Cont.

The employees data set allows you to look up the name and surname of the sales person using the sales person id. Here is the employees data set:

employees
## # A tibble: 6 x 3
## sales_person_id first_name last_name
## <chr> <chr> <chr>
## 1 A1 John Doe
## 2 A2 Jane Smith
## 3 A3 Micheal Brown
## 4 B1 Jim Johnson
## 5 B2 Karen Wilson
## 6 B3 Kate Taylor
  • employees connects to sales via the sales_person_id variable.
17 / 28
sales
## # A tibble: 4 x 6
## sales_id sales_person_id customer_id product_id quantity payment_type
## <dbl> <chr> <dbl> <dbl> <dbl> <chr>
## 1 201 A1 1 102 2 Debit
## 2 202 B3 2 101 3 Credit
## 3 203 A1 3 101 1 Cash
## 4 204 A2 1 103 5 Debit
employees
## # A tibble: 6 x 3
## sales_person_id first_name last_name
## <chr> <chr> <chr>
## 1 A1 John Doe
## 2 A2 Jane Smith
## 3 A3 Micheal Brown
## 4 B1 Jim Johnson
## 5 B2 Karen Wilson
## 6 B3 Kate Taylor
# Q16: How would you find the names of sales people who made a sale while dropping all the information in `sales` data set?
employees %>% semi_join(sales)
## Joining, by = "sales_person_id"
## # A tibble: 3 x 3
## sales_person_id first_name last_name
## <chr> <chr> <chr>
## 1 A1 John Doe
## 2 A2 Jane Smith
## 3 B3 Kate Taylor
# Q17: How would you find the names of sales people who didn't make a sale?
employees %>% anti_join(sales)
## Joining, by = "sales_person_id"
## # A tibble: 3 x 3
## sales_person_id first_name last_name
## <chr> <chr> <chr>
## 1 A3 Micheal Brown
## 2 B1 Jim Johnson
## 3 B2 Karen Wilson

Question 14

According to the given information, how would you find the names of sales people (employees) who made a sale while dropping all the information in the sales data set?

  • A. anti_join(employees, sales)
  • B. semi_join(employees, sales)
  • C. union(employees, sales)
  • D. bind_cols(employees, sales)
18 / 28

Question 14

According to the given information, how would you find the names of sales people (employees) who made a sale while dropping all the information in the sales data set?

  • A. anti_join(employees, sales)
  • B. semi_join(employees, sales)
  • C. union(employees, sales)
  • D. bind_cols(employees, sales)


    CORRECT ANSWER: B
18 / 28

Question 15

According to the given information, how would you find the names of sales people who didn't make a sale?

  • A. anti_join(employees, sales)
  • B. semi_join(employees, sales)
  • C. union(employees, sales)
  • D. bind_cols(employees,sales)
19 / 28

Question 15

According to the given information, how would you find the names of sales people who didn't make a sale?

  • A. anti_join(employees, sales)
  • B. semi_join(employees, sales)
  • C. union(employees, sales)
  • D. bind_cols(employees,sales)


    CORRECT ANSWER: A
19 / 28

For Questions 16 and 17

20 / 28

For Questions 16 and 17

  • Picture 1:
  • Picture 2:
21 / 28

For Questions 16 and 17

  • Picture 3:
  • Picture 4:
22 / 28

Question 16

Consider the id_lookup and ratings data sets, what would be the result of:

ratings %>% left_join(id_lookup)
#OR
left_join(ratings, id_lookup)
  • A. Picture 1
  • B. Picture 2
  • C. Picture 3
  • D. Picture 4
23 / 28

Question 16

Consider the id_lookup and ratings data sets, what would be the result of:

ratings %>% left_join(id_lookup)
#OR
left_join(ratings, id_lookup)
  • A. Picture 1
  • B. Picture 2
  • C. Picture 3
  • D. Picture 4


    CORRECT ANSWER: A
23 / 28

Question 17

Consider the id_lookup and ratings data sets, what would be the result of:

id_lookup %>% anti_join(ratings)
#OR
anti_join(id_lookup, ratings)
  • A. Picture 1
  • B. Picture 2
  • C. Picture 3
  • D. Picture 4
24 / 28

Question 17

Consider the id_lookup and ratings data sets, what would be the result of:

id_lookup %>% anti_join(ratings)
#OR
anti_join(id_lookup, ratings)
  • A. Picture 1
  • B. Picture 2
  • C. Picture 3
  • D. Picture 4


    CORRECT ANSWER: D
24 / 28

Question 18

Which one of the following will order this data frame in an ascending order using col2 , col3 and col1 , respectively?

df <- data.frame(col1 = c(4, 3, 1),
col2 = c(81, 12, 4),
col3 = c(54, 22, 66))
  • A. df %>% select(col1, col2, col3)
  • B. df %>% filter(col1, col2, col3)
  • C. df %>% arrange(col1, col2, col3)
  • D. df %>% arrange(col2, col3, col1)
25 / 28

Question 18

Which one of the following will order this data frame in an ascending order using col2 , col3 and col1 , respectively?

df <- data.frame(col1 = c(4, 3, 1),
col2 = c(81, 12, 4),
col3 = c(54, 22, 66))
  • A. df %>% select(col1, col2, col3)
  • B. df %>% filter(col1, col2, col3)
  • C. df %>% arrange(col1, col2, col3)
  • D. df %>% arrange(col2, col3, col1)


    CORRECT ANSWER: D
25 / 28

Question 19

According to the following code, what will be the class of df?

df <- data.frame(col1 = 1:3,
col2 = c("this", "is", "text"),
col3 = c(TRUE, FALSE, TRUE),
col4 = c(25.5, 44.2, 54.9))
df <- as.matrix(df)
class(df)
  • A. list
  • B. vector
  • C. matrix
  • D. data.frame
26 / 28

Question 19

According to the following code, what will be the class of df?

df <- data.frame(col1 = 1:3,
col2 = c("this", "is", "text"),
col3 = c(TRUE, FALSE, TRUE),
col4 = c(25.5, 44.2, 54.9))
df <- as.matrix(df)
class(df)
  • A. list
  • B. vector
  • C. matrix
  • D. data.frame


    CORRECT ANSWER: C
26 / 28

Question 20

According to the following code, what will be the ordering of the levels for y?

y <- factor(c("low", "moderate", "low", "severe", "low", "high", "moderate", "severe"),
levels = c("low" , "moderate", "high" , "severe"),
ordered = TRUE)
y
  • A. moderate < high < severe < low
  • B. low < severe < high < moderate
  • C. low < moderate < high < severe
  • D. severe < high < moderate < low
27 / 28

Question 20

According to the following code, what will be the ordering of the levels for y?

y <- factor(c("low", "moderate", "low", "severe", "low", "high", "moderate", "severe"),
levels = c("low" , "moderate", "high" , "severe"),
ordered = TRUE)
y
  • A. moderate < high < severe < low
  • B. low < severe < high < moderate
  • C. low < moderate < high < severe
  • D. severe < high < moderate < low


    CORRECT ANSWER: C
27 / 28

Revision 2

2 / 28
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow