Data Engineering Practitioner Exam
What to know:
Only Playfair+
Core
and
Premium
members are eligible
You will need the
Playfair+ Exam Dataset
This test is designed to take 60 – 90 minutes
You must answer 20 of 25 questions correctly to pass
Following the exam, we’ll review your answers and respond within 3 – 5 business days.
Get Started
Step
1
of
27
3%
Which of the following does not apply to conditional formatting?
A. It produces rudimentary data visualization
B. It can be a tool to clean datasets
C. It can be used to train machine learning models
You are tasked with taking a dataset to perform some type of analysis, what is the preferred data structure of this dataset?
A. Structured
B. Unstructured
True or False: If you perform a join on two datasets, you can have more records than you started with.
A. True
B. False
Which of these data types would best be used to classify the value ‘HELLO’?
A. String
B. Float
C. Integer
D. Boolean
The process of using ETL to configure a dataset into the proper output for analysis is a key part of data engineering. What does the term ETL stand for?
A. Extract, Transform, Load
B. Expose, Transform, Lift
C. Extract, Transpose, Load
D. Examine, Transpose, Line up
Which of the following could be a use case for a Primary Key?
A. To clean a dataset
B. To optimize a dataset
C. To join one or more datasets
D. To assign a unique identifier to each record
E. All of the above
If you are provided a dataset with 5,000,000 rows, which of the following outputs would not work?
A. Database
B. Excel
C. Tableau Hyper
D. Data Cloud
APIs are used commonly to extract data from various sources. What does API stand for?
A. Automated Programming Interaction
B. Automatically Processed Input
C. Application Processing Interface
D. Application Programming Interface
What is the high-level goal of a data pipeline?
A. To pull and process data into the intended output
B. To make insights on a dataset
C. To optimize a database
D. To create data visualizations
What is a data lake and when would it be used?
A. Hierarchically organized dataset, when you have a large dataset
B. Raw unstructured host of data, when you have a large dataset
C. Raw unstructured host of data, when you have a small dataset
D. Hierarchically organized dataset, when you have a small dataset
Which would create more rows in the final dataset, a union or join? (assuming there are no duplicates and both datasets have a one to one relationship)
A. Union
B. Join
You have Table A and Table B. You are doing an INNER JOIN on the tables. What will be the result?
A. All records from both Table A and Table B
B. All records from Table A and only matching records from Table B
C. Only matching records from both Table A and Table B
D. All records from Table B and only matching records from Table A
What is the result of the following formula when Order Date = 11/8/2021 and Ship Date = 11/11/2021? Order Date >= Ship Date
A. TRUE
B. FALSE
C. 11/8/2021
D. 11/11/2021
Which of these rows of data is PII (Personal Identifiable Information)?
A. Income
B. Full Name
C. IP Address
D. Date of Birth
E. Phone Number
F. All of the above
Using SQL, how would you select all of the columns in a dataset?
A. SELECT ALL COLUMNS
B. SELECT ALL
C. SELECT TOTAL
D. SELECT *
Which of the following cannot be used as a delimiter in a CSV File?
A. Space
B. Comma
C. Column
D. Pipe
True or False: A NULL and a Blank are the same within a database.
A. True
B. False
If Postal Code = 42420 and Region = South, which of the following IF statements would produce ‘X’?
A. IF Postal Code = ‘90032’ OR ‘42420’ THEN ‘X’ END
B. IF Region = ‘West’ AND Postal Code = ‘42420’ THEN ‘X’ ELSE ‘N/A’ END
C. IF Region = ‘West’ OR Postal Code = ‘42420’ THEN ‘X’ END
D. IF Region = ‘West’ OR Postal Code = ‘42420’ THEN ‘X’ ELSE ‘N/A’ END
If Sales = $261, $731, $958, and $49, which of the following would produce $500?
A. SUM(Sales)
B. AVG(Sales)
C. MAX(Sales)
What is the result of DATEADD(month, 2, ‘2021-11-20’)?
A. 2022-01-01
B. 2022-01-20
C. 2021-11-22
D. 2021-11-01
If Order Date = 1/3/2019, 1/4/2019, 1/5/2019, 1/6/2019, how was the data sorted?
A. Descending
B. Ascending
Which is not an intended function of the DISTINCT function?
A. Remove duplicates
B. Take each unique instance of a value
C. Take most common instance of a value
What transformation steps would you use to get each unique category of a dataset and the sum of sales for each category?
A. DISTINCT, SUM
B. GROUP BY, AVG
C. GROUP BY, SUM
D. COUNT DISTINCT, SUM
The following questions must be answered using the provided dataset. If you haven't already downloaded the dataset, use the link below.
Download
here
The following questions must be answered using the provided dataset. If you haven't already downloaded the dataset, use the link below.
Okay, I'm ready to move on.
Download
here
What was the average sales in the Furniture Category for the 2nd best performing region (by total overall sales)?
A. $351
B. $299
C. $309
D. $311
What is the Order ID for the second largest order (in terms of Profit) on 9/4 in the East Region?
A. CA-2021-120761
B. CA-2021-109757
C. CA-2021-109757
D. US-2021-166394
Let us know who is taking the test:
First Name
(Required)
Last Name
(Required)
Playfair+ Email
(Required)
Phone
Company Name
Δ
0% Completed!
Previous
Next
Exit
Cookie Settings