Data Prep Advanced Exam
What to know:
Only Playfair+
Core
and
Premium
members are eligible
You will need the
Proxy
dataset
This test is designed to take 60 – 90 minutes
You must answer 20 of 25 questions correctly to pass
Following the exam, we’ll review your answers and respond within 3 – 5 business days.
Get Started
Δ
Step
1
of
26
3%
1. Which is an example of building with scale in mind?
A. Minimizing hardcoding and limiting the need for human intervention.
B. Move calculations earlier in the process to simplify queries.
C. Move calculations later in the process to reduce storage required.
D. All of the above
2. What is pseudocode?
A. A command that temporarily grants the security access of another user.
B. A programming language.
C. A technique for writing code.
D. All of the above
3. Which icon commonly denotes a database? (Answer on next screen)
3. Which icon commonly denotes a database?
A. a
B. b
C. c
D. d
4. Which icon commonly denotes a filter? (Answer on next screen)
4. Which icon commonly denotes a filter?
A. a
B. b
C. c
D. d
5. When working with client data what is NOT something we need to consider?
A. Integrity of the data
B. Data security
C. How large the dataset is
D. None of the above
6. Which law is most relevant when working with data related to student records?
A. HIPAA
B. GDPR
C. FERPA
D. CCPA
7. Which querying language is most likely to be used to query big data?
A. SQL
B. JSON
C. R
D. Java
8. Which data structure works best for big data?
A. An array
B. A hash table
C. A list
D. All of the above
9. What is the benefit of storing data in a Cube / Parquet?
A. Efficiency
B. Ease of use
C. Doesn't require setup
D. There aren't any
10. Which dataset would be most likely to be stored in a Cube / Parquet format?
A. Small
B. Medium
C. Large
D. Medium and small
11. What is a benefit of storing data in JSON?
A. JSON can easily hold unstructured data like audio files and images.
B. JSON data is encrypted and more secure.
C. JSON is compatible with most programming languages, making it ideal for transferring data between systems.
D. JSON keeps data more organized, making it easier to navigate long term storage.
12. What is a drawback of JSON?
A. JSON cannot store unstructured data like audio files and images.
B. JSON is not flexible, and requires data to follow a strict schema.
C. JSON data is hard to transfer between systems.
D. All of the above
13. How can you reduce the size of your data?
A. Remove columns
B. Filter rows
C. Aggregate records
D. All of the above
14. What issues could be created with an append-only ETL pipeline?
A. Historical data does not get updated - historical pricing adjustments, etc.
B. The table will continue to add multiple copies of the data set with each load.
C. Some records may be missing.
D. All of the above
15. What are the benefits of using an append or partial refresh?
A. Better backup records
B. Faster run time
C. More accurate data
D. Smaller storage requirements
16. Why is data governance important?
A. To ensure consistency over time and alignment between stakeholders.
B. To ensure data accuracy & quality.
C. To trace where data comes from, ensuring alignment between input and analysis.
D. All of the above
17. What are the benefits of using a view over a table?
A. Eliminates data lags
B. Faster processing
C. Less storage
D. All of the above
18. Which of the following would be a reason to automate a process?
A. Reduces inconsistencies from typos and other human errors.
B. Fewer dependencies - for example, the automated process can still run when the office is closed for holidays.
C. By running the process during off hours, you can balance compute loads.
D. All of the above
19. What is oauth?
A. A password manager
B. An authorization service
C. A coding language
D. None of the above
20. Which regex statement would you use to find email domains?
A. /\w+$/
B. .\S+
C. [@]\w+[.]\w+
D. {.}
21. Which regex statement would you use to identify phone numbers in the following format: 123-456-7890?
A. \d{3}-\d{3}-\d{4}
B. \d{4}-\d{3}-\d{4}
C. ^(\+\d{1,2}\s?)?\(?\d{3}\)?[\s.-]?\d{3}[\s.-]?\d{4}$
D. None of the above
22. What does this REGEX statement do: (\D{8})?
A. Parse 8 numbers from the string.
B. Delete the first 8 characters from a string.
C. Retrieve the first 8 characters from a string.
D. Retrieve the first 8 consecutive non-numbers from a string.
23. Which REGEX snippet would work best to isolate 12-digit order numbers from open text feedback forms?
A. (\d{12})
B. .\S+12
C. a*{12}
D. a{0,12}
The final two questions in this exam are from Playfair Data's Proxy dataset. Download the latest version here
24. Which query could be used to find duplicate orders in the Sales Table?
A. SELECT * FROM Sales WHERE COUNT(OrderID) > 1
B. SELECT * FROM Sales HAVING COUNT(OrderID) > 1
C. SELECT DISTINCT OrderID FROM Sales GROUP BY OrderID HAVING COUNT(OrderID) >1
D. None of the above
25. Which of these queries would not change the number of rows of the sales table?
A. SELECT * FROM SalesTable LEFT JOIN AttributionTable ON SalesTable.CampaignID = AttributionTable.CampaignID
B. SELECT * FROM SalesTable WHERE Product = ‘On Demand’
C. SELECT SUM(Sales), Country FROM SalesTable GROUP BY Country
D. None of the above
Let us know who is taking the test:
First Name
(Required)
Last Name
(Required)
Playfair+ Email
(Required)
Phone
Company Name
0% Completed!
Previous
Next
Exit
Cookie Settings