You have already completed the Test before. Hence you can not start it again.
Test is loading...
You must sign in or sign up to start the Test.
You have to finish following quiz, to start this Test:
Your results are here!! for" DP-500 Practice Test 4 "
0 of 75 questions answered correctly
Your time:
Time has elapsed
Your Final Score is : 0
You have attempted : 0
Number of Correct Questions : 0 and scored 0
Number of Incorrect Questions : 0 and Negative marks 0
Average score
Your score
DP-203
You have attempted: 0
Number of Correct Questions: 0 and scored 0
Number of Incorrect Questions: 0 and Negative marks 0
DP-900
You have attempted: 0
Number of Correct Questions: 0 and scored 0
Number of Incorrect Questions: 0 and Negative marks 0
You can review your answers by clicking on “View Answers” option. Important Note : Open Reference Documentation Links in New Tab (Right Click and Open in New Tab).
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
Answered
Review
Question 1 of 75
1. Question
You are designing a streaming data solution that will ingest variable volumes of data. You need to ensure that you can change the partition count after creation. Which service should you use to ingest the data?
You are designing a date dimension table in an Azure Synapse Analytics dedicated SQL pool. The date dimension table will be used by all the fact tables. Which distribution type should you recommend to minimize data movement?
Correct
A replicated table has a full copy of the table available on every Compute node. Queries run fast on replicated tables since joins on replicated tables don‘t require data movement. Replication requires extra storage, though, and isn‘t practical for large tables. Incorrect Answers: A: A hash distributed table is designed to achieve high performance for queries on large tables. C: A round-robin table distributes table rows evenly across all distributions. The rows are distributed randomly. Loading data into a round-robin table is fast. Keep in mind that queries can require more data movement than the other distribution methods. Reference: https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-overview
Incorrect
A replicated table has a full copy of the table available on every Compute node. Queries run fast on replicated tables since joins on replicated tables don‘t require data movement. Replication requires extra storage, though, and isn‘t practical for large tables. Incorrect Answers: A: A hash distributed table is designed to achieve high performance for queries on large tables. C: A round-robin table distributes table rows evenly across all distributions. The rows are distributed randomly. Loading data into a round-robin table is fast. Keep in mind that queries can require more data movement than the other distribution methods. Reference: https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-overview
Unattempted
A replicated table has a full copy of the table available on every Compute node. Queries run fast on replicated tables since joins on replicated tables don‘t require data movement. Replication requires extra storage, though, and isn‘t practical for large tables. Incorrect Answers: A: A hash distributed table is designed to achieve high performance for queries on large tables. C: A round-robin table distributes table rows evenly across all distributions. The rows are distributed randomly. Loading data into a round-robin table is fast. Keep in mind that queries can require more data movement than the other distribution methods. Reference: https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-overview
Question 3 of 75
3. Question
You are designing a security model for an Azure Synapse Analytics dedicated SQL pool that will support multiple companies. You need to ensure that users from each company can view only the data of their respective company. Which two objects should you include in the solution? Each correct answer presents part of the solution. NOTE: Each correct selection is worth one point.
You have a SQL pool in Azure Synapse that contains a table named dbo.Customers. The table contains a column name Email. You need to prevent nonadministrative users from seeing the full email addresses in the Email column. The users must see values in a format of a [email protected] instead. What should you do?
You are designing an Azure Synapse solution that will provide a query interface for the data stored in an Azure Storage account. The storage account is only accessible from a virtual network. You need to recommend an authentication mechanism to ensure that the solution can access the source data. What should you recommend?
You are developing an application that uses Azure Data Lake Storage Gen2. You need to recommend a solution to grant permissions to a specific application for a limited time period. What should you include in the recommendation?
Correct
A shared access signature (SAS) provides secure delegated access to resources in your storage account. With a SAS, you have granular control over how a client can access your data. For example: What resources the client may access. What permissions they have to those resources. How long the SAS is valid. Reference: https://docs.microsoft.com/en-us/azure/storage/common/storage-sas-overview
Incorrect
A shared access signature (SAS) provides secure delegated access to resources in your storage account. With a SAS, you have granular control over how a client can access your data. For example: What resources the client may access. What permissions they have to those resources. How long the SAS is valid. Reference: https://docs.microsoft.com/en-us/azure/storage/common/storage-sas-overview
Unattempted
A shared access signature (SAS) provides secure delegated access to resources in your storage account. With a SAS, you have granular control over how a client can access your data. For example: What resources the client may access. What permissions they have to those resources. How long the SAS is valid. Reference: https://docs.microsoft.com/en-us/azure/storage/common/storage-sas-overview
Question 7 of 75
7. Question
You have an Azure Synapse Analytics dedicated SQL pool that contains a table named Contacts. Contacts contains a column named Phone. You need to ensure that users in a specific role only see the last four digits of a phone number when querying the Phone column. What should you include in the solution?
Correct
Dynamic data masking helps prevent unauthorized access to sensitive data by enabling customers to designate how much of the sensitive data to reveal with minimal impact on the application layer. It?s a policy-based security feature that hides the sensitive data in the result set of a query over designated database fields, while the data in the database is not changed. Reference: https://docs.microsoft.com/en-us/azure/azure-sql/database/dynamic-data-masking-overview
Incorrect
Dynamic data masking helps prevent unauthorized access to sensitive data by enabling customers to designate how much of the sensitive data to reveal with minimal impact on the application layer. It?s a policy-based security feature that hides the sensitive data in the result set of a query over designated database fields, while the data in the database is not changed. Reference: https://docs.microsoft.com/en-us/azure/azure-sql/database/dynamic-data-masking-overview
Unattempted
Dynamic data masking helps prevent unauthorized access to sensitive data by enabling customers to designate how much of the sensitive data to reveal with minimal impact on the application layer. It?s a policy-based security feature that hides the sensitive data in the result set of a query over designated database fields, while the data in the database is not changed. Reference: https://docs.microsoft.com/en-us/azure/azure-sql/database/dynamic-data-masking-overview
Question 8 of 75
8. Question
A company purchases IoT devices to monitor manufacturing machinery. The company uses an Azure IoT Hub to communicate with the IoT devices. The company must be able to monitor the devices in real-time. You need to design the solution. What should you recommend?
Correct
In a real-world scenario, you could have hundreds of these sensors generating events as a stream. Ideally, a gateway device would run code to push these events to Azure Event Hubs or Azure IoT Hubs. Your Stream Analytics job would ingest these events from Event Hubs and run real-time analytics queries against the streams. Create a Stream Analytics job: In the Azure portal, select + Create a resource from the left navigation menu. Then, select Stream Analytics job from Analytics. Reference: https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-get-started-with-azure-stream-analytics-to-process-data-from-iot-devices
Incorrect
In a real-world scenario, you could have hundreds of these sensors generating events as a stream. Ideally, a gateway device would run code to push these events to Azure Event Hubs or Azure IoT Hubs. Your Stream Analytics job would ingest these events from Event Hubs and run real-time analytics queries against the streams. Create a Stream Analytics job: In the Azure portal, select + Create a resource from the left navigation menu. Then, select Stream Analytics job from Analytics. Reference: https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-get-started-with-azure-stream-analytics-to-process-data-from-iot-devices
Unattempted
In a real-world scenario, you could have hundreds of these sensors generating events as a stream. Ideally, a gateway device would run code to push these events to Azure Event Hubs or Azure IoT Hubs. Your Stream Analytics job would ingest these events from Event Hubs and run real-time analytics queries against the streams. Create a Stream Analytics job: In the Azure portal, select + Create a resource from the left navigation menu. Then, select Stream Analytics job from Analytics. Reference: https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-get-started-with-azure-stream-analytics-to-process-data-from-iot-devices
Question 9 of 75
9. Question
You are designing a dimension table for a data warehouse. The table will track the value of the dimension attributes over time and preserve the history of the data by adding new rows as the data changes. Which type of slowly changing dimension (SCD) should you use?
Correct
A Type 2 SCD supports versioning of dimension members. Often the source system doesn‘t store versions, so the data warehouse load process detects and manages changes in a dimension table. In this case, the dimension table must use a surrogate key to provide a unique reference to a version of the dimension member. It also includes columns that define the date range validity of the version (for example, StartDate and EndDate) and possibly a flag column (for example, IsCurrent) to easily filter by current dimension members. Incorrect Answers: B: A Type 1 SCD always reflects the latest values, and when changes in source data are detected, the dimension table data is overwritten. D: A Type 3 SCD supports storing two versions of a dimension member as separate columns. The table includes a column for the current value of a member plus either the original or previous value of the member. So Type 3 uses additional columns to track one key instance of history, rather than storing additional rows to track each change like in a Type 2 SCD. Reference: https://docs.microsoft.com/en-us/learn/modules/populate-slowly-changing-dimensions-azure-synapse-analytics-pipelines/3-choose-between-dimension-types
Incorrect
A Type 2 SCD supports versioning of dimension members. Often the source system doesn‘t store versions, so the data warehouse load process detects and manages changes in a dimension table. In this case, the dimension table must use a surrogate key to provide a unique reference to a version of the dimension member. It also includes columns that define the date range validity of the version (for example, StartDate and EndDate) and possibly a flag column (for example, IsCurrent) to easily filter by current dimension members. Incorrect Answers: B: A Type 1 SCD always reflects the latest values, and when changes in source data are detected, the dimension table data is overwritten. D: A Type 3 SCD supports storing two versions of a dimension member as separate columns. The table includes a column for the current value of a member plus either the original or previous value of the member. So Type 3 uses additional columns to track one key instance of history, rather than storing additional rows to track each change like in a Type 2 SCD. Reference: https://docs.microsoft.com/en-us/learn/modules/populate-slowly-changing-dimensions-azure-synapse-analytics-pipelines/3-choose-between-dimension-types
Unattempted
A Type 2 SCD supports versioning of dimension members. Often the source system doesn‘t store versions, so the data warehouse load process detects and manages changes in a dimension table. In this case, the dimension table must use a surrogate key to provide a unique reference to a version of the dimension member. It also includes columns that define the date range validity of the version (for example, StartDate and EndDate) and possibly a flag column (for example, IsCurrent) to easily filter by current dimension members. Incorrect Answers: B: A Type 1 SCD always reflects the latest values, and when changes in source data are detected, the dimension table data is overwritten. D: A Type 3 SCD supports storing two versions of a dimension member as separate columns. The table includes a column for the current value of a member plus either the original or previous value of the member. So Type 3 uses additional columns to track one key instance of history, rather than storing additional rows to track each change like in a Type 2 SCD. Reference: https://docs.microsoft.com/en-us/learn/modules/populate-slowly-changing-dimensions-azure-synapse-analytics-pipelines/3-choose-between-dimension-types
Question 10 of 75
10. Question
You have data stored in thousands of CSV files in Azure Data Lake Storage Gen2. Each file has a header row followed by a properly formatted carriage return (/ r) and line feed (/n).
You are implementing a pattern that batch loads the files daily into an enterprise data warehouse in Azure Synapse Analytics by using PolyBase.
You need to skip the header row when you import the files into the data warehouse. Before building the loading pattern, you need to prepare the required database objects in Azure Synapse Analytics.
Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.
NOTE: Each correct selection is worth one point
Select and Place:
Correct
“Create external table as select (CETAS)“ makes no sense in this case because we would need to include a Select to fill out the external table, however this data must come from files and not from other tables. In this case It‘s not the same an “external table“ as an “external table as select“, the first one the data come from files and the second one the data come from a SQL query to export them into files. https://docs.microsoft.com/en-us/azure/synapse-analytics/sql/develop-tables-external-tables?tabs=hadoop#create-external-data-source
Incorrect
“Create external table as select (CETAS)“ makes no sense in this case because we would need to include a Select to fill out the external table, however this data must come from files and not from other tables. In this case It‘s not the same an “external table“ as an “external table as select“, the first one the data come from files and the second one the data come from a SQL query to export them into files. https://docs.microsoft.com/en-us/azure/synapse-analytics/sql/develop-tables-external-tables?tabs=hadoop#create-external-data-source
Unattempted
“Create external table as select (CETAS)“ makes no sense in this case because we would need to include a Select to fill out the external table, however this data must come from files and not from other tables. In this case It‘s not the same an “external table“ as an “external table as select“, the first one the data come from files and the second one the data come from a SQL query to export them into files. https://docs.microsoft.com/en-us/azure/synapse-analytics/sql/develop-tables-external-tables?tabs=hadoop#create-external-data-source
Question 11 of 75
11. Question
You are building an Azure Synapse Analytics dedicated SQL pool that will contain a fact table for transactions from the first half of the year 2020.
You need to ensure that the table meets the following requirements:
Minimizes the processing time to delete data that is older than 10 years
Minimizes the I/O for queries that use year-to-date values
How should you complete the Transact-SQL statement? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Hot Area:
Correct
Box 1: PARTITION –
RANGE RIGHT FOR VALUES is used with PARTITION.
Part 2: [TransactionDateID]
Partition on the date column.
Example: Creating a RANGE RIGHT partition function on a datetime column
The following partition function partitions a table or index into 12 partitions, one for each month of a year‘s worth of values in a datetime column.
CREATE PARTITION FUNCTION [myDateRangePF1] (datetime)
AS RANGE RIGHT FOR VALUES (‘20030201‘, ‘20030301‘, ‘20030401‘,
‘20030501‘, ‘20030601‘, ‘20030701‘, ‘20030801‘,
‘20030901‘, ‘20031001‘, ‘20031101‘, ‘20031201‘);
Reference: https://docs.microsoft.com/en-us/sql/t-sql/statements/create-partition-function-transact-sql
Incorrect
Box 1: PARTITION –
RANGE RIGHT FOR VALUES is used with PARTITION.
Part 2: [TransactionDateID]
Partition on the date column.
Example: Creating a RANGE RIGHT partition function on a datetime column
The following partition function partitions a table or index into 12 partitions, one for each month of a year‘s worth of values in a datetime column.
CREATE PARTITION FUNCTION [myDateRangePF1] (datetime)
AS RANGE RIGHT FOR VALUES (‘20030201‘, ‘20030301‘, ‘20030401‘,
‘20030501‘, ‘20030601‘, ‘20030701‘, ‘20030801‘,
‘20030901‘, ‘20031001‘, ‘20031101‘, ‘20031201‘);
Reference: https://docs.microsoft.com/en-us/sql/t-sql/statements/create-partition-function-transact-sql
Unattempted
Box 1: PARTITION –
RANGE RIGHT FOR VALUES is used with PARTITION.
Part 2: [TransactionDateID]
Partition on the date column.
Example: Creating a RANGE RIGHT partition function on a datetime column
The following partition function partitions a table or index into 12 partitions, one for each month of a year‘s worth of values in a datetime column.
CREATE PARTITION FUNCTION [myDateRangePF1] (datetime)
AS RANGE RIGHT FOR VALUES (‘20030201‘, ‘20030301‘, ‘20030401‘,
‘20030501‘, ‘20030601‘, ‘20030701‘, ‘20030801‘,
‘20030901‘, ‘20031001‘, ‘20031101‘, ‘20031201‘);
Reference: https://docs.microsoft.com/en-us/sql/t-sql/statements/create-partition-function-transact-sql
Question 12 of 75
12. Question
You are performing exploratory analysis of the bus fare data in an Azure Data Lake Storage Gen2 account by using an Azure Synapse Analytics serverless SQL pool.
You execute the Transact-SQL query shown in the following exhibit.
What do the query results include?
Correct
Only CSV that have file names that beginning with “tripdata_2020“.
Incorrect
Only CSV that have file names that beginning with “tripdata_2020“.
Unattempted
Only CSV that have file names that beginning with “tripdata_2020“.
Question 13 of 75
13. Question
You are processing streaming data from vehicles that pass through a toll booth.
You need to use Azure Stream Analytics to return the license plate, vehicle make, and hour the last vehicle passed during each 10-minute window.
How should you complete the query? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Hot Area:
Correct
Box 1: MAX –
The first step on the query finds the maximum time stamp in 10-minute windows, that is the time stamp of the last event for that window. The second step joins the results of the first query with the original stream to find the event that match the last time stamps in each window.
Query:
WITH LastInWindow AS –
(
SELECT –
MAX(Time) AS LastEventTime –
FROM –
Input TIMESTAMP BY Time –
GROUP BY –
TumblingWindow(minute, 10)
)
SELECT –
Input.License_plate,
Input.Make,
Input.Time –
FROM –
Input TIMESTAMP BY Time –
INNER JOIN LastInWindow –
ON DATEDIFF(minute, Input, LastInWindow) BETWEEN 0 AND 10
AND Input.Time = LastInWindow.LastEventTime
Box 2: TumblingWindow –
Tumbling windows are a series of fixed-sized, non-overlapping and contiguous time intervals.
Box 3: DATEDIFF –
DATEDIFF is a date-specific function that compares and returns the time difference between two DateTime fields, for more information, refer to date functions.
Reference: https://docs.microsoft.com/en-us/stream-analytics-query/tumbling-window-azure-stream-analytics
Incorrect
Box 1: MAX –
The first step on the query finds the maximum time stamp in 10-minute windows, that is the time stamp of the last event for that window. The second step joins the results of the first query with the original stream to find the event that match the last time stamps in each window.
Query:
WITH LastInWindow AS –
(
SELECT –
MAX(Time) AS LastEventTime –
FROM –
Input TIMESTAMP BY Time –
GROUP BY –
TumblingWindow(minute, 10)
)
SELECT –
Input.License_plate,
Input.Make,
Input.Time –
FROM –
Input TIMESTAMP BY Time –
INNER JOIN LastInWindow –
ON DATEDIFF(minute, Input, LastInWindow) BETWEEN 0 AND 10
AND Input.Time = LastInWindow.LastEventTime
Box 2: TumblingWindow –
Tumbling windows are a series of fixed-sized, non-overlapping and contiguous time intervals.
Box 3: DATEDIFF –
DATEDIFF is a date-specific function that compares and returns the time difference between two DateTime fields, for more information, refer to date functions.
Reference: https://docs.microsoft.com/en-us/stream-analytics-query/tumbling-window-azure-stream-analytics
Unattempted
Box 1: MAX –
The first step on the query finds the maximum time stamp in 10-minute windows, that is the time stamp of the last event for that window. The second step joins the results of the first query with the original stream to find the event that match the last time stamps in each window.
Query:
WITH LastInWindow AS –
(
SELECT –
MAX(Time) AS LastEventTime –
FROM –
Input TIMESTAMP BY Time –
GROUP BY –
TumblingWindow(minute, 10)
)
SELECT –
Input.License_plate,
Input.Make,
Input.Time –
FROM –
Input TIMESTAMP BY Time –
INNER JOIN LastInWindow –
ON DATEDIFF(minute, Input, LastInWindow) BETWEEN 0 AND 10
AND Input.Time = LastInWindow.LastEventTime
Box 2: TumblingWindow –
Tumbling windows are a series of fixed-sized, non-overlapping and contiguous time intervals.
Box 3: DATEDIFF –
DATEDIFF is a date-specific function that compares and returns the time difference between two DateTime fields, for more information, refer to date functions.
Reference: https://docs.microsoft.com/en-us/stream-analytics-query/tumbling-window-azure-stream-analytics
Question 14 of 75
14. Question
A company plans to use Platform-as-a-Service (PaaS) to create the new data pipeline process. The process must meet the following requirements:
Ingest:
Access multiple data sources.
Provide the ability to orchestrate workflow.
Provide the capability to run SQL Server Integration Services packages.
Store:
Optimize storage for big data workloads.
Provide encryption of data at rest.
Operate with no size limits.
Prepare and Train:
Provide a fully-managed and interactive workspace for exploration and visualization.
Provide the ability to program in R, SQL, Python, Scala, and Java.
Provide seamless user authentication with Azure Active Directory.
Model & Serve:
Implement native columnar storage.
Support for the SQL language
Provide support for structured streaming.
You need to build the data integration pipeline.
Which technologies should you use? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Hot Area:
Correct
Ingest: Azure Data Factory –
Azure Data Factory pipelines can execute SSIS packages.
In Azure, the following services and tools will meet the core requirements for pipeline orchestration, control flow, and data movement: Azure Data Factory, Oozie on HDInsight, and SQL Server Integration Services (SSIS).
Store: Data Lake Storage –
Data Lake Storage Gen1 provides unlimited storage.
Note: Data at rest includes information that resides in persistent storage on physical media, in any digital format. Microsoft Azure offers a variety of data storage solutions to meet different needs, including file, disk, blob, and table storage. Microsoft also provides encryption to protect Azure SQL Database, Azure Cosmos
DB, and Azure Data Lake.
Prepare and Train: Azure Databricks
Azure Databricks provides enterprise-grade Azure security, including Azure Active Directory integration.
With Azure Databricks, you can set up your Apache Spark environment in minutes, autoscale and collaborate on shared projects in an interactive workspace.
Azure Databricks supports Python, Scala, R, Java and SQL, as well as data science frameworks and libraries including TensorFlow, PyTorch and scikit-learn.
Model and Serve: Azure Synapse Analytics
Azure Synapse Analytics/ SQL Data Warehouse stores data into relational tables with columnar storage.
Azure SQL Data Warehouse connector now offers efficient and scalable structured streaming write support for SQL Data Warehouse. Access SQL Data
Warehouse from Azure Databricks using the SQL Data Warehouse connector.
Note: As of November 2019, Azure SQL Data Warehouse is now Azure Synapse Analytics.
Reference: https://docs.microsoft.com/bs-latn-ba/azure/architecture/data-guide/technology-choices/pipeline-orchestration-data-movement https://docs.microsoft.com/en-us/azure/azure-databricks/what-is-azure-databricks
Incorrect
Ingest: Azure Data Factory –
Azure Data Factory pipelines can execute SSIS packages.
In Azure, the following services and tools will meet the core requirements for pipeline orchestration, control flow, and data movement: Azure Data Factory, Oozie on HDInsight, and SQL Server Integration Services (SSIS).
Store: Data Lake Storage –
Data Lake Storage Gen1 provides unlimited storage.
Note: Data at rest includes information that resides in persistent storage on physical media, in any digital format. Microsoft Azure offers a variety of data storage solutions to meet different needs, including file, disk, blob, and table storage. Microsoft also provides encryption to protect Azure SQL Database, Azure Cosmos
DB, and Azure Data Lake.
Prepare and Train: Azure Databricks
Azure Databricks provides enterprise-grade Azure security, including Azure Active Directory integration.
With Azure Databricks, you can set up your Apache Spark environment in minutes, autoscale and collaborate on shared projects in an interactive workspace.
Azure Databricks supports Python, Scala, R, Java and SQL, as well as data science frameworks and libraries including TensorFlow, PyTorch and scikit-learn.
Model and Serve: Azure Synapse Analytics
Azure Synapse Analytics/ SQL Data Warehouse stores data into relational tables with columnar storage.
Azure SQL Data Warehouse connector now offers efficient and scalable structured streaming write support for SQL Data Warehouse. Access SQL Data
Warehouse from Azure Databricks using the SQL Data Warehouse connector.
Note: As of November 2019, Azure SQL Data Warehouse is now Azure Synapse Analytics.
Reference: https://docs.microsoft.com/bs-latn-ba/azure/architecture/data-guide/technology-choices/pipeline-orchestration-data-movement https://docs.microsoft.com/en-us/azure/azure-databricks/what-is-azure-databricks
Unattempted
Ingest: Azure Data Factory –
Azure Data Factory pipelines can execute SSIS packages.
In Azure, the following services and tools will meet the core requirements for pipeline orchestration, control flow, and data movement: Azure Data Factory, Oozie on HDInsight, and SQL Server Integration Services (SSIS).
Store: Data Lake Storage –
Data Lake Storage Gen1 provides unlimited storage.
Note: Data at rest includes information that resides in persistent storage on physical media, in any digital format. Microsoft Azure offers a variety of data storage solutions to meet different needs, including file, disk, blob, and table storage. Microsoft also provides encryption to protect Azure SQL Database, Azure Cosmos
DB, and Azure Data Lake.
Prepare and Train: Azure Databricks
Azure Databricks provides enterprise-grade Azure security, including Azure Active Directory integration.
With Azure Databricks, you can set up your Apache Spark environment in minutes, autoscale and collaborate on shared projects in an interactive workspace.
Azure Databricks supports Python, Scala, R, Java and SQL, as well as data science frameworks and libraries including TensorFlow, PyTorch and scikit-learn.
Model and Serve: Azure Synapse Analytics
Azure Synapse Analytics/ SQL Data Warehouse stores data into relational tables with columnar storage.
Azure SQL Data Warehouse connector now offers efficient and scalable structured streaming write support for SQL Data Warehouse. Access SQL Data
Warehouse from Azure Databricks using the SQL Data Warehouse connector.
Note: As of November 2019, Azure SQL Data Warehouse is now Azure Synapse Analytics.
Reference: https://docs.microsoft.com/bs-latn-ba/azure/architecture/data-guide/technology-choices/pipeline-orchestration-data-movement https://docs.microsoft.com/en-us/azure/azure-databricks/what-is-azure-databricks
Question 15 of 75
15. Question
You have an enterprise-wide Azure Data Lake Storage Gen2 account. The data lake is accessible only through an Azure virtual network named VNET1. You are building a SQL pool in Azure Synapse that will use data from the data lake. Your company has a sales team. All the members of the sales team are in an Azure Active Directory group named Sales. POSIX controls are used to assign the Sales group access to the files in the data lake. You plan to load data to the SQL pool every hour. You need to ensure that the SQL pool can load the sales data from the data lake. Which three actions should you perform? Each correct answer presents part of the solution. NOTE: Each area selection is worth one point.
You have an Azure Synapse Analytics dedicated SQL pool that contains the users shown in the following table.
User1 executes a query on the database, and the query returns the results shown in the following exhibit.
User1 is the only user who has access to the unmasked data.
Use the drop-down menus to select the answer choice that completes each statement based on the information presented in the graphic.
NOTE: Each correct selection is worth one point.
Hot Area:
Correct
Box 1: 0 –
The YearlyIncome column is of the money data type.
The Default masking function: Full masking according to the data types of the designated fields
Use a zero value for numeric data types (bigint, bit, decimal, int, money, numeric, smallint, smallmoney, tinyint, float, real).
Box 2: the values stored in the database
Users with administrator privileges are always excluded from masking, and see the original data without any mask.
Reference: https://docs.microsoft.com/en-us/azure/azure-sql/database/dynamic-data-masking-overview
Incorrect
Box 1: 0 –
The YearlyIncome column is of the money data type.
The Default masking function: Full masking according to the data types of the designated fields
Use a zero value for numeric data types (bigint, bit, decimal, int, money, numeric, smallint, smallmoney, tinyint, float, real).
Box 2: the values stored in the database
Users with administrator privileges are always excluded from masking, and see the original data without any mask.
Reference: https://docs.microsoft.com/en-us/azure/azure-sql/database/dynamic-data-masking-overview
Unattempted
Box 1: 0 –
The YearlyIncome column is of the money data type.
The Default masking function: Full masking according to the data types of the designated fields
Use a zero value for numeric data types (bigint, bit, decimal, int, money, numeric, smallint, smallmoney, tinyint, float, real).
Box 2: the values stored in the database
Users with administrator privileges are always excluded from masking, and see the original data without any mask.
Reference: https://docs.microsoft.com/en-us/azure/azure-sql/database/dynamic-data-masking-overview
Question 17 of 75
17. Question
You have an enterprise data warehouse in Azure Synapse Analytics.
Using PolyBase, you create an external table named [Ext].[Items] to query Parquet files stored in Azure Data Lake Storage Gen2 without importing the data to the data warehouse.
The external table has three columns.
You discover that the Parquet files have a fourth column named ItemID.
Which command should you run to add the ItemID column to the external table?
You have an Azure Data Lake Storage Gen2 container that contains 100 TB of data. You need to ensure that the data in the container is available for read workloads in a secondary region if an outage occurs in the primary region. The solution must minimize costs. Which type of data redundancy should you use?
Correct
Geo-redundant storage (with GRS or GZRS) replicates your data to another physical location in the secondary region to protect against regional outages. However, that data is available to be read only if the customer or Microsoft initiates a failover from the primary to secondary region. When you enable read access to the secondary region, your data is available to be read at all times, including in a situation where the primary region becomes unavailable. Incorrect Answers: A: While Geo-redundant storage (GRS) is cheaper than Read-Access Geo-Redundant Storage (RA-GRS), GRS does NOT initiate automatic failover. C, D: Locally redundant storage (LRS) and Zone-redundant storage (ZRS) provides redundancy within a single region. Reference: https://docs.microsoft.com/en-us/azure/storage/common/storage-redundancy
Incorrect
Geo-redundant storage (with GRS or GZRS) replicates your data to another physical location in the secondary region to protect against regional outages. However, that data is available to be read only if the customer or Microsoft initiates a failover from the primary to secondary region. When you enable read access to the secondary region, your data is available to be read at all times, including in a situation where the primary region becomes unavailable. Incorrect Answers: A: While Geo-redundant storage (GRS) is cheaper than Read-Access Geo-Redundant Storage (RA-GRS), GRS does NOT initiate automatic failover. C, D: Locally redundant storage (LRS) and Zone-redundant storage (ZRS) provides redundancy within a single region. Reference: https://docs.microsoft.com/en-us/azure/storage/common/storage-redundancy
Unattempted
Geo-redundant storage (with GRS or GZRS) replicates your data to another physical location in the secondary region to protect against regional outages. However, that data is available to be read only if the customer or Microsoft initiates a failover from the primary to secondary region. When you enable read access to the secondary region, your data is available to be read at all times, including in a situation where the primary region becomes unavailable. Incorrect Answers: A: While Geo-redundant storage (GRS) is cheaper than Read-Access Geo-Redundant Storage (RA-GRS), GRS does NOT initiate automatic failover. C, D: Locally redundant storage (LRS) and Zone-redundant storage (ZRS) provides redundancy within a single region. Reference: https://docs.microsoft.com/en-us/azure/storage/common/storage-redundancy
Question 19 of 75
19. Question
You plan to implement an Azure Data Lake Gen 2 storage account. You need to ensure that the data lake will remain available if a data center fails in the primary Azure region. The solution must minimize costs. Which type of replication should you use for the storage account?
Correct
Locally redundant storage (LRS) replicates your data three times within a single data center in the primary region. We need to have our files in an other datacenter and the cheapest is ZRS LRS is the lowest-cost redundancy option and offers the least durability compared to other options. LRS protects your data against server rack and drive failures. However, if a disaster such as fire or flooding occurs within the data center, all replicas of a storage account using LRS may be lost or unrecoverable. To mitigate this risk, Microsoft recommends using zone-redundant storage (ZRS), geo-redundant storage (GRS), or geo-zone-redundant storage (GZRS). Reference: https://docs.microsoft.com/en-us/azure/storage/common/storage-redundancy#locally-redundant-storage
Incorrect
Locally redundant storage (LRS) replicates your data three times within a single data center in the primary region. We need to have our files in an other datacenter and the cheapest is ZRS LRS is the lowest-cost redundancy option and offers the least durability compared to other options. LRS protects your data against server rack and drive failures. However, if a disaster such as fire or flooding occurs within the data center, all replicas of a storage account using LRS may be lost or unrecoverable. To mitigate this risk, Microsoft recommends using zone-redundant storage (ZRS), geo-redundant storage (GRS), or geo-zone-redundant storage (GZRS). Reference: https://docs.microsoft.com/en-us/azure/storage/common/storage-redundancy#locally-redundant-storage
Unattempted
Locally redundant storage (LRS) replicates your data three times within a single data center in the primary region. We need to have our files in an other datacenter and the cheapest is ZRS LRS is the lowest-cost redundancy option and offers the least durability compared to other options. LRS protects your data against server rack and drive failures. However, if a disaster such as fire or flooding occurs within the data center, all replicas of a storage account using LRS may be lost or unrecoverable. To mitigate this risk, Microsoft recommends using zone-redundant storage (ZRS), geo-redundant storage (GRS), or geo-zone-redundant storage (GZRS). Reference: https://docs.microsoft.com/en-us/azure/storage/common/storage-redundancy#locally-redundant-storage
Question 20 of 75
20. Question
You have a SQL pool in Azure Synapse.
You plan to load data from Azure Blob storage to a staging table. Approximately 1 million rows of data will be loaded daily. The table will be truncated before each daily load.
You need to create the staging table. The solution must minimize how long it takes to load the data to the staging table.
How should you configure the table? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Hot Area:
You are designing a fact table named FactPurchase in an Azure Synapse Analytics dedicated SQL pool. The table contains purchases from suppliers for a retail store. FactPurchase will contain the following columns.
FactPurchase will have 1 million rows of data added daily and will contain three years of data.
Transact-SQL queries similar to the following query will be executed daily.
SELECT –
SupplierKey, StockItemKey, IsOrderFinalized, COUNT(*)
FROM FactPurchase –
WHERE DateKey >= 20210101 –
AND DateKey <= 20210131 –
GROUP By SupplierKey, StockItemKey, IsOrderFinalized
Which table distribution will minimize query times?
Correct
Hash-distributed tables improve query performance on large fact tables.
To balance the parallel processing, select a distribution column that:
Has many unique values. The column can have duplicate values. All rows with the same value are assigned to the same distribution. Since there are 60 distributions, some distributions can have > 1 unique values while others may end with zero values.
Does not have NULLs, or has only a few NULLs.
Is not a date column.
Incorrect Answers:
C: Round-robin tables are useful for improving loading speed.
Reference: https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-distribute
Incorrect
Hash-distributed tables improve query performance on large fact tables.
To balance the parallel processing, select a distribution column that:
Has many unique values. The column can have duplicate values. All rows with the same value are assigned to the same distribution. Since there are 60 distributions, some distributions can have > 1 unique values while others may end with zero values.
Does not have NULLs, or has only a few NULLs.
Is not a date column.
Incorrect Answers:
C: Round-robin tables are useful for improving loading speed.
Reference: https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-distribute
Unattempted
Hash-distributed tables improve query performance on large fact tables.
To balance the parallel processing, select a distribution column that:
Has many unique values. The column can have duplicate values. All rows with the same value are assigned to the same distribution. Since there are 60 distributions, some distributions can have > 1 unique values while others may end with zero values.
Does not have NULLs, or has only a few NULLs.
Is not a date column.
Incorrect Answers:
C: Round-robin tables are useful for improving loading speed.
Reference: https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-distribute
Question 22 of 75
22. Question
From a website analytics system, you receive data extracts about user interactions such as downloads, link clicks, form submissions, and video plays.
The data contains the following columns.
You need to design a star schema to support analytical queries of the data. The star schema will contain four tables including a date dimension.
To which table should you add each column? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Hot Area:
Correct
Box 1: DimEvent –
Box 2: DimChannel –
Box 3: FactEvents –
Fact tables store observations or events, and can be sales orders, stock balances, exchange rates, temperatures, etc
Reference: https://docs.microsoft.com/en-us/power-bi/guidance/star-schema
Incorrect
Box 1: DimEvent –
Box 2: DimChannel –
Box 3: FactEvents –
Fact tables store observations or events, and can be sales orders, stock balances, exchange rates, temperatures, etc
Reference: https://docs.microsoft.com/en-us/power-bi/guidance/star-schema
Unattempted
Box 1: DimEvent –
Box 2: DimChannel –
Box 3: FactEvents –
Fact tables store observations or events, and can be sales orders, stock balances, exchange rates, temperatures, etc
Reference: https://docs.microsoft.com/en-us/power-bi/guidance/star-schema
Question 23 of 75
23. Question
You have an Azure Storage account that contains 100 GB of files. The files contain rows of text and numerical values. 75% of the rows contain description data that has an average length of 1.1 MB. You plan to copy the data from the storage account to an enterprise data warehouse in Azure Synapse Analytics. You need to prepare the files to ensure that the data copies quickly. Solution: You convert the files to compressed delimited text files. Does this meet the goal?
You have an Azure Storage account that contains 100 GB of files. The files contain rows of text and numerical values. 75% of the rows contain description data that has an average length of 1.1 MB. You plan to copy the data from the storage account to an enterprise data warehouse in Azure Synapse Analytics. You need to prepare the files to ensure that the data copies quickly. Solution: You copy the files to a table that has a columnstore index. Does this meet the goal?
You have an Azure Storage account that contains 100 GB of files. The files contain rows of text and numerical values. 75% of the rows contain description data that has an average length of 1.1 MB. You plan to copy the data from the storage account to an enterprise data warehouse in Azure Synapse Analytics. You need to prepare the files to ensure that the data copies quickly. Solution: You modify the files to ensure that each row is more than 1 MB. Does this meet the goal?
You build a data warehouse in an Azure Synapse Analytics dedicated SQL pool. Analysts write a complex SELECT query that contains multiple JOIN and CASE statements to transform data for use in inventory reports. The inventory reports will use the data and additional WHERE parameters depending on the report. The reports will be produced once daily. You need to implement a solution to make the dataset available for the reports. The solution must minimize query times. What should you implement?
Correct
Materialized views for dedicated SQL pools in Azure Synapse provide a low maintenance method for complex analytical queries to get fast performance without any query change. Incorrect Answers: C: One daily execution does not make use of result cache caching. Note: When result set caching is enabled, dedicated SQL pool automatically caches query results in the user database for repetitive use. This allows subsequent query executions to get results directly from the persisted cache so recomputation is not needed. Result set caching improves query performance and reduces compute resource usage. In addition, queries using cached results set do not use any concurrency slots and thus do not count against existing concurrency limits. Reference: https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/performance-tuning-materialized-views https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/performance-tuning-result-set-caching
Incorrect
Materialized views for dedicated SQL pools in Azure Synapse provide a low maintenance method for complex analytical queries to get fast performance without any query change. Incorrect Answers: C: One daily execution does not make use of result cache caching. Note: When result set caching is enabled, dedicated SQL pool automatically caches query results in the user database for repetitive use. This allows subsequent query executions to get results directly from the persisted cache so recomputation is not needed. Result set caching improves query performance and reduces compute resource usage. In addition, queries using cached results set do not use any concurrency slots and thus do not count against existing concurrency limits. Reference: https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/performance-tuning-materialized-views https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/performance-tuning-result-set-caching
Unattempted
Materialized views for dedicated SQL pools in Azure Synapse provide a low maintenance method for complex analytical queries to get fast performance without any query change. Incorrect Answers: C: One daily execution does not make use of result cache caching. Note: When result set caching is enabled, dedicated SQL pool automatically caches query results in the user database for repetitive use. This allows subsequent query executions to get results directly from the persisted cache so recomputation is not needed. Result set caching improves query performance and reduces compute resource usage. In addition, queries using cached results set do not use any concurrency slots and thus do not count against existing concurrency limits. Reference: https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/performance-tuning-materialized-views https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/performance-tuning-result-set-caching
Question 27 of 75
27. Question
You have an Azure Synapse Analytics workspace named WS1 that contains an Apache Spark pool named Pool1. You plan to create a database named DB1 in Pool1. You need to ensure that when tables are created in DB1, the tables are available automatically as external tables to the built-in serverless SQL pool. Which format should you use for the tables in DB1?
You are planning a solution to aggregate streaming data that originates in Apache Kafka and is output to Azure Data Lake Storage Gen2. The developers who will implement the stream processing solution use Java.
Which service should you recommend using to process the streaming data?
Correct
The following tables summarize the key differences in capabilities for stream processing technologies in Azure.
General capabilities –
You plan to implement an Azure Data Lake Storage Gen2 container that will contain CSV files. The size of the files will vary based on the number of events that occur per hour. File sizes range from 4 KB to 5 GB. You need to ensure that the files stored in the container are optimized for batch processing. What should you do?
Correct
Avro supports batch and is very relevant for streaming. Note: Avro is framework developed within Apache Hadoop project. It is a row-based storage format which is widely used as a serialization process. AVRO stores its schema in JSON format making it easy to read and interpret by any program. The data itself is stored in binary format by doing it compact and efficient. Reference: https://www.adaltas.com/en/2020/07/23/benchmark-study-of-different-file-format/
Incorrect
Avro supports batch and is very relevant for streaming. Note: Avro is framework developed within Apache Hadoop project. It is a row-based storage format which is widely used as a serialization process. AVRO stores its schema in JSON format making it easy to read and interpret by any program. The data itself is stored in binary format by doing it compact and efficient. Reference: https://www.adaltas.com/en/2020/07/23/benchmark-study-of-different-file-format/
Unattempted
Avro supports batch and is very relevant for streaming. Note: Avro is framework developed within Apache Hadoop project. It is a row-based storage format which is widely used as a serialization process. AVRO stores its schema in JSON format making it easy to read and interpret by any program. The data itself is stored in binary format by doing it compact and efficient. Reference: https://www.adaltas.com/en/2020/07/23/benchmark-study-of-different-file-format/
Question 30 of 75
30. Question
You store files in an Azure Data Lake Storage Gen2 container. The container has the storage policy shown in the following exhibit.
Use the drop-down menus to select the answer choice that completes each statement based on the information presented in the graphic.
NOTE: Each correct selection is worth one point.
Hot Area:
Correct
Box 1: moved to cool storage –
The ManagementPolicyBaseBlob.TierToCool property gets or sets the function to tier blobs to cool storage. Support blobs currently at Hot tier.
Box 2: container1/contoso.csv –
As defined by prefixMatch.
prefixMatch: An array of strings for prefixes to be matched. Each rule can define up to 10 case-senstive prefixes. A prefix string must start with a container name.
Reference: https://docs.microsoft.com/en-us/dotnet/api/microsoft.azure.management.storage.fluent.models.managementpolicybaseblob.tiertocool
Incorrect
Box 1: moved to cool storage –
The ManagementPolicyBaseBlob.TierToCool property gets or sets the function to tier blobs to cool storage. Support blobs currently at Hot tier.
Box 2: container1/contoso.csv –
As defined by prefixMatch.
prefixMatch: An array of strings for prefixes to be matched. Each rule can define up to 10 case-senstive prefixes. A prefix string must start with a container name.
Reference: https://docs.microsoft.com/en-us/dotnet/api/microsoft.azure.management.storage.fluent.models.managementpolicybaseblob.tiertocool
Unattempted
Box 1: moved to cool storage –
The ManagementPolicyBaseBlob.TierToCool property gets or sets the function to tier blobs to cool storage. Support blobs currently at Hot tier.
Box 2: container1/contoso.csv –
As defined by prefixMatch.
prefixMatch: An array of strings for prefixes to be matched. Each rule can define up to 10 case-senstive prefixes. A prefix string must start with a container name.
Reference: https://docs.microsoft.com/en-us/dotnet/api/microsoft.azure.management.storage.fluent.models.managementpolicybaseblob.tiertocool
Question 31 of 75
31. Question
You are designing a financial transactions table in an Azure Synapse Analytics dedicated SQL pool. The table will have a clustered columnstore index and will include the following columns: TransactionType: 40 million rows per transaction type CustomerSegment: 4 million per customer segment TransactionMonth: 65 million rows per month AccountType: 500 million per account type You have the following query requirements: Analysts will most commonly analyze transactions for a given month. Transactions analysis will typically summarize transactions by transaction type, customer segment, and/or account type You need to recommend a partition strategy for the table to minimize query times. On which column should you recommend partitioning the table?
Correct
For optimal compression and performance of clustered columnstore tables, a minimum of 1 million rows per distribution and partition is needed. Before partitions are created, dedicated SQL pool already divides each table into 60 distributed databases. Example: Any partitioning added to a table is in addition to the distributions created behind the scenes. Using this example, if the sales fact table contained 36 monthly partitions, and given that a dedicated SQL pool has 60 distributions, then the sales fact table should contain 60 million rows per month, or 2.1 billion rows when all months are populated. If a table contains fewer than the recommended minimum number of rows per partition, consider using fewer partitions in order to increase the number of rows per partition.
Incorrect
For optimal compression and performance of clustered columnstore tables, a minimum of 1 million rows per distribution and partition is needed. Before partitions are created, dedicated SQL pool already divides each table into 60 distributed databases. Example: Any partitioning added to a table is in addition to the distributions created behind the scenes. Using this example, if the sales fact table contained 36 monthly partitions, and given that a dedicated SQL pool has 60 distributions, then the sales fact table should contain 60 million rows per month, or 2.1 billion rows when all months are populated. If a table contains fewer than the recommended minimum number of rows per partition, consider using fewer partitions in order to increase the number of rows per partition.
Unattempted
For optimal compression and performance of clustered columnstore tables, a minimum of 1 million rows per distribution and partition is needed. Before partitions are created, dedicated SQL pool already divides each table into 60 distributed databases. Example: Any partitioning added to a table is in addition to the distributions created behind the scenes. Using this example, if the sales fact table contained 36 monthly partitions, and given that a dedicated SQL pool has 60 distributions, then the sales fact table should contain 60 million rows per month, or 2.1 billion rows when all months are populated. If a table contains fewer than the recommended minimum number of rows per partition, consider using fewer partitions in order to increase the number of rows per partition.
Question 32 of 75
32. Question
You have an Azure Data Lake Storage Gen2 account named account1 that stores logs as shown in the following table.
You do not expect that the logs will be accessed during the retention periods.
You need to recommend a solution for account1 that meets the following requirements:
Automatically deletes the logs at the end of each retention period
Minimizes storage costs
What should you include in the recommendation? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Hot Area:
Correct
Box 1: Store the infrastructure logs in the Cool access tier and the application logs in the Archive access tier
For infrastructure logs: Cool tier – An online tier optimized for storing data that is infrequently accessed or modified. Data in the cool tier should be stored for a minimum of 30 days. The cool tier has lower storage costs and higher access costs compared to the hot tier.
For application logs: Archive tier – An offline tier optimized for storing data that is rarely accessed, and that has flexible latency requirements, on the order of hours.
Data in the archive tier should be stored for a minimum of 180 days.
Box 2: Azure Blob storage lifecycle management rules
Blob storage lifecycle management offers a rule-based policy that you can use to transition your data to the desired access tier when your specified conditions are met. You can also use lifecycle management to expire data at the end of its life.
Reference: https://docs.microsoft.com/en-us/azure/storage/blobs/access-tiers-overview
Incorrect
Box 1: Store the infrastructure logs in the Cool access tier and the application logs in the Archive access tier
For infrastructure logs: Cool tier – An online tier optimized for storing data that is infrequently accessed or modified. Data in the cool tier should be stored for a minimum of 30 days. The cool tier has lower storage costs and higher access costs compared to the hot tier.
For application logs: Archive tier – An offline tier optimized for storing data that is rarely accessed, and that has flexible latency requirements, on the order of hours.
Data in the archive tier should be stored for a minimum of 180 days.
Box 2: Azure Blob storage lifecycle management rules
Blob storage lifecycle management offers a rule-based policy that you can use to transition your data to the desired access tier when your specified conditions are met. You can also use lifecycle management to expire data at the end of its life.
Reference: https://docs.microsoft.com/en-us/azure/storage/blobs/access-tiers-overview
Unattempted
Box 1: Store the infrastructure logs in the Cool access tier and the application logs in the Archive access tier
For infrastructure logs: Cool tier – An online tier optimized for storing data that is infrequently accessed or modified. Data in the cool tier should be stored for a minimum of 30 days. The cool tier has lower storage costs and higher access costs compared to the hot tier.
For application logs: Archive tier – An offline tier optimized for storing data that is rarely accessed, and that has flexible latency requirements, on the order of hours.
Data in the archive tier should be stored for a minimum of 180 days.
Box 2: Azure Blob storage lifecycle management rules
Blob storage lifecycle management offers a rule-based policy that you can use to transition your data to the desired access tier when your specified conditions are met. You can also use lifecycle management to expire data at the end of its life.
Reference: https://docs.microsoft.com/en-us/azure/storage/blobs/access-tiers-overview
Question 33 of 75
33. Question
You have an Azure Synapse Analytics dedicated SQL pool named Pool1. Pool1 contains a partitioned fact table named dbo.Sales and a staging table named stg.Sales that has the matching table and partition definitions. You need to overwrite the content of the first partition in dbo.Sales with the content of the same partition in stg.Sales. The solution must minimize load times. What should you do?
You are designing a slowly changing dimension (SCD) for supplier data in an Azure Synapse Analytics dedicated SQL pool.
You plan to keep a record of changes to the available fields.
The supplier data contains the following columns.
Which three additional columns should you add to the data to create a Type 2 SCD? Each correct answer presents part of the solution.
NOTE: Each correct selection is worth one point.
Correct
In order to support type 2 changes, we need to add four columns to our table:
· Surrogate Key the original ID will no longer be sufficient to identify the specific record we require, we therefore need to create a new ID that the fact records can join to specifically.
· Current Flag A quick method of returning only the current version of each record
· Start Date The date from which the specific historical version is active
· End Date The date to which the specific historical version record is active https://adatis.co.uk/introduction-to-slowly-changing-dimensions-scd-types/
Incorrect
In order to support type 2 changes, we need to add four columns to our table:
· Surrogate Key the original ID will no longer be sufficient to identify the specific record we require, we therefore need to create a new ID that the fact records can join to specifically.
· Current Flag A quick method of returning only the current version of each record
· Start Date The date from which the specific historical version is active
· End Date The date to which the specific historical version record is active https://adatis.co.uk/introduction-to-slowly-changing-dimensions-scd-types/
Unattempted
In order to support type 2 changes, we need to add four columns to our table:
· Surrogate Key the original ID will no longer be sufficient to identify the specific record we require, we therefore need to create a new ID that the fact records can join to specifically.
· Current Flag A quick method of returning only the current version of each record
· Start Date The date from which the specific historical version is active
· End Date The date to which the specific historical version record is active https://adatis.co.uk/introduction-to-slowly-changing-dimensions-scd-types/
Question 35 of 75
35. Question
You have a Microsoft SQL Server database that uses a third normal form schema.
You plan to migrate the data in the database to a star schema in an Azure Synapse Analytics dedicated SQL pool.
You need to design the dimension tables. The solution must optimize read operations.
What should you include in the solution? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Hot Area:
Correct
Box 1: Denormalize to a second normal form
Denormalization is the process of transforming higher normal forms to lower normal forms via storing the join of higher normal form relations as a base relation.
Denormalization increases the performance in data retrieval at cost of bringing update anomalies to a database.
Box 2: New identity columns –
The collapsing relations strategy can be used in this step to collapse classification entities into component entities to obtain flat dimension tables with single-part keys that connect directly to the fact table. The single-part key is a surrogate key generated to ensure it remains unique over time.
Example:
Box 1: Denormalize to a second normal form
Denormalization is the process of transforming higher normal forms to lower normal forms via storing the join of higher normal form relations as a base relation.
Denormalization increases the performance in data retrieval at cost of bringing update anomalies to a database.
Box 2: New identity columns –
The collapsing relations strategy can be used in this step to collapse classification entities into component entities to obtain flat dimension tables with single-part keys that connect directly to the fact table. The single-part key is a surrogate key generated to ensure it remains unique over time.
Example:
Box 1: Denormalize to a second normal form
Denormalization is the process of transforming higher normal forms to lower normal forms via storing the join of higher normal form relations as a base relation.
Denormalization increases the performance in data retrieval at cost of bringing update anomalies to a database.
Box 2: New identity columns –
The collapsing relations strategy can be used in this step to collapse classification entities into component entities to obtain flat dimension tables with single-part keys that connect directly to the fact table. The single-part key is a surrogate key generated to ensure it remains unique over time.
Example:
You are designing a partition strategy for a fact table in an Azure Synapse Analytics dedicated SQL pool. The table has the following specifications: Contain sales data for 20,000 products. Use hash distribution on a column named ProductID. Contain 2.4 billion records for the years 2019 and 2020. Which number of partition ranges provides optimal compression and performance for the clustered columnstore index?
Correct
Each partition should have around 1 millions records. Dedication SQL pools already have 60 partitions. We have the formula: Records/(Partitions*60)= 1 million Partitions= Records/(1 million * 60) Partitions= 2.4 x 1,000,000,000/(1,000,000 * 60) = 40 Note: Having too many partitions can reduce the effectiveness of clustered columnstore indexes if each partition has fewer than 1 million rows. Dedicated SQL pools automatically partition your data into 60 databases. So, if you create a table with 100 partitions, the result will be 6000 partitions. Reference: https://docs.microsoft.com/en-us/azure/synapse-analytics/sql/best-practices-dedicated-sql-pool
Incorrect
Each partition should have around 1 millions records. Dedication SQL pools already have 60 partitions. We have the formula: Records/(Partitions*60)= 1 million Partitions= Records/(1 million * 60) Partitions= 2.4 x 1,000,000,000/(1,000,000 * 60) = 40 Note: Having too many partitions can reduce the effectiveness of clustered columnstore indexes if each partition has fewer than 1 million rows. Dedicated SQL pools automatically partition your data into 60 databases. So, if you create a table with 100 partitions, the result will be 6000 partitions. Reference: https://docs.microsoft.com/en-us/azure/synapse-analytics/sql/best-practices-dedicated-sql-pool
Unattempted
Each partition should have around 1 millions records. Dedication SQL pools already have 60 partitions. We have the formula: Records/(Partitions*60)= 1 million Partitions= Records/(1 million * 60) Partitions= 2.4 x 1,000,000,000/(1,000,000 * 60) = 40 Note: Having too many partitions can reduce the effectiveness of clustered columnstore indexes if each partition has fewer than 1 million rows. Dedicated SQL pools automatically partition your data into 60 databases. So, if you create a table with 100 partitions, the result will be 6000 partitions. Reference: https://docs.microsoft.com/en-us/azure/synapse-analytics/sql/best-practices-dedicated-sql-pool
Question 37 of 75
37. Question
You are creating dimensions for a data warehouse in an Azure Synapse Analytics dedicated SQL pool.
You create a table by using the Transact-SQL statement shown in the following exhibit.
Use the drop-down menus to select the answer choice that completes each statement based on the information presented in the graphic.
NOTE: Each correct selection is worth one point.
Hot Area:
Correct
Box 1: Type 2 –
A Type 2 SCD supports versioning of dimension members. Often the source system doesn‘t store versions, so the data warehouse load process detects and manages changes in a dimension table. In this case, the dimension table must use a surrogate key to provide a unique reference to a version of the dimension member. It also includes columns that define the date range validity of the version (for example, StartDate and EndDate) and possibly a flag column (for example,
IsCurrent) to easily filter by current dimension members.
Incorrect Answers:
A Type 1 SCD always reflects the latest values, and when changes in source data are detected, the dimension table data is overwritten.
Box 2: a surrogate key –
“In data warehousing, IDENTITY functionality is particularly important as it makes easier the creation of surrogate keys.“
Why ProductKey is certainly not a business key: “The IDENTITY value in Synapse is not guaranteed to be unique if the user explicitly inserts a duplicate value with ‘SET IDENTITY_INSERT ON‘ or reseeds IDENTITY“. Business key is an index which identifies uniqueness of a row and here Microsoft says that identity doesn‘t guarantee uniqueness.
Reference: https://azure.microsoft.com/en-us/blog/identity-now-available-with-azure-sql-data-warehouse/https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-identity
Incorrect
Box 1: Type 2 –
A Type 2 SCD supports versioning of dimension members. Often the source system doesn‘t store versions, so the data warehouse load process detects and manages changes in a dimension table. In this case, the dimension table must use a surrogate key to provide a unique reference to a version of the dimension member. It also includes columns that define the date range validity of the version (for example, StartDate and EndDate) and possibly a flag column (for example,
IsCurrent) to easily filter by current dimension members.
Incorrect Answers:
A Type 1 SCD always reflects the latest values, and when changes in source data are detected, the dimension table data is overwritten.
Box 2: a surrogate key –
“In data warehousing, IDENTITY functionality is particularly important as it makes easier the creation of surrogate keys.“
Why ProductKey is certainly not a business key: “The IDENTITY value in Synapse is not guaranteed to be unique if the user explicitly inserts a duplicate value with ‘SET IDENTITY_INSERT ON‘ or reseeds IDENTITY“. Business key is an index which identifies uniqueness of a row and here Microsoft says that identity doesn‘t guarantee uniqueness.
Reference: https://azure.microsoft.com/en-us/blog/identity-now-available-with-azure-sql-data-warehouse/https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-identity
Unattempted
Box 1: Type 2 –
A Type 2 SCD supports versioning of dimension members. Often the source system doesn‘t store versions, so the data warehouse load process detects and manages changes in a dimension table. In this case, the dimension table must use a surrogate key to provide a unique reference to a version of the dimension member. It also includes columns that define the date range validity of the version (for example, StartDate and EndDate) and possibly a flag column (for example,
IsCurrent) to easily filter by current dimension members.
Incorrect Answers:
A Type 1 SCD always reflects the latest values, and when changes in source data are detected, the dimension table data is overwritten.
Box 2: a surrogate key –
“In data warehousing, IDENTITY functionality is particularly important as it makes easier the creation of surrogate keys.“
Why ProductKey is certainly not a business key: “The IDENTITY value in Synapse is not guaranteed to be unique if the user explicitly inserts a duplicate value with ‘SET IDENTITY_INSERT ON‘ or reseeds IDENTITY“. Business key is an index which identifies uniqueness of a row and here Microsoft says that identity doesn‘t guarantee uniqueness.
Reference: https://azure.microsoft.com/en-us/blog/identity-now-available-with-azure-sql-data-warehouse/https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-identity
Question 38 of 75
38. Question
You are designing a fact table named FactPurchase in an Azure Synapse Analytics dedicated SQL pool. The table contains purchases from suppliers for a retail store. FactPurchase will contain the following columns.
\
FactPurchase will have 1 million rows of data added daily and will contain three years of data.
Transact-SQL queries similar to the following query will be executed daily.
SELECT –
SupplierKey, StockItemKey, COUNT(*)
FROM FactPurchase –
WHERE DateKey >= 20210101 –
AND DateKey <= 20210131 –
GROUP By SupplierKey, StockItemKey
Which table distribution will minimize query times?
Correct
Hash-distributed tables improve query performance on large fact tables, and are the focus of this article. Round-robin tables are useful for improving loading speed.
Incorrect:
Not D: Do not use a date column. . All data for the same date lands in the same distribution. If several users are all filtering on the same date, then only 1 of the 60 distributions do all the processing work.
Reference: https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-distribute
Incorrect
Hash-distributed tables improve query performance on large fact tables, and are the focus of this article. Round-robin tables are useful for improving loading speed.
Incorrect:
Not D: Do not use a date column. . All data for the same date lands in the same distribution. If several users are all filtering on the same date, then only 1 of the 60 distributions do all the processing work.
Reference: https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-distribute
Unattempted
Hash-distributed tables improve query performance on large fact tables, and are the focus of this article. Round-robin tables are useful for improving loading speed.
Incorrect:
Not D: Do not use a date column. . All data for the same date lands in the same distribution. If several users are all filtering on the same date, then only 1 of the 60 distributions do all the processing work.
Reference: https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-distribute
Question 39 of 75
39. Question
You need to build a solution to ensure that users can query specific files in an Azure Data Lake Storage Gen2 account from an Azure Synapse Analytics serverless SQL pool.
Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.
NOTE: More than one order of answer choices is correct. You will receive credit for any of the correct orders you select.
Select and Place:
Correct
Step 1: Create an external data source
You can create external tables in Synapse SQL pools via the following steps:
1. CREATE EXTERNAL DATA SOURCE to reference an external Azure storage and specify the credential that should be used to access the storage.
2. CREATE EXTERNAL FILE FORMAT to describe format of CSV or Parquet files.
3. CREATE EXTERNAL TABLE on top of the files placed on the data source with the same file format.
Step 2: Create an external file format object
Creating an external file format is a prerequisite for creating an external table.
Step 3: Create an external table
Reference: https://docs.microsoft.com/en-us/azure/synapse-analytics/sql/develop-tables-external-tables
Incorrect
Step 1: Create an external data source
You can create external tables in Synapse SQL pools via the following steps:
1. CREATE EXTERNAL DATA SOURCE to reference an external Azure storage and specify the credential that should be used to access the storage.
2. CREATE EXTERNAL FILE FORMAT to describe format of CSV or Parquet files.
3. CREATE EXTERNAL TABLE on top of the files placed on the data source with the same file format.
Step 2: Create an external file format object
Creating an external file format is a prerequisite for creating an external table.
Step 3: Create an external table
Reference: https://docs.microsoft.com/en-us/azure/synapse-analytics/sql/develop-tables-external-tables
Unattempted
Step 1: Create an external data source
You can create external tables in Synapse SQL pools via the following steps:
1. CREATE EXTERNAL DATA SOURCE to reference an external Azure storage and specify the credential that should be used to access the storage.
2. CREATE EXTERNAL FILE FORMAT to describe format of CSV or Parquet files.
3. CREATE EXTERNAL TABLE on top of the files placed on the data source with the same file format.
Step 2: Create an external file format object
Creating an external file format is a prerequisite for creating an external table.
Step 3: Create an external table
Reference: https://docs.microsoft.com/en-us/azure/synapse-analytics/sql/develop-tables-external-tables
Question 40 of 75
40. Question
You are designing a data mart for the human resources (HR) department at your company. The data mart will contain employee information and employee transactions. From a source system, you have a flat extract that has the following fields: EmployeeID FirstName – LastName Recipient GrossAmount TransactionID GovernmentID NetAmountPaid TransactionDate You need to design a star schema data model in an Azure Synapse Analytics dedicated SQL pool for the data mart. Which two tables should you create? Each correct answer presents part of the solution. NOTE: Each correct selection is worth one point.
Correct
C: Dimension tables contain attribute data that might change but usually changes infrequently. For example, a customer‘s name and address are stored in a dimension table and updated only when the customer‘s profile changes. To minimize the size of a large fact table, the customer‘s name and address don‘t need to be in every row of a fact table. Instead, the fact table and the dimension table can share a customer ID. A query can join the two tables to associate a customer‘s profile and transactions. E: Fact tables contain quantitative data that are commonly generated in a transactional system, and then loaded into the dedicated SQL pool. For example, a retail business generates sales transactions every day, and then loads the data into a dedicated SQL pool fact table for analysis. Reference: https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-overview
Incorrect
C: Dimension tables contain attribute data that might change but usually changes infrequently. For example, a customer‘s name and address are stored in a dimension table and updated only when the customer‘s profile changes. To minimize the size of a large fact table, the customer‘s name and address don‘t need to be in every row of a fact table. Instead, the fact table and the dimension table can share a customer ID. A query can join the two tables to associate a customer‘s profile and transactions. E: Fact tables contain quantitative data that are commonly generated in a transactional system, and then loaded into the dedicated SQL pool. For example, a retail business generates sales transactions every day, and then loads the data into a dedicated SQL pool fact table for analysis. Reference: https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-overview
Unattempted
C: Dimension tables contain attribute data that might change but usually changes infrequently. For example, a customer‘s name and address are stored in a dimension table and updated only when the customer‘s profile changes. To minimize the size of a large fact table, the customer‘s name and address don‘t need to be in every row of a fact table. Instead, the fact table and the dimension table can share a customer ID. A query can join the two tables to associate a customer‘s profile and transactions. E: Fact tables contain quantitative data that are commonly generated in a transactional system, and then loaded into the dedicated SQL pool. For example, a retail business generates sales transactions every day, and then loads the data into a dedicated SQL pool fact table for analysis. Reference: https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-overview
Question 41 of 75
41. Question
You are designing an application that will store petabytes of medical imaging data
When the data is first created, the data will be accessed frequently during the first week. After one month, the data must be accessible within 30 seconds, but files will be accessed infrequently. After one year, the data will be accessed infrequently but must be accessible within five minutes.
You need to select a storage strategy for the data. The solution must minimize costs.
If you would like to access the data in first week, which storage strategy you will use ?
Correct
Azure storage offers different access tiers, which allow you to store blob object data in the most cost-effective manner. The available access tiers include:
Hot – Optimized for storing data that is accessed frequently.
Cool – Optimized for storing data that is infrequently accessed and stored for at least 30 days.
Archive – Optimized for storing data that is rarely accessed and stored for at least 180 days with flexible latency requirements (on the order of hours). https://docs.microsoft.com/en-us/azure/storage/blobs/storage-blob-storage-tiers?tabs=azure-portal
Incorrect
Azure storage offers different access tiers, which allow you to store blob object data in the most cost-effective manner. The available access tiers include:
Hot – Optimized for storing data that is accessed frequently.
Cool – Optimized for storing data that is infrequently accessed and stored for at least 30 days.
Archive – Optimized for storing data that is rarely accessed and stored for at least 180 days with flexible latency requirements (on the order of hours). https://docs.microsoft.com/en-us/azure/storage/blobs/storage-blob-storage-tiers?tabs=azure-portal
Unattempted
Azure storage offers different access tiers, which allow you to store blob object data in the most cost-effective manner. The available access tiers include:
Hot – Optimized for storing data that is accessed frequently.
Cool – Optimized for storing data that is infrequently accessed and stored for at least 30 days.
Archive – Optimized for storing data that is rarely accessed and stored for at least 180 days with flexible latency requirements (on the order of hours). https://docs.microsoft.com/en-us/azure/storage/blobs/storage-blob-storage-tiers?tabs=azure-portal
Question 42 of 75
42. Question
You have a requirement to process data once in a month which processing strategy will you recommend ?
Correct
Data processing is simply the conversion of raw data to meaningful information through a process. Depending on how the data is ingested into your system, you could process each data item as it arrives, or buffer the raw data and process it in groups. Processing data as it arrives is called streaming. Buffering and processing the data in groups is called batch processing. https://docs.microsoft.com/en-us/learn/modules/explore-core-data-concepts/4-describe-difference
Incorrect answers:
Stream Processing – Thus is real time processing and not once a month
Analytical Processing – This can also be a possible if there was an option to select multiple right answers. Analytical processing under the hood is Batch processing
Incorrect
Data processing is simply the conversion of raw data to meaningful information through a process. Depending on how the data is ingested into your system, you could process each data item as it arrives, or buffer the raw data and process it in groups. Processing data as it arrives is called streaming. Buffering and processing the data in groups is called batch processing. https://docs.microsoft.com/en-us/learn/modules/explore-core-data-concepts/4-describe-difference
Incorrect answers:
Stream Processing – Thus is real time processing and not once a month
Analytical Processing – This can also be a possible if there was an option to select multiple right answers. Analytical processing under the hood is Batch processing
Unattempted
Data processing is simply the conversion of raw data to meaningful information through a process. Depending on how the data is ingested into your system, you could process each data item as it arrives, or buffer the raw data and process it in groups. Processing data as it arrives is called streaming. Buffering and processing the data in groups is called batch processing. https://docs.microsoft.com/en-us/learn/modules/explore-core-data-concepts/4-describe-difference
Incorrect answers:
Stream Processing – Thus is real time processing and not once a month
Analytical Processing – This can also be a possible if there was an option to select multiple right answers. Analytical processing under the hood is Batch processing
Question 43 of 75
43. Question
You develop data engineering solutions for a company.
You must integrate the company’s on-premises Microsoft SQL Server data with Microsoft Azure SQL Database. Data must be transformed incrementally.
You need to implement the data integration solution.
Which tool should you use to configure a pipeline to copy data?
Correct
Requirement:
Company’s on-premises Microsoft SQL Server data ——> Microsoft Azure SQL Database.
Use Azure PowerShell with SQL Server linked service as a source – True
This is because the source linked service has to be SQL server https://docs.microsoft.com/en-us/azure/data-factory/scripts/hybrid-copy-powershell
Use the Copy Data tool with Blob storage linked service as the source – False
This is because the source linked service cannot be blob storage
Use Azure Data Factory UI with Blob storage linked service as a source – False
This is because the source linked service cannot be blob storage
Use the .NET Data Factory API with Blob storage linked service as the source – False
This is because the source linked service cannot be blob storage
Incorrect
Requirement:
Company’s on-premises Microsoft SQL Server data ——> Microsoft Azure SQL Database.
Use Azure PowerShell with SQL Server linked service as a source – True
This is because the source linked service has to be SQL server https://docs.microsoft.com/en-us/azure/data-factory/scripts/hybrid-copy-powershell
Use the Copy Data tool with Blob storage linked service as the source – False
This is because the source linked service cannot be blob storage
Use Azure Data Factory UI with Blob storage linked service as a source – False
This is because the source linked service cannot be blob storage
Use the .NET Data Factory API with Blob storage linked service as the source – False
This is because the source linked service cannot be blob storage
Unattempted
Requirement:
Company’s on-premises Microsoft SQL Server data ——> Microsoft Azure SQL Database.
Use Azure PowerShell with SQL Server linked service as a source – True
This is because the source linked service has to be SQL server https://docs.microsoft.com/en-us/azure/data-factory/scripts/hybrid-copy-powershell
Use the Copy Data tool with Blob storage linked service as the source – False
This is because the source linked service cannot be blob storage
Use Azure Data Factory UI with Blob storage linked service as a source – False
This is because the source linked service cannot be blob storage
Use the .NET Data Factory API with Blob storage linked service as the source – False
This is because the source linked service cannot be blob storage
Question 44 of 75
44. Question
You want to save images and videos. What time of data store would you recommends?
Correct
Images and Videos are example of unstructured object data. Many applications need to store large, binary data objects, such as images and video streams. Microsoft Azure virtual machines use blob storage for holding virtual machine disk images. These objects can be several hundreds of GB in size. https://docs.microsoft.com/en-us/learn/modules/explore-non-relational-data-offerings-azure/3-explore-azure-blob-storage
Incorrect answers:
columnar – This is used for storing semi structured data and no suitable for images and videos
key/value – This is used for storing semi structured data and no suitable for images and videos
document – This is used for storing semi structured data and no suitable for images and videos
Incorrect
Images and Videos are example of unstructured object data. Many applications need to store large, binary data objects, such as images and video streams. Microsoft Azure virtual machines use blob storage for holding virtual machine disk images. These objects can be several hundreds of GB in size. https://docs.microsoft.com/en-us/learn/modules/explore-non-relational-data-offerings-azure/3-explore-azure-blob-storage
Incorrect answers:
columnar – This is used for storing semi structured data and no suitable for images and videos
key/value – This is used for storing semi structured data and no suitable for images and videos
document – This is used for storing semi structured data and no suitable for images and videos
Unattempted
Images and Videos are example of unstructured object data. Many applications need to store large, binary data objects, such as images and video streams. Microsoft Azure virtual machines use blob storage for holding virtual machine disk images. These objects can be several hundreds of GB in size. https://docs.microsoft.com/en-us/learn/modules/explore-non-relational-data-offerings-azure/3-explore-azure-blob-storage
Incorrect answers:
columnar – This is used for storing semi structured data and no suitable for images and videos
key/value – This is used for storing semi structured data and no suitable for images and videos
document – This is used for storing semi structured data and no suitable for images and videos
Question 45 of 75
45. Question
You have an application that runs on windows and requires access to mapped file share.
Which Azure service would you recommend
Correct
Azure Files enables you to set up highly available network file shares that can be accessed by using the standard Server Message Block (SMB) protocol. That means that multiple VMs can share the same files with both read and write access.
Azure Cosmos DB – Azure Cosmos DB is a fully managed NoSQL database for modern app development. Single-digit millisecond response times, and automatic and instant scalability, guarantee speed at any scale. Business continuity is assured with SLA-backed availability and enterprise-grade security. App development is faster and more productive thanks to turnkey multi region data distribution https://docs.microsoft.com/en-us/azure/storage/common/storage-introduction?toc=%2fazure%2fstorage%2fblobs%2ftoc.json
Incorrect
Azure Files enables you to set up highly available network file shares that can be accessed by using the standard Server Message Block (SMB) protocol. That means that multiple VMs can share the same files with both read and write access.
Azure Cosmos DB – Azure Cosmos DB is a fully managed NoSQL database for modern app development. Single-digit millisecond response times, and automatic and instant scalability, guarantee speed at any scale. Business continuity is assured with SLA-backed availability and enterprise-grade security. App development is faster and more productive thanks to turnkey multi region data distribution https://docs.microsoft.com/en-us/azure/storage/common/storage-introduction?toc=%2fazure%2fstorage%2fblobs%2ftoc.json
Unattempted
Azure Files enables you to set up highly available network file shares that can be accessed by using the standard Server Message Block (SMB) protocol. That means that multiple VMs can share the same files with both read and write access.
Azure Cosmos DB – Azure Cosmos DB is a fully managed NoSQL database for modern app development. Single-digit millisecond response times, and automatic and instant scalability, guarantee speed at any scale. Business continuity is assured with SLA-backed availability and enterprise-grade security. App development is faster and more productive thanks to turnkey multi region data distribution https://docs.microsoft.com/en-us/azure/storage/common/storage-introduction?toc=%2fazure%2fstorage%2fblobs%2ftoc.json
Question 46 of 75
46. Question
Select the correct storage option for the following scenario:
Employee data that shows the relationships between employees
Correct
Gremlin API. The Gremlin API implements a graph database interface to Cosmos DB. A graph is a collection of data objects and directed relationships. Data is still held as a set of documents in Cosmos DB, but the Gremlin API enables you to perform graph queries over data. Using the Gremlin API you can walk through the objects and relationships in the graph to discover all manner of complex relationships, such as “What is the name of the pet of Sam’s landlord?” in the graph shown below. https://docs.microsoft.com/en-us/learn/modules/explore-non-relational-data-offerings-azure/5-explore-azure-cosmos-database
Incorrect answers:
key/value – A key/value store associates each data value with a unique key. Most key/value stores only support simple query, insert, and delete operations.
object – This is suitable for storing unstructured data like images and videos.
document – A document database stores a collection of documents, where each document consists of named fields and data. The data can be simple values or complex elements such as lists and child collections. Documents are retrieved by unique keys.
Incorrect
Gremlin API. The Gremlin API implements a graph database interface to Cosmos DB. A graph is a collection of data objects and directed relationships. Data is still held as a set of documents in Cosmos DB, but the Gremlin API enables you to perform graph queries over data. Using the Gremlin API you can walk through the objects and relationships in the graph to discover all manner of complex relationships, such as “What is the name of the pet of Sam’s landlord?” in the graph shown below. https://docs.microsoft.com/en-us/learn/modules/explore-non-relational-data-offerings-azure/5-explore-azure-cosmos-database
Incorrect answers:
key/value – A key/value store associates each data value with a unique key. Most key/value stores only support simple query, insert, and delete operations.
object – This is suitable for storing unstructured data like images and videos.
document – A document database stores a collection of documents, where each document consists of named fields and data. The data can be simple values or complex elements such as lists and child collections. Documents are retrieved by unique keys.
Unattempted
Gremlin API. The Gremlin API implements a graph database interface to Cosmos DB. A graph is a collection of data objects and directed relationships. Data is still held as a set of documents in Cosmos DB, but the Gremlin API enables you to perform graph queries over data. Using the Gremlin API you can walk through the objects and relationships in the graph to discover all manner of complex relationships, such as “What is the name of the pet of Sam’s landlord?” in the graph shown below. https://docs.microsoft.com/en-us/learn/modules/explore-non-relational-data-offerings-azure/5-explore-azure-cosmos-database
Incorrect answers:
key/value – A key/value store associates each data value with a unique key. Most key/value stores only support simple query, insert, and delete operations.
object – This is suitable for storing unstructured data like images and videos.
document – A document database stores a collection of documents, where each document consists of named fields and data. The data can be simple values or complex elements such as lists and child collections. Documents are retrieved by unique keys.
Question 47 of 75
47. Question
Select the correct storage for the following scenario:
Medical images and their associated metadata
Correct
Azure Blob storage is Microsoft’s object storage solution for the cloud. Blob storage is ideal for:
– Serving images or documents directly to a browser.
– Storing files for distributed access.
– Streaming video and audio.
– Storing data for backup and restore, disaster recovery, and archiving.
– Storing data for analysis by an on-premises or Azure-hosted service.
Objects in Blob storage can be accessed from anywhere in the world via HTTP or HTTPS. https://docs.microsoft.com/en-us/azure/storage/common/storage-introduction#:~:text=Azure%20Blob%20storage%20is%20Microsoft's,as%20text%20or%20binary%20data.&text=Storing%20data%20for%20backup%20and,premises%20or%20Azure%2Dhosted%20service.
Incorrect answers:
document – A document database stores a collection of documents, where each document consists of named fields and data. The data can be simple values or complex elements such as lists and child collections. Documents are retrieved by unique keys.
graph – A graph database stores two types of information, nodes and edges. Edges specify relationships between nodes. Nodes and edges can have properties that provide information about that node or edge, similar to columns in a table. Edges can also have a direction indicating the nature of the relationship.
key/value – A key/value store associates each data value with a unique key. Most key/value stores only support simple query, insert, and delete operations. To modify a value (either partially or completely), an application must overwrite the existing data for the entire value. In most implementations, reading or writing a single value is an atomic operation. https://docs.microsoft.com/en-us/azure/architecture/guide/technology-choices/data-store-overview
Incorrect
Azure Blob storage is Microsoft’s object storage solution for the cloud. Blob storage is ideal for:
– Serving images or documents directly to a browser.
– Storing files for distributed access.
– Streaming video and audio.
– Storing data for backup and restore, disaster recovery, and archiving.
– Storing data for analysis by an on-premises or Azure-hosted service.
Objects in Blob storage can be accessed from anywhere in the world via HTTP or HTTPS. https://docs.microsoft.com/en-us/azure/storage/common/storage-introduction#:~:text=Azure%20Blob%20storage%20is%20Microsoft's,as%20text%20or%20binary%20data.&text=Storing%20data%20for%20backup%20and,premises%20or%20Azure%2Dhosted%20service.
Incorrect answers:
document – A document database stores a collection of documents, where each document consists of named fields and data. The data can be simple values or complex elements such as lists and child collections. Documents are retrieved by unique keys.
graph – A graph database stores two types of information, nodes and edges. Edges specify relationships between nodes. Nodes and edges can have properties that provide information about that node or edge, similar to columns in a table. Edges can also have a direction indicating the nature of the relationship.
key/value – A key/value store associates each data value with a unique key. Most key/value stores only support simple query, insert, and delete operations. To modify a value (either partially or completely), an application must overwrite the existing data for the entire value. In most implementations, reading or writing a single value is an atomic operation. https://docs.microsoft.com/en-us/azure/architecture/guide/technology-choices/data-store-overview
Unattempted
Azure Blob storage is Microsoft’s object storage solution for the cloud. Blob storage is ideal for:
– Serving images or documents directly to a browser.
– Storing files for distributed access.
– Streaming video and audio.
– Storing data for backup and restore, disaster recovery, and archiving.
– Storing data for analysis by an on-premises or Azure-hosted service.
Objects in Blob storage can be accessed from anywhere in the world via HTTP or HTTPS. https://docs.microsoft.com/en-us/azure/storage/common/storage-introduction#:~:text=Azure%20Blob%20storage%20is%20Microsoft's,as%20text%20or%20binary%20data.&text=Storing%20data%20for%20backup%20and,premises%20or%20Azure%2Dhosted%20service.
Incorrect answers:
document – A document database stores a collection of documents, where each document consists of named fields and data. The data can be simple values or complex elements such as lists and child collections. Documents are retrieved by unique keys.
graph – A graph database stores two types of information, nodes and edges. Edges specify relationships between nodes. Nodes and edges can have properties that provide information about that node or edge, similar to columns in a table. Edges can also have a direction indicating the nature of the relationship.
key/value – A key/value store associates each data value with a unique key. Most key/value stores only support simple query, insert, and delete operations. To modify a value (either partially or completely), an application must overwrite the existing data for the entire value. In most implementations, reading or writing a single value is an atomic operation. https://docs.microsoft.com/en-us/azure/architecture/guide/technology-choices/data-store-overview
Question 48 of 75
48. Question
Select the correct storage for the following scenario:
Application user and their default language
Correct
A key-value store is the simplest (and often quickest) type of NoSQL database for inserting and querying data. Each data item in a key-value store has two elements, a key and a value. The key uniquely identifies the item, and the value holds the data for the item.
https://docs.microsoft.com/en-us/learn/modules/explore-concepts-of-non-relational-data/4-describe-types-nosql-databases
Incorrect answers:
object – Object storage is optimized for storing and retrieving large binary objects (images, files, video and audio streams, large application data objects and documents, virtual machine disk images). Large data files are also popularly used in this model, for example, delimiter file (CSV), parquet, and ORC. Object stores can manage extremely large amounts of unstructured data.
document – A document database stores a collection of documents, where each document consists of named fields and data. The data can be simple values or complex elements such as lists and child collections. Documents are retrieved by unique keys.
graph – A graph database stores two types of information, nodes and edges. Edges specify relationships between nodes. Nodes and edges can have properties that provide information about that node or edge, similar to columns in a table. Edges can also have a direction indicating the nature of the relationship. https://docs.microsoft.com/en-us/azure/architecture/guide/technology-choices/data-store-overview
Incorrect
A key-value store is the simplest (and often quickest) type of NoSQL database for inserting and querying data. Each data item in a key-value store has two elements, a key and a value. The key uniquely identifies the item, and the value holds the data for the item.
https://docs.microsoft.com/en-us/learn/modules/explore-concepts-of-non-relational-data/4-describe-types-nosql-databases
Incorrect answers:
object – Object storage is optimized for storing and retrieving large binary objects (images, files, video and audio streams, large application data objects and documents, virtual machine disk images). Large data files are also popularly used in this model, for example, delimiter file (CSV), parquet, and ORC. Object stores can manage extremely large amounts of unstructured data.
document – A document database stores a collection of documents, where each document consists of named fields and data. The data can be simple values or complex elements such as lists and child collections. Documents are retrieved by unique keys.
graph – A graph database stores two types of information, nodes and edges. Edges specify relationships between nodes. Nodes and edges can have properties that provide information about that node or edge, similar to columns in a table. Edges can also have a direction indicating the nature of the relationship. https://docs.microsoft.com/en-us/azure/architecture/guide/technology-choices/data-store-overview
Unattempted
A key-value store is the simplest (and often quickest) type of NoSQL database for inserting and querying data. Each data item in a key-value store has two elements, a key and a value. The key uniquely identifies the item, and the value holds the data for the item.
https://docs.microsoft.com/en-us/learn/modules/explore-concepts-of-non-relational-data/4-describe-types-nosql-databases
Incorrect answers:
object – Object storage is optimized for storing and retrieving large binary objects (images, files, video and audio streams, large application data objects and documents, virtual machine disk images). Large data files are also popularly used in this model, for example, delimiter file (CSV), parquet, and ORC. Object stores can manage extremely large amounts of unstructured data.
document – A document database stores a collection of documents, where each document consists of named fields and data. The data can be simple values or complex elements such as lists and child collections. Documents are retrieved by unique keys.
graph – A graph database stores two types of information, nodes and edges. Edges specify relationships between nodes. Nodes and edges can have properties that provide information about that node or edge, similar to columns in a table. Edges can also have a direction indicating the nature of the relationship. https://docs.microsoft.com/en-us/azure/architecture/guide/technology-choices/data-store-overview
Question 49 of 75
49. Question
Your company is designing an application that will write large amount of JSON data. Which type of data store should you use?
Correct
A document database represents the opposite end of the NoSQL spectrum from a key-value store. In a document database, each document has a unique ID, but the fields in the documents are transparent to the database management system. Document databases typically store data in JSON format, as described in the previous unit, or they could be encoded using other formats such as XML, YAML, JSON, BSON.
https://docs.microsoft.com/en-us/learn/modules/explore-concepts-of-non-relational-data/4-describe-types-nosql-databases
Incorrect answers:
columnar – A column-family database organizes data into rows and columns. In its simplest form, a column-family database can appear very similar to a relational database, at least conceptually. The real power of a column-family database lies in its denormalized approach to structuring sparse data.
key/value – A key/value store associates each data value with a unique key. Most key/value stores only support simple query, insert, and delete operations. To modify a value (either partially or completely), an application must overwrite the existing data for the entire value.
graph – A graph database stores two types of information, nodes and edges. Edges specify relationships between nodes. Nodes and edges can have properties that provide information about that node or edge, similar to columns in a table. Edges can also have a direction indicating the nature of the relationship. https://docs.microsoft.com/en-us/azure/architecture/guide/technology-choices/data-store-overview
Incorrect
A document database represents the opposite end of the NoSQL spectrum from a key-value store. In a document database, each document has a unique ID, but the fields in the documents are transparent to the database management system. Document databases typically store data in JSON format, as described in the previous unit, or they could be encoded using other formats such as XML, YAML, JSON, BSON.
https://docs.microsoft.com/en-us/learn/modules/explore-concepts-of-non-relational-data/4-describe-types-nosql-databases
Incorrect answers:
columnar – A column-family database organizes data into rows and columns. In its simplest form, a column-family database can appear very similar to a relational database, at least conceptually. The real power of a column-family database lies in its denormalized approach to structuring sparse data.
key/value – A key/value store associates each data value with a unique key. Most key/value stores only support simple query, insert, and delete operations. To modify a value (either partially or completely), an application must overwrite the existing data for the entire value.
graph – A graph database stores two types of information, nodes and edges. Edges specify relationships between nodes. Nodes and edges can have properties that provide information about that node or edge, similar to columns in a table. Edges can also have a direction indicating the nature of the relationship. https://docs.microsoft.com/en-us/azure/architecture/guide/technology-choices/data-store-overview
Unattempted
A document database represents the opposite end of the NoSQL spectrum from a key-value store. In a document database, each document has a unique ID, but the fields in the documents are transparent to the database management system. Document databases typically store data in JSON format, as described in the previous unit, or they could be encoded using other formats such as XML, YAML, JSON, BSON.
https://docs.microsoft.com/en-us/learn/modules/explore-concepts-of-non-relational-data/4-describe-types-nosql-databases
Incorrect answers:
columnar – A column-family database organizes data into rows and columns. In its simplest form, a column-family database can appear very similar to a relational database, at least conceptually. The real power of a column-family database lies in its denormalized approach to structuring sparse data.
key/value – A key/value store associates each data value with a unique key. Most key/value stores only support simple query, insert, and delete operations. To modify a value (either partially or completely), an application must overwrite the existing data for the entire value.
graph – A graph database stores two types of information, nodes and edges. Edges specify relationships between nodes. Nodes and edges can have properties that provide information about that node or edge, similar to columns in a table. Edges can also have a direction indicating the nature of the relationship. https://docs.microsoft.com/en-us/azure/architecture/guide/technology-choices/data-store-overview
Extract, transform, and load (ETL) is a data pipeline used to collect data from various sources, transform the data according to business rules, and load it into a destination data store. The transformation work in ETL takes place in a specialized engine, and often involves using staging tables to temporarily hold data as it is being transformed and ultimately loaded to its destination.
Extract, load, and transform (ELT) differs from ETL solely in where the transformation takes place. In the ELT pipeline, the transformation occurs in the target data store. Instead of using a separate transformation engine, the processing capabilities of the target data store are used to transform data. https://docs.microsoft.com/en-us/azure/architecture/data-guide/relational-data/etl#:~:text=Extract%2C%20load%2C%20and%20transform%20(ELT)%20differs%20from%20ETL,are%20used%20to%20transform%20data.
Incorrect answers:
a target datasource powerful enough to transform the data – This ELT(Extract, Load & Transform) and not ETL( Extract, Transform & Load)
a matching schema in the data source and data target – This is not necessarily required. Source and Target can have different schema types.
Incorrect
Extract, transform, and load (ETL) is a data pipeline used to collect data from various sources, transform the data according to business rules, and load it into a destination data store. The transformation work in ETL takes place in a specialized engine, and often involves using staging tables to temporarily hold data as it is being transformed and ultimately loaded to its destination.
Extract, load, and transform (ELT) differs from ETL solely in where the transformation takes place. In the ELT pipeline, the transformation occurs in the target data store. Instead of using a separate transformation engine, the processing capabilities of the target data store are used to transform data. https://docs.microsoft.com/en-us/azure/architecture/data-guide/relational-data/etl#:~:text=Extract%2C%20load%2C%20and%20transform%20(ELT)%20differs%20from%20ETL,are%20used%20to%20transform%20data.
Incorrect answers:
a target datasource powerful enough to transform the data – This ELT(Extract, Load & Transform) and not ETL( Extract, Transform & Load)
a matching schema in the data source and data target – This is not necessarily required. Source and Target can have different schema types.
Unattempted
Extract, transform, and load (ETL) is a data pipeline used to collect data from various sources, transform the data according to business rules, and load it into a destination data store. The transformation work in ETL takes place in a specialized engine, and often involves using staging tables to temporarily hold data as it is being transformed and ultimately loaded to its destination.
Extract, load, and transform (ELT) differs from ETL solely in where the transformation takes place. In the ELT pipeline, the transformation occurs in the target data store. Instead of using a separate transformation engine, the processing capabilities of the target data store are used to transform data. https://docs.microsoft.com/en-us/azure/architecture/data-guide/relational-data/etl#:~:text=Extract%2C%20load%2C%20and%20transform%20(ELT)%20differs%20from%20ETL,are%20used%20to%20transform%20data.
Incorrect answers:
a target datasource powerful enough to transform the data – This ELT(Extract, Load & Transform) and not ETL( Extract, Transform & Load)
a matching schema in the data source and data target – This is not necessarily required. Source and Target can have different schema types.
Question 52 of 75
52. Question
You have data stored in ADLS in text format. You need to load the data in Azure Synapse Analytics in a table in one of the databases.
Which are the two components that you should define in order to use polybase?
Your company needs to implement a relational database in Azure. The solution must minimize ongoing maintenance.
Which Azure service should you use?
Correct
Azure HDInsight – Not a database
?SQL Server on Azure virtual machines – IAAS
?Azure Cosmos DB – Not a relational database
?Azure SQL Database installed on VM – IAAS
?Azure SQL Database -PaaS
The minimum ongoing maintenance is always in PaaS solution. IaaS requires the virtual machine maintenance and patching to be done by tenant and not cloud provider.
Azure SQL Database is a fully managed platform as a service (PaaS) database engine that handles most of the database management functions such as upgrading, patching, backups, and monitoring without user involvement. https://docs.microsoft.com/en-us/azure/azure-sql/database/sql-database-paas-overview#:~:text=Azure%20SQL%20Database%20is%20a,and%20monitoring%20without%20user%20involvement.
Incorrect
Azure HDInsight – Not a database
?SQL Server on Azure virtual machines – IAAS
?Azure Cosmos DB – Not a relational database
?Azure SQL Database installed on VM – IAAS
?Azure SQL Database -PaaS
The minimum ongoing maintenance is always in PaaS solution. IaaS requires the virtual machine maintenance and patching to be done by tenant and not cloud provider.
Azure SQL Database is a fully managed platform as a service (PaaS) database engine that handles most of the database management functions such as upgrading, patching, backups, and monitoring without user involvement. https://docs.microsoft.com/en-us/azure/azure-sql/database/sql-database-paas-overview#:~:text=Azure%20SQL%20Database%20is%20a,and%20monitoring%20without%20user%20involvement.
Unattempted
Azure HDInsight – Not a database
?SQL Server on Azure virtual machines – IAAS
?Azure Cosmos DB – Not a relational database
?Azure SQL Database installed on VM – IAAS
?Azure SQL Database -PaaS
The minimum ongoing maintenance is always in PaaS solution. IaaS requires the virtual machine maintenance and patching to be done by tenant and not cloud provider.
Azure SQL Database is a fully managed platform as a service (PaaS) database engine that handles most of the database management functions such as upgrading, patching, backups, and monitoring without user involvement. https://docs.microsoft.com/en-us/azure/azure-sql/database/sql-database-paas-overview#:~:text=Azure%20SQL%20Database%20is%20a,and%20monitoring%20without%20user%20involvement.
Question 54 of 75
54. Question
You can query a graph database in Azure Cosmos DB as a ———————–
Which service in Azure can be used to process the data in real-time having three components: input, query, and output?
Select the correct option.
Correct
An Azure Stream Analytics job consists of an input, query, and an output. Stream Analytics ingests data from Azure Event Hubs (including Azure Event Hubs from Apache Kafka), Azure IoT Hub, or Azure Blob Storage. The query, which is based on SQL query language, can be used to easily filter, sort, aggregate, and join streaming data over a period of time. You can also extend this SQL language with JavaScript and C# user-defined functions (UDFs). You can easily adjust the event ordering options and duration of time windows when performing aggregation operations through simple language constructs and/or configurations.
Reference: https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-introduction
Incorrect answers:
IoT hub – This can act as Input for Azure Stream Analytics but itself it doesn’t have input, query, and output
Event hub – This can act as Input for Azure Stream Analytics but itself it doesn’t have input, query, and output
Azure Databricks – Azure Databricks is a data analytics platform optimized for the Microsoft Azure cloud services platform. It provides notebook like experience to perform data analysis and model building.
Azure Synapse Analytics – Azure Synapse Analytics is a limitless analytics service that brings together data integration, enterprise data warehousing and big data analytics. It gives you the freedom to query data on your terms, using either serverless or dedicated resources—at scale.
Incorrect
An Azure Stream Analytics job consists of an input, query, and an output. Stream Analytics ingests data from Azure Event Hubs (including Azure Event Hubs from Apache Kafka), Azure IoT Hub, or Azure Blob Storage. The query, which is based on SQL query language, can be used to easily filter, sort, aggregate, and join streaming data over a period of time. You can also extend this SQL language with JavaScript and C# user-defined functions (UDFs). You can easily adjust the event ordering options and duration of time windows when performing aggregation operations through simple language constructs and/or configurations.
Reference: https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-introduction
Incorrect answers:
IoT hub – This can act as Input for Azure Stream Analytics but itself it doesn’t have input, query, and output
Event hub – This can act as Input for Azure Stream Analytics but itself it doesn’t have input, query, and output
Azure Databricks – Azure Databricks is a data analytics platform optimized for the Microsoft Azure cloud services platform. It provides notebook like experience to perform data analysis and model building.
Azure Synapse Analytics – Azure Synapse Analytics is a limitless analytics service that brings together data integration, enterprise data warehousing and big data analytics. It gives you the freedom to query data on your terms, using either serverless or dedicated resources—at scale.
Unattempted
An Azure Stream Analytics job consists of an input, query, and an output. Stream Analytics ingests data from Azure Event Hubs (including Azure Event Hubs from Apache Kafka), Azure IoT Hub, or Azure Blob Storage. The query, which is based on SQL query language, can be used to easily filter, sort, aggregate, and join streaming data over a period of time. You can also extend this SQL language with JavaScript and C# user-defined functions (UDFs). You can easily adjust the event ordering options and duration of time windows when performing aggregation operations through simple language constructs and/or configurations.
Reference: https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-introduction
Incorrect answers:
IoT hub – This can act as Input for Azure Stream Analytics but itself it doesn’t have input, query, and output
Event hub – This can act as Input for Azure Stream Analytics but itself it doesn’t have input, query, and output
Azure Databricks – Azure Databricks is a data analytics platform optimized for the Microsoft Azure cloud services platform. It provides notebook like experience to perform data analysis and model building.
Azure Synapse Analytics – Azure Synapse Analytics is a limitless analytics service that brings together data integration, enterprise data warehousing and big data analytics. It gives you the freedom to query data on your terms, using either serverless or dedicated resources—at scale.
Question 56 of 75
56. Question
Which are the services in Azure that can be used to process the data?
Select two correct options.
Correct
Azure blob storage – Storage
Synapse Analytics – Processing
IoT hub – Ingestion
Event hub – Ingestion
Stream Analytics Job -Processing
Synapse Analytics – Azure Synapse Analytics is a limitless analytics service that brings together data integration, enterprise data warehousing and big data analytics. It gives you the freedom to query data on your terms, using either serverless or dedicated resources—at scale. https://azure.microsoft.com/en-in/services/synapse-analytics/
Stream Analytics Job – Azure Stream Analytics is a real-time analytics and complex event-processing engine that is designed to analyze and process high volumes of fast streaming data from multiple sources simultaneously. https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-introduction
Incorrect
Azure blob storage – Storage
Synapse Analytics – Processing
IoT hub – Ingestion
Event hub – Ingestion
Stream Analytics Job -Processing
Synapse Analytics – Azure Synapse Analytics is a limitless analytics service that brings together data integration, enterprise data warehousing and big data analytics. It gives you the freedom to query data on your terms, using either serverless or dedicated resources—at scale. https://azure.microsoft.com/en-in/services/synapse-analytics/
Stream Analytics Job – Azure Stream Analytics is a real-time analytics and complex event-processing engine that is designed to analyze and process high volumes of fast streaming data from multiple sources simultaneously. https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-introduction
Unattempted
Azure blob storage – Storage
Synapse Analytics – Processing
IoT hub – Ingestion
Event hub – Ingestion
Stream Analytics Job -Processing
Synapse Analytics – Azure Synapse Analytics is a limitless analytics service that brings together data integration, enterprise data warehousing and big data analytics. It gives you the freedom to query data on your terms, using either serverless or dedicated resources—at scale. https://azure.microsoft.com/en-in/services/synapse-analytics/
Stream Analytics Job – Azure Stream Analytics is a real-time analytics and complex event-processing engine that is designed to analyze and process high volumes of fast streaming data from multiple sources simultaneously. https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-introduction
Question 57 of 75
57. Question
Your company has a reporting solution that has paginated reports. The reports query a dimensional model in a data warehouse.
Which type of processing does the reporting solution use?
Correct
Azure Synapse Analytics(Azure Datawarehouse) is a limitless analytics service that brings together data integration, enterprise data warehousing and big data analytics. It gives you the freedom to query data on your terms, using either serverless or dedicated resources—at scale. Azure Synapse brings these worlds together with a unified experience to ingest, explore, prepare, manage and serve data for immediate BI and machine learning needs.
Azure Datawarehouse is an Online Analytical Processing (OLAP) system https://azure.microsoft.com/en-in/services/synapse-analytics/
Incorrect answers:
Online Transaction Processing (OLTP) – OLTP (Online Transactional Processing) is a category of data processing that is focused on transaction-oriented tasks and not analytics /reporting tasks
Stream processing – This is real time processing and more related to OLTP and not OLAP
Batch processing – Batch processing is the processing of transactions in a group or batch which happens in OLAP. This should be selected as a second option if the question allows multiple answers.
Incorrect
Azure Synapse Analytics(Azure Datawarehouse) is a limitless analytics service that brings together data integration, enterprise data warehousing and big data analytics. It gives you the freedom to query data on your terms, using either serverless or dedicated resources—at scale. Azure Synapse brings these worlds together with a unified experience to ingest, explore, prepare, manage and serve data for immediate BI and machine learning needs.
Azure Datawarehouse is an Online Analytical Processing (OLAP) system https://azure.microsoft.com/en-in/services/synapse-analytics/
Incorrect answers:
Online Transaction Processing (OLTP) – OLTP (Online Transactional Processing) is a category of data processing that is focused on transaction-oriented tasks and not analytics /reporting tasks
Stream processing – This is real time processing and more related to OLTP and not OLAP
Batch processing – Batch processing is the processing of transactions in a group or batch which happens in OLAP. This should be selected as a second option if the question allows multiple answers.
Unattempted
Azure Synapse Analytics(Azure Datawarehouse) is a limitless analytics service that brings together data integration, enterprise data warehousing and big data analytics. It gives you the freedom to query data on your terms, using either serverless or dedicated resources—at scale. Azure Synapse brings these worlds together with a unified experience to ingest, explore, prepare, manage and serve data for immediate BI and machine learning needs.
Azure Datawarehouse is an Online Analytical Processing (OLAP) system https://azure.microsoft.com/en-in/services/synapse-analytics/
Incorrect answers:
Online Transaction Processing (OLTP) – OLTP (Online Transactional Processing) is a category of data processing that is focused on transaction-oriented tasks and not analytics /reporting tasks
Stream processing – This is real time processing and more related to OLTP and not OLAP
Batch processing – Batch processing is the processing of transactions in a group or batch which happens in OLAP. This should be selected as a second option if the question allows multiple answers.
Question 58 of 75
58. Question
Which type of database workload is the type of deep analytics used by SQL Server Analysis Services (SSAS), Azure Analysis Services, and similar apps?
Select the correct option.
Correct
SQL Server Analysis Services (SSAS), Azure Analysis Services are services that are used for data analytics which is an example of OLAP(Online Analytics Processing)
Analysis Services is an analytical data engine (Vertipaq) used in decision support and business analytics. It provides enterprise-grade semantic data model capabilities for business intelligence (BI), data analysis, and reporting applications such as Power BI, Excel, Reporting Services, and other data visualization tools. https://docs.microsoft.com/en-us/analysis-services/analysis-services-overview?view=asallproducts-allversions
Azure Analysis Services is a fully managed platform as a service (PaaS) that provides enterprise-grade data models in the cloud. Use advanced mashup and modeling features to combine data from multiple data sources, define metrics, and secure your data in a single, trusted tabular semantic data model. The data model provides an easier and faster way for users to perform ad hoc data analysis using tools like Power BI and Excel. https://docs.microsoft.com/en-us/azure/analysis-services/analysis-services-overview
Incorrect answers:
OLTP – OLTP (Online Transactional Processing) is a category of data processing that is focused on transaction-oriented tasks. OLTP typically involves inserting, updating, and/or deleting small amounts of data in a database.
Data warehouse – Enterprise analytics must work at massive scale on any kind of data, whether raw, refined, or highly curated. This typically requires enterprises to stitch together big data and data warehousing technologies into complex data pipelines that work across data in relational stores and data lakes.
Incorrect
SQL Server Analysis Services (SSAS), Azure Analysis Services are services that are used for data analytics which is an example of OLAP(Online Analytics Processing)
Analysis Services is an analytical data engine (Vertipaq) used in decision support and business analytics. It provides enterprise-grade semantic data model capabilities for business intelligence (BI), data analysis, and reporting applications such as Power BI, Excel, Reporting Services, and other data visualization tools. https://docs.microsoft.com/en-us/analysis-services/analysis-services-overview?view=asallproducts-allversions
Azure Analysis Services is a fully managed platform as a service (PaaS) that provides enterprise-grade data models in the cloud. Use advanced mashup and modeling features to combine data from multiple data sources, define metrics, and secure your data in a single, trusted tabular semantic data model. The data model provides an easier and faster way for users to perform ad hoc data analysis using tools like Power BI and Excel. https://docs.microsoft.com/en-us/azure/analysis-services/analysis-services-overview
Incorrect answers:
OLTP – OLTP (Online Transactional Processing) is a category of data processing that is focused on transaction-oriented tasks. OLTP typically involves inserting, updating, and/or deleting small amounts of data in a database.
Data warehouse – Enterprise analytics must work at massive scale on any kind of data, whether raw, refined, or highly curated. This typically requires enterprises to stitch together big data and data warehousing technologies into complex data pipelines that work across data in relational stores and data lakes.
Unattempted
SQL Server Analysis Services (SSAS), Azure Analysis Services are services that are used for data analytics which is an example of OLAP(Online Analytics Processing)
Analysis Services is an analytical data engine (Vertipaq) used in decision support and business analytics. It provides enterprise-grade semantic data model capabilities for business intelligence (BI), data analysis, and reporting applications such as Power BI, Excel, Reporting Services, and other data visualization tools. https://docs.microsoft.com/en-us/analysis-services/analysis-services-overview?view=asallproducts-allversions
Azure Analysis Services is a fully managed platform as a service (PaaS) that provides enterprise-grade data models in the cloud. Use advanced mashup and modeling features to combine data from multiple data sources, define metrics, and secure your data in a single, trusted tabular semantic data model. The data model provides an easier and faster way for users to perform ad hoc data analysis using tools like Power BI and Excel. https://docs.microsoft.com/en-us/azure/analysis-services/analysis-services-overview
Incorrect answers:
OLTP – OLTP (Online Transactional Processing) is a category of data processing that is focused on transaction-oriented tasks. OLTP typically involves inserting, updating, and/or deleting small amounts of data in a database.
Data warehouse – Enterprise analytics must work at massive scale on any kind of data, whether raw, refined, or highly curated. This typically requires enterprises to stitch together big data and data warehousing technologies into complex data pipelines that work across data in relational stores and data lakes.
Question 59 of 75
59. Question
Transparent Data Encryption(TDE) encrypts
Correct
Transparent data encryption (TDE) helps protect Azure SQL Database, Azure SQL Managed Instance, and Azure Synapse Analytics against the threat of malicious offline activity by encrypting data at rest. It performs real-time encryption and decryption of the database, associated backups, and transaction log files at rest without requiring changes to the application. https://docs.microsoft.com/en-us/azure/azure-sql/database/transparent-data-encryption-tde-overview?tabs=azure-portal
Incorrect
Transparent data encryption (TDE) helps protect Azure SQL Database, Azure SQL Managed Instance, and Azure Synapse Analytics against the threat of malicious offline activity by encrypting data at rest. It performs real-time encryption and decryption of the database, associated backups, and transaction log files at rest without requiring changes to the application. https://docs.microsoft.com/en-us/azure/azure-sql/database/transparent-data-encryption-tde-overview?tabs=azure-portal
Unattempted
Transparent data encryption (TDE) helps protect Azure SQL Database, Azure SQL Managed Instance, and Azure Synapse Analytics against the threat of malicious offline activity by encrypting data at rest. It performs real-time encryption and decryption of the database, associated backups, and transaction log files at rest without requiring changes to the application. https://docs.microsoft.com/en-us/azure/azure-sql/database/transparent-data-encryption-tde-overview?tabs=azure-portal
Question 60 of 75
60. Question
Select all that is TRUE:
Correct
Cosmos DB transparently replicates the data to all the regions associated with your Cosmos account. It provides a single system image of your globally distributed Azure Cosmos database and containers that your application can read and write to locally.
Cosmos DB transparently replicates the data to all the regions associated with your Cosmos account. It provides a single system image of your globally distributed Azure Cosmos database and containers that your application can read and write to locally.
Cosmos DB transparently replicates the data to all the regions associated with your Cosmos account. It provides a single system image of your globally distributed Azure Cosmos database and containers that your application can read and write to locally.
Graph data – Gremlin API
JSON documents – MongoDB API
key/value data – Table API
Table API. This interface enables you to use the Azure Table Storage API to store and retrieve documents. The purpose of this interface is to enable you to switch from Table Storage to Cosmos DB without requiring that you modify your existing applications. Table API stores data as key/value
MongoDB API. MongoDB is another well-known document database, with its own programmatic interface. Many organizations run MongoDB on-premises. You can use the MongoDB API for Cosmos DB to enable a MongoDB application to run unchanged against a Cosmos DB database. You can migrate the data in the MongoDB database to Cosmos DB running in the cloud, but continue to run your existing applications to access this data. Many document databases use JSON (JavaScript Object Notation) to represent the document structure.
Cassandra API. Cassandra is a column family database management system. This is another database management system that many organizations run on-premises. The Cassandra API for Cosmos DB provides a Cassandra-like programmatic interface for Cosmos DB. Cassandra API requests are mapped to Cosmos DB document requests. As with the MongoDB API, the primary purpose of the Cassandra API is to enable you to quickly migrate Cassandra databases and applications to Cosmos DB.
Gremlin API. The Gremlin API implements a graph database interface to Cosmos DB. A graph is a collection of data objects and directed relationships. Data is still held as a set of documents in Cosmos DB, but the Gremlin API enables you to perform graph queries over data. Using the Gremlin API you can walk through the objects and relationships in the graph to discover all manner of complex relationships https://docs.microsoft.com/en-us/learn/modules/explore-non-relational-data-offerings-azure/5-explore-azure-cosmos-database
Incorrect
Graph data – Gremlin API
JSON documents – MongoDB API
key/value data – Table API
Table API. This interface enables you to use the Azure Table Storage API to store and retrieve documents. The purpose of this interface is to enable you to switch from Table Storage to Cosmos DB without requiring that you modify your existing applications. Table API stores data as key/value
MongoDB API. MongoDB is another well-known document database, with its own programmatic interface. Many organizations run MongoDB on-premises. You can use the MongoDB API for Cosmos DB to enable a MongoDB application to run unchanged against a Cosmos DB database. You can migrate the data in the MongoDB database to Cosmos DB running in the cloud, but continue to run your existing applications to access this data. Many document databases use JSON (JavaScript Object Notation) to represent the document structure.
Cassandra API. Cassandra is a column family database management system. This is another database management system that many organizations run on-premises. The Cassandra API for Cosmos DB provides a Cassandra-like programmatic interface for Cosmos DB. Cassandra API requests are mapped to Cosmos DB document requests. As with the MongoDB API, the primary purpose of the Cassandra API is to enable you to quickly migrate Cassandra databases and applications to Cosmos DB.
Gremlin API. The Gremlin API implements a graph database interface to Cosmos DB. A graph is a collection of data objects and directed relationships. Data is still held as a set of documents in Cosmos DB, but the Gremlin API enables you to perform graph queries over data. Using the Gremlin API you can walk through the objects and relationships in the graph to discover all manner of complex relationships https://docs.microsoft.com/en-us/learn/modules/explore-non-relational-data-offerings-azure/5-explore-azure-cosmos-database
Unattempted
Graph data – Gremlin API
JSON documents – MongoDB API
key/value data – Table API
Table API. This interface enables you to use the Azure Table Storage API to store and retrieve documents. The purpose of this interface is to enable you to switch from Table Storage to Cosmos DB without requiring that you modify your existing applications. Table API stores data as key/value
MongoDB API. MongoDB is another well-known document database, with its own programmatic interface. Many organizations run MongoDB on-premises. You can use the MongoDB API for Cosmos DB to enable a MongoDB application to run unchanged against a Cosmos DB database. You can migrate the data in the MongoDB database to Cosmos DB running in the cloud, but continue to run your existing applications to access this data. Many document databases use JSON (JavaScript Object Notation) to represent the document structure.
Cassandra API. Cassandra is a column family database management system. This is another database management system that many organizations run on-premises. The Cassandra API for Cosmos DB provides a Cassandra-like programmatic interface for Cosmos DB. Cassandra API requests are mapped to Cosmos DB document requests. As with the MongoDB API, the primary purpose of the Cassandra API is to enable you to quickly migrate Cassandra databases and applications to Cosmos DB.
Gremlin API. The Gremlin API implements a graph database interface to Cosmos DB. A graph is a collection of data objects and directed relationships. Data is still held as a set of documents in Cosmos DB, but the Gremlin API enables you to perform graph queries over data. Using the Gremlin API you can walk through the objects and relationships in the graph to discover all manner of complex relationships https://docs.microsoft.com/en-us/learn/modules/explore-non-relational-data-offerings-azure/5-explore-azure-cosmos-database
Question 62 of 75
62. Question
What will be the correct Object type for Code?
{
“emp1” : {
“EmpName” : “Kavita Singh”,
“EmpAge” : “34”,
“Company Code” : {
“Code” : “10” }
{“onetype”:[
{“id”:1,”name”:”John D.”},
{“id”:2,”name”:”Don J.”
]
}
}
Correct
emp1 will be root objected and rest other are nested object
Select the correct answer
Your company plans to load data from a customer relationship management (CRM) system to a data warehouse by using an extract load, and transform (ELT) process. Where does data processing occur for each stage of the ELT process? Please Match the answer with component correctly
Correct
I. Extract – The CRM System
II. Load – The Data Ware house
III. Transform – An in memory data integration tool
Your company needs to integrate chatbot that will respond to customers queries automatically on the company’s website. Who should perform this task?
Correct
An artificial intelligence engineer is an individual who works with traditional machine learning techniques like natural language processing and neural networks to build models that power AI–based applications.
An artificial intelligence engineer is an individual who works with traditional machine learning techniques like natural language processing and neural networks to build models that power AI–based applications.
An artificial intelligence engineer is an individual who works with traditional machine learning techniques like natural language processing and neural networks to build models that power AI–based applications.
The bank needs to process the data every month to calculate payroll. Which of the following best categorizes the data?
Correct
Payroll – Batch, Non Streaming Data, Structured Data , Encrypted Data
In batch processing, newly arriving data elements are collected into a group. The whole group is then processed at a future time as a batch. Exactly when each group is processed can be determined in a number of ways.
Data processing is simply the conversion of raw data to meaningful information through a process. Depending on how the data is ingested into your system, you could process each data item as it arrives, or buffer the raw data and process it in groups. Processing data as it arrives is called streaming. Buffering and processing the data in groups is called batch processing. https://docs.microsoft.com/en-us/learn/modules/explore-core-data-concepts/4-describe-difference
Incorrect
Payroll – Batch, Non Streaming Data, Structured Data , Encrypted Data
In batch processing, newly arriving data elements are collected into a group. The whole group is then processed at a future time as a batch. Exactly when each group is processed can be determined in a number of ways.
Data processing is simply the conversion of raw data to meaningful information through a process. Depending on how the data is ingested into your system, you could process each data item as it arrives, or buffer the raw data and process it in groups. Processing data as it arrives is called streaming. Buffering and processing the data in groups is called batch processing. https://docs.microsoft.com/en-us/learn/modules/explore-core-data-concepts/4-describe-difference
Unattempted
Payroll – Batch, Non Streaming Data, Structured Data , Encrypted Data
In batch processing, newly arriving data elements are collected into a group. The whole group is then processed at a future time as a batch. Exactly when each group is processed can be determined in a number of ways.
Data processing is simply the conversion of raw data to meaningful information through a process. Depending on how the data is ingested into your system, you could process each data item as it arrives, or buffer the raw data and process it in groups. Processing data as it arrives is called streaming. Buffering and processing the data in groups is called batch processing. https://docs.microsoft.com/en-us/learn/modules/explore-core-data-concepts/4-describe-difference
Question 67 of 75
67. Question
Which among the following statements is true with respect to ETL process?
Correct
ETL process have low load times – False
Since transformation happens before loading, the time from Extract to Load is very high
ETL process reduces the resource contention on the source systems – False
Since Transformation has to happen before loading the resource contention on the source systems is high
ETL process require target systems to transform the data being loaded – False
In ETL, transformation happens before load. In ELT, transformation happens after load and hence target systems needs to transform the data being loaded
ETL process have very high load times – True
An alternative approach is ELT. ELT is an abbreviation of Extract, Load, and Transform. The process differs from ETL in that the data is stored before being transformed. The data processing engine can take an iterative approach, retrieving and processing the data from storage, before writing the transformed data and models back to storage. ELT is more suitable for constructing complex models that depend on multiple items in the database, often using periodic batch processing.
Because processing happens before loading, there is a high load time as each batch of data has to be transformed before being loaded. https://docs.microsoft.com/en-us/learn/modules/explore-concepts-of-data-analytics/2-describe-data-ingestion-process
Incorrect
ETL process have low load times – False
Since transformation happens before loading, the time from Extract to Load is very high
ETL process reduces the resource contention on the source systems – False
Since Transformation has to happen before loading the resource contention on the source systems is high
ETL process require target systems to transform the data being loaded – False
In ETL, transformation happens before load. In ELT, transformation happens after load and hence target systems needs to transform the data being loaded
ETL process have very high load times – True
An alternative approach is ELT. ELT is an abbreviation of Extract, Load, and Transform. The process differs from ETL in that the data is stored before being transformed. The data processing engine can take an iterative approach, retrieving and processing the data from storage, before writing the transformed data and models back to storage. ELT is more suitable for constructing complex models that depend on multiple items in the database, often using periodic batch processing.
Because processing happens before loading, there is a high load time as each batch of data has to be transformed before being loaded. https://docs.microsoft.com/en-us/learn/modules/explore-concepts-of-data-analytics/2-describe-data-ingestion-process
Unattempted
ETL process have low load times – False
Since transformation happens before loading, the time from Extract to Load is very high
ETL process reduces the resource contention on the source systems – False
Since Transformation has to happen before loading the resource contention on the source systems is high
ETL process require target systems to transform the data being loaded – False
In ETL, transformation happens before load. In ELT, transformation happens after load and hence target systems needs to transform the data being loaded
ETL process have very high load times – True
An alternative approach is ELT. ELT is an abbreviation of Extract, Load, and Transform. The process differs from ETL in that the data is stored before being transformed. The data processing engine can take an iterative approach, retrieving and processing the data from storage, before writing the transformed data and models back to storage. ELT is more suitable for constructing complex models that depend on multiple items in the database, often using periodic batch processing.
Because processing happens before loading, there is a high load time as each batch of data has to be transformed before being loaded. https://docs.microsoft.com/en-us/learn/modules/explore-concepts-of-data-analytics/2-describe-data-ingestion-process
Question 68 of 75
68. Question
Which API of cosmos DB can be used to work with data which is in the form of edges and vertices?
Which of the two Azure services can be used to provision Apache Spark clusters?
Correct
Azure Databricks is an Apache Spark environment running on Azure to provide big data processing, streaming, and machine learning.
Azure HDInsight is a big data processing service, that provides the platform for technologies such as Spark in an Azure environment. https://docs.microsoft.com/en-us/learn/modules/examine-components-of-modern-data-warehouse/3-explore-azure-data-services-warehousing
Incorrect answers:
Azure Log Analytics – Log Analytics is a tool in the Azure portal to edit and run log queries from data collected by Azure Monitor Logs and interactively analyze their results.
Azure Time Series Insights – Azure Time Series Insights Gen2 is an open and scalable end-to-end IoT analytics service featuring best-in-class user experiences and rich APIs to integrate its powerful capabilities into your existing workflow or application.
Incorrect
Azure Databricks is an Apache Spark environment running on Azure to provide big data processing, streaming, and machine learning.
Azure HDInsight is a big data processing service, that provides the platform for technologies such as Spark in an Azure environment. https://docs.microsoft.com/en-us/learn/modules/examine-components-of-modern-data-warehouse/3-explore-azure-data-services-warehousing
Incorrect answers:
Azure Log Analytics – Log Analytics is a tool in the Azure portal to edit and run log queries from data collected by Azure Monitor Logs and interactively analyze their results.
Azure Time Series Insights – Azure Time Series Insights Gen2 is an open and scalable end-to-end IoT analytics service featuring best-in-class user experiences and rich APIs to integrate its powerful capabilities into your existing workflow or application.
Unattempted
Azure Databricks is an Apache Spark environment running on Azure to provide big data processing, streaming, and machine learning.
Azure HDInsight is a big data processing service, that provides the platform for technologies such as Spark in an Azure environment. https://docs.microsoft.com/en-us/learn/modules/examine-components-of-modern-data-warehouse/3-explore-azure-data-services-warehousing
Incorrect answers:
Azure Log Analytics – Log Analytics is a tool in the Azure portal to edit and run log queries from data collected by Azure Monitor Logs and interactively analyze their results.
Azure Time Series Insights – Azure Time Series Insights Gen2 is an open and scalable end-to-end IoT analytics service featuring best-in-class user experiences and rich APIs to integrate its powerful capabilities into your existing workflow or application.
Question 70 of 75
70. Question
Which of the following tool can be used to access data stored in Azure Storage Account?
Correct
Microsoft Azure Storage Explorer is a standalone app that makes it easy to work with Azure Storage data on Windows, macOS, and Linux.
You have data stored in ADLS in text format. You need to load the data in Azure Synapse Analytics in a table in one of the database. Which technology would you use for the purpose?
Correct
PolyBase enables your SQL Server instance to process Transact-SQL queries that read data from external data sources. SQL Server 2016 and higher can access external data in Hadoop and Azure Blob Storage. Starting in SQL Server 2019, you can now use PolyBase to access external data in SQL Server, Oracle, Teradata, and MongoDB.
PolyBase provides these same functionalities for the following SQL products from Microsoft:
SQL Server 2016 and later versions (Windows only)
Analytics Platform System (formerly Parallel Data Warehouse)
Azure Synapse Analytics https://docs.microsoft.com/en-us/sql/relational-databases/polybase/polybase-guide?view=sql-server-ver15
Incorrect – All others other terms are not related to Azure
Incorrect
PolyBase enables your SQL Server instance to process Transact-SQL queries that read data from external data sources. SQL Server 2016 and higher can access external data in Hadoop and Azure Blob Storage. Starting in SQL Server 2019, you can now use PolyBase to access external data in SQL Server, Oracle, Teradata, and MongoDB.
PolyBase provides these same functionalities for the following SQL products from Microsoft:
SQL Server 2016 and later versions (Windows only)
Analytics Platform System (formerly Parallel Data Warehouse)
Azure Synapse Analytics https://docs.microsoft.com/en-us/sql/relational-databases/polybase/polybase-guide?view=sql-server-ver15
Incorrect – All others other terms are not related to Azure
Unattempted
PolyBase enables your SQL Server instance to process Transact-SQL queries that read data from external data sources. SQL Server 2016 and higher can access external data in Hadoop and Azure Blob Storage. Starting in SQL Server 2019, you can now use PolyBase to access external data in SQL Server, Oracle, Teradata, and MongoDB.
PolyBase provides these same functionalities for the following SQL products from Microsoft:
SQL Server 2016 and later versions (Windows only)
Analytics Platform System (formerly Parallel Data Warehouse)
Azure Synapse Analytics https://docs.microsoft.com/en-us/sql/relational-databases/polybase/polybase-guide?view=sql-server-ver15
Incorrect – All others other terms are not related to Azure
Question 72 of 75
72. Question
The production workload is facing technical issue with one of the server. You need to collect the logs and analyze the logs to determine the root cause of the issue. What type of analysis would you perform?
Correct
Diagnostic analytics helps answer questions about why things happened. Diagnostic analytics techniques supplement more basic descriptive analytics.
The airline company needs to update the prices of tickets based on customers feedback, fuel price and other factors. It analyses the data which gives the new prices of tickets. What type of analysis does this come under?
Correct
Prescriptive analytics helps answer questions about what actions should be taken to achieve a goal or target. By using insights from predictive analytics, data-driven decisions can be made.
Prescriptive analytics helps answer questions about what actions should be taken to achieve a goal or target. By using insights from predictive analytics, data-driven decisions can be made.
Prescriptive analytics helps answer questions about what actions should be taken to achieve a goal or target. By using insights from predictive analytics, data-driven decisions can be made.
Relational databases use _________________________ to enforce relationships between different data tables and _____________________ to enforce referential integrity.
Correct
A FOREIGN KEY is a key used to link two tables together.
A FOREIGN KEY is a field (or collection of fields) in one table that refers to the PRIMARY KEY in another table.
The table containing the foreign key is called the child table, and the table containing the candidate key is called the referenced or parent table. https://www.w3schools.com/sql/sql_foreignkey.asp
Incorrect
A FOREIGN KEY is a key used to link two tables together.
A FOREIGN KEY is a field (or collection of fields) in one table that refers to the PRIMARY KEY in another table.
The table containing the foreign key is called the child table, and the table containing the candidate key is called the referenced or parent table. https://www.w3schools.com/sql/sql_foreignkey.asp
Unattempted
A FOREIGN KEY is a key used to link two tables together.
A FOREIGN KEY is a field (or collection of fields) in one table that refers to the PRIMARY KEY in another table.
The table containing the foreign key is called the child table, and the table containing the candidate key is called the referenced or parent table. https://www.w3schools.com/sql/sql_foreignkey.asp
Question 75 of 75
75. Question
An application will use Microsoft Azure Cosmos DB as its data solution. The application will use the Cassandra API to support a column-based database type that uses containers to store items.
You need to provision Azure Cosmos DB. Which container name and item name should you use?