Data Engineering on Microsoft Azure Test 1 Welcome to your Data Engineering on Microsoft Azure - Test-1 Name Email Phone Q1. A company is managing several business apps that use SQL databases hosted on the Microsoft SQL Server (on-prem). As a part of the transformation project, the company is considering moving the on-prem tech stack to Microsoft Azure. What would be the most convenient way to move SQL databases (with as low migration effort as possible)? Please select the correct answer. Select 1 option(s): Azure SQL Database (managed instance) Azure SQL Database (single database) Azure Cosmos DB Azure Synapse Analytics None Q2. A company manages several on-premises Microsoft SQL Server databases. You’re hired as a data engineer to migrate the databases to Microsoft Azure by using a backup process of Microsoft SQL Server. Which data technology should you use? Select 1 option(s): Azure Cosmos DB Azure SQL Database (single database) Azure SQL Database Managed Instance Azure Synapse Analytics None Q3. A company uses Microsoft Azure SQL Database to store sensitive company data. You encrypt the data and only allow access to specified users from specified locations. You need to monitor data usage, and data copied from the system to prevent data leakage. You need to configure Azure SQL Database to email a specific user when data leakage occurs. What should you configure? Select 1 option(s): Add the database to a failover group Azure SQL Data Sync Active geo-replication Enable advanced threat protection None Q4. All Azure SQL databases (single, pooled, and managed instance databases) have a default backup retention period of 7 days. You can change the backup retention period to as long as 35 days. If you need to keep the backups for longer than the maximum retention period, you can modify the backup properties to add one or more long-term retention (LTR) periods to your database. Select 1 option(s): TRUE FALSE None Q5. Apache Hadoop cluster type is a good choice to optimize for Hive queries used as a batch process in Azure HDInsight Select 1 option(s): TRUE FALSE None Q6. Azure Databricks pools reduce cluster start and auto-scaling times by maintaining a set of idle, ready-to-use instances. Select 1 option(s): TRUE FALSE None Q7. Azure SQL Database managed instance allows existing SQL Server customers to lift and shift their on-premises applications to the cloud with minimal application and database changes. Select 1 option(s): TRUE FALSE None Q8. Azure SQL Database secures customer data by encrypting data in motion with Transport Layer Security (TLS). Select 1 option(s): FALSE TRUE None Q9. Azure Synapse Analytics uses server-level IP firewall rules. It doesn’t support database-level IP firewall rules. Select 1 option(s): FALSE TRUE None Q10. By using the long-term retention (LTR) feature, you can store specified SQL database full backups in Azure Blob storage with read-access geo-redundant storage for up to 5 years. Select 1 option(s): FALSE TRUE None Q11. Consider using a hash-distributed table in Synapse SQL pools, when: – The table size on disk is more than 2 GB.– The table has a frequent insert, update, and delete operations.Select 1 option(s): FALSE TRUE None Q12. Data Discovery & Classification provides advanced capabilities for discovering, classifying, labelling, and reporting the sensitive data in your databases. Select 1 option(s): FALSE TRUE None Q13. Data Factory offers three types of Integration Runtime (IR). These three types are: azure, self-hosted and Azure -SSIS Select 1 option(s): FALSE TRUE None Q14. Does Azure SQL database perform the automated backup? Select 1 option(s): No Yes None Q15. High concurrency clusters (Azure Databricks) work only for SQL, Python, and R. The performance, security, and fault isolation of high concurrency clusters is provided by running user code in separate processes, which is not possible in Scala. Select 1 option(s): TRUE FALSE None Q16. How does the Dynamic Data Masking (DDM) limit the exposure of sensitive data? Please select the correct answer.Select 1 option(s): Using Transport Layer Security (TLS) encryption Using Round-Robin distributed tables Using Row-data encryption Masking it to non-privileged users Using Hash distributed tables None Q17. Indexing is helpful for reading tables quickly. For tables up to 100 million rows, the best indexing type fit is the clustered index. Select 1 option(s): FALSE TRUE None Q18. Interactive Query cluster type is a good choice to optimize for ad hoc, interactive queries in Azure HDInsight Select 1 option(s): TRUE FALSE None Q19. Members of the db_accessadmin fixed database role can add or remove access to the database for Windows logins, Windows groups, and SQL Server logins. Select 1 option(s): TRUE FALSE None Q20. Members of the db_backupoperator fixed database role can back up the database. Select 1 option(s): TRUE FALSE None Q21. Members of the db_owner fixed database role can perform all configuration and maintenance activities on the database, and can also drop the database in SQL Server. Select 1 option(s): TRUE FALSE None Q22. Please select the correct statements. Select 3 option(s): Azure SQL Database uses SQL Server technology to create differential backups every 24 hours Azure SQL Database uses SQL Server technology to create differential backups every 12 hours Azure SQL Database uses SQL Server technology to create transaction log backups every 5 to 10 minutes Azure SQL Database uses SQL Server technology to create full backups every week Azure SQL Database uses SQL Server technology to create full backups every day Q23. Please selects all correct statements about the Transparent Data Encryption (TDE) on Microsoft SQL Server. Select 4 option(s): TDE protects data in transit using TSL protocol TDE performs real-time I/O encryption and decryption of the data and log files. TDE protects data "at rest", meaning the data and log files. TDE provides the ability to comply with many laws, regulations, and guidelines established in various industries. TDE protects data "at rest", meaning the data without log files. For encryption, TDE uses a database encryption key (DEK), which is stored in the database boot record for availability during recovery. Q24. Premium and Business Critical Azure SQL Database service tiers leverage the Premium availability model, which integrates compute resources (SQL Server Database Engine process) and storage (locally attached SSD) on a single node. Select 1 option(s): FALSE TRUE None Q25. Replicated table is a great fit for small dimension tables in a star schema with less than 2 GB of storage after compression Select 1 option(s): FALSE TRUE None Q26. Round-robin distributed table distributes data evenly across the table but without any further optimization. Select 1 option(s): FALSE TRUE None Q27. Select all statements about azure Blob Storage, which are correct. Select 3 option(s): It stores virtual machine(s) disks It stores structured object data It stores pictures, video, text files It stores unstructured data 28. SQL Database dynamic data masking (DDM) limits sensitive data exposure by masking it to non-privileged users. Select 1 option(s): TRUE FALSE None Q29. Synapse SQL leverages a scale-out architecture where compute is separate from storage, which enables you to scale compute independently of the data in your system. Select 1 option(s): FALSE TRUE None Q30. The key benefits of high concurrency clusters are that they provide Apache Spark-native fine-grained sharing for maximum resource utilization and minimum query latencies. Select 1 option(s): FALSE TRUE None Q31. The compute (Azure Databricks) and storage sources must be in the same region so that jobs don’t experience high latency. Select 1 option(s): TRUE FALSE None Q32. Transparent data encryption (TDE) helps protect Azure SQL Database against the threat of malicious offline activity by encrypting data at rest. Select 1 option(s): TRUE FALSE None Q33. Using SQL Server technology, Azure SQL Database creates full backups every week, differential backups every 12 hours and transaction logs backups every 5 to 10 minutes. Select 1 option(s): FALSE TRUE None Q34. What is the key advantage to choose an Azure SQL database over a SQL Server hosted on a Virtual Machine (IaaS)? Select 1 option(s): Works better if you want to have complete control over the environment If you want to extend your on-prem environment to Azure The private IP address within Azure VNet Speed up the time to market for your application None Q35. What is the most convenient way to store large amounts of unstructured text and binary data such as video, audio and pictures? Select 1 option(s): Azure Page Blob Azure Queue Storage Azure Block Blob Azure Table Storage Azure SQL Database None Q36. When TDE is enabled for an Azure SQL database, backups are also encrypted. Select 1 option(s): TRUE FALSE None Q37. Which Azure Blob type is optimised for storing cloud objects and streaming? Select 1 option(s): Block Standard General Page None Q38. Which Azure SQL Database feature provides peak performance and stable workloads through continuous performance tuning based on AI and machine learning? Please select the correct answer.Select 1 option(s): Azure SQL Analytics Automatic Tuning Azure Monitor Azure Activity Log None Q39. Which Azure SQL deployment option would you choose in order to: – have full control over the SQL Server engine– The private IP address within Azure VNet– manage your backups and patches– SQL Server instances with up to 256 TB of storage. The instance can support as many databases as neededSelect 1 option(s): Azure SQL Database - Managed Instance Azure SQL Elastic Pool Azure SQL Database Azure SQL Virtual Machine None Q40. Which component of Azure Data Factory defines the information needed to connect to external resources? Select 1 option(s): Linked Service Pipeline Activity Storage None Q41. Which inputs can Azure Stream Analytics use to stream data from? Select 3 option(s): Event Hubs File Storage Queue Storage Blob Storage IoT Hub Q42. Which service should you use to do the compatibility assessment of on-prem Microsoft SQL database before moving it to Azure cloud? Select 1 option(s): Data Migration Assistant (DMA) SQL vulnerability assessment SQL Server Migration Assistant (SSMA) Azure SQL Data sync None Q43. Which statement is NOT correct? Select 1 option(s): Standard clusters are configured to not terminate automatically High concurrency clusters work only for SQL, Python, and R. Standard clusters can run workloads developed in Python, R, Scala, and SQL. Standard clusters are recommended for a single user. None Q44. Which statement is NOT correct? Select 1 option(s): The hot access tier is used for data that are in active use or expected to be accessed (read from and written to) frequently. The cold access tier is optimized for storing data that is infrequently accessed and stored for at least 30 days. The hot access tier has lower storage costs and higher access costs compared to cold storage The cool access tier has lower storage costs and higher access costs compared to hot storage. Archive storage stores data offline and offers the lowest storage costs but also the highest data rehydrate and access costs. None Q45. Which tasks do you need to complete in order to create a regional disaster recovery topology for Azure Databricks? Please select all the correct answers. Select 3 option(s): Once the secondary region is created, you must migrate the users, user folders, notebooks, cluster configuration, jobs configuration, libraries, storage, init scripts, and reconfigure access control. Provision multiple Azure Databricks workspaces in separate Azure regions. Once the secondary region is created, you don't need to migrate the users, user folders, notebooks, cluster configuration, jobs configuration, libraries, storage, init scripts, and reconfigure access control. These tasks are done automatically. Use geo-redundant storage (GRS) Use local-redundant storage (LRS) Provision multiple Azure Databricks workspaces in separate Azure zones. Q46. Which trigger types does Azure Data Factory support? Select all correct answers. Select 3 option(s): Schedule trigger Sliding window trigger Session Trigger Event-based trigger Tumbling window trigger Q47. You are building a data engineering pipeline using Microsoft Azure services. You are asked to deploy an analytical data store to meet the following requirements: – support for T-SQL and Spark SQL – enterprise data warehousing and parallel processing – deep integration of Azure Machine Learning and Power BI. What would you suggest? Select 1 option(s): Azure Synapse Azure SQL Database HDInsight Hbase Azure Cosmos DB None Q48. You are building a data engineering pipeline using Microsoft Azure services. You need to provision a data ingestion solution to meet the following requirements: – can ingest millions of events per second – fully managed PaaS solution – integration with Azure Functions Which technology would you use? Select 1 option(s): Azure HDInsight Azure Event Hubs Azure HDInsight Spark Azure HDInsight Hadoop None Q49. You are building a data engineering pipeline using Microsoft Azure services. You need to provision a data processing solution to meet the following requirements: – use a fully managed Apache Spark environment – support autoscaling and auto-termination of clusters – support Python, R, Scala and SQL programming languages Select 1 option(s): Azure Databricks Azure Data Lake Gen2 HDInsight Hadoop HDInsight Interactive Query None Q50. You are building a data engineering pipeline using Microsoft Azure services. You need to provision a data storage solution to meet the following requirements: – can store petabytes (PT) of data in various formats – use a hierarchical namespace to achieve scalability and cost-effectiveness of object storage – storage is optimised for big data analytics workloads Select 1 option(s): Azure File Storage None Azure Blob Storage Azure Data Lake Storage Gen2 Azure Cosmos Database None Time's up