Chapter 7 and Chapter 9 covered what you need to know for the exam concerning scaling. The intention of the content here is to summarize the three important concepts related to scaling. The first concept has to do with understanding the options you have regarding scaling an Azure Stream Analytics job. The second concept concerns […]
Author: Cassandra Sampson
Handle Interruptions – Troubleshoot Data Storage ProcessingHandle Interruptions – Troubleshoot Data Storage Processing
An interruption to the processing of your data stream flowing through your Azure Stream Analytics job can occur in many forms. One of the most catastrophic examples is caused by an event such as a storm or other event that results in the closure of all datacenters in a given Azure region. Although these events […]
Monitor Batches and Pipelines – Troubleshoot Data Storage ProcessingMonitor Batches and Pipelines – Troubleshoot Data Storage Processing
This section is a follow‐up to Chapter 6. It is placed here so that you can recall the content reading about logging, monitoring, optimizing, and troubleshooting techniques in this chapter and Chapter 9. Handle Failed Batch Loads There are many actions you can take within the Azure Batch job itself from a coding perspective. In […]
Scale Resources – Troubleshoot Data Storage ProcessingScale Resources – Troubleshoot Data Storage Processing
Figure 6.6 shows the select node size when you provisioned your Azure Batch pool. Notice that the Mode toggle switch is set to Fixed, with a targeted dedicated nodes value of 2. This means the amount of compute capacity allocated to this pool is fixed and will not scale. If the utilization of the allocated […]
Design and Develop a Batch Processing Solution – Troubleshoot Data Storage ProcessingDesign and Develop a Batch Processing Solution – Troubleshoot Data Storage Processing
The processes discussed here were introduced in Chapter 6. A reason for introducing them there and following up here is that many of the concepts—such as logging, monitoring, and error handling—had not yet been covered. At this point, however, it is just a matter of connecting the dots and providing more detail within the context […]
Rewrite User‐Defined Functions – Troubleshoot Data Storage ProcessingRewrite User‐Defined Functions – Troubleshoot Data Storage Processing
The description of user‐defined functions (UDF) in Chapter 2 is very informative, so have a look back at it if you need a refresher. In general terms, a UDF is a code snippet that performs some action on your data. These code snippets are most commonly triggered using the method name from within either SQL […]
Troubleshoot a Failed Pipeline Run – Troubleshoot Data Storage Processing-4Troubleshoot a Failed Pipeline Run – Troubleshoot Data Storage Processing-4
Exceptions will have major impact on performance, even if handled, so you should log them, set up alerts when they happen, and work toward avoiding them all together. Chapter 6, “Create and Manage Batch Processing and Pipelines,” introduced the different execution paths (aka conditions) that can be taken between pipeline activities. As shown in Figure […]
Troubleshoot a Failed Pipeline Run – Troubleshoot Data Storage Processing-3Troubleshoot a Failed Pipeline Run – Troubleshoot Data Storage Processing-3
The pop‐out window enables you to select the integration runtime to use for the pipeline execution. Data flows often perform very large ingestion and transformational activities, and this additional amount of compute power is required to process them. The default amount of time to keep the IR active is 1 hour, but if you need […]
Troubleshoot a Failed Pipeline Run – Troubleshoot Data Storage Processing-2Troubleshoot a Failed Pipeline Run – Troubleshoot Data Storage Processing-2
The last topic to mention in the context of slowness has to do with overutilized compute resources. This is one of the most common scenarios you will encounter. The metrics you configure to monitor the health of your data analytics pipeline should target this specifically. When those metrics show that compute resources are under pressure, […]
Troubleshoot a Failed Pipeline Run – Troubleshoot Data Storage Processing-1Troubleshoot a Failed Pipeline Run – Troubleshoot Data Storage Processing-1
It is almost certain that at some point while running your data analytics procedures, something unexpected will happen. When it does, you must gather information like the symptoms experienced and the log files that will help you get down to the reason for the behavior. Knowing what you read in the last section about the […]