Author: Cassandra Sampson

Scale Resources – Troubleshoot Data Storage ProcessingScale Resources – Troubleshoot Data Storage Processing

Chapter 7 and Chapter 9 covered what you need to know for the exam concerning scaling. The intention of the content here is to summarize the three important concepts related to scaling. The first concept has to do with understanding the options you have regarding scaling an Azure Stream Analytics job. The second concept concerns […]

10/03/2022 Cassandra Sampson

Handle Interruptions – Troubleshoot Data Storage ProcessingHandle Interruptions – Troubleshoot Data Storage Processing

An interruption to the processing of your data stream flowing through your Azure Stream Analytics job can occur in many forms. One of the most catastrophic examples is caused by an event such as a storm or other event that results in the closure of all datacenters in a given Azure region. Although these events […]

08/02/2022 Cassandra Sampson

Monitor Batches and Pipelines – Troubleshoot Data Storage ProcessingMonitor Batches and Pipelines – Troubleshoot Data Storage Processing

This section is a follow‐up to Chapter 6. It is placed here so that you can recall the content reading about logging, monitoring, optimizing, and troubleshooting techniques in this chapter and Chapter 9. Handle Failed Batch Loads There are many actions you can take within the Azure Batch job itself from a coding perspective. In […]

07/16/2022 Cassandra Sampson

Scale Resources – Troubleshoot Data Storage ProcessingScale Resources – Troubleshoot Data Storage Processing

Figure 6.6 shows the select node size when you provisioned your Azure Batch pool. Notice that the Mode toggle switch is set to Fixed, with a targeted dedicated nodes value of 2. This means the amount of compute capacity allocated to this pool is fixed and will not scale. If the utilization of the allocated […]

06/01/2022 Cassandra Sampson

Design and Develop a Batch Processing Solution – Troubleshoot Data Storage ProcessingDesign and Develop a Batch Processing Solution – Troubleshoot Data Storage Processing

The processes discussed here were introduced in Chapter 6. A reason for introducing them there and following up here is that many of the concepts—such as logging, monitoring, and error handling—had not yet been covered. At this point, however, it is just a matter of connecting the dots and providing more detail within the context […]

04/19/2022 Cassandra Sampson

Rewrite User‐Defined Functions – Troubleshoot Data Storage ProcessingRewrite User‐Defined Functions – Troubleshoot Data Storage Processing

The description of user‐defined functions (UDF) in Chapter 2 is very informative, so have a look back at it if you need a refresher. In general terms, a UDF is a code snippet that performs some action on your data. These code snippets are most commonly triggered using the method name from within either SQL […]

03/03/2022 Cassandra Sampson

Troubleshoot a Failed Pipeline Run – Troubleshoot Data Storage Processing-4Troubleshoot a Failed Pipeline Run – Troubleshoot Data Storage Processing-4

Exceptions will have major impact on performance, even if handled, so you should log them, set up alerts when they happen, and work toward avoiding them all together. Chapter 6, “Create and Manage Batch Processing and Pipelines,” introduced the different execution paths (aka conditions) that can be taken between pipeline activities. As shown in Figure […]

01/26/2022 Cassandra Sampson

Troubleshoot a Failed Pipeline Run – Troubleshoot Data Storage Processing-3Troubleshoot a Failed Pipeline Run – Troubleshoot Data Storage Processing-3

The pop‐out window enables you to select the integration runtime to use for the pipeline execution. Data flows often perform very large ingestion and transformational activities, and this additional amount of compute power is required to process them. The default amount of time to keep the IR active is 1 hour, but if you need […]

12/26/2021 Cassandra Sampson

Troubleshoot a Failed Pipeline Run – Troubleshoot Data Storage Processing-2Troubleshoot a Failed Pipeline Run – Troubleshoot Data Storage Processing-2

The last topic to mention in the context of slowness has to do with overutilized compute resources. This is one of the most common scenarios you will encounter. The metrics you configure to monitor the health of your data analytics pipeline should target this specifically. When those metrics show that compute resources are under pressure, […]

10/02/2021 Cassandra Sampson

Troubleshoot a Failed Pipeline Run – Troubleshoot Data Storage Processing-1Troubleshoot a Failed Pipeline Run – Troubleshoot Data Storage Processing-1

It is almost certain that at some point while running your data analytics procedures, something unexpected will happen. When it does, you must gather information like the symptoms experienced and the log files that will help you get down to the reason for the behavior. Knowing what you read in the last section about the […]

09/23/2021 Cassandra Sampson

Contact Us