Showing posts with label Error. Show all posts
Showing posts with label Error. Show all posts

Thursday, October 6, 2022

Resoved: Error Creating Storage Event Trigger in Azure Synapse

My client receives external files from a vendor and wants to ingest them into the Data Lake using the integration pipelines in Synapse (Synapse's version of Azure Data Factory). Since the exact time the vendor sends the files can vary greatly day-to-day, he requested that I create a Storage Event Trigger.

I quickly set up the trigger:


Set Type to "Storage events", then select the correct storage account and container from the list associated to the current Azure Subscription. Use the blob path begins with setting to filter on the correct folder where the files land and use the blob path ends with setting to filter the file types (and perhaps file names if it's relatively consistent) to ensure that only the right blobs invoke the trigger. Finally, set Event to "Blob created".

Click Continue to go to the next page.

There will be a warning, "Make sure you have specific filters. Configuring filters that are too broad can match a large number of files created/deleted and may significantly impact your cost." which reminds you to check that the filters on the previous page actually return only the desired set of files. Be sure that you have at least one qualifying file in the folder and that the Data Preview can find it. If not, go back to previous page and adjust the the blob path begins with setting and the blob path ends with setting to correct the filtering. 

Click Continue to go to the next page.

This final page asks for the pipeline parameters to use when the trigger is invoked. 

Click Save to create the trigger. 



Once the trigger has been saved, publish the data factory.

So far so good. 

Then this popped up: 


The trigger needs to create and subscribe to an Event Grid event in order to be activated. Even the error was mysterious: 

"The client {GUID}' with object id {GUID}' does not have authorization to perform action 'Microsoft.EventGrid/eventSubscriptions/write' over scope '/subscriptions/{GUID}/resourceGroups/{resourceGroup}/providers/Microsoft.Storage/storageAccounts/{StorageAccount}/providers/Microsoft.EventGrid/eventSubscriptions/{GUID}' or the scope is invalid. If access was recently granted, please refresh your credentials."

I tried numerous searches on how to get the authorization to perform action Microsoft.EventGrid/eventSubscriptions/write and kept hitting dead ends. 

Finally, I started poking around in the Subscription settings to see if something needed to be set in there. Under "Resource Providers", I found that Microsoft.Synapse, Microsoft.Storage, Microsoft.DataLakeStore and Microsoft.EventGrid were all registered. So that felt like a dead end. 

After a bit more muddling around searching, I entered "Failed to Subscribe" in the search and found my savior: Cathrine Wilhelmsen.  She had experienced exactly the same issue and had the same difficulty I had locating information on how to resolve issue. She even mentioned the same articles that I read in my attempts to figure out what to do! The only thing I had not done was visit the Microsoft Q&A thread about running event triggers in Synapse - probably because I stumbled upon her blog post first! Thank you, Cathrine!!

So what was the magic trick?


The Microsoft.DataFactory resource provider was not registered. 

I hadn't expected that because we didn't have Azure Data Factory installed in this subscription, but now we know that it is required for event triggers.

Once the Admin registered Microsoft.DataFactory, I was able to successfully publish the storage event trigger. 😀 


Thursday, November 11, 2021

CI/CD with Azure Synapse Notebooks - Error Resolved

 


Some features of Azure Synapse are mysterious. Recently, I was working on deploying Azure Synapse artifacts from development to production using the "Synapse Workspace Deployment" extension in Azure DevOps and received an odd error: 

2021-11-10T21:20:14.8670075Z For artifact: AzureSQLQueryTool: Checkstatus: 202; status message: Accepted
2021-11-10T21:20:44.9656242Z For artifact: AzureSQLQueryTool: Checkstatus: 200; status message: OK
2021-11-10T21:20:44.9661205Z For artifact: AzureSQLQueryTool: Artifact Deployment status: Failed
2021-11-10T21:20:44.9673543Z Error during execution: Error: Failed to fetch the deployment status {"code":"400","message":"Failed Component = DataFactoryResourceProvider, ErrorCode = 400, Error = BadRequest "}
2021-11-10T21:20:44.9723399Z ##[error]Encountered with exception:Error: Failed to fetch the deployment status {"code":"400","message":"Failed Component = DataFactoryResourceProvider, ErrorCode = 400, Error = BadRequest "}
2021-11-10T21:20:44.9945300Z ##[section]Finishing: Synpase deployment task for workspace: myWorkspace_prod

The new items I had added to Synapse were several spark notebooks for ingesting data. I had tested them individually and they all appeared to be working, yet Azure DevOps' CI/CD gave me error when it attempted to deploy the release to production. I had followed the instructions provided by Microsoft to set up the CI/CD pipeline, yet it was failing.

I attempted to add override parameters for the notebooks - each notebook was linked to the spark pool in dev which was named "sp_dev". The Production spark pool was called "sp_prod", so with parameters for the pool's name it should work, right? 

No. Same error. 

After numerous other unsuccessful attempts at deployment, I deleted the production spark pool and recreated it with the same name as the dev spark pool. The notebooks deployed without a hitch. 

If you see the above error messages in your CI/CD logs and have spark notebooks in your Synapse deployment, the fix is always give the same names to the spark pools in every environment


Resoved: Error Creating Storage Event Trigger in Azure Synapse

My client receives external files from a vendor and wants to ingest them into the Data Lake using the integration pipelines in Synapse (Syna...