Skip to content

Conversation

cheenamalhotra
Copy link
Member

@cheenamalhotra cheenamalhotra commented Jul 9, 2025

Description

This PR fixes a couple of issues:

  1. Concurrency issue with AKV cache: If parallel threads access the Local or Global CEK caches at the same time, they end up all making requests to Azure Key Vault to acquire key information. This causes unnecessary requests over the wire for unwrapping key which is already "supposedly" in cache.
    • This is addressed by usage of Semaphore Slim to limit cache access to one thread at a time.
  2. Sync over Async: As of now requests to cache Key are made asynchronously, causing thread starvation due to sync-over-async issue. Since AKV provider is purely sync, we should not be making any async calls synchronously.
    • This is addressed by calling sync APIs to fetch key information from AKV.

Issues

Fixes above issues, and additionally:

Testing

Repro can be found here: https://gist.github.com/cheenamalhotra/79f8ae709147e0b52239fd54c3049932

When registering AKV provider globally:

  • With current driver: The number of 'unwrap' key requests triggered by Azure Key Vault SDK are more than 1 for parallel operations tests.
  • With fix: Only 1 Unwrap call is made by the driver as Cached Encryption key is available.

Guidelines

Please review the contribution guidelines before submitting a pull request:

P.S. We might want to consider it for 6.1.0 as it's critically needed for a customer.

@Copilot Copilot AI review requested due to automatic review settings July 9, 2025 00:37
@cheenamalhotra cheenamalhotra requested a review from a team as a code owner July 9, 2025 00:37
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the Azure Key Vault (AKV) provider cache design to address cache TTL enforcement, remove sync-over-async calls, and prevent redundant cache fetches under concurrency.

  • Modernized SqlSymmetricKeyCache to use target-typed new, inline TryGetValue, and removed forced TTL reset.
  • Switched SqlConnection to C# 12 collection expressions and dropped global TTL zeroing.
  • Refactored SqlColumnEncryptionAzureKeyVaultProvider and AzureSqlKeyCryptographer from async task-based caching to synchronous patterns with a semaphore guard.

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

Show a summary per file
File Description
src/Microsoft.Data.SqlClient/src/Microsoft/Data/SqlClient/SqlSymmetricKeyCache.cs Modernized cache lookup and entry options; removed TTL override
src/Microsoft.Data.SqlClient/netcore/src/Microsoft/Data/SqlClient/SqlConnection.cs Replaced new List<string>(…) with [ .. keys]; removed TTL zeroing
src/Microsoft.Data.SqlClient/add-ons/AzureKeyVaultProvider/SqlColumnEncryptionAzureKeyVaultProvider.cs Added SemaphoreSlim around local cache access
src/Microsoft.Data.SqlClient/add-ons/AzureKeyVaultProvider/LocalCache.cs Adjusted using order; removed unused Contains method
src/Microsoft.Data.SqlClient/add-ons/AzureKeyVaultProvider/AzureSqlKeyCryptographer.cs Replaced async fetch/dictionaries with sync KeyVaultKey cache
Comments suppressed due to low confidence (5)

src/Microsoft.Data.SqlClient/add-ons/AzureKeyVaultProvider/AzureSqlKeyCryptographer.cs:31

  • [nitpick] The _keyFetchTaskDictionary now stores KeyVaultKey instances rather than tasks. Consider renaming it to better reflect its contents, e.g. _keyCacheDictionary or _fetchedKeys.
        private readonly ConcurrentDictionary<string, KeyVaultKey> _keyFetchTaskDictionary = new();

src/Microsoft.Data.SqlClient/add-ons/AzureKeyVaultProvider/AzureSqlKeyCryptographer.cs:175

  • FetchKeyFromKeyVault retrieves keyClient but never initializes it. You need to call CreateKeyClient(vaultUri) before accessing _keyClientDictionary or handle the case where keyClient is null.
            return keyClient?.GetKey(keyName, keyVersion);

src/Microsoft.Data.SqlClient/add-ons/AzureKeyVaultProvider/AzureSqlKeyCryptographer.cs:81

  • With the async/task-based guard removed, parallel calls for a missing key will all invoke FetchKeyFromKeyVault. Consider adding a lock or double-checked locking to prevent N+1 fetches under concurrency.
            if (_keyFetchTaskDictionary.TryGetValue(keyIdentifierUri, out KeyVaultKey keyVaultKey))

src/Microsoft.Data.SqlClient/netcore/src/Microsoft/Data/SqlClient/SqlConnection.cs:279

  • The method signature returns List<string>, but the collection expression [...] produces an array by default. This will cause a type mismatch. Either construct a List<string> explicitly or verify that C# 12 collection expressions target the declared return type.
                return [.. s_systemColumnEncryptionKeyStoreProviders.Keys];

src/Microsoft.Data.SqlClient/netcore/src/Microsoft/Data/SqlClient/SqlConnection.cs:294

  • Same as above: returning a collection expression for a method declared to return List<string> may not compile. Consider new List<string>(...) or confirm the collection expression creates a List<string>.
                return [.. _customColumnEncryptionKeyStoreProviders.Keys];

@cheenamalhotra cheenamalhotra added this to the 6.1.0 milestone Jul 9, 2025
@cheenamalhotra cheenamalhotra requested a review from a team July 10, 2025 06:01
Copy link
Contributor

@paulmedynski paulmedynski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A variety of comments.

@paulmedynski paulmedynski requested a review from David-Engel July 14, 2025 15:30
paulmedynski
paulmedynski previously approved these changes Jul 14, 2025
mdaigle
mdaigle previously approved these changes Jul 14, 2025
@paulmedynski paulmedynski dismissed stale reviews from mdaigle and themself via a1ec8c3 July 15, 2025 11:59
Copy link

codecov bot commented Jul 15, 2025

Codecov Report

Attention: Patch coverage is 84.28571% with 11 lines in your changes missing coverage. Please review.

Project coverage is 66.86%. Comparing base (8eb9f32) to head (a1ec8c3).
Report is 10 commits behind head on main.

Files with missing lines Patch % Lines
.../AzureKeyVaultProvider/AzureSqlKeyCryptographer.cs 75.00% 7 Missing ⚠️
...a.SqlClient/add-ons/AzureKeyVaultProvider/Utils.cs 0.00% 2 Missing ⚠️
...core/src/Microsoft/Data/SqlClient/SqlConnection.cs 66.66% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3464      +/-   ##
==========================================
- Coverage   68.86%   66.86%   -2.00%     
==========================================
  Files         280      276       -4     
  Lines       62417    62236     -181     
==========================================
- Hits        42982    41615    -1367     
- Misses      19435    20621    +1186     
Flag Coverage Δ
addons 90.82% <76.31%> (-1.76%) ⬇️
netcore 68.93% <93.75%> (-3.81%) ⬇️
netfx 69.31% <100.00%> (+1.27%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

@David-Engel David-Engel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed trace logs are lacking connection context. But it was already like that, so not going to hold this up for that.

@mdaigle mdaigle merged commit 8435d00 into main Jul 15, 2025
251 checks passed
@mdaigle mdaigle deleted the dev/cheena/fix-azurekeyvault branch July 15, 2025 15:54
mdaigle pushed a commit that referenced this pull request Jul 15, 2025
* Fix AzureSqlKeyCryptographer to be pure sync

* Fix Symmetric key cache design + Key vault cache

* Fix tests

* Add back multi-user cache scenario

* Add cache lock in global CEK cache to prevent concurrency issues.

* Touch-ups

* More changes to streamline getKey calls

* More updates

* - Addressed review comments.

* - Fixing tests that expect certain localized strings that have changed.

---------

Co-authored-by: Paul Medynski <31868385+paulmedynski@users.noreply.github.com>
@paulmedynski paulmedynski modified the milestones: 6.1.0, 7.0-preview1 Jul 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Always encrypted: 401 errors from Azure Key Vault is cached forever
4 participants