Skip to content

Conversation

stone-98
Copy link
Contributor

@stone-98 stone-98 commented Mar 20, 2024

What is the purpose of the change

Nacos supports fuzzy config listening.

Config a fuzzy listen scheme(配置模糊监听方案)

Brief changelog

Nacos supports fuzzy config listening.

Verifying this change

XXXX

Follow this checklist to help us incorporate your contribution quickly and easily:

  • Make sure there is a Github issue filed for the change (usually before you start working on it). Trivial changes like typos do not require a Github issue. Your pull request should address just this issue, without pulling in other changes - one PR resolves one issue.
  • Format the pull request title like [ISSUE #123] Fix UnknownException when host config not exist. Each commit in the pull request should have a meaningful subject line and body.
  • Write a pull request description that is detailed enough to understand what the pull request does, how, and why.
  • Write necessary unit-test to verify your logic correction, more mock a little better when cross module dependency exist. If the new feature or significant change is committed, please remember to add integration-test in test module.
  • Run mvn -B clean package apache-rat:check findbugs:findbugs -Dmaven.test.skip=true to make sure basic checks pass. Run mvn clean install -DskipITs to make sure unit-test pass. Run mvn clean test-compile failsafe:integration-test to make sure integration-test pass.

@codecov-commenter
Copy link

codecov-commenter commented Mar 20, 2024

Codecov Report

Attention: Patch coverage is 15.03532% with 842 lines in your changes are missing coverage. Please review.

Project coverage is 66.61%. Comparing base (64b9e8d) to head (7f84376).
Report is 13 commits behind head on develop.

Additional details and impacted files

Impacted file tree graph

@@              Coverage Diff              @@
##             develop   #11856      +/-   ##
=============================================
- Coverage      67.81%   66.61%   -1.20%     
- Complexity      8893     8945      +52     
=============================================
  Files           1237     1254      +17     
  Lines          40443    41399     +956     
  Branches        4286     4422     +136     
=============================================
+ Hits           27425    27580     +155     
- Misses         11046    11805     +759     
- Partials        1972     2014      +42     
Files Coverage Δ
...acos/client/config/impl/ConfigTransportClient.java 86.11% <ø> (ø)
...n/java/com/alibaba/nacos/api/common/Constants.java 53.84% <0.00%> (-4.49%) ⬇️
...emote/response/ConfigBatchFuzzyListenResponse.java 0.00% <0.00%> (ø)
...mote/response/FuzzyListenNotifyChangeResponse.java 0.00% <0.00%> (ø)
...remote/response/FuzzyListenNotifyDiffResponse.java 0.00% <0.00%> (ø)
...m/alibaba/nacos/config/server/model/CacheItem.java 76.38% <0.00%> (-3.33%) ⬇️
...java/com/alibaba/nacos/client/utils/ParamUtil.java 86.17% <62.50%> (-5.88%) ⬇️
...onfig/server/configuration/ConfigCommonConfig.java 72.72% <40.00%> (-27.28%) ⬇️
...mote/request/AbstractFuzzyListenNotifyRequest.java 0.00% <0.00%> (ø)
...acos/config/server/service/ConfigCacheService.java 75.34% <0.00%> (-1.09%) ⬇️
... and 17 more

... and 32 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 64b9e8d...7f84376. Read the comment docs.

@KomachiSion KomachiSion added the version/3.x Nacos 3.0 Architecture Evolution label Mar 25, 2024
@KomachiSion
Copy link
Collaborator

Is this PR should be merged into Nacos 3.0 branch? @shiyiyue1102

// Check if the context is consistent with the server
if (context.getIsConsistentWithServer().get()) {
// Skip if a full synchronization is not needed
if (!needAllSync) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

为什么不在前面直接返回,要在for循环来continue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

因为ClientWorker中包含所有的模糊监听上下文(多个模糊监听表达式),每个都有属于自身的isConsistentWithServer字段,所以需要在for循环里面continue。(可能存在有些已经与服务端同步的,有些没有同步的)

/**
* Flag indicating whether the context is discarded.
*/
private volatile boolean isDiscard = false;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isDiscard字段的存在是不是让逻辑更复杂了,取消订阅的时候直接删除订阅的数据是不是就可以了

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isDiscard如果为空时,代表此模糊监听上下文已经没有Listener,需要通知服务端取消订阅,让服务端不再进行通知。
如果去除该字段,直接用listener == null判断其实也可行。

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不过原有普通订阅也有这个字段,保持一致没毛病

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

// Get the connection for the client
Connection connection = connectionManager.getConnection(event.getClientId());
if (connection == null) {
// If connection is not available, return
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里增加一些日志更好,不然可能客户端任务订阅成了,但是服务端这边又没有日志验证

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

ConfigExecutor.getClientConfigNotifierServiceExecutor()
.schedule(retryTask, retryTask.tryTimes * 2L, TimeUnit.SECONDS);
} else {
// Client is already offline, ignore the task.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

建议补充client 离线的 log,方便后续问题排查

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

hujun-w-2
hujun-w-2 previously approved these changes Apr 1, 2024
// Execute the fuzzy listen operation
ConfigBatchFuzzyListenResponse listenResponse = (ConfigBatchFuzzyListenResponse) requestProxy(
rpcClient, configBatchFuzzyListenRequest);
if (listenResponse != null && listenResponse.isSuccess()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里的设计是只要请求服务端成功就认为本地的context列表与服务端一致?

Copy link
Contributor Author

@stone-98 stone-98 Apr 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FuzzyListenContext的字段如下:
isConsistentWithServer:是否与服务端同步
isInitializing:是否在初始化中
当请求服务端成功,isConsistentWithServer = true,代表本地的context与服务端同步了(但是列表未一致),但是此时isInitializing = true,需要等到服务端将所有的配置推送到客户端后,才将isInitializing = false,代表已经完成了初始化,此时本地的context的配置列表才与服务端一致。
然后在断开连接时,会将isConsistentWithServer = false,代表需要重新同步。

@hujun-w-2
Copy link
Collaborator

目前的实现是垮了 namespace ,这个与 Nacos 的隔离理念是否不符

@stone-98
Copy link
Contributor Author

stone-98 commented Apr 3, 2024

目前的实现是垮了 namespace ,这个与 Nacos 的隔离理念是否不符

大佬,我没太理解这句话的意思,目前的实现客户端是不跨namespace的,服务端是跨namespace的,只是说request带了namespace。

@hujun-w-2
Copy link
Collaborator

目前的实现是垮了 namespace ,这个与 Nacos 的隔离理念是否不符

大佬,我没太理解这句话的意思,目前的实现客户端是不跨namespace的,服务端是跨namespace的,只是说request带了namespace。

嗯,我是担心不同namespace的监听数据冲突,我刚刚又仔细看了一下,确实客户端维护的 fuzzyListenContextMap 是每个ns都是单独的,服务端的唯一key也包含了namespace。

@hujun-w-2
Copy link
Collaborator

@KomachiSion 是否考虑先合并到2.x呢?

@hujun-w-2
Copy link
Collaborator

以前单个配置监听定时批量监听的时候会返回change的配置列表,相当于有个兜底的与服务端对账,模糊监听的当时有考虑这个吗

}

// Execute fuzzy listen operation for addition
doExecuteConfigFuzzyListen(addContextMap, true);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里先分类成 addContextMap和removeContextMap,再调用两次doExecuteConfigFuzzyListen 看着有点绕,为什么不直接遍历fuzzyListenContextMap,执行一次doExecuteConfigFuzzyListen呢,反正doExecuteConfigFuzzyListen内部都会判断分类

Copy link
Contributor Author

@stone-98 stone-98 Apr 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

确实,仔细看了下,这个确实有点复杂了,后续我更新下。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

/**
* Set of fuzzy listening contexts.
*/
private Set<Context> contexts = new HashSet<>();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里的context为什么不直接用FuzzyListenContext ,相当于两个类做同样的事情

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里理论上使用FuzzyListenContext是ok的,但是FuzzyListenContext 不单单是一个实体,还承载着通知listener的能力,所以这个请求里面就直接使用了一个内部类去存关键字段,这里是否有必要直接使用FuzzyListenContext我也有点纠结......

@stone-98
Copy link
Contributor Author

stone-98 commented Apr 6, 2024

以前单个配置监听定时批量监听的时候会返回change的配置列表,相当于有个兜底的与服务端对账,模糊监听的当时有考虑这个吗

这里的ConfigBatchFuzzyListenRequest分别有两个作用:

  • 创建以及取消模糊监听
  • 对账(客户端此上下文匹配的配置与服务端此上下文匹配的配置进行对账,然后将变更的配置会推送至SDK)

@stone-98 stone-98 changed the title Nacos supports fuzzy config listening. Nacos supports fuzzy config listen. Apr 7, 2024
@hujun-w-2
Copy link
Collaborator

以前单个配置监听定时批量监听的时候会返回change的配置列表,相当于有个兜底的与服务端对账,模糊监听的当时有考虑这个吗

这里的ConfigBatchFuzzyListenRequest分别有两个作用:

  • 创建以及取消模糊监听
  • 对账(客户端此上下文匹配的配置与服务端此上下文匹配的配置进行对账,然后将变更的配置会推送至SDK)

目前对账的逻辑是每隔5分钟服务端会把所有配置重新推一遍,这个开销是不是太大了,如果多个客户端同时监听服务端多个配置,那推送的量是很大的,这块感觉需要优化一下

@hujun-w-2
Copy link
Collaborator

以前单个配置监听定时批量监听的时候会返回change的配置列表,相当于有个兜底的与服务端对账,模糊监听的当时有考虑这个吗

这里的ConfigBatchFuzzyListenRequest分别有两个作用:

  • 创建以及取消模糊监听
  • 对账(客户端此上下文匹配的配置与服务端此上下文匹配的配置进行对账,然后将变更的配置会推送至SDK)

目前对账的逻辑是每隔5分钟服务端会把所有配置重新推一遍,这个开销是不是太大了,如果多个客户端同时监听服务端多个配置,那推送的量是很大的,这块感觉需要优化一下

优化思路:客户端批量对比的时候将同一个groupPattern下的配置 md5 集合再计算一个 md5 提供给服务端,服务端如果对比发现有变更的话告诉客户端需要变更,客户端再将同一个groupPattern下的所有 md5 传给服务端,服务端再将不一致的重新推一编。这个只是一个简单的思路,应该还有更好的方案

@stone-98
Copy link
Contributor Author

以前单个配置监听定时批量监听的时候会返回change的配置列表,相当于有个兜底的与服务端对账,模糊监听的当时有考虑这个吗

这里的ConfigBatchFuzzyListenRequest分别有两个作用:

  • 创建以及取消模糊监听
  • 对账(客户端此上下文匹配的配置与服务端此上下文匹配的配置进行对账,然后将变更的配置会推送至SDK)

目前对账的逻辑是每隔5分钟服务端会把所有配置重新推一遍,这个开销是不是太大了,如果多个客户端同时监听服务端多个配置,那推送的量是很大的,这块感觉需要优化一下

优化思路:客户端批量对比的时候将同一个groupPattern下的配置 md5 集合再计算一个 md5 提供给服务端,服务端如果对比发现有变更的话告诉客户端需要变更,客户端再将同一个groupPattern下的所有 md5 传给服务端,服务端再将不一致的重新推一编。这个只是一个简单的思路,应该还有更好的方案

后续有时间会处理一下。

@shiyiyue1102 shiyiyue1102 changed the title Nacos supports fuzzy config listen. Config fuzzy watch support Dec 26, 2024
@shiyiyue1102 shiyiyue1102 added area/Config area/Client Related to Nacos Client SDK labels Dec 26, 2024
@shiyiyue1102 shiyiyue1102 changed the base branch from develop to v3.0-develop December 26, 2024 10:25
@shiyiyue1102 shiyiyue1102 changed the base branch from v3.0-develop to v3.0-develop-support-fuzzy-watch December 27, 2024 02:07
@shiyiyue1102 shiyiyue1102 merged commit 17bd04e into alibaba:v3.0-develop-support-fuzzy-watch Dec 27, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/Client Related to Nacos Client SDK area/Config version/3.x Nacos 3.0 Architecture Evolution
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants