Skip to content

Conversation

Fangliding
Copy link
Member

@Fangliding Fangliding commented Jul 26, 2025

query失败的时候塞个无效的config进去就会让TLS失败了 行为符合预期 甚至报的错都是 tls: malformed ECHConfigList 非常合理

@RPRX
Copy link
Member

RPRX commented Jul 26, 2025

我想想,下个版本前决定吧,话说现在的 ECH 配置支持 quic-go 吗

@Fangliding
Copy link
Member Author

支持的 我很久以前就说过了 xhttp3都能用

@patterniha
Copy link
Collaborator

patterniha commented Jul 27, 2025

For MitM-serverless we can use ech for websites that have cloudflare-IP.

But not all websites that have cloudflare-IP support ech.

so only if a website does not support ech, we should use fragment instead.

///

I will check to see what is the best way to do this.

@RPRX
Copy link
Member

RPRX commented Jul 27, 2025

我感觉 ECH 这功能主要是给外国友人上的,不像中国把境外 DoH 都封得差不多了

@patterniha
Copy link
Collaborator

patterniha commented Jul 27, 2025

which has almost blocked all overseas DoH

use a DOH behind cloudflare that support ech.

@patterniha
Copy link
Collaborator

use a DOH behind cloudflare that support ech.

I see the code, we can't use ech for DOH which is used for ech.

so we should add this feature.

but for it's key, we can't use another DOH for that(if we can, we'll use that from the beginning), so we should use fixed key.

Does ech-key change Frequently? if not, we can use fixed-key for the DOH something like:

"example.com+https://doh.website.com/dns-query+ech=..."

so we can use ech for DOH which is used for ech.

@RPRX
Copy link
Member

RPRX commented Jul 27, 2025

又来烧脑了

@RPRX
Copy link
Member

RPRX commented Jul 27, 2025

我觉得在 ECH 普遍能自定义域名之前,无需过度开发 ECH 相关功能,哪天 cloudflare-ech.com 这样的 SNI 都被封了就搞笑了

@Fangliding
Copy link
Member Author

首先这个东西用udp dns也能查 自己起个doh转发也非常简单 修改查询域名的功能也可以防止暴露原始域名 如果真的被针对了把type65屏蔽掉那么更好的方法是ban掉outer sni 那就直接没得玩了

@patterniha
Copy link
Collaborator

patterniha commented Jul 27, 2025

my suggestion is only for MitM+serverless, that SNI changes every time.

if we want to connect to our Xray-server, sni is fixed, so we can use fixed-key at beginning and we don't need DOH at all.

///

unfortunately, the DOH used for ech is "local+https" DOH.

so we can do:

  1. add "inbound-tag" and use dispatcher
  2. simply add dialerProxy-support like dns-outbound-skip

also we need to add "h2c" for ech-DOH.

@RPRX
Copy link
Member

RPRX commented Jul 27, 2025

@patterniha 其实支持个 h2c:// 就行

@RPRX
Copy link
Member

RPRX commented Jul 27, 2025

还有如果各个代理软件的默认行为是查不到 ECH 就回落到明文 SNI,我觉得应该加个 force 之类的字样再开启强制

@RPRX
Copy link
Member

RPRX commented Jul 27, 2025

@Fangliding https+force:// 这种

@Fangliding
Copy link
Member Author

应该没人想要自己配置了ech的连接连到非ech吧 而且失败了是不会cache的 每个请求都会去尝试请求记录 我还设计了多个请求同时进来只会有一条真的request出去其他人会得到缓存 失败了不进缓存的 这样所有连接会全部串行卡五秒超时过去一个 根本不是正常行为

@RPRX
Copy link
Member

RPRX commented Jul 27, 2025

主要是因为 ECH 这种东西要查,有的地方可能用不了那个 DNS,得看分享出来的人怎么想,是要保证尽可能连上还是

@Fangliding
Copy link
Member Author

Fangliding commented Jul 27, 2025

草案也说除非是被可信服务端明确reject不然不能回退到未加密

   Unless ECH is disabled as a result of successfully establishing a
   connection to the public name, the client MUST NOT fall back to using
   unencrypted ClientHellos, as this allows a network attacker to
   disclose the contents of this ClientHello, including the SNI.  It MAY
   attempt to use another server from the DNS results, if one is
   provided.

@patterniha
Copy link
Collaborator

patterniha commented Jul 27, 2025

I open a PR soon to add this features:

  1. use chrome-fingerprint for ech-DOH

  2. add h2c for ech-DOH

  3. add echDialerProxy, so we can: change dial-address(use built-in-dns, ...), use ech for ech-DOH,...

///

also changing dial-address is useful for other uses, currently ech-DOH domain only resolvable with system-DNS.

@Fangliding
Copy link
Member Author

....

@RPRX
Copy link
Member

RPRX commented Jul 27, 2025

我又想了下,v25.7.26 的行为更符合现实,因为 ech 刚被加进分享链接,等客户端普及得一年,如果改成默认 force,那反而可能会出现新客户端连不上、旧客户端用原 SNI 连上的情况,也无助于保护隐私,如果原 SNI 没有被 GFW 阻断,能连就连、尽力而为

对于就是想用 ECH、不想暴露 SNI 的隐私需求,给个 force 的选择即可

@Fangliding
Copy link
Member Author

Fangliding commented Jul 27, 2025

我更多觉得这是个自建才用的功能 玩具而已 真是在服务端配置的ech那直接分享固定的config就行了让它失败这么过去那每个连接卡几秒超时也能用也挺厉害的 旧客户端能带明文CH直接连这是没法避免的 本来把query用的服务器和直接编码的config合并几个选项就是不想太臃肿 这还叠个加号太多了 有时候顺序都搞不清 解析还麻烦 要是觉得之前的行为是对的或者干脆不合这个也随便 它什么行为没关系就是不想堆太多加号

@CrazyBoyFeng
Copy link

CrazyBoyFeng commented Jul 27, 2025

我不知道伊朗方面是怎么使用 Ech 这个特性的,他那个太复杂了。
站在中国用户的角度考虑,但凡是在客户端里配置了 EchConfig 的人,肯定都是为了隐藏服务器的 Sni。但如果最终没能隐藏 Sni,那这不就与预期效果不符了吗?

而且现在 7.26 里的逻辑并不完美。如同 Fangliding 所述,客户端每次连接目标服务器前都会去查 Type65,这个体验会变差。如果要实现回落明文 Sni,需要多缓存一个“查询 Ech 失败”状态的变量。那这需要多做一些工作。还不如就直接断开,日志上告诉用户查询 Ech 失败。然后让用户检查失败的原因,或者干脆关闭 Ech。

@likejia1
Copy link

@CrazyBoyFeng 套cf的情况啊,cf的ech会经常变

@patterniha
Copy link
Collaborator

patterniha commented Jul 27, 2025

For using "ech" in MitM+serverless config.
if a domain has ech-key, then we don't use fragment, otherwise use fragment.

to achieve this, because the domain changes every time and we don't know which domain support ech, i should use redirect-socks-in/out-with-sniff to find out if applyEch is successful or not, then if sniffed-domain is "cloudflare-ech.com", i don't use fragment, otherwise use.

as a result, force should be optional.

@RPRX
Copy link
Member

RPRX commented Jul 29, 2025

@CrazyBoyFeng 所以这不是要有可选的 force 了吗,需要 force 的自己配置即可,我还没想好要不要把它加进分享链接

主要是对分享链接和广泛的旧客户端、其它实现来说,实际上没办法弄成 force,只能自己本地配置,除非不兼容比如 vlessech://

而且 ECH 最大的劣势就是要查 DNS,但是这个 DNS 不是所有地方都能访问到的,弄成默认 force 的话有的地方能用有的却不能用

对于直接 pin 了 ECH 参数的,或者是查到 ECH 参数但没连上的,不应退回明文 SNI,这个行为没问题

@RPRX
Copy link
Member

RPRX commented Jul 29, 2025

@Fangliding 加个配置项,暂定名 echForceQuery 吧,应该不会加进分享链接了,因为这东西靠现有的分享链接强制不了,除非像 VLESS 当初留了个 "encryption": "none" #4952 (comment)

@RPRX
Copy link
Member

RPRX commented Jul 29, 2025

还有,虽然现有的分享链接强制不了 ECH,但 Xray 出配置订阅后,或 Serverless-for-Iran 直接分享配置那种,可以实现强制 ECH

@RPRX
Copy link
Member

RPRX commented Jul 29, 2025

要是觉得之前的行为是对的或者干脆不合这个也随便 它什么行为没关系就是不想堆太多加号

@Fangliding echForceQuery

@patterniha
Copy link
Collaborator

also, if echForceQuery is false, we should cache even if no ech record found,
(otherwise in MitM-usage of ech, we run update for each request)

so if echForceQuery is false:

return []byte{}, 0, errors.New("no ech record found")

should be change to:

return []byte{}, dns2.DefaultTTL, nil

also, we should have Periodic-CacheCleanup to remove expired-items and control cache size.

@RPRX
Copy link
Member

RPRX commented Jul 29, 2025

"echForceQuery": false 的话,失败就失败了,十分钟后再查吧 @Fangliding

@RPRX
Copy link
Member

RPRX commented Aug 1, 2025

@Fangliding

  1. config.proto 没放上来
  2. 各处把 echServerKeys 放最上面吧,因为以后要加的都是客户端参数
  3. "echForceQuery": false 时查 DoH 失败的话十分钟后再查

@Fangliding
Copy link
Member Author

那就这样吧

@RPRX
Copy link
Member

RPRX commented Aug 1, 2025

Ready 了说一声

@Fangliding
Copy link
Member Author

就这样已经好了 后面的fp是漏了json定义的顺序

@patterniha
Copy link
Collaborator

@RPRX @Fangliding

#4947 (comment)

caching emptyResponse is necessary for MitM-usage of ech, I add #4947 (comment) after this PR merged.

@RPRX
Copy link
Member

RPRX commented Aug 1, 2025

@patterniha 我会在合并 #4968 后合并这个 PR,然后你可以 rebase 一下你的 PR 并加上你想加的东西

@Fangliding
Copy link
Member Author

你们都不看的吗 十分钟再查就是cache失败记录啊

@RPRX
Copy link
Member

RPRX commented Aug 1, 2025

总之我没看新的代码,等下合了

@RPRX RPRX changed the title TLS: Force connection failed if ApplyECH failed TLS ECH client: Add echForceQuery config Aug 1, 2025
@RPRX RPRX merged commit b282921 into main Aug 1, 2025
77 of 78 checks passed
@patterniha
Copy link
Collaborator

patterniha commented Aug 1, 2025

this code has two problem:

  1. in:
    configRecord = &echConfigRecord{
    config: nil,
    expire: time.Now().Add(10 * time.Minute),
    err: err,
    }

we cache all errors for 10 minutes, this is wrong, suppose our internet disconnect for a while so for 10 minutes we don't try to get the ech-key !!!

we should cache only for emptyResponse error, and we should not cache other errors.

  1. this code:
    if configRecord.expire == (time.Time{}) || configRecord.expire.Add(time.Hour*6).Before(time.Now()) {
    return echConfigCache.Update(domain, server, false, forceQuery)
    } else {
    // If someone already acquired the lock, it means it is updating, do not start another update goroutine
    if echConfigCache.UpdateLock.TryLock() {
    go func() {
    defer echConfigCache.UpdateLock.Unlock()
    echConfigCache.Update(domain, server, true, forceQuery)
    }()
    }
    return configRecord.config, nil
    }

causes us to use expired-key.

///////////////////////////////////////////////////////////////////////////////////////////////

1 is a big problem and I solve it in my PR.

but for 2, it seems that ech-key does not change frequently, so it may be problematic in some occasional situations, so I leave this matter to you.

@Fangliding
Copy link
Member Author

Fangliding commented Aug 1, 2025

1 我懒得管它是怎么失败的 只是缓存失败而已 临时的dns服务器不可用也会导致这个问题 这不是我关心的问题
2 我不知道你是直接找AI问这个代码有什么缺陷还是怎么想的 你难道看不出来这是故意的么 我连注释都写了 ech config 的轮换窗口不可能这么短 一秒不到后它们就会替换成新的 使用旧的不会造成问题 这里直接返回是为了后续请求不阻塞 是一种很常用的缓存策略 上面还怕太久没请求导致cache真的过期设置了6小时的上限

@patterniha
Copy link
Collaborator

patterniha commented Aug 1, 2025

usually, the ttl is not less than 5 minutes, also, it takes less than half a second to get the new-key.

So waiting half a second every 5 minutes is not a problem, but this will make sure the key hasn't expired.

Although the probability that the key has actually expired is very low, but is not zero, and if you have seen my PRs, I also worry about nearly-impossible situations.

///

In short, I think it's not worth sacrificing accuracy for this small speed-up.

Anyway, I respect your code and I won't change it.

@patterniha
Copy link
Collaborator

also, the order of forceQuery and isLockedUpdate is wrong in:

func (c *ECHConfigCache) Update(domain string, server string, forceQuery bool, isLockedUpdate bool) ([]byte, error) {

and i fix it in my PR.

@RPRX
Copy link
Member

RPRX commented Aug 1, 2025

@patterniha 大概多久

@RPRX RPRX deleted the ECH branch August 1, 2025 20:55
@patterniha
Copy link
Collaborator

@RPRX

some users find new bugs in ech.

for example @GFW-knocker find ech does not work when fingerprint is not chrome.

or we have panic for wrong address like: "udp://8.8.4.4/dns-query" instead of error-print.

///

I'm checking them out. It'll probably take a few hours.

@RPRX
Copy link
Member

RPRX commented Aug 1, 2025

uTLS 的非 Chrome 指纹可能还没 ECH 扩展

配置写错的话运行时 panic 倒不是啥大问题,运行时因为处理数据导致意外 panic 才是问题

@patterniha
Copy link
Collaborator

patterniha commented Aug 1, 2025

Ok, I try to finish in 2 hours at most

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants