mirror of
https://github.com/hpd840321/starRiverProperty.git
synced 2026-06-10 00:40:30 +08:00
7b2bd307f1
- backend/: 13 Maven modules (cw-elevator-application, cloudwalk-cloud, intelligent-cwoscomponent, ninca-crk, etc.) - frontend/: 4 Vue projects (elevator-front, cwos-portal, alarm-front, front_acs) + decompiled + scripts - scripts/: build, test-env, tools (Docker Compose, service templates, API parity) - docs/: AGENTS.md, superpowers specs, architecture docs - .gitignore: standard Java/Maven exclusions Moved from legacy maven-*/ root layout to backend/ organized structure.
257 lines
12 KiB
Markdown
257 lines
12 KiB
Markdown
# ConsulServerList 静态固定后:IP 直连服务发现分析
|
||
|
||
> **日期**:2026-05-05
|
||
> **分析范围**:`maven-cw-elevator-application`、`maven-intelligent-cwoscomponent`
|
||
> **状态**:待后续排查
|
||
|
||
---
|
||
|
||
## 1. 问题背景
|
||
|
||
电梯应用(cw-elevator-application)的三个上游 Feign 客户端在 `spring.cloud.consul.discovery.enabled=false` 的情况下,无法通过 Consul 动态发现服务实例。当前解决方式是将 Ribbon 的 `ServerList` 固定为 `ConfigurationBasedServerList`,直接配置静态 IP 列表绕过 Consul。
|
||
|
||
但 V1 生产(星中心)中**并没有**这些静态 IP 配置,却仍在正常运行——这个差异需要深入排查。
|
||
|
||
---
|
||
|
||
## 2. 服务发现链路现状
|
||
|
||
### 2.1 V2 当前架构
|
||
|
||
```
|
||
业务代码 → @FeignClient(name = "ninca-crk-std")
|
||
│
|
||
▼
|
||
Ribbon LoadBalancer
|
||
│
|
||
▼
|
||
ConfigurationBasedServerList ← 由 application.properties 强制指定
|
||
│
|
||
▼
|
||
ninca-crk-std.ribbon.listOfServers ← 直接返回硬编码 IP:Port
|
||
│
|
||
▼
|
||
目标 HTTP 服务
|
||
```
|
||
|
||
### 2.2 V1 生产(星中心)架构——未文档化的差异
|
||
|
||
```
|
||
业务代码 → @FeignClient(name = "ninca-crk-std")
|
||
│
|
||
▼
|
||
Ribbon LoadBalancer
|
||
│
|
||
▼
|
||
??? (无 ConfigurationBasedServerList 配置)
|
||
│
|
||
▼
|
||
??? (无 listOfServers 配置)
|
||
```
|
||
|
||
**V1 生产没有 `ribbon.listOfServers`,但 Feign 调用工作正常——原因未知。**
|
||
|
||
---
|
||
|
||
## 3. 涉及的上游服务与 Feign 客户端
|
||
|
||
### 3.1 `ninca-crk-std`(CRK 人脸识别 GPU)
|
||
|
||
| Feign 客户端 | 模块 | @FeignClient name | 当前解析方式 |
|
||
|-------------|------|-------------------|-------------|
|
||
| `VisitorFeignClient.java` | elevator-service | `${feign.ninca-crk-std.name:ninca-crk-std}` | Ribbon → `listOfServers=10.128.161.95:16106` |
|
||
| `AcsRecordThreeSendFeignClient.java` | intelligent-cwoscomponent-rest | `${feign.ninca-crk-std.name:ninca-crk-std}` | Ribbon → 同上 |
|
||
|
||
另有非 Feign 路径:`AcsElevatorRecordServiceImpl` 通过 `@Value("${ninca-crk-std.ip}")` + `RestTemplate` 直连 CRK,与 Feign 无关。
|
||
|
||
### 3.2 `ninca-common-component-organization`(组织组件)
|
||
|
||
| Feign 客户端 | @FeignClient name | @FeignClient url | 当前解析方式 |
|
||
|-------------|-------------------|-----------------|-------------|
|
||
| `PersonFeignClient.java` | `ninca-common-component-organization` | **`http://127.0.0.1:33011`**(硬编码) | **绕过 Ribbon,直接 URL** |
|
||
| `OrganizationFeignClient.java` | `${feign.component-organization.name}` | 无 | Ribbon → `listOfServers=127.0.0.1:33011` |
|
||
| `LabelFeignClient.java` | `${feign.component-organization.name}` | 无 | Ribbon → 同上 |
|
||
| `ImageStorePersonFeignClient.java` | `${feign.component-organization.name}` | 无 | Ribbon → 同上 |
|
||
| `ImageStoreFeignClient.java` | `${feign.component-organization.name}` | 无 | Ribbon → 同上 |
|
||
| `ApplicationImageStoreFeignClient.java` | `${feign.component-organization.name}` | 无 | Ribbon → 同上 |
|
||
|
||
### 3.3 `ninca-common`(公共组件)
|
||
|
||
| Feign 客户端 | @FeignClient name | 当前解析方式 |
|
||
|-------------|-------------------|-------------|
|
||
| `ZoneFeignClient.java` | `${feign.ninca-common.name:ninca-common}` | Ribbon → **无 `listOfServers` 配置** |
|
||
| `FileFeign.java` | `${feign.ninca-common.name:ninca-common}` | Ribbon → 同上 |
|
||
| `SysettingAreaFeignClient.java` | `${feign.ninca-common.name:ninca-common}` | Ribbon → 同上 |
|
||
|
||
> **⚠️ `ninca-common` 没有任何 `ribbon.listOfServers` 配置。如果 Feign 调用到达此服务,会因空列表失败。**
|
||
|
||
---
|
||
|
||
## 4. 配置差异矩阵
|
||
|
||
### 4.1 V2 deploy/v2-maven/application.properties(当前验证环境)
|
||
|
||
```properties
|
||
# Feign 服务名映射
|
||
feign.ninca-crk-std.name=ninca-crk-std
|
||
feign.component-organization.name=ninca-common-component-organization
|
||
feign.ninca-common.name=ninca-common
|
||
|
||
# ninca-crk-std: 强制 ConfigurationBasedServerList + 静态 IP
|
||
ninca-crk-std.ribbon.NIWSServerListClassName=com.netflix.loadbalancer.ConfigurationBasedServerList
|
||
ninca-crk-std.ribbon.listOfServers=10.128.161.95:16106
|
||
|
||
# ninca-common-component-organization: 同上
|
||
ninca-common-component-organization.ribbon.NIWSServerListClassName=com.netflix.loadbalancer.ConfigurationBasedServerList
|
||
ninca-common-component-organization.ribbon.listOfServers=127.0.0.1:33011
|
||
|
||
# ninca-common: ❌ 无任何 Ribbon 配置
|
||
```
|
||
|
||
### 4.2 V2 deploy/v1-legacy/application.properties
|
||
|
||
```properties
|
||
ninca-crk-std.ribbon.NIWSServerListClassName=com.netflix.loadbalancer.ConfigurationBasedServerList
|
||
ninca-crk-std.ribbon.listOfServers=10.128.161.95:16106
|
||
# ❌ component-organization 和 ninca-common 无 Ribbon 配置
|
||
```
|
||
|
||
### 4.3 V1 生产(星中心/cw-elevator-application-V1.0.0.20211103/application.properties)
|
||
|
||
```properties
|
||
feign.ninca-crk-std.name=ninca-crk-std
|
||
feign.component-organization.name=ninca-common-component-organization
|
||
feign.ninca-common.name=ninca-common
|
||
ninca-crk-std.ip=10.0.22.102:16106
|
||
# ❌ 无任何 ribbon.listOfServers 配置
|
||
# ❌ 无任何 NIWSServerListClassName 配置
|
||
# ✅ discovery.enabled=false(bootstrap.properties)
|
||
```
|
||
|
||
---
|
||
|
||
## 5. NincaCrkStdRibbonConfiguration 源码删除问题
|
||
|
||
| 文件 | 状态 | 位置 |
|
||
|------|------|------|
|
||
| `NincaCrkStdRibbonConfiguration.java` | ❌ **源码已删除** | `cw-elevator-application-starter/src/main/java/cn/cloudwalk/ribbon/` 目录存在但为空 |
|
||
| `NincaCrkStdRibbonConfiguration.class` | ✅ 旧 fat JAR 中残留 | `deploy/v2-maven/cw-elevator-application-2.0.0/BOOT-INF/classes/cn/cloudwalk/ribbon/` |
|
||
| `ElevatorApplication.java` | ❌ 当前无 `@RibbonClient`/`@RibbonClients` | 启动类已清理 |
|
||
|
||
**影响**:新的 V2 构建不再包含 `NincaCrkStdRibbonConfiguration.class`。但由于 `application.properties` 中已有 `NIWSServerListClassName`,功能不受影响——属性配置和 Java 配置实现相同目的。
|
||
|
||
**残留风险**:旧 JAR 的 `NincaCrkStdRibbonConfiguration` 被 `@RibbonClients` 引用。如果旧 JAR 中的 `ElevatorApplication.class` 有 `@RibbonClients({@RibbonClient(configuration=NincaCrkStdRibbonConfiguration.class)})`,而新的构建没有这个类,会导致 `ClassNotFoundException`。需要确认 `ElevatorApplication.java` 是在哪个提交中完全移除了 `@RibbonClients`。
|
||
|
||
---
|
||
|
||
## 6. PersonFeignClient URL 硬编码问题
|
||
|
||
Commit: `ff9a9ed6` — `fix(test): hardcode PersonFeignClient url to local stub for test env`
|
||
|
||
```java
|
||
// 改前:
|
||
@FeignClient(name = "${feign.component-organization.name:ninca-common-component-organization}",
|
||
path = "/component/person", fallback = PersonFeignClientFallback.class)
|
||
|
||
// 改后:
|
||
@FeignClient(name = "ninca-common-component-organization",
|
||
url = "http://127.0.0.1:33011",
|
||
path = "/component/person", fallback = PersonFeignClientFallback.class)
|
||
```
|
||
|
||
**改动影响**:
|
||
- `name` 从可配置占位符改为硬编码
|
||
- 新增 `url` 强制指向 `127.0.0.1:33011`(本地 stub)
|
||
- `name` 硬编码后,即使去掉 `url`,Ribbon 客户端名也不再跟随 `feign.component-organization.name` 配置
|
||
|
||
**提交说明**:`NOTE: PersonFeignClient url change is test-only, revert before production`
|
||
|
||
---
|
||
|
||
## 7. 待排查项
|
||
|
||
### P0 — V1 生产服务发现机制
|
||
|
||
V1 生产(星中心 `bootstrap.properties`)中 `spring.cloud.consul.discovery.enabled=false`,且没有任何 `ribbon.listOfServers` 配置。需要确认:
|
||
- V1 生产的 Feign 客户端到 CRK/组织组件的调用路径是否实际被执行?
|
||
- 如果执行,ConsulServerList 在 `discovery.enabled=false` 下是否仍能从 Consul HTTP API 获取实例?
|
||
- Spring Cloud Edgware.SR3 的 `ConsulServerList` 在 `discovery.enabled=false` 时到底如何行为?
|
||
|
||
**验证方法**:
|
||
```bash
|
||
# 在 V1 生产进程上查 Ribbon 实际使用的 ServerList 实现类
|
||
# 通过 RibbonLoadBalancerProbeRunner 日志(如果有)
|
||
# 或通过 jstack 分析 Feign 调用链
|
||
|
||
# 查看 Spring Cloud Consul 版本(决定 ConsulServerList 行为)
|
||
unzip -p V1.jar lib/spring-cloud-consul-discovery-*.jar META-INF/MANIFEST.MF
|
||
```
|
||
|
||
### P1 — ninca-common 缺少静态 IP 配置
|
||
|
||
`ZoneFeignClient`(空间服务)和 `FileFeign`(文件服务)调用 `ninca-common`,但没有任何 `listOfServers` 配置。需要:
|
||
- 确认这些 Feign 客户端在 V1/V2 中是否实际被调用
|
||
- 如果被调用,补充 `ninca-common.ribbon.listOfServers` 配置
|
||
|
||
### P1 — PersonFeignClient 生产就绪检查
|
||
|
||
- 确认 `url = "http://127.0.0.1:33011"` 在构建生产 JAR 前被 revert
|
||
- 恢复 `name = "${feign.component-organization.name:ninca-common-component-organization}"` 的占位符形式
|
||
|
||
### P2 — NincaCrkStdRibbonConfiguration 清理残留
|
||
|
||
- 决定是否恢复源码(保留 Java config 方式)或完全依赖 properties
|
||
- 如果依赖 properties,删除 `cn/cloudwalk/ribbon/` 空目录清除残留
|
||
|
||
### P2 — V1 vs V2 Ribbon 配置对齐文档
|
||
|
||
- 在 `docs/architecture/` 中记录 V1 生产的真实服务发现机制
|
||
- 明确 V2 增加 `ribbon.listOfServers` 的理由和适用范围
|
||
|
||
---
|
||
|
||
## 8. 关键文件定位
|
||
|
||
| 文件 | 路径 |
|
||
|------|------|
|
||
| ElevatorApplication.java | `maven-cw-elevator-application/cw-elevator-application-starter/src/main/java/cn/cloudwalk/elevator/ElevatorApplication.java` |
|
||
| NincaCrkStdRibbonConfiguration.java | **已删除**(旧路径:`.../ribbon/NincaCrkStdRibbonConfiguration.java`) |
|
||
| RibbonLoadBalancerProbeRunner.java | `.../debug/RibbonLoadBalancerProbeRunner.java` |
|
||
| ElevatorUpstreamServiceNames.java | `.../debug/ElevatorUpstreamServiceNames.java` |
|
||
| PersonFeignClient.java | `maven-intelligent-cwoscomponent/intelligent-cwoscomponent-rest/.../person/feign/PersonFeignClient.java` |
|
||
| V2 deploy application.properties | `maven-cw-elevator-application/deploy/v2-maven/application.properties` |
|
||
| V2 deploy test properties | `maven-cw-elevator-application/deploy/v2-maven/application-test.properties` |
|
||
| V1 生产 config | `星中心/cw-elevator-application-V1.0.0.20211103/application.properties` |
|
||
| V1 生产 bootstrap | `星中心/cw-elevator-application-V1.0.0.20211103/bootstrap.properties` |
|
||
| V1 fat JAR internal props | `星中心/.../cw-elevator-application-V1.0.0.20211103.jar!/application.properties` |
|
||
| 提交:清理 @RibbonClients | `373b5501` |
|
||
| 提交:PersonFeignClient url 硬编码 | `ff9a9ed6` |
|
||
| 提交:添加 ZK discovery 依赖 | `6b5898d0` |
|
||
| 提交:添加 NincaCrkStdRibbonConfiguration | `0a6ac955` |
|
||
| 架构设计文档 | `docs/superpowers/specs/2026-05-01-service-discovery-architecture-design.md` |
|
||
|
||
---
|
||
|
||
## 9. 附录:Ribbon ServerList 在 Spring Cloud Edgware 中的默认行为
|
||
|
||
Spring Cloud Edgware.SR3(V1 使用的版本)中,当 `spring.cloud.consul.discovery.enabled=false`:
|
||
|
||
```
|
||
ConsulDiscoveryClient → @ConditionalOnProperty(enabled=true) → bean NOT created
|
||
ConsulServerList → depends on DiscoveryClient → can't be created
|
||
Fallback → ConfigurationBasedServerList (Ribbon 默认)
|
||
ConfigurationBasedServerList → reads "{name}.ribbon.listOfServers" from Environment
|
||
if absent → returns empty list
|
||
```
|
||
|
||
V2(Spring Cloud Greenwich + Boot 2.1.x)表现相同。
|
||
|
||
如果 V1 生产确实没有 `listOfServers` 但 Feign 调用正常,则说明:
|
||
1. Feign 客户端的调用链路本身不在 CRK/组织组件上执行,或者
|
||
2. Spring Cloud Edgware 的 ConsulServerList 在 `discovery.enabled=false` 时有特殊处理(需要反编译 `ConsulServerList.class` 确认),或者
|
||
3. V1 生产运行时通过其他方式注入了 `listOfServers`(如 Consul KV、`SPRING_APPLICATION_JSON`、JVM 参数)
|
||
|
||
---
|
||
|
||
*本文档为代码走查与配置分析生成,需结合生产环境运行时证据进一步排查。*
|