Skip to content

Commit 34a5f05

Browse files
feat: remove shared of openmldb dependencies for openmldb-batchjob (#3849)
* Remove shared of openmldb dependencies for openmldb-batchjob * Add shade for protobuf in openmldb-batchjob * Change log to logger in taskmanager * Update deploy doc for spark distribution * fix minor typos --------- Co-authored-by: Siqi Wang <[email protected]>
1 parent fa861e6 commit 34a5f05

File tree

4 files changed

+46
-38
lines changed

4 files changed

+46
-38
lines changed

docs/en/deploy/install_deploy.md

Lines changed: 10 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ If you need to deploy ZooKeeper and TaskManager, you need a Java runtime environ
2424

2525
Servers needs Java 1.8 or above.
2626

27-
Zookeeper Client 3.4.14 requires `Java 1.7` - `Java 13`. Java SDK depends on it, so it should use the same Java version, don't run in higher version. If you wish to use zkCli, please use `Java 1.8` or `Java 11`.
27+
Zookeeper Client 3.4.14 requires `Java 1.7` - `Java 13`. Java SDK depends on the same client, so it should use the same Java version, not a higher version. If you wish to use zkCli, please use `Java 1.8` or `Java 11`.
2828

2929
### Hardware
3030

@@ -195,7 +195,7 @@ The environment variables are defined in `conf/openmldb-env.sh`, as shown in the
195195
| OPENMLDB_VERSION | 0.8.5 | OpenMLDB version |
196196
| OPENMLDB_MODE | standalone | standalone or cluster |
197197
| OPENMLDB_HOME | root directory of the release folder | openmldb root directory |
198-
| SPARK_HOME | $OPENMLDB_HOME/spark | openmldb spark root directory,If the directory does not exist, it will be downloaded automatically.|
198+
| SPARK_HOME | $OPENMLDB_HOME/spark | Spark root directory, if the directory does not exist, it will be downloaded automatically.|
199199
| OPENMLDB_TABLET_PORT | 10921 | TabletServer default port |
200200
| OPENMLDB_NAMESERVER_PORT | 7527 | NameServer default port |
201201
| OPENMLDB_TASKMANAGER_PORT | 9902 | taskmanager default port |
@@ -205,7 +205,7 @@ The environment variables are defined in `conf/openmldb-env.sh`, as shown in the
205205
| OPENMLDB_ZK_CLUSTER | auto derived from `[zookeeper]` section in `conf/hosts` | ZooKeeper cluster address |
206206
| OPENMLDB_ZK_ROOT_PATH | /openmldb | OpenMLDB root directory in ZooKeeper |
207207
| OPENMLDB_ZK_CLUSTER_CLIENT_PORT | 2181 | ZooKeeper client port, the client port in zoo.cfg |
208-
| OPENMLDB_ZK_CLUSTER_PEER_PORT | 2888 | ZooKeeper peer portthe first port in settings like "server.1=zoo1:2888:3888" in zoo.cfg |
208+
| OPENMLDB_ZK_CLUSTER_PEER_PORT | 2888 | ZooKeeper peer port, the first port in settings like "server.1=zoo1:2888:3888" in zoo.cfg |
209209
| OPENMLDB_ZK_CLUSTER_ELECTION_PORT | 3888 | ZooKeeper election port, the second port in settings like "server.1=zoo1:2888:3888" in zoo.cfg |
210210

211211
### Node Configuration
@@ -252,7 +252,7 @@ If multiple TaskManager instances are deployed on distinct machines, the configu
252252
bash sbin/init_env.sh
253253
```
254254
Note:
255-
- This script requires root execution. Other scripts does not require root privileges.
255+
- This script requires root execution. Other scripts do not require root privileges.
256256
- The script only modifies limit configurations, disabling swap and THP.
257257

258258
### Deployment
@@ -360,7 +360,7 @@ Enter `quit` or `Ctrl+C` to exit the zk client.
360360

361361
### Deploy TabletServer
362362

363-
Note that at least two TabletServer need to be deployed, otherwise errors may occur.
363+
Note that at least two TabletServers need to be deployed, otherwise errors may occur.
364364

365365
**1. Download the OpenMLDB deployment package**
366366

@@ -439,7 +439,7 @@ mv openmldb-0.8.5-linux openmldb-tablet-0.8.5-2
439439
cd openmldb-tablet-0.8.5-2
440440
```
441441

442-
Modify the configuration again and start the TabletServer. Note that if all TabletServers are on the same machine, use different port numbers to avoid "Fail to listen" error in the log (`logs/tablet.WARNING`).
442+
Modify the configuration again and start the TabletServer. Note that if all TabletServers are on the same machine, use different port numbers to avoid the "Fail to listen" error in the log (`logs/tablet.WARNING`).
443443

444444
**Note:**
445445

@@ -608,9 +608,11 @@ The results should include information about all TabletServer and NameServer tha
608608

609609
You can have only one TaskManager, but if you require high availability, you can deploy multiple TaskManagers, taking care to avoid IP and port conflicts. If the TaskManager master node experiences a failure, a slave node will automatically recover and replace the master node. Clients can continue accessing the TaskManager service without any modifications.
610610

611-
**1. Download the OpenMLDB deployment package and Spark distribution for feature engineering optimization**
611+
**1. Download the OpenMLDB deployment package and Spark distribution **
612612

613-
Spark distribution:
613+
Download the Spark distribution from the [Spark official website](https://spark.apache.org/downloads.html)。 Then unzip it and set the `SPARK_HOME` environment variable.
614+
615+
Alternatively, use the OpenMLDB Spark distribution.
614616

615617
```shell
616618
wget https://github.com/4paradigm/spark/releases/download/v3.2.1-openmldb0.8.5/spark-3.2.1-bin-openmldbspark.tgz

docs/zh/deploy/install_deploy.md

Lines changed: 9 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -192,10 +192,10 @@ cd openmldb-0.8.5-linux
192192

193193
| 环境变量 | 默认值 | 定义 |
194194
| -------------------------------- | ------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------- |
195-
| OPENMLDB_VERSION | 0.8.5 | OpenMLDB版本,主要用于spark下载,一般不改动。 |
195+
| OPENMLDB_VERSION | 0.8.5 | OpenMLDB版本,主要用于Spark下载,一般不改动。 |
196196
| OPENMLDB_MODE | cluster | standalone或者cluster |
197197
| OPENMLDB_HOME | 当前发行版的根目录 | openmldb发行版根目录,不则使用当前根目录,也就是openmldb-0.8.5-linux所在目录。 |
198-
| SPARK_HOME | $OPENMLDB_HOME/spark | openmldb spark发行版根目录,如果该目录不存在,自动从网上下载。**此路径也将成为TaskManager运行机器上的Spark安装目录。** |
198+
| SPARK_HOME | $OPENMLDB_HOME/spark | Spark发行版根目录,如果该目录不存在,自动从网上下载。**此路径也将成为TaskManager运行机器上的Spark安装目录。** |
199199
| RUNNER_EXISTING_SPARK_HOME | | 配置此项,运行TaskManager的机器将使用该Spark环境,将不下载、部署OpenMLDB Spark发行版。 |
200200
| OPENMLDB_USE_EXISTING_ZK_CLUSTER | false | 是否使用已经运行的ZooKeeper集群。如果是`true`,将跳过ZooKeeper集群的部署与管理。 |
201201
| OPENMLDB_ZK_HOME | $OPENMLDB_HOME/zookeeper | ZooKeeper发行版根目录,如果该目录不存在,自动从网上下载。 |
@@ -208,7 +208,7 @@ cd openmldb-0.8.5-linux
208208
通常来讲,需要确认以下几点:
209209
- ZooKeeper集群地址,如果使用已有ZooKeeper集群,需要配置`OPENMLDB_USE_EXISTING_ZK_CLUSTER=true`,并配置`OPENMLDB_ZK_CLUSTER`。(如果在`conf/hosts`中配置外部ZK集群,请注释标注其不受sbin部署影响,避免混乱。)
210210
- 需要此工具部署ZooKeeper集群时,在`conf/hosts`中配置`[zookeeper]`。填写多个ZooKeeper节点,即部署ZooKeeper集群,无需额外配置。
211-
- Spark环境,如果需要使用运行机器上已有的Spark环境,需要配置`RUNNER_EXISTING_SPARK_HOME`(地址为TaskManager运行机器上的路径)。如果部署机器存在Spark环境,并想要在TaskManager机器上使用此套环境,可配置`SPARK_HOME`(部署到TaskManager机器同名路径上)。`SPARK_HOME`不进行配置时,将自动下载、使用OpenMLDB Spark发行版
211+
- Spark环境,如果需要使用运行机器上已有的Spark环境,需要配置`RUNNER_EXISTING_SPARK_HOME`(地址为TaskManager运行机器上的路径)。如果部署机器存在Spark环境,并想要在TaskManager机器上使用此套环境,可配置`SPARK_HOME`(部署到TaskManager机器同名路径上)。`SPARK_HOME`不进行配置时,将自动下载、使用特定Spark发行版
212212

213213
#### 默认端口
214214
| 环境变量 | 默认值 | 定义 |
@@ -628,9 +628,14 @@ curl http://<apiserver_ip>:<port>/dbs/foo -X POST -d'{"mode":"online","sql":"sho
628628

629629
TaskManager 可以只存在一台,如果你需要高可用性,可以部署多 TaskManager ,需要注意避免IP端口冲突。如果 TaskManager 主节点出现故障,从节点将自动恢复故障取代主节点,客户端无需任何修改可继续访问 TaskManager 服务。
630630

631-
**1. 下载 OpenMLDB 部署包和面向特征工程优化的 Spark 发行版**
631+
**1. 下载 OpenMLDB 部署包和 Spark 发行版**
632632

633633
Spark发行版:
634+
635+
从 Spark 官网下载[Spark 发行版](https://spark.apache.org/downloads.html),解压后配置`SPARK_HOME`环境变量。
636+
637+
或者使用 OpenMLDB Spark 发行版。
638+
634639
```shell
635640
wget https://github.com/4paradigm/spark/releases/download/v3.2.1-openmldb0.8.5/spark-3.2.1-bin-openmldbspark.tgz
636641
# 中国镜像地址:https://www.openmldb.com/download/v0.8.5/spark-3.2.1-bin-openmldbspark.tgz

java/openmldb-batchjob/pom.xml

Lines changed: 26 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,26 @@
2323
</properties>
2424

2525
<dependencies>
26+
27+
<!-- OpenMLDB -->
28+
<dependency>
29+
<groupId>com.4paradigm.openmldb</groupId>
30+
<artifactId>openmldb-native</artifactId>
31+
<version>${variant.native.version}</version>
32+
</dependency>
33+
34+
<dependency>
35+
<groupId>com.4paradigm.openmldb</groupId>
36+
<artifactId>openmldb-batch</artifactId>
37+
<version>${project.parent.version}</version>
38+
<exclusions>
39+
<exclusion>
40+
<groupId>org.apache.spark</groupId>
41+
<artifactId>*</artifactId>
42+
</exclusion>
43+
</exclusions>
44+
</dependency>
45+
2646
<dependency>
2747
<groupId>junit</groupId>
2848
<artifactId>junit</artifactId>
@@ -64,27 +84,6 @@
6484
<scope>${spark.scope}</scope>
6585
</dependency>
6686

67-
<!-- OpenMLDB -->
68-
<dependency>
69-
<groupId>com.4paradigm.openmldb</groupId>
70-
<artifactId>openmldb-native</artifactId>
71-
<version>${variant.native.version}</version>
72-
<scope>provided</scope>
73-
</dependency>
74-
75-
<dependency>
76-
<groupId>com.4paradigm.openmldb</groupId>
77-
<artifactId>openmldb-batch</artifactId>
78-
<version>${project.parent.version}</version>
79-
<exclusions>
80-
<exclusion>
81-
<groupId>org.apache.spark</groupId>
82-
<artifactId>*</artifactId>
83-
</exclusion>
84-
</exclusions>
85-
<scope>provided</scope>
86-
</dependency>
87-
8887
</dependencies>
8988

9089
<build>
@@ -129,10 +128,12 @@
129128
<goal>shade</goal>
130129
</goals>
131130
<configuration>
132-
<artifactSet>
133-
<excludes>
134-
</excludes>
135-
</artifactSet>
131+
<relocations>
132+
<relocation>
133+
<pattern>com.google.protobuf</pattern>
134+
<shadedPattern>shade.protobuf</shadedPattern>
135+
</relocation>
136+
</relocations>
136137
</configuration>
137138
</execution>
138139
</executions>

java/openmldb-taskmanager/src/main/java/com/_4paradigm/openmldb/taskmanager/server/impl/TaskManagerImpl.java

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -481,7 +481,7 @@ public TaskManager.SaveJobResultResponse SaveJobResult(TaskManager.SaveJobResult
481481
}
482482
// log if save failed
483483
if (!jobResultSaver.saveFile(request.getResultId(), request.getJsonData())) {
484-
log.error("save job result failed(write to local file) for resultId: {}", request.getResultId());
484+
logger.error("save job result failed(write to local file) for resultId: " + request.getResultId());
485485
return TaskManager.SaveJobResultResponse.newBuilder().setCode(StatusCode.FAILED)
486486
.setMsg("save job result failed(write to local file)").build();
487487
}

0 commit comments

Comments
 (0)