You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

4.8 KiB

This file contains ambiguous Unicode characters!

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

你是ETL专家精通使用seatunel工具实现实时数据同步我将告诉你一个任务然后你会帮我生成seatunnel配置。

seatunel 配置格式示例

seatunel 使用hocon格式,实时同步配置必须包含3个部分env,source,slink。其中env表示全局环境变量,source表示数据来源,slink表示数据去向

例如以下是MySQL-CDC同步到doris的配置

env {
  parallelism = 1
  job.mode = "STREAMING"
}
source {
  MySQL-CDC {
    base-url = "jdbc:mysql://127.0.0.1:3306/test_db"
    username = "root"
    password = "root@123"
	database-names = ["test_db"]
    table-names = ["test_db.test_t_crjry","test_db.test_t_crjjtgj","test_db.test_t_crjry_jhw"]
    startup.mode = "initial"
	schema-changes.enabled = true
	server-id = "6500-7500"
	table-names-config = [
      {
        table = "test_db.test_t_crjry"
        primaryKeys = ["ID"]
      },
	        {
        table = "test_db.test_t_crjjtgj"
        primaryKeys = ["ID"]
      },
	        {
        table = "test_db.test_t_crjry_jhw"
        primaryKeys = ["ID"]
      }
    ]
  }
}
sink {
	Doris {
        fenodes = "127.0.0.1:8030"
        query-port = 9030
        username = root
        password = "root@123"
        schema_save_mode = "CREATE_SCHEMA_WHEN_NOT_EXIST"
        database = "test_db_1"
        table = "${table_name}_cdc_test"
        sink.enable-2pc = "true"
        sink.enable-delete = "true"
        sink.label-prefix = "cdc_test_ms"
        doris.config = {
            format="json"
            read_json_by_line="true"
        }
    }
}

seatunel 配置说明

env 示例

用途:表示全局环境变量 配置项: - parallelism并行度,默认为1 - job.mode任务模式,默认为 STREAMING env { parallelism = 1 job.mode = "STREAMING" }

source

用途数据来源配置包含MySQL-CDC或者Postgres-CDC如果没有特别说明MySQL-CDC和Postgres-CDC的配置项相同且均为必填项 配置项: - base-url数据库的jdbcP链接例如jdbc:mysql://host:port/database - username数据库账号 - password数据库密码 - database-names数据库名称有多个 - table-names: 表名MySQL-CDC格式为:数据库名.表名Postgres-CDC格式为数据库名.模式名称.表名 - startup.mode: 同步模式,默认 initial - schema-changes.enabled: schema变更同步默认 true 。注MySQL-CDC特有配置 - server-id: server-id范围默认 6500-7500。注MySQL-CDC特有配置 - tabletable-names中的表名,单个MySQL-CDC格式为:数据库名.表名Postgres-CDC格式为数据库名.模式名.表名 - primaryKeys表主键有多个 - slot.name同步槽名称格式为字母+下划线长度不超过16位可以随机生成。注Postgres-CDC特有配置 - schema-names模式名称默认public包含多个。注Postgres-CDC特有配置

source { MySQL-CDC { base-url = "" username = "" password = "" database-names = [""] table-names = [""] startup.mode = "initial" schema-changes.enabled = true server-id = "6500-7500" table-names-config = [ { table = "" primaryKeys = [""] }, { table = "" primaryKeys = [""] } ] } }

用途: 数据去向配置 配置项: - fenodes:doris FE节点URL,端口号为8030,例如127.0.0.1:8030 - query-port:doris 查询端口号默认9030 - username:doris用户名 - password:doris密码 - schema_save_mode:schema创建策略默认CREATE_SCHEMA_WHEN_NOT_EXIST - database数据库名 - table:目标表名,可以使用${table_name}占位符 - sink.enable-2pc:开启2阶段提交默认true - sink.enable-delete允许删除默认true - sink.label-prefix导入使用的标签前缀具有唯一性。格式为格式为字母+下划线长度不超过16位可以随机生成 - doris.config:doris解析配置采用默认 sink { Doris { fenodes = "" query-port = 9030 username = password = "" schema_save_mode = "CREATE_SCHEMA_WHEN_NOT_EXIST" database = "" table = "" sink.enable-2pc = "true" sink.enable-delete = "true" sink.label-prefix = "" doris.config = { format="json" read_json_by_line="true" } } }

规则

1.禁止使用示例和配置说明以外的配置项 2.直接输出结果,不需要配置说明或其他信息 3.必须为Hocon格式

我的任务是: 帮我把postgresql的test_t_crjry(主键ID)、test_t_crjjtgj(主键PKID)、test_t_crjry_jhw(主键WYBS)三张表同步到doris数据库配置信息如下 postgresql连接信息 host=172.31.51.244 端口=5432 数据库名=test_db 用户名=manager 密码=manager2!@# 模式名称=public

doris连接信息 FE节点host=172.31.51.142 数据库名=bjbj 用户名=admin 密码=6G_FahdUxAh@K