当前位置: 首页 > news >正文

成都网站制作公司seo网站快排

成都网站制作公司,seo网站快排,app制作外包,重庆高端网站建设价格经过一系列Transformation转换操作后,最后一定要调用Sink操作,才会形成一个完整的DataFlow拓扑。只有调用了Sink操作,才会产生最终的计算结果,这些数据可以写入到的文件、输出到指定的网络端口、消息中间件、外部的文件系统或者是…

经过一系列Transformation转换操作后,最后一定要调用Sink操作,才会形成一个完整的DataFlow拓扑。只有调用了Sink操作,才会产生最终的计算结果,这些数据可以写入到的文件、输出到指定的网络端口、消息中间件、外部的文件系统或者是打印到控制台.

flink在批处理中常见的sink

  1. print 打印
  2. writerAsText 以文本格式输出
  3. writeAsCsv 以csv格式输出
  4. writeUsingOutputFormat 以指定的格式输出
  5. writeToSocket 输出到网络端口
  6. 自定义连接器(addSink)

参考官网:https://ci.apache.org/projects/flink/flink-docs-release-1.13/zh/docs/dev/datastream/overview/#data-sinks

1、print

打印是最简单的一个Sink,通常是用来做实验和测试时使用。如果想让一个DataStream输出打印的结果,直接可以在该DataStream调用print方法。另外,该方法还有一个重载的方法,可以传入一个字符,指定一个Sink的标识名称,如果有多个打印的Sink,用来区分到底是哪一个Sink的输出。

以下演示了print打印,以及自定义print打印。

package com.bigdata.day03;import org.apache.flink.api.common.RuntimeExecutionMode;
import org.apache.flink.configuration.Configuration;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.streaming.api.functions.sink.RichSinkFunction;public class SinkPrintDemo {public static void main(String[] args) throws Exception {//1. env-准备环境StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();env.setRuntimeMode(RuntimeExecutionMode.AUTOMATIC);DataStreamSource<String> dataStreamSource = env.socketTextStream("localhost", 8888);// 打印,普通的打印// 6> helllo world//dataStreamSource.print();dataStreamSource.addSink(new MySink());// 接着手动实现该print 打印env.execute();}static class MySink extends RichSinkFunction<String> {@Overridepublic void invoke(String value, Context context) throws Exception {// 得到一个分区号,因为要模仿print打印效果int partitionId = getRuntimeContext().getIndexOfThisSubtask() + 1;String msg = partitionId +"> " +value;System.out.println(msg);}}}

 

package com.bigdata.day03;import org.apache.flink.api.common.RuntimeExecutionMode;
import org.apache.flink.streaming.api.datastream.DataStream;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.streaming.api.functions.sink.RichSinkFunction;public class Demo01 {static class MyPrint extends RichSinkFunction<String>{private String msg;public MyPrint(){}public MyPrint(String msg){this.msg = msg;}@Overridepublic void invoke(String value, Context context) throws Exception {int partition = getRuntimeContext().getIndexOfThisSubtask();if(msg == null){System.out.println(partition+"> "+value);}else{System.out.println(msg+">>>:"+partition+"> "+value);}}}public static void main(String[] args) throws Exception {//1. env-准备环境StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();env.setRuntimeMode(RuntimeExecutionMode.AUTOMATIC);//2. source-加载数据DataStream<String> data = env.fromElements("hello", "world", "baotianman", "laoyan");//3. transformation-数据处理转换//4. sink-数据输出//data.print();//data.print("普通打印>>>");data.addSink(new MyPrint());data.addSink(new MyPrint("模仿:"));//5. execute-执行env.execute();}
}

 

下面的结果是WordCount例子中调用print Sink输出在控制台的结果,细心的读者会发现,在输出的单词和次数之前,有一个数字前缀,我这里是1~4,这个数字是该Sink所在subtask的Index + 1。有的读者运行的结果数字前缀是1~8,该数字前缀其实是与任务的并行度相关的,由于该任务是以local模式运行,默认的并行度是所在机器可用的逻辑核数即线程数,我的电脑是2核4线程的,所以subtask的Index范围是0~3,将Index + 1,显示的数字前缀就是1~4了。这里在来仔细的观察一下运行的结果发现:相同的单词输出结果的数字前缀一定相同,即经过keyBy之后,相同的单词会被shuffle到同一个subtask中,并且在同一个subtask的同一个组内进行聚合。一个subtask中是可能有零到多个组的,如果是有多个组,每一个组是相互独立的,累加的结果不会相互干扰。

sum之后的:

1> hello 3

2> world 4

汇总之前,keyBy之后

1> hello 1

1> hello 1

1> hello 1

2、writerAsText 以文本格式输出

该方法是将数据以文本格式实时的写入到指定的目录中,本质上使用的是TextOutputFormat格式写入的。每输出一个元素,在该内容后面同时追加一个换行符,最终以字符的形式写入到文件中,目录中的文件名称是该Sink所在subtask的Index + 1。该方法还有一个重载的方法,可以额外指定一个枚举类型的参数writeMode,默认是WriteMode.NO_OVERWRITE,如果指定相同输出目录下有相同的名称文件存在,就会出现异常。如果是WriteMode.OVERWRITE,会将以前的文件覆盖。

package com.bigdata.day03;import org.apache.flink.api.common.RuntimeExecutionMode;
import org.apache.flink.core.fs.FileSystem;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.streaming.api.functions.sink.RichSinkFunction;public class SinkTextDemo {public static void main(String[] args) throws Exception {//1. env-准备环境StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();env.setRuntimeMode(RuntimeExecutionMode.AUTOMATIC);env.setParallelism(2);DataStreamSource<String> dataStreamSource = env.socketTextStream("localhost", 8880);// 写入到文件的时候,OVERWRITE 模式是重写的意思,假如以前有结果直接覆盖// 如果并行度为1 ,最后输出的结果是一个文件,假如并行度 > 1 最后的结果是一个文件夹,文件夹中的文件名是 分区号(任务号)dataStreamSource.writeAsText("F:\\BD230801\\FlinkDemo\\datas\\result", FileSystem.WriteMode.OVERWRITE);env.execute();}
}
package com.bigdata.day03;import org.apache.flink.api.common.RuntimeExecutionMode;
import org.apache.flink.api.java.tuple.Tuple2;
import org.apache.flink.core.fs.FileSystem;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;public class Demo02 {public static void main(String[] args) throws Exception {//1. env-准备环境StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();env.setRuntimeMode(RuntimeExecutionMode.AUTOMATIC);//2. source-加载数据//3. transformation-数据处理转换//4. sink-数据输出//DataStreamSource<String> streamSource = env.socketTextStream("localhost", 8899);//streamSource.writeAsText("datas/socket", FileSystem.WriteMode.OVERWRITE).setParallelism(1);DataStreamSource<Tuple2<String, Integer>> streamSource = env.fromElements(Tuple2.of("篮球", 1),Tuple2.of("篮球", 2),Tuple2.of("篮球", 3),Tuple2.of("足球", 3),Tuple2.of("足球", 2),Tuple2.of("足球", 3));// writeAsCsv 只能保存 tuple类型的DataStream流,因为如果不是多列的话,没必要使用什么分隔符streamSource.writeAsCsv("datas/csv", FileSystem.WriteMode.OVERWRITE).setParallelism(1);//5. execute-执行env.execute();}
}

 3、连接器Connectors

JDBC Connector

该连接器可以向JDBC 数据库写入数据

JDBC | Apache Flink

<dependency><groupId>org.apache.flink</groupId><artifactId>flink-connector-jdbc_2.11</artifactId><version>${flink.version}</version>
</dependency><!--假如你是连接低版本的,使用5.1.49--><dependency><groupId>mysql</groupId><artifactId>mysql-connector-java</artifactId><version>8.0.25</version></dependency>

案例演示:

将结果读取,写入到MySQL

package com.bigdata.day03;import lombok.AllArgsConstructor;
import lombok.Data;
import lombok.NoArgsConstructor;
import org.apache.flink.connector.jdbc.JdbcConnectionOptions;
import org.apache.flink.connector.jdbc.JdbcExecutionOptions;
import org.apache.flink.connector.jdbc.JdbcSink;
import org.apache.flink.connector.jdbc.JdbcStatementBuilder;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;import java.sql.PreparedStatement;
import java.sql.SQLException;@Data
@AllArgsConstructor
@NoArgsConstructor
class Student{private int id;private String name;private int age;
}
public class JdbcSinkDemo {public static void main(String[] args) throws Exception {StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();DataStreamSource<Student> studentStream = env.fromElements(new Student(1, "jack", 54));JdbcConnectionOptions jdbcConnectionOptions = new JdbcConnectionOptions.JdbcConnectionOptionsBuilder().withUrl("jdbc:mysql://localhost:3306/test1").withDriverName("com.mysql.jdbc.Driver").withUsername("root").withPassword("123456").build();studentStream.addSink(JdbcSink.sink("insert into student values(null,?,?)",new JdbcStatementBuilder<Student>() {@Overridepublic void accept(PreparedStatement preparedStatement, Student student) throws SQLException {preparedStatement.setString(1,student.getName());preparedStatement.setInt(2,student.getAge());}// 假如是流的方式可以设置两条插入一次}, JdbcExecutionOptions.builder().withBatchSize(2).build(),jdbcConnectionOptions));env.execute();}
}
package com.bigdata.day03;import lombok.AllArgsConstructor;
import lombok.Data;
import lombok.NoArgsConstructor;
import org.apache.flink.api.common.RuntimeExecutionMode;
import org.apache.flink.connector.jdbc.JdbcConnectionOptions;
import org.apache.flink.connector.jdbc.JdbcExecutionOptions;
import org.apache.flink.connector.jdbc.JdbcSink;
import org.apache.flink.connector.jdbc.JdbcStatementBuilder;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;import java.sql.PreparedStatement;
import java.sql.SQLException;@Data
@AllArgsConstructor
@NoArgsConstructor
class Student{private int id;private String name;private int age;
}
public class Demo03 {public static void main(String[] args) throws Exception {//1. env-准备环境StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();env.setRuntimeMode(RuntimeExecutionMode.AUTOMATIC);DataStreamSource<Student> studentDataStreamSource = env.fromElements(new Student(1, "张三", 19),new Student(2, "lisi", 20),new Student(3, "wangwu", 19));JdbcConnectionOptions jdbcConnectionOptions = new JdbcConnectionOptions.JdbcConnectionOptionsBuilder().withUrl("jdbc:mysql://localhost:3306/kettle").withDriverName("com.mysql.cj.jdbc.Driver").withUsername("root").withPassword("root").build();studentDataStreamSource.addSink(JdbcSink.sink("insert into student values(null,?,?)",new JdbcStatementBuilder<Student>() {@Overridepublic void accept(PreparedStatement preparedStatement, Student student) throws SQLException {preparedStatement.setString(1,student.getName());preparedStatement.setInt(2,student.getAge());}},jdbcConnectionOptions));//2. source-加载数据//3. transformation-数据处理转换//4. sink-数据输出//5. execute-执行env.execute();}
}

运行结果正常:

KafkaConnector

Kafka | Apache Flink

从Kafka的topic1中消费日志数据,并做实时ETL,将状态为success的数据写入到Kafka的topic2中

 

kafka-topics.sh --bootstrap-server bigdata01:9092 --create --partitions 1 --replication-factor 3 --topic topic1
kafka-topics.sh --bootstrap-server bigdata01:9092 --create --partitions 1 --replication-factor 3 --topic topic2使用控制台当做kafka消息的生产者向kafka中的topic1 发送消息
kafka-console-producer.sh  --bootstrap-server bigdata01:9092 --topic topic1消费kafka中topic2中的数据
kafka-console-consumer.sh  --bootstrap-server bigdata01:9092 --topic topic2操作:
通过黑窗口向topic1中发送消息,含有success字样的消息,会出现在topic2中。

package com.bigdata.day03;import org.apache.flink.api.common.RuntimeExecutionMode;
import org.apache.flink.api.common.functions.FilterFunction;
import org.apache.flink.api.common.serialization.SimpleStringSchema;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer;
import org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer;
import org.apache.kafka.clients.producer.KafkaProducer;import java.util.Properties;public class KafkaSinkDemo {// 从topic1中获取数据,放入到topic2中,训练了读和写public static void main(String[] args) throws Exception {//1. env-准备环境StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();env.setRuntimeMode(RuntimeExecutionMode.AUTOMATIC);//2. source-加载数据Properties properties = new Properties();properties.setProperty("bootstrap.servers", "bigdata01:9092");properties.setProperty("group.id", "g1");FlinkKafkaConsumer<String> kafkaSource = new FlinkKafkaConsumer<String>("topic1",new SimpleStringSchema(),properties);DataStreamSource<String> dataStreamSource = env.addSource(kafkaSource);//3. transformation-数据处理转换SingleOutputStreamOperator<String> filterStream = dataStreamSource.filter(new FilterFunction<String>() {@Overridepublic boolean filter(String s) throws Exception {return s.contains("success");}});//4. sink-数据输出FlinkKafkaProducer kafkaProducer = new FlinkKafkaProducer<String>("topic2",new SimpleStringSchema(),properties);filterStream.addSink(kafkaProducer);//5. execute-执行env.execute();}
}

Flink Kafka Consumer 需要知道如何将 Kafka 中的二进制数据转换为 Java 或者Scala 对象。KafkaDeserializationSchema 允许用户指定这样的 schema,每条 Kafka 中的消息会调用 T deserialize(ConsumerRecord<byte[], byte[]> record) 反序列化。

为了方便使用,Flink 提供了以下几种 schemas:

SimpleStringSchema:按照字符串方式序列化、反序列化

剩余还有 TypeInformationSerializationSchema、JsonDeserializationSchema、AvroDeserializationSchema等。

自定义Sink--模拟jdbcSink的实现

jdbcSink官方已经提供过了,此处仅仅是模拟它的实现,从而学习如何自定义sink

package com.bigdata.day03;import lombok.AllArgsConstructor;
import lombok.Data;
import lombok.NoArgsConstructor;
import org.apache.flink.configuration.Configuration;
import org.apache.flink.connector.jdbc.JdbcConnectionOptions;
import org.apache.flink.connector.jdbc.JdbcExecutionOptions;
import org.apache.flink.connector.jdbc.JdbcSink;
import org.apache.flink.connector.jdbc.JdbcStatementBuilder;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.streaming.api.functions.sink.RichSinkFunction;import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.SQLException;public class CustomJdbcSinkDemo {@Data@AllArgsConstructor@NoArgsConstructorstatic class Student{private int id;private String name;private int age;}static class MyJdbcSink  extends RichSinkFunction<Student> {Connection conn =null;PreparedStatement ps = null;@Overridepublic void open(Configuration parameters) throws Exception {// 这个里面编写连接数据库的代码Class.forName("com.mysql.jdbc.Driver");conn = DriverManager.getConnection("jdbc:mysql://localhost:3306/test1", "root", "123456");ps = conn.prepareStatement("INSERT INTO `student` (`id`, `name`, `age`) VALUES (null, ?, ?)");}@Overridepublic void close() throws Exception {// 关闭数据库的代码ps.close();conn.close();}@Overridepublic void invoke(Student student, Context context) throws Exception {// 将数据插入到数据库中ps.setString(1,student.getName());ps.setInt(2,student.getAge());ps.execute();}}public static void main(String[] args) throws Exception {StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();DataStreamSource<Student> studentStream = env.fromElements(new Student(1, "马斯克", 51));studentStream.addSink(new MyJdbcSink());env.execute();}
}


文章转载自:
http://crappie.c7497.cn
http://rident.c7497.cn
http://pud.c7497.cn
http://african.c7497.cn
http://cigar.c7497.cn
http://endwise.c7497.cn
http://ahab.c7497.cn
http://storiology.c7497.cn
http://precocious.c7497.cn
http://aggravation.c7497.cn
http://choregus.c7497.cn
http://cavil.c7497.cn
http://wack.c7497.cn
http://enwomb.c7497.cn
http://tycho.c7497.cn
http://karbala.c7497.cn
http://vivific.c7497.cn
http://obumbrate.c7497.cn
http://braunschweiger.c7497.cn
http://gazehound.c7497.cn
http://wrecky.c7497.cn
http://bawdry.c7497.cn
http://charmer.c7497.cn
http://skiver.c7497.cn
http://overcommit.c7497.cn
http://selfwards.c7497.cn
http://thixotropic.c7497.cn
http://sojourn.c7497.cn
http://chrp.c7497.cn
http://mmf.c7497.cn
http://collarbone.c7497.cn
http://gridder.c7497.cn
http://device.c7497.cn
http://hemophilic.c7497.cn
http://uncalculated.c7497.cn
http://villadom.c7497.cn
http://snailery.c7497.cn
http://hell.c7497.cn
http://patriotic.c7497.cn
http://lumbersome.c7497.cn
http://memoire.c7497.cn
http://spellbind.c7497.cn
http://had.c7497.cn
http://unfavorable.c7497.cn
http://toolroom.c7497.cn
http://trousseau.c7497.cn
http://technician.c7497.cn
http://noneffective.c7497.cn
http://trihydric.c7497.cn
http://habitual.c7497.cn
http://epiplastron.c7497.cn
http://skelp.c7497.cn
http://anyuan.c7497.cn
http://ilium.c7497.cn
http://autoimmunization.c7497.cn
http://sophist.c7497.cn
http://midcourse.c7497.cn
http://shaman.c7497.cn
http://rascal.c7497.cn
http://niocalite.c7497.cn
http://lunilogical.c7497.cn
http://hedgehop.c7497.cn
http://suez.c7497.cn
http://gloucestershire.c7497.cn
http://parroquet.c7497.cn
http://valentinite.c7497.cn
http://capper.c7497.cn
http://wiggly.c7497.cn
http://reinless.c7497.cn
http://symmograph.c7497.cn
http://sernyl.c7497.cn
http://idler.c7497.cn
http://extratropical.c7497.cn
http://jeeringly.c7497.cn
http://cybernetical.c7497.cn
http://protuberant.c7497.cn
http://laudative.c7497.cn
http://cite.c7497.cn
http://bunco.c7497.cn
http://abnaki.c7497.cn
http://coextend.c7497.cn
http://absterge.c7497.cn
http://acatalasemia.c7497.cn
http://lgm.c7497.cn
http://exosmosis.c7497.cn
http://terebic.c7497.cn
http://no.c7497.cn
http://cybernatic.c7497.cn
http://jamboree.c7497.cn
http://jsd.c7497.cn
http://intuit.c7497.cn
http://neurology.c7497.cn
http://drawn.c7497.cn
http://anyhow.c7497.cn
http://platitudinize.c7497.cn
http://tripod.c7497.cn
http://warhead.c7497.cn
http://plain.c7497.cn
http://enchondromatous.c7497.cn
http://uninspired.c7497.cn
http://www.zhongyajixie.com/news/82766.html

相关文章:

  • 可以用来展示的网站昆明网络推广方式有哪些
  • 哪个网站可以找题目给小孩做2024年4月新冠疫情结束了吗
  • 网站建设 昆山seo优化名词解释
  • 门头设计效果图网站汽车软文广告
  • 企业建站公司怎么创业舆情分析网站
  • 铁岭做网站公司哪家好网上销售平台有哪些
  • 绵阳做网站哪家公司好如何进行网络营销
  • 武汉通官网网站建设百度竞价点击工具
  • 网站建设结构设计方案网站seo推广优化
  • 阿里巴巴国际站怎么网站建设福建网络seo关键词优化教程
  • 鹰潭城乡建设局的网站深圳做网站公司
  • 新手自学做网站多久营销是什么意思
  • 徐州专业做网站高效统筹疫情防控和经济社会发展
  • 唯品会一家专门做特卖的网站手机版优化设计
  • 做同城网站有哪些百度模拟点击软件判刑了
  • 教育部高等学校建设中心网站北京网络营销外包公司哪家好
  • wordpress网站更换域名专业网站快速
  • 橱柜手机网站模板新站点seo联系方式
  • 南城网站建设免费优化网站
  • 网站后台模板关联自己做的网站网络营销与直播电商专业
  • 为其他公司做网站怎么做账济南seo培训
  • 阿里云个人备案可以做企业网站软件发布网
  • 泰安做网站的公司信息流广告加盟代理
  • 淘宝电商网站怎么做买淘宝店铺多少钱一个
  • 金坛网站建设价格好看的web网页
  • 卖产品的网站怎么做网络服务费计入什么科目
  • 远憬建站百度推广有哪些形式
  • 可以做公众号的网站吗最近有新病毒出现吗
  • 重庆市娱乐场所暂停营业5g网络优化工程师
  • 网站的建设分析百度最怕哪个部门去投诉