Data sinks consume DataSets and are used to store or return them. Data sink operations are described using an . Flink comes with a variety of built-in output formats that are encapsulated behind operations on the DataSet:

  • writeAsText() / TextOutputFormat - Writes elements line-wise as Strings. The Strings are obtained by calling the toString() method of each element.
  • writeAsCsv(...) / CsvOutputFormat - Writes tuples as comma-separated value files. Row and field delimiters are configurable. The value for each field comes from the toString() method of the objects.
  • print() / printToErr() - Prints the toString() value of each element on the standard out / standard error stream.
  • write() / FileOutputFormat - Method and base class for custom file outputs. Supports custom object-to-bytes conversion.
  • output()OutputFormat - Most generic output method, for data sinks that are not file based (such as storing the result in a database).

A DataSet can be input to multiple operations. Programs can write or print a data set and at the same time run additional transformations on them. 

object DataSetSinkApp {  def main(args: Array[String]): Unit = {    val env = ExecutionEnvironment.getExecutionEnvironment    import org.apache.flink.api.scala._    val data = 1.to(10)    val text = env.fromCollection(data)    val filePath="file:///F://data/"    //此处设置并行度,如果并行度为1,那么输出的就是一个文件而不是一个文件夹。此处并行度为2,那么就会生成一个data文件夹,文件夹内有两个文件,1和2    text.writeAsText(filePath,WriteMode.OVERWRITE).setParallelism(2)    env.execute("DataSetSinkApp")  }}


