Parquet

Apache Parquet

Parameter

Value

path

A string which specifies the origin of the file

rowGroupSize

The recommended disk block/row group/file size is 512 to 1024 MB on HDFS.

Auth Type

This field defines the authentication type for your data sync. Cinchy supports "Access Key" and "IAM" role. When selecting "Access Key", you must provide the key and key secret. When selecting "IAM role", a new field will appear for you to paste in the role's Amazon Resource Name (ARN). You also must ensure that:

Note: This field was added in Cinchy v5.6

<ParquetDataSource 
    path="String" 
    rowGroupSize="String">
    <Schema>
    ...
    </Schema>
    <Filter/>
</ParquetDataSource> 

Last updated