Parquet Meltano loader

The target-parquet Meltano loader sends data into Parquet after it was pulled from a source using an extractor.

Repository: https://github.com/estrategiahq/target-parquet

Maintainer: Estratégia
Meltano Stats:

Alternative variants #

Multiple variants of target-parquet are available. This document describes the default estrategiahq variant, which is recommended for new users.

Alternative variants are:

Getting Started #

Prerequisites #

If you haven't already, follow the initial steps of the Getting Started guide:

Installation and configuration #

Using the Command Line Interface #

Add the target-parquet loader to your project using meltano add :
```
meltano add loader target-parquet
```
Configure the settings below using meltano config .

Next steps #

Follow the remaining steps of the Getting Started guide:

Run a data integration (EL) pipeline

If you run into any issues, learn how to get help.

Capabilities #

target-parquet does not have any capabilities defined in its metadata. Please consider adding them by making a pull request to the YAML file that defines the capabilities for this loader.

Settings #

These and other supported settings are documented below. To quickly find the setting you're looking for, use the Table of Contents at the top of the page.

Disable Collection (`disable_collection`) #

Environment variable: TARGET_PARQUET_DISABLE_COLLECTION

A boolean of whether to disable Singer anonymous tracking.

How to use #

Manage this setting using meltano config or an environment variable:

meltano config target-parquet set disable_collection true

export TARGET_PARQUET_DISABLE_COLLECTION=true

Logging Level (`logging_level`) #

Environment variable: TARGET_PARQUET_LOGGING_LEVEL

(Default - INFO) The log level. Can also be set using environment variables.

How to use #

Manage this setting using meltano config or an environment variable:

meltano config target-parquet set logging_level <logging_level>

export TARGET_PARQUET_LOGGING_LEVEL=<logging_level>

Destination Path (`destination_path`) #

Environment variable: TARGET_PARQUET_DESTINATION_PATH

(Default - ‘.’) The path to write files out to.

How to use #

Manage this setting using meltano config or an environment variable:

meltano config target-parquet set destination_path <destination_path>

export TARGET_PARQUET_DESTINATION_PATH=<destination_path>

Compression Method (`compression_method`) #

Environment variable: TARGET_PARQUET_COMPRESSION_METHOD

Compression methods have to be supported by Pyarrow, and currently the compression modes available are - snappy (recommended), zstd, brotli and gzip.

How to use #

Manage this setting using meltano config or an environment variable:

meltano config target-parquet set compression_method <compression_method>

export TARGET_PARQUET_COMPRESSION_METHOD=<compression_method>

Streams In Separate Folder (`streams_in_separate_folder`) #

Environment variable: TARGET_PARQUET_STREAMS_IN_SEPARATE_FOLDER

(Default - False) The option to create each stream in a different folder, as these are expected to come in different schema.

How to use #

Manage this setting using meltano config or an environment variable:

meltano config target-parquet set streams_in_separate_folder true

export TARGET_PARQUET_STREAMS_IN_SEPARATE_FOLDER=true

File Size (`file_size`) #

Environment variable: TARGET_PARQUET_FILE_SIZE

The number of rows to write per file. The default is to write to a single file.

How to use #

Manage this setting using meltano config or an environment variable:

meltano config target-parquet set file_size 1234

export TARGET_PARQUET_FILE_SIZE=1234

Looking for help? #

If you're having trouble getting the target-parquet loader to work, look for an existing issue in its repository, file a new issue, or join the Meltano Slack community and ask for help in the #plugins-general channel.

Found an issue on this page? #

This page is generated from a YAML file that you can contribute changes to. Edit it on GitHub!

Alternative variants #

Getting Started #

Prerequisites #

Installation and configuration #

Using the Command Line Interface #

Next steps #

Capabilities #

Settings #

Disable Collection (disable_collection) #

How to use #

Logging Level (logging_level) #

How to use #

Destination Path (destination_path) #

How to use #

Compression Method (compression_method) #

How to use #

Streams In Separate Folder (streams_in_separate_folder) #

How to use #

File Size (file_size) #

How to use #

Looking for help? #

Found an issue on this page? #

Disable Collection (`disable_collection`) #

Logging Level (`logging_level`) #

Destination Path (`destination_path`) #

Compression Method (`compression_method`) #

Streams In Separate Folder (`streams_in_separate_folder`) #

File Size (`file_size`) #