Replicating Data to Apache Parquet

With Syniti Data Replication, you can replicate relational data to an Apache Parquet target. Parquet is currently supported as a target both in refresh and mirroring. Every session creates either a .ref or .mir file with the content of the replication. The files are in Parquet format.

To define a target connection for Parquet output:

  1.  In the Metadata Explorer, choose Targets, then Add New Connection from the right mouse button.

  2. In the Add Target Connection wizard, type a name for the connection and choose Files - Parquet in the Database field.

  3. In the Set Connection String screen, choose values using the information below.

    Output Folder

    The schema name and location to the Parquet files. Set an Output Folder available to the system where Syniti DR is running.

    Add transactional info Make sure that the Add Transactional Info field is set to Yes, at the beginning.
    Compression

    The compression method desired, Currently supported 0 - None, 1 – Gzip, 2 – Snappy

    • None for no compression. This is the fastest way to write files, however they may end up slightly larger.
    • Snappy is the default level and is a perfect balance between compression and speed.
    • Gzip, using gzip compression, is the slowest, but should produce the best results if maximum compression is your top priority.
    Use Nullable Fields Set to 1 - True to allow nullable fields in Parquet files.
    Use DateTime Fields Set to 1 – True to allow datetime fields in Parquet files. If 0 - False is selected, Strings are used instead of datetime values.
    Use One File Per Group True by default. If True, Syniti DR uses the same file for all streams within the group, a better choice for performance.
  4. Click Next to view the Select Tables screen.
    If this is the first time you have created a Parquet connection using the output folder defined above, the table display will be empty. You can add a representation of target tables after completing the wizard.

  5. Click Next to display the Actions screen,

  6. Optionally choose to continue with creating replications once the wizard is complete.

  7. Click Next to display the summary, then click Finish to create the connection.

  8. The next step is to add target output representation to the Metadata Explorer. This will be represented as relational tables.

Now you can set up replications from whichever source connection you have defined to the Parquet target.