Common Design Patterns

Syncing Changes versus Syncing Full Datasets

When creating a data sync, you need to know if you are synchronizing the full data set or simply a subset of the data.

Full Data Synchronization

Set <DroppedRecordBehaviour type="DELETE" /> so that any records that are no longer in the source are deleted in the target.

Partial Data Synchronization

Set <DroppedRecordBehaviour type="IGNORE" /> otherwise it will delete any records that are not in your partial record set. You are unable to delete any records during a partial data synchronization.

Partitioning your Data Synchronization

You can create a full data synchronization on a partial dataset by filtering the source and target. For example, if you want to sync transactions from one system to another but there are a lot of transactions, you can run a data sync where you filter by <Filter>[Transaction Date] > 100000 </Filter>

In both the source and the target. This way, you can sync a smaller dataset while still being able to perform deletions and ensuring a full synchronization of that partition of the data.

Syncing Linked Data

If you are syncing linked data, you need to order your data syncs such that you sync the data that is linked to first. For example, if you have customers and invoices and invoices links to customers, then you should sync the customer data first, so that when you sync the invoices the link to customers resolves to a pre-existing customer.

Populating Reference Data

When you want to create a reference data set, for example, a country code, based on a set of shipping labels, you will first run the data against the Country Codes table and then sync in the shipping labels to the labels table. In this scenario you want to set suppressDuplicateErrors="false" When running the countries sync since we expect duplicates and that is not an error.

Creating a Reference Master Data Set

Sometimes different reference data values can mean the same thing, but different systems use different codes. In our country code example, System A uses the full country name (ex. Canada, United States), System B uses the 2 letter ISO code (ex. CA, US) and System C uses the 3 letter ISO code (ex. CAN, USA).

All three of these systems can sync into one shipping labels table, with a link to Country, but depending on the system, we use a different link column reference.

The same column mapping will look slightly different in each data sync to the Shipping Labels table.

Note that you can change the display column in the Shipping Labels table to switch between the columns if you decide on a different code as the golden standard, without having to modify your data sync configurations.

System A Config

<ColumnMapping sourceColumn="Country" targetColumn="Country" linkColumn="ISO 3166 Name" />

System B Config

<ColumnMapping sourceColumn="ShippingCountry" targetColumn="Country" linkColumn="ISO 3166-2 Letter Code" />

System C Config

<ColumnMapping sourceColumn="Country" targetColumn="Country" linkColumn="ISO 3166-3 Letter Code" />

Multiple Sources - Separate Records

When syncing from different sources into the same data set, you will need to add the source of the data as a column so you do not overwrite records from other sources. You will add that column in your sync key as well. For contacts you might have:

<CalculatedColumn name="Source System" formula="CONCAT('','Salesforce')" dataType="Text" /> 
...
<SyncKey>
            <SyncKeyColumnReference name:"Source System" />
            <SyncKeyColumnReference name:"Email Address" />
</SyncKey>

Once you have all your data from various systems in Cinchy, you can master that dataset and sync the master record back into any source systems where you wish to do so. These syncs would simply filter the source by where the [Master Record] column is set to true, and sync on the unique identifier without the source system. We would also likely only want to update records already in the source, rather than deleting unmastered data or adding all records from other systems.

<SyncKey>
           <SyncKeyColumnReference name="Email Address" />
</SyncKey>
<NewRecordBehaviour type="IGNORE" />
<ChangedRecordBehaviour type="UPDATE" />
<DroppedRecordBehaviour type="IGNORE"/>

Multiple Sources - Enriching Fields

Sometimes you may want to use different sources to enrich different fields on the same record. In this case you want to set your dropped record behavior to ignore, and simply update your columns based on a sync key on the record. Depending on the source system, it may or may not be allowed to create new records. Usually an internal system like customer invoices should create new customer records but an external data source like a published company size and industry report will only add noise to< your table if you insert new records.

<SyncKey>
        <SyncKeyColumnReference name="Email Address" />
</SyncKey>
<NewRecordBehaviour type="IGNORE" />
<ChangedRecordBehaviour type="UPDATE" />
<DroppedRecordBehaviour type="IGNORE"/>

Post Processing

You can add post sync scripts in the data sync configuration. This allows you to run post processing after the data sync completes. For example, you can run a query to update the assignee on a lead after the leads data sync runs. This way you create more complex logic like only updating where the assignee is empty, except if it's a high value lead which is reassigned to the most senior person on each team (based on another table in the instance that has the seniority and team of each sales director).

<CinchyTableTarget model="" domain="Revenue" table="Leads" suppressDuplicateErrors="true" >
     <PostSyncScripts>
          <PostSyncScript name="Script1" timeout="60">
               <CQL>INSERT CQL HERE</CQL>
          </PostSyncScript>
     </PostSyncScripts>
</CinchyTableTarget>

No Unique Identifier

If you have a file source (i.e. delimited, csv or excel) and you want to sync data into Cinchy that does not have a sync key, you can add a calculated column for the row number of that record.

<CalculatedColumn name="RowNumber" formula="row_number()" dataType="Number" />

You will also want to add a calculated column for the file name (and ensure that it is unique) to be able to re-run the data sync if any failures occur.

Bi Directional Sync

To run a bi-directional sync, you need to identify a source system unique identifier. You will then need to run the following four data syncs for bidirectional syncing. Note that if one of the systems cannot create new records, we can omit the data sync configuration where we are creating new records in the other system.

Sync from External Source into Cinchy (New Records)

First we run a data sync from the source into Cinchy filtering the source by records where the Cinchy ID (a custom field we create in the source) is empty. We insert these records into Cinchy and make sure to populate the source unique identifier column in Cinchy.

Sync from Cinchy into External Source (New Records)

We can do the opposite as well by syncing data from Cinchy to the external source by inserting any records where a source unique identifier is empty.

Sync from External Source into Cinchy (Existing Records)

Now that all records exist in both the external system and Cinchy, we can sync data based on the target system's unique ID, in this case we are syncing data from the external source into Cinchy, based on the Cinchy ID. We filter out records where Cinchy ID is null here to avoid errors (those will be picked up the next time the new records sync runs).

Sync from Cinchy into External Source (Existing Records)

Likewise we can sync any data changes from Cinchy into the external source using the external unique identifier as the sync key, filtering out records in Cinchy where the external identifier is empty.

Caching Query Data

If there are some intensive summary queries you need to run on the platform, you can create a data sync to cache that data in a Cinchy table. Simply sync the results of your Cinchy query into a Cinchy table with the same schema, and schedule the CLI to run as often as you would like your cache to expire. You can point other queries or reports to query from this table, rather than the complex query itself.

Last updated