top of page

Your guide to an Enterprise Data Factory for Salesforce

Introduction


We may call it dummy, mocked or testing data, but it’s a fact: we always need to generate data for any Salesforce project. In this practical guide we’ll see how you can construct an Enterprise Data Factory (EDF) for your main use cases:

  1. Use case 1: create data for Apex tests.

  2. Use case 2: create data for populating an org (e.g. sandbox or scratch org) for demos, UAT or just testing slow queries.

This guide requires some basic Apex knowledge and is based on Forceea data factory, a native open-source GitHub project (developed in Apex by the author of this guide).


The assumptions


Let’s assume you need to create opportunities with accounts. We'll keep it simple:


Accounts

  1. the MajorAccount (B2B) record type needs Annual Revenue between 1M and 10M.

  2. Industry will get any picklist value except from Electronics and Energy.

  3. Shipping address will get random “real” addresses.

Opportunities

  1. the B2BOpportunity record type

  2. Close Date will get a random date between 60 days before and 60 days after today.

  3. Stage will get a random value between Prospecting, Qualification and Needs Analysis.


The steps


Step 1: go to https://github.com/Forceea/Forceea-data-factory and get the details for using the Salesforce CLI command that installs the latest managed package into your org.


Step 2: create a new DataFactory class.

public with sharing class DataFactory {
    // your templates here
} 

Step 3: create the basic Template for accounts.

public static Forceea.Template templateAccounts() {
    // Template details
} 

A Template defines the SObject, the field values, the number of records and many other details. The following basic (simple) Template

  1. returns a new Forceea.Template

  2. adds a new Template item, using the add(key,value) method.

The key is just a label you give for this Template item. The value is always a new FObject (Forceea uses FObject to keep all the details about an SObject).

// Template details 
return new Forceea.Template()
    .add(ACCOUNTS, new FObject(Account.SObjectType)
        // methods
    ); 

With “methods” you can define

  1. variables, e.g. .setVariable('recordType', 'MajorAccount')

  2. the number of records, e.g. .setNumberOfRecords(10)

  3. field definitions, e.g. .setDefinition(Account.Industry, 'random type(picklist) except(Electronics,Energy)')

  4. many other settings and options.

Now let’s see what your first Template contains:

public static Forceea.Template templateAccounts() {
    return new Forceea.Template()
        .add(ACCOUNTS, new FObject(Account.SObjectType)
            .setNumberOfRecords(10)
            .setDefinition(Account.Industry,                
                'random type(picklist) except(Electronics,Energy)')
            .setDefinition(Account.ShippingStreet,                
                'random type(street) group(shipping)')
            // other field definitions
        );
} 


Field definitions with Dadela data generation language


Your EDF is very powerful because of Dadela. Dadela allows you to define complex data values easily.


For example, the definition random type(picklist) except(Electronics,Energy) will create random values from the existing picklist values, excluding Electronics and Energy. If you need serial values, Dadela provides a rich set of serial definitions, e.g. the definition serial type(number) from(1) step(1) scale(0) creates the integers 1, 2, 3, …


Now we’ve created your basic Template for accounts, let’s go to the next phase:


Step 4: create a basic Template for accounts of the MajorAccount record type.


The new Template will have the same structure as the previous, with an important difference: will get and modify the Template you created above:

public static Forceea.Template templateMajorAccounts() {
    return new Forceea.Template()
        .add(MAJOR_ACCOUNTS, templateAccounts().getFObject(ACCOUNTS)
            // methods
        );
} 

Notice line 3: templateAccounts().getFObject(ACCOUNTS). It gets the FObject with the key ACCOUNTS from templateAccounts, and then modifies this FObject adding the field definition for the record type and other details.


Again, the “methods” block contains:

.setVariable('recordType', 'MajorAccount')
.setDefinition(Account.RecordTypeId, 'static value({@recordType})')
.setDefinition(Account.Name, 'static value(Account-)')
.setDefinition(Account.Name, 
    'serial type(number) from(1) step(1) scale(0)')
.setDefinition(Account.AnnualRevenue,
    'random type(number) from(1000000) to(100000000) scale(2)') 
  1. The setVariable method declares a variable for the record type (an elegant way to isolate “parameters”). A variable is used with {@variableName}.

  2. Text fields (like Name) are allowed to have multiple definitions (line 3 and 4), while many other fields (e.g. picklist, date or number fields) are not.

The new templateMajorAccounts practically inherits all the field definitions (and any other settings) of templateAccounts. If you add a new definition to templateAccounts, all Templates that encapsulate it will inherit the new field definition as well.


Now you prepared the Account Templates, let’s enrich your EDF with opportunities.


Step 5: create a basic Template for opportunities


It’s the same logic as the Template you used for accounts:

public static Forceea.Template templateOpportunities() {
    return new Forceea.Template()
        .add(OPPORTUNITIES, new FObject(Opportunity.SObjectType)
          // methods
        );
} 

where the “methods” block includes:

.setNumberOfRecords(40)
.setVariable('today', Date.today())
.setDefinition('$days', 'random type(number) from(-60) to(60) scale(0)')
.setDefinition(Opportunity.CloseDate, 'static value({@today})')
.setDefinition(Opportunity.CloseDate, 'function-add field($days)')
.setDefinition(Opportunity.StageName,
    'random type(list) value(Prospecting, Qualification, Needs Analysis)')
.setDefinition(Opportunity.Amount,
    'random type(number) from(1000) to(10000) scale(2)'); 

Many interesting things happen here, especially in lines 3 and 5.

  1. The method setDefinition('$days', 'random ...') introduces the Virtual Field ($days). Virtual Fields start with $ and they represent a convenient way to keep intermediate “field calculations” or feed the same value to multiple field definitions (plus some other special purposes).

  2. The field definition function-add field($days) allows for some flexible addition (to add a field value to another). Here we add a random integer number (days) between -60 and 60 to today’s Date value. Practically, we get a random date between 60 days before and 60 days after today per our initial requirements.

And now a very important notice: the field definitions are calculated top-down: the first field definition of an FObject is calculated before the second, etc.


This means that you need to define a field before you use it, for example to copy its value with a definition like copy field(Account.MyField__c).


The final step will provide the Template we’ve been waiting for:


Step 6: create a complex Template for opportunities of the B2bOpportunity record type.


Again the same structure:

public static Forceea.Template templateB2bOpportunitiesWithRelated() {
    return new Forceea.Template()
        .add(templateMajorAccounts())
        .add(B2B_OPPORTUNITIES,
            templateOpportunities().getFObject(OPPORTUNITIES)
          // methods
        );
} 

with the important difference of line 3: you add an existing Template (templateMajorAccounts) before you add the item for B2B Opportunities. Now the block will be:

.setVariable('recordType', 'B2bOpportunity')
.setDefinition(Opportunity.RecordTypeId, 'static value({@recordType})')
.setDefinition(Opportunity.Name, 'static value(Opportunity-)')
.setDefinition(Opportunity.Name,
    'serial type(number) from(1) step(1) scale(0)')
.setDefinition(Opportunity.AccountId,
    'serial lookup(Account) mode(cyclical) source(forceea)') 

Look at the last line, the serial lookup field definition:

  1. It belongs to a rich set of static/serial/random lookup definitions.

  2. The definition retrieves the previously inserted accounts from Forceea’ s internal storage (doesn’t query the database) and gets the ID of each Account record serially, following a circular order.


The use cases: solutions


Let’s review what we’ve achieved so far: a DataFactory class with Forceea Templates for accounts and opportunities.


Is that enough? This is more than enough for our EDF!


Use Case 1: Apex test data


Suppose a developer wants to test that opportunities are updated successfully when we update the Stage field to “Closed Won”. The following test method:

  1. temporarily uses the FObject.setGlobalVerbose('debug') statement, so Forceea will use Debug Log for information about the data generation process (an invaluable help during development!)

  2. generates records from the template and inserts the records with template.insertRecords(true)

  3. updates the values of the StageName field with the methods setDefinitionForUpdate and updateFields

  4. finally updates the records

@IsTest
private static void itShouldUpdateOpportunities() {
    FObject.setGlobalVerbose('debug');
    // GIVEN opportunities with accounts
    Forceea.Template template =
        DataFactory.templateB2bOpportunitiesWithRelated();
    template.insertRecords(true);

    // WHEN we update the Stage field to "Closed Won"
    template.getFObject(DataFactory.B2B_OPPORTUNITIES)
        .setDefinitionForUpdate(Opportunity.StageName,
            'static value(Closed Won)')
        .updateFields()
        .updateRecords(true);

    // THEN ...
} 

Important points:

  1. You don’t have to know how your Data Factory generates the Account and Opportunity records.

  2. You can add a new definition (or replace an existing one) with a very simple step.

So, to add a new definition for a random Account Website and replace the existing definition of Opportunity Amount to a static value of 20000, we can use:

Forceea.Template template =         
    DataFactory.templateB2bOpportunitiesWithRelated();
template.getFObject(DataFactory.MAJOR_ACCOUNTS)
    .setDefinition(Account.Website, 'random type(url)');
template.getFObject(DataFactory.B2B_OPPORTUNITIES)
    .replaceDefinitions(Opportunity.Amount)
    .setDefinition(Opportunity.Amount, 'static value(20000)');
template.insertRecords(true); 

The method replaceDefinitions (line 5) will replace the existing field definition(s) of Amount with a new definition.


Use Case 2: Data for populating an org


This use case has 2 options

  1. Insert data synchronously with sfdx force:execute.

  2. Insert data asynchronously with FObjectAsync.

The first approach uses the same logic as Use Case 1. You create Apex scripts that are executed from the Salesforce CLI command. The issue here is that we have transaction limits. Consider this option for a small number of records only.


The second approach is more interesting because it can generate an unlimited number of records (well, always subject to your org’s storage..).


Execute this script in any Apex Anonymous Window:

Forceea.Template template = 
    DataFactory.templateB2bOpportunitiesWithRelated();
new FObjectAsync(template)
    .setNumberOfIterations(1000)
    .setNumberOfJobs(20)
    .insertRecords(); 
  1. An iteration is the records that are defined in the Template, let’s say 10 accounts and 40 opportunities. When you require 1000 iterations, you actually want to insert 10,000 accounts and 40,000 opportunities.

  2. You may have from 1 to 50 concurrent (Queueable Apex) jobs.

  3. The Template can be modified (we saw it previously), inserting new or replacing existing field definitions.

Do you want to delete or update the inserted records? Here is the script for doing this:

Forceea.Template template = 
    DataFactory.templateB2bOpportunitiesWithRelated();
new FObjectAsync(template)
    .deleteRecords(); // or .updateRecords() 

You can define a partition field, which inserts information for deleting or updating the inserted records using multiple concurrent jobs (much faster). Partitioning avoids the notorious UNABLE_TO_LOCK_ROW error!


Conclusion


We reviewed how you can construct a best-of-breed Enterprise Data Factory with accounts and opportunities. The same logic applies to any other standard or custom SObject (or even Big Object) that you need.


Some final thoughts:

  1. Constructing your new EDF doesn’t have to be done in one week! Take a step by step approach and use it for all new requirements.

  2. Your legacy Data Factory code should be replaced gradually.

  3. The Forceea framework supports many advanced features, for example dependent picklist fields, extended error messaging system or complex data field definitions with permutations. One of its strengths is the Success Guide, a 180-pages guide to training, examples and reference material.

Enjoy your new Enterprise Data Factory!

bottom of page