Sign up for weekly AppOps insights.

Sign up for weekly AppOps insights.

How to populate Salesforce sandboxes with data

Hayley Coxon

VP of Marketing

September 2, 2020

Considerations for seeding sandboxes

Seeding sandboxes helps accelerate Salesforce development cycles, increases testing throughout development, and ultimately enables your team to better build new solutions. But how do you actually do it? No one likes having to redo work—especially data migration work—so before attempting to move any data, take a moment to plan your sandbox seeding project. Here are some things to consider:

Limitations

Each type of Salesforce org has a specific purpose in Application Lifecycle Management and even though most sandboxes can have many uses, different org types have different limits. For example, scratch orgs and Developer/Dev Pro orgs are perfect for initiating new projects since they are fairly disposable orgs that can be used to isolate changes from impacting other projects. However, they also have the least amount of data storage of any org type so you may only be able to seed a subset of production data. It’s important to understand the limitations of the org you are attempting to seed so you don’t burn through the limits with data that’s irrelevant to the project. What are the limitations of your destination org? What data is essential to the development work, and what’s the right amount of data for development and testing at this stage of the project? How will you filter record data to capture what you need?

Repeatability and scale

Since sandbox seeding tends to require a sizable time investment upfront, it’s worth thinking about how to make it more scalable. More often than not, you’ll want to re-seed the same data set multiple times as a base testing data set and as work progresses on larger projects and applications. There’s also a good chance that someone else on the team is trying to achieve similar seeding results. Creating templates for recurring sandbox seeding needs can reduce the time requirement for seeding and accelerate development cycles even more across the entire team. Also consider establishing naming conventions and folder structures to keep track of your data sets and templates.

Data relationships and duplicate prevention

When a record is moved from one Salesforce org to another, the ID of that record changes making it challenging to know that Contact Record 1 in the sandbox matches Contact Record A in production. Most data seeding projects will require multiple related and unrelated objects that must be populated in a specific order. For example, you cannot associate Contacts to their Account if you have not first migrated Account records. Having a good understanding of the data schema (or a tool that will manage the Salesforce data schema for you) will prevent frustration and rework as you migrate data.

But what if you already have data in the sandbox that you don’t want to or cannot get rid of? Traditionally with a data loader, you would use an Upsert operation, so you could update existing records and insert new records. While this sounds relatively simple, and it is, it does require maintaining record IDs in the CSV upsert file. You will want to be on the lookout though, if you’re updating a record, because any fields that aren’t defined in the CSV file are ignored during the update. Once you have completed the operation, data loader will generate another CSV file of your results and errors if you have them.

A final note on data loader. When you run an update, you may receive an error message regarding duplicates. You will need to go back and manually remove all of the duplicates. This can also be problematic if the person who has the CSV file on their desktop or thumb drive is unavailable, on vacation, or even worse, they could have had their computer or drive stolen. You are now experiencing data loss.

Common gotchas

When a data deployment fails, it’s typically due to a handful of reasons. Are the user permissions the same in each org? The user will need create/edit access in the destination org for all of the fields and objects in the data set. If working with Campaign data, also make sure that the User record in the source org has the “Marketing User” checkbox checked.

Validation rules might also prevent your data from inserting correctly. You’ll want to turn those off before attempting to seed the org. Finally, you’ll want to think about the schemas in your source and destination orgs. Are the schemas the same? Are any fields missing in your destination org?

Data security and org automations

Additional steps must be taken to ensure the safe use of your data. Are you seeding the sandboxes with sensitive data? Does everyone involved in the project have permissions to all of the data? Does it need to be anonymized or scrambled to ensure it’s safeguard during development and testing? Keep in mind that standard Salesforce Sharing Rules also apply to your data sets.

Org automations can also cause unintended consequences such as erroneous user notifications or emails to your database. Make sure to deactivate unwanted triggers, workflow rules, and Process Builder processes before seeding your org. As an extra measure for preventing emails being sent from sandbox or Dev or, we also recommend appending email addresses with “.invalid” or setting them to empty so even if an email is triggered it would not reach a real customer inbox. However, if the project in question involves sending emails consider setting all emails in your data set to a predetermined testing address so you can perform accurate testing.

Who will be responsible for seeding sandboxes

Given all of the above considerations and potential areas for errors, you’ll want to carefully consider who performs data seeding. Clearly the practice requires a strong understanding of your org’s data schema and data security policies. When we start working with companies, it’s common for this task to be solely entrusted to Data Architects or a more technical team member because of its complexity and propensity for error. In some cases companies are able—with very detailed documentation—to hand off routine data seeding tasks to less technical resources; however ad hoc seeding needs remain the responsibility of senior staffers.

Unfortunately, sandbox seeding tends to be a time consuming, tedious process, which means companies find themselves in a Catch 22. It is very expensive to have your most technical employees perform the task. Those same employees would much rather be working on new projects or shepherding projects through testing and UAT than moving data between orgs at the start and end of every Salesforce sprint.

Seed sandboxes faster and more reliably with Prodly AppOps

Prodly AppOps Release offers granular control over how and what data to seed, enabling customers to:

  1. Deploy entire complex, relational data schemas at once—AppOps will automatically deploy the objects sequentially
  2. Automatically prevent duplicate records
  3. Seed up to 5 orgs simultaneously
  4. Disable and re-enable org automations automatically during a deployment
  5. Granularly control what data is seeded
  6. Achieve desired data security with robust options for obfuscating, appending, and/or default data
  7. Create and share reusable deployment plans that can be easily re-run by any user regardless of technical skill level

To see how easy sandbox seeding can be, watch this step-by-step video.