Synthetic Data Generation. Part 3: Backup and Restore

Total: 1 Average: 5

Implementation of the general script for data sanitization and the secret data changes

We’ve examined simple examples for each type of altered data in the previous articles (Part 1: Data Copying, Part 2: Data Changing):

  1. Changing the date and time;
  2. Changing the numerical value;
  3. Changing the byte sequence;
  4. Changing the characters’ data.

However, the examples described above don’t meet the criteria 2 and 3 for the data altering scripts:

Read More

Synthetic Data Generation. Part 2: Data Changing

Total: 1 Average: 5

Character Data Change

Here, we take an example for the English and Russian alphabets, but you can do it for any other alphabet. The only condition is that its characters must be present in the NCHAR types.

We need to create a function that accepts the line, replaces every character with a pseudorandom character, and then puts the result together and returns it.

Read More

Synthetic Data Generation. Part 1: Data Copying

Total: 2 Average: 3.5

Introduction

Sooner or later, any information system gets a database, often – more than one. With time, that database gathers very much data, from several GBs to dozens of TBs. To understand how the functionals will perform with the data volumes increasing, we need to generate the data to fill that database.

All scripts presented and implemented will execute on the JobEmplDB database of a recruiting service. The database realization is available here

Read More