Return to Glossary

Synthetic Data

Data that companies generate using sophisticated algorithms that automatically discover and learn the statistical patterns and characteristics in a real-world dataset. Companies use synthetic data to meet demands for high volumes of ML training data.

Added Perspectives

“Synthetic data generation techniques can generate training data using deep learning models such as the Generative Adversarial Networks (GANs). Such models can automatically discover and learn the statistical patterns and characteristics in an input dataset and then generate output datasets that share the same characteristics as the original dataset. Hence, synthetic data generation techniques can be used to rapidly produce a high volume of training data, bypassing the requirement to collect data from the real world. Furthermore, because it generates data from scratch, synthetic data techniques do not require labeling or feature engineering process, which is one of the most expensive, time-consuming, and error-prone activities in AI development. The labels are generated along with the dataset from the get go.”

- Wayne Eckerson in Business Monitoring Systems: Using Machine Learning to Analyze Business Metrics

June 15, 2020 (Report)

Relevant Content

Related Terms

Datalere

Unleash The Power Of Your Data

Providing modern, comprehensive data solutions so you can turn data into your most powerful asset and stay ahead of the competition.

Learn how we can help your organization create actionable data strategies and highly tailored solutions.

© Datalere, LLC. All rights reserved

383 N Corona St
Denver, CO 80218