Recommendations on the Use of Synthetic Data to Train AI Models

Philippe de Wilde, Payal Arora, Fernando Buarque, Yik Chin, Thinyane, Mamello, Stinckwich, Serge, Eleonore Fournier-Tombs and Tshilidzi Marwala (2024). Recommendations on the Use of Synthetic Data to Train AI Models. United Nations University.

Document type:

  • Attached Files (Some files may be inaccessible until you login with your UNU Collections credentials)
    Name Description MIMEType Size Downloads
    Use-of-Synthetic-Data-to-Train-AI-Models.pdf English PDF application/pdf 629.50KB
  • Sub-type Policy brief
    Author Philippe de Wilde
    Payal Arora
    Fernando Buarque
    Yik Chin
    Thinyane, Mamello
    Stinckwich, Serge
    Eleonore Fournier-Tombs
    Tshilidzi Marwala
    Title Recommendations on the Use of Synthetic Data to Train AI Models
    Publication Date 2024-02
    Place of Publication Tokyo
    Publisher United Nations University
    Pages 9
    Language eng
    Abstract Using synthetic or artificially generated data in training Artificial Intelligence (AI) algorithms is a burgeoning practice with significant potential to affect society directly. It can address data scarcity, privacy and bias issues but does raise concerns about data quality, security and ethical implications. While some systems use only synthetic data, most times synthetic data is used together with real-world data to train AI models. Our recommendations in this document are for any system where some synthetic data are used. The use of synthetic data has the potential to enhance existing data to allow for more efficient and inclusive practices and policies. However, we cannot assume synthetic data to be automatically better or even equivalent to data from the physical world. There are many risks to using synthetic data, including cybersecurity risks, bias propagation and increasing model error. This document sets out recommendations for the responsible use of synthetic data in AI training.
    Copyright Holder United Nations University
    Copyright Year 2024
    Copyright type All rights reserved
    ISBN 9789280891546
  • Versions
    Version Filter Type
  • Citation counts
    Google Scholar Search Google Scholar
    Access Statistics: 51 Abstract Views, 303 File Downloads  -  Detailed Statistics
    Created: Thu, 29 Feb 2024, 15:35:03 JST by Powell, Daniel on behalf of UNU Office of Communications