The Use of Synthetic Data to Train AI Models: Opportunities and Risks for Sustainable Development
Tshilidzi Marwala, Eleonore Fournier-Tombs and Stinckwich, Serge (2023). The Use of Synthetic Data to Train AI Models: Opportunities and Risks for Sustainable Development. UNU Technology Brief. United Nations University.
Document type:
Report
Collections:
-
Attached Files (Some files may be inaccessible until you login with your UNU Collections credentials) Name Description MIMEType Size Downloads UNU-TB_1-2023_The-Use-of-Synthetic-Data-to-Train-AI-Models.pdf English PDF application/pdf 460.02KB UNU-TB_1-2023_Use-of-Synthetic-Data-to-Train-AI-Models_CN.pdf Chinese PDF application/pdf 668.65KB UNU-TB_1-2023_Use-of-Synthetic-Data-to-Train-AI-Models_JP.pdf Japanese PDF application/pdf 636.00KB -
Sub-type Policy brief Author Tshilidzi Marwala
Eleonore Fournier-Tombs
Stinckwich, SergeTitle The Use of Synthetic Data to Train AI Models: Opportunities and Risks for Sustainable Development Series Title UNU Technology Brief Volume/Issue No. 1 Publication Date 2023-09 Place of Publication Tokyo Publisher United Nations University Pages 5 Language eng
jpnAbstract Using synthetic or artificially generated data in training AI algorithms is a burgeoning practice with significant potential. It can address data scarcity, privacy, and bias issues and raise concerns about data quality, security, and ethical implications. This issue is heightened in the global South, where data scarcity is much more severe than in the global North. Synthetic data, therefore, addresses the problem of missing data, leading, in the best case, to better representation of populations in datasets and more equitable outcomes. However, we cannot consider synthetic data to be better or even equivalent to actual data from the physical world. In fact, there are many risks to using synthetic data, including cybersecurity risks, bias propagation, and simply an increase in model error. This policy brief proposes recommendations for the responsible use of synthetic data in AI training and the associated guidelines to regulate the use of synthetic data. Copyright Holder United Nations University Copyright Year 2023 Copyright type Creative commons ISBN 9789280891454 -
Citation counts Search Google Scholar Access Statistics: 578 Abstract Views, 1462 File Downloads - Detailed Statistics Created: Mon, 04 Sep 2023, 16:16:47 JST by Powell, Daniel on behalf of UNU Centre