But while data production rose 22 per cent, data storage only increased 0.95ZB last year – less than 3 per cent of the amount of data generated in the same period, the survey said.

The survey acknowledged China’s data generation and storage conversion ratio was still low and that “the potential value of data needs further exploration”.

01:35

Chinese firm obtains country’s first passenger-drone production certificate

Chinese firm obtains country’s first passenger-drone production certificate

It also noted that use of stored data remained low – nearly 40 per cent of data stored by the companies last year was not read or being reused after being stored.

“The insufficiency of data processing capabilities leads to an underestimation and difficulties in uncovering and reusing a large amount of data,” the report said.

Professor Andy Chun from City University of Hong Kong’s College of Business, said China’s 3 per cent figure for storing data was consistent with global measures that found only a fraction of data generated was preserved.

According to German statistics portal Statista, 2 per cent of the data created and consumed in 2020 was stored for use in 2021.

“There are many reasons for this selective retention, with data privacy and security being paramount. Most nations enforce regulations that restrict data storage to what is necessary for defined purposes, mandating its deletion once it no longer serves those ends,” Chun said.

“Storing vast quantities of data not only presents security vulnerabilities but also entails significant costs and technological challenges.

“The infrastructure required to store such volumes of real-time data demands continual advancements in storage solutions, which can be cost-prohibitive,” said Chun, who is an adviser to the AI Specialist Group of the Hong Kong Computer Society.

Chun said he anticipated an imminent and substantial rise in China’s data retention rates soon, propelled by the embrace of generative AI technologies around the world.

Successful AI output depended on both the volume and quality of the underlying data, he said, adding that as the trend veered towards more personalised generative AI applications, it was likely that more personal data would be kept to train AI models.

“To accommodate this growth, it would be prudent for China to channel investments into the advancement of storage technologies, aiming to enhance capacity and drive down costs. This strategic focus could support the burgeoning AI-driven demands while fostering innovation across the industry,” Chun said.

The National Data Resources Survey Report 2023 also urged China’s large enterprises to invest in digital transformation. Some 22 per cent of companies surveyed said they still had no data management system. Among those that had undergone a digital transformation, only 8 per cent reused their data and achieved additional value from it.

“There is still a long way to go to explore the full value of data,” the survey said.

But it also noted that China’s demand for quality data products remained very strong, with demand reaching 1.75 times that of supply, according to the survey’s results relating to China’s data exchange centres.

Chun said that although there was no comparable American survey, based on population he inferred the volume of data generated in the US was significantly less than in China.

The demand for computing power for large AI model training is expected to remain high, and the demand for computing power from science institutions, government affairs, finance and other industries also increased accordingly. It recommended China accelerate the construction of its national integrated computing power system to meet the demand.

The National Data Resources Survey Report predicted China’s data production would increase more than 25 per cent in 2024, driven by large-scale application of new technologies, such as satellite communications, self-driving cars and generative AI.

01:58

China denies accusations of state-sponsored hacking from US, UK and New Zealand

China denies accusations of state-sponsored hacking from US, UK and New Zealand

Jiang Yan, director of the National Industrial Information Security Development Research Centre which was in charge of the survey, said China had an initial scale advantage for its data resources.

“But more needs to be done to release the potential of massive data, as China’s data resource management and utilisation are in the initial stage as a whole,” Jiang was quoted as saying by China’s Daily Economic News.

CityU’s Chun warned that the expansion of personal data storage “must be carefully managed, with vigilant adherence to privacy regulations, ethical standards, and robust data protection protocols”.

He added that for sustainable data growth, China must engage in strategic investment beyond merely augmenting storage capacity.

Advancing data management practices that prioritise quality, security and governance was essential. Such a comprehensive approach was vital to fully leverage the capabilities of generative AI, ensuring that the principles of responsible AI were maintained, he said.