您的浏览器禁用了JavaScript(一种计算机语言,用以实现您与网页的交互),请解除该禁用,或者联系我们。 [GSMA]:克服数据瓶颈和稀缺性-Rockfish数据 - 发现报告

克服数据瓶颈和稀缺性-Rockfish数据

信息技术 2025-09-16 GSMA 灰灰
报告封面

Rockfish Data is helping enterprises to desensitise real-world dataand to train AI models Alternatively, Rockfish AI’s models can be used to alterthe real-world data so as to test a particular hypothesis.They can be configured to change specific fields in thereal-world data, such as the take-up rate for an autopilotfeature in a vehicle, and then see how that would impactother fields. ExecutiveSummary Harnessing real-world data to train AI models can facemajor technical, commercial and regulatory obstacles.These obstacles can result in both data bottlenecksand data sparsity. Rockfish Data is employing generativeAI to help enterprises overcome these challenges.The three-year-old company has developed AI modelsthat can analyse real-world data and then either replaceit or augment it with synthetic data. Rockfish Data already has some notable customers,including Ford Motor Company, Conviva, US public sectoragencies and Deutsche Telekom. It is initially targetingthree sectors – cybersecurity/telecoms, financial servicesand supply chains – and supporting a wide range of usecases, from countering fraud and predictive maintenanceto testing product concepts and anticipating scenarios. Rockfish Data’s models generate a representation orstatistical abstraction of the real-world operational data,which doesn’t leave the customer’s environment.For example, it could analyse the distribution, magnitude,timing and nature of financial transactions, as well ascorrelations between them, made by people in specificlocations. It would then create a synthetic data setthat is faithful to these patterns and correlations, butwithout any information that could be used to inferanything about the real-world transactions or the peoplethat made them. So far, the synthetic data produced by Rockfish Datahas had a major impact on the effectiveness of AI modeltraining. The company says the accuracy of somemodels has increased by 20 to 30 percentage points.The solution has also enabled a significant reduction inthe time it takes customers to develop new productsand services. Without the right data, artificial intelligence (AI)is far from intelligent. Even the most advanced AImodels depend on high-quality and relevant data towork well. Data sparsity is a different, but equally challenging,issue. In this case, there may be insufficientreal-world data to enable an AI model to test aparticular hypothesis or run a specific scenario.This can be a problem if a business wants to gaugethe impact of relatively rare events. For example, alack of real-world data can make it difficult for AImodels to stress-test cyber-security or the resilienceof a supply chain to a particular geopolitical scenario.Other examples include the need to assess a bank’svulnerability to a new form of fraud or the potentialimpact of an unusual network fault (in the case of atelecoms operator). As organisations across the economy strive toharness AI, many are hamstrung by either databottlenecks or data sparsity or both. That’s the viewof Muckai Girish, who has 30 years’ experience in thetech and telecoms sector, including with AT&T andJuniper Networks. “You need some data, but it is not with you, it is withsomebody else, either with a vendor or a partner ora customer, or another division,” Muckai Girish notes.“They are not able to give you the data as is, becauseof either confidentiality or some sort of regulatory orcompliance issue.” In many jurisdictions, the sharingof personally-identifiable data is highly regulated.At the same time, many businesses don’t want toshare commercially-sensitive information, even withpartners and suppliers. “Even though there is a lot of data, in many instancesit’s like being in an ocean sitting on a boat: You wantto drink sparkling water, but the only water that theyhave is salt water”, Muckai Girish says. Distilling value from data This analogy is likely to resonate with telecomsoperators - their networks generate oceans of data,but they often struggle to distil value from that.“Unfortunately, they are not able to leverage it tothe extent that they could,” notes Muckai Girish.“They just haven’t been able to use it for everythingfrom customer experience to operating networkswell to overall business outcomes. All of that couldbe done so much better, and AI actually provides afabric to make all that happen. But you need to reallybe able to use the data to do that.” Even though there is a lot ofdata, in many instances it’s likebeing in an ocean sitting on aboat: You want to drink sparklingwater, but the only water thatthey have issalt water. Telcos may face regulatory and technical challengestransferring and aggregating data from multiplesystems and countries. The cost of storage can alsomean that operators need to delete large amounts ofdata every week, while only retaining limitedsnapshots of the activity on their networks. As aresult, much of the granularity that AI models needto become highly sophisticated may be lost. Dr. Muckai Gi