Typically, this refers to data coming from other business processes. Examples include social media activity, satellite photos, location tracking data, credit card transactions, and web scraping.
Alternative data can be used across the enterprise, from marketing to sales, finance and strategic planning, but most of these third-party data are owned and managed by IT departments. In 2019, Forrester Research found that CIOs and CDOs in IT were managing 56% of alternative data collection.
‘Source’, ‘store’ and ‘manage’ alternative data presents new challenges for IT managers. In addition, it is unnecessary and may entail significant costs. Here, we look at the five problems and how to mitigate them.
1. Vendor selection costs
According to Lowenstein’s survey results, the ‘cost of vendor selection (61%)’ is the biggest concern for alternative data users. This cost is incurred in the process of reviewing alternative data vendors and ensuring that the data provided by that vendor is of sufficient quality. This is especially important when data is a key component of business processes and cannot be easily replaced. In these circumstances, the buyer should have confidence that the vendor will continue to provide this data into the foreseeable future.
One way to mitigate this risk is to research industry consortia to identify trusted data sources. Other companies in the same field are likely to have similar requirements, and share ideas and best practices.
2. Finding appropriately skilled staff
According to a survey by Quanthub, there was a shortage of about 250,000 data scientists in 2020. As of the end of April 2022, there were 2,700 job postings on the job/job site Indeed.com looking for data scientists (UK only). This talent shortage is driving salaries and making it more difficult to retain employees. Also, having a data scientist doesn’t mean you can integrate alternative data into your business.
Forrester Research recommends that businesses use a ‘data hunter’ service that tracks available alternative data and validates the accuracy and integrity of these sources. European reinsurance company Munich Re, for example, employs a team of 20 data hunters for this purpose.
Also, solutions that can alleviate this skills shortage include training existing staff (rather than new hires) who knows the business and business requirements. Partnering with universities seeking data science course support, student placement and graduate education programs is another way to build a skills pipeline.
3. Ascertaining data ownership
Because of the nature of alternative data and the fact that it is based on non-traditional sources, it can be more difficult to verify data ownership than data from a trusted vendor. This is especially true when the sources are complex, as multiple data sources are combined prior to purchase. Issues may arise with respect to licensing, intellectual property rights and data protection regulations. You can alleviate this problem by choosing a (trusted) vendor that offers customers some degree of transparency in how to source their data. Of course, using internal data whenever possible is another way to reduce risk.
4. Updating models to process alt data
Maintaining the data model to ensure consistency and handling errors when they occur is expensive. Many companies overlook this. Idera estimates that maintenance accounts for 50 to 80 percent of a development budget. Adding a new data source to a model can add significant cost. Carefully modeling the data from the start and incorporating some flexibility into the model design can facilitate this process.
5. Appropriate tools to store alt data
A quarter of respondents to a Lowenstein survey cited the lack of tools and skills to store alternative data as a serious problem. This problem is due to the lack of consistency between the various sources in terms of update frequency, API, and data format. Organizing data to ensure that models run smoothly as well as produce consistent and reliable results can be expensive. From on-premises systems to cloud and hybrid solutions, getting them to work efficiently with the ingestion requirements of increasingly storage options and data models adds another layer of complexity and cost.
The importance of alternative data will grow as data continues to provide a source of competitive advantage to exploit its commercial viability. While accessing many alternative data sources may incur little or no cost, it should be understood that there can sometimes be significant extra costs associated with making it fit for purpose and integrating it into existing workflows.
* Martin De Saulles, PhD, is a writer and scholar specializing in data-driven innovation and Internet of Things research and writing. He currently works as a Senior Lecturer at the University of Brighton, UK.
Source: ITWorld Korea by www.itworld.co.kr.
*The article has been translated based on the content of ITWorld Korea by www.itworld.co.kr. If there is any problem regarding the content, copyright, please leave a report below the article. We will try to process as quickly as possible to protect the rights of the author. Thank you very much!
*We just want readers to access information more quickly and easily with other multilingual content, instead of information only available in a certain language.
*We always respect the copyright of the content of the author and always include the original link of the source article.If the author disagrees, just leave the report below the article, the article will be edited or deleted at the request of the author. Thanks very much! Best regards!