The future of DW's in the age of Big Data

Kirsty Garshong our consultant managing the role
Posting date: 7/27/2013 2:47 PM

Many companies are saddled with data warehouses that weren’t designed to handle big data, but they can evolve their data warehouses into “analytics warehouses” capable of processing structured and unstructured data.

Enterprise data warehouses have reached a crossroads. Companies have spent millions of dollars designing, implementing, and updating them, but few organizations have realized the return they expected from their investments, according to Richard Solari, a director with Deloitte Consulting LLP’s Information Management service line.

The disappointing ROI largely stems from an inherent inadequacy in data warehouses: They were designed to handle the kind of structured data stored in ERP systems, not the unstructured data from social media, mobile devices, Web traffic, and other sources now streaming into enterprises. By Solari’s estimate, 90 percent of the data warehouses he’s observed process just 20 percent of an enterprise’s data. Consequently, many enterprises have only been able to use their data warehouses for historical analysis and past performance reporting.

The bottom line, says Solari: “Companies are using expensive infrastructure to generate back office reports.”

Organizations’ prospects for obtaining an acceptable return from their data warehousing investments may continue to diminish as long as this infrastructure fails to keep pace with big data.

The good news: Vendors are building new generations of data warehouses with advanced statistical capabilities for performing analytics and forecasting, according to Robert Stackowiak, Oracle Corp.’s vice president of information architecture and big data. They’re also improving integration with emerging platforms like Hadoop that process large volumes of unstructured data, he says.

Because newer generations of data warehouses are designed to federate structured and unstructured data, they may provide enterprises with a 360-degree view of their operations and, with that broader perspective, the ability to make better decisions about the future, according to Solari.

Companies running legacy data warehouses don’t have to junk their infrastructure and start anew. Solari says they can add capabilities to their existing data warehouse infrastructure that can allow it to grow into an “analytics warehouse.”

“Data warehouses are going to look very different in five years, and organizations should begin preparing for that transition,” says Solari.

Introducing the Analytics Warehouse

Fundamentally, the analytics warehouse functions as a central repository for an enterprise’s structured and unstructured data. In a traditional data warehousing architecture, structured data from ERP systems, CRM systems, file shares, and line of business applications is batch processed into the enterprise data warehouse using ETL (extract, transform, load) database processes. Software for running ad hoc queries and business intelligence systems take data from the warehouse environment, which may include operational data stores and data marts, to generate reports for users.

The architecture for the analytics warehouse builds on the traditional data warehouse architecture in three primary ways:

1. A distributed file system (like Hadoop) sits between source data systems and the data warehouse. It collects, aggregates, and processes huge volumes of unstructured data, and stages it for loading into the data warehouse.

2. Structured and unstructured data from back end systems can be brought into the data warehouse in real- and near-real time.

3. Engines that use statistical and predictive modeling techniques to perform data discovery, visualization, inductive and deductive reasoning, and real-time decision-making reside between the data warehouse and end users. These engines identify patterns in big data. They can also complement and feed traditional ad hoc querying tools and business intelligence applications.


“In the past, companies couldn’t integrate these disparate technologies with the data warehouse because each technology required different file formats and data schemas,” says Stackowiak. “Today, you can integrate these technologies, and the result is that companies can access more of their data—not just the 20 percent from enterprise systems—and convert it into valuable, profitable information.”

Companies interested in building out their traditional data warehouse infrastructures may consider starting with reporting, if they don’t already have reporting capabilities in place, suggests Solari. Then, they can begin integrating analytics technologies to their reporting framework.

“When companies start bringing this data together and federating it inside a data warehouse, the total cost of ownership for the data warehouse may begin to go down while the ROI goes up,” says Solari. “The ability to integrate big data technologies, analytics technologies, back office systems, and traditional data warehouses has the potential to fundamentally change the economics of data warehousing for the better.”


Click here for the article on the web.


Related blog & news

With over 10 years experience working solely in the Data & Analytics sector our consultants are able to offer detailed insights into the industry.

Visit our Blogs & News portal or check out the related posts below.

Weekly News Digest - 18th-22nd Jan 2021

This is Harnham’s weekly news digest, the place to come for a quick breakdown of the week’s top news stories from the world of Data & Analytics. KDNuggets: 20 core Data Science concepts for beginners The field of Data Science is one that continuously evolves. For Data Scientists, this means constantly learning and perfecting new skills, keeping up to date with crucial trends and filling knowledge gaps.  However, there are a core set of concepts that all Data Scientists will need to understand throughout their career, especially at the start. From Data Wrangling to Data Imputation, Reinforcement Learning to Evaluation Metrics, KDNuggets outlines 20 of the key basics needed.  A great article if you’re just starting out and want to grasp the essentials or, if you’re a bit further up the ladder and would appreciate a quick refresh.  Read more here.  FinExtra: 15 DevOps trends to watch in 2021 As a direct response to the COVID-19 pandemic, there is no doubt that DevOps has come on leaps and bounds in the past year alone. FinExtra hears from a wide range of specialists within the sector, all of whom give their opinion on what 2021 holds for DevOps.  A few examples include: Nirav Chotai, Senior DevOps Engineer at Rakuten: “DataOps will definitely boom in 2021, and COVID might play a role in it. Due to COVID and WFH situation, consumption of digital content is skyrocket high which demands a new level of automation for self-scaling and self-healing systems to meet the growth and demand.” DevOps Architect at JFrog: “The "Sec'' part of DevSecOps will become more and more an integral part of the Software Development Lifecycle. A real security "shift left" approach will be the new norm.” CTO at International Technology Ventures: “Chaos Engineering will become an increasingly more important (and common) consideration in the DevOps planning discussions in more organizations.” Read the full article here.  Towards Data Science: 3 Simple Questions to Hone Python Skills for Beginners in 2021 Python is one of the most frequently used data languages within Data Science but for a new starter in the industry, it can be incredibly daunting. Leihua Yea, a PHD researcher at the University of California in Machine Learning and Data Science knows all too well how stressful can be to learn. He says: “Once, I struggled to figure out an easy level question on Leetcode and made no progress for hours!” In this piece for Towards Data Science, Yea gives junior Data Scientists three top pieces of advice to help master the basics of Python and level-up their skills. Find out what that advice is here.  ITWire: Enhancing customer experiences through better data management From the start of last year, businesses around the globe were pushed into a remote and digital way of working. This shift undoubtedly accelerated the use of the use of digital and data to keep their services as efficient and effective as possible.  Derak Cowan of Cohesity, the Information Technology company, talks with ITWire about the importance of the continued use of digital transformation and data post-pandemic, even after restrictions are relaxed and we move away from this overtly virtual world.  He says: “Business transformation is more than just a short-term tactic of buying software. If you want your business to thrive in the post-COVID age, it will need to place digital transformation at the heart of its business strategy and identify and overcome the roadblocks.” Read more about long-term digital transformation for your business here.  We've loved seeing all the news from Data and Analytics in the past week, it’s a market full of exciting and dynamic opportunities. To learn more about our work in this space, get in touch with us at info@harnham.com.

Using Data Ethically To Guide Digital Transformation

Over the past few years, the uptick in the number of companies putting more budget behind digital transformation has been significant. However, since the start of 2020 and the outbreak of the coronavirus pandemic, this number has accelerated on an unprecedented scale. Companies have been forced to re-evaluate  their systems and services to make them more efficient, effective and financially viable in order to stay competitive in this time of crisis. These changes help to support internal operational agility and learn about customers' needs and wants to create a much more personalised customer experience.  However, despite the vast amount of good these systems can do for companies' offerings, a lot of them, such as AI and machine learning, are inherently data driven. Therefore, these systems run a high risk of breaching ethical conducts, such as privacy and security leaks or serious issues with bias, if not created, developed and managed properly.  So, what can businesses do to ensure their digital transformation efforts are implemented in the most ethical way possible? Implement ways to reduce bias From Twitter opting to show a white person in a photo instead of a black person, soap dispensers not recognising black hands and women being perpetually rejected for financial loans; digital transformation tools, such as AI, have proven over the years to be inherently biased.  Of course, a computer cannot be decisive about gender or race, this problem of inequality from computer algorithms stems from the humans behind the screen. Despite the advancements made with Diversity and Inclusion efforts across all industries, Data & Analytics is still a predominantly white and male industry. Only 22 per cent of AI specialists are women, and an even lower number represent the BAME communities. Within Google, the world’s largest technology organisation, only 2.5 per cent of its employees are black, and a similar story can be seen at Facebook and Microsoft, where only 4 per cent of employees are black.  So, where our systems are being run by a group of people who are not representative of our diverse society, it should come as no surprise that our machines and algorithms are not representative either.  For businesses looking to implement AI and machine learning into their digital transformation moving forward, it is important you do so in a way that is truly reflective of a fair society. This can be achieved by encouraging a more diverse hiring process when looking for developers of AI systems, implementing fairness tests and always keeping your end user in mind, considering how the workings of your system may affect them.  Transparency Capturing Data is crucial for businesses when they are looking to implement or update digital transformation tools. Not only can this data show them the best ways to service customers’ needs and wants, but it can also show them where there are potential holes and issues in their current business models.  However, due to many mismanagements in past cases, such as Cambridge Analytica, customers have become increasingly worried about sharing their data with businesses in fear of personal data, such as credit card details or home addresses, being leaked. In 2018, Europe devised a new law known as the General Data Protection Regulation, or GDPR, to help minimise the risk of data breaches. Nevertheless, this still hasn’t stopped all businesses from collecting or sharing data illegally, which in turn, has damaged the trustworthiness of even the most law-abiding businesses who need to collect relevant consumer data.  Transparency is key to successful data collection for digital transformation. Your priority should be to always think about the end user and the impact poorly managed data may have on them. Explain methods for data collection clearly, ensure you can provide a clear end-to-end map of how their data is being used and always follow the law in order to keep your consumers, current and potential, safe from harm.  Make sure there is a process for accountability  Digital tools are usually brought in to replace a human being with qualifications and a wealth of experience. If this human being were to make a mistake in their line of work, then they would be held accountable and appropriate action would be taken. This process would then restore trust between business and consumer and things would carry on as usual.  But what happens if a machine makes an error, who is accountable?  Unfortunately, it has been the case that businesses choose to implement digital transformation tools in order to avoid corporate responsibility. This attitude will only cause, potentially lethal, harm to a business's reputation.  If you choose to implement digital tools, ensure you have a valid process for accountability which creates trust between yourself and your consumers and is representative of and fair to every group in society you’re potentially addressing.  Businesses must be aware of the potential ethical risks that come with badly managed digital transformation and the effects this may have on their brands reputation. Before implementing any technology, ensure you can, and will, do so in a transparent, trustworthy, fair, representative and law-abiding way.  If you’re in the world of Data & Analytics and looking to take a step up or find the next member of your team, we can help. Take a look at our latest opportunities or get in touch with one of our expert consultants to find out more.

RELATED Jobs

Salary

US$170000 - US$200000 per annum

Location

Boston, Massachusetts

Description

An autonomous vehicle company is looking to add a Lead Software Engineer. You will lead and manage a small team to develop software and radar systems

Salary

£85000 - £90000 per annum

Location

London

Description

Product Manager (Data Science) - to join a tech company and marketplace leading their Routing and Supply Chain workflow!

Salary

Up to £60000 per annum + Benefits

Location

City of London, London

Description

Data Scientist, London United Kingdom.

recently viewed jobs