Welcome!

Mobile IoT Authors: Jason Bloomberg, Zakia Bouachraoui, Kevin Benedict, Pat Romanski, Yeshim Deniz

Related Topics: Agile Computing, Artificial Intelligence, @CloudExpo, @DXWorldExpo, @ThingsExpo

Agile Computing: Blog Post

Data Warehouse and Data Lake | @ExpoDX @Schmarzo #BigData #Analytics #DataLake #AI #ArtificialIntelligence

Maybe the best way to understand today’s role of the data warehouse is with a bit of history

This blog was written with the thoughtful assistance of David Leibowitz, Dell EMC Director of Business Intelligence, Analytics & Big Data

So data warehousing may not be cool anymore, you say? It’s yesterday’s technology (or 1990’s technology if you’re as old as me) that served yesterday’s business needs. And while it’s true that recent big data and data science technologies, architectures and methodologies seems to have rendered data warehousing to the back burner, it is entirely false that there is not a critical role for the data warehouse and Business Intelligence in digitally transformed organizations.

Maybe the best way to understand today’s role of the data warehouse is with a bit of history. And please excuse us if we take a bit of liberty with history (since we were there for most of this!).

Phase 1: The Data Warehouse Era
Phase 1: In the beginning, Gods (Ralph Kimble and Bill Inmon, depending upon your data warehouse religious beliefs) created the data warehouse. And it was good. The data warehouse, coupled with Business Intelligence (BI) tools, served the management and operational reporting needs of the organization so that executives and line-of-business managers could quickly and easily understand the status of the business, identify opportunities, and highlight potential areas of under-performance (see Figure 1).

Figure 1: The Data Warehouse Era

The data warehouse served as a central integration point; collecting, cleansing and aggregating a variety of data sources from AS/400, relational and file based (such as EDI). For the first time, data from supply chain, warehouse management, AP/AR, HR, point of sale was available in a “single version of the truth.”

Using extraction-transform-load (ETL) processing wasn’t always quick, and could require a degree of technical gymnastics to bring together all of these disparate data sources. At one point, the “enterprise service bus” entered the playing field to lighten the load on ETL maintenance, but routines quickly went from proprietary data sources, to proprietary (and sometimes arcane) middleware business logic code (anyone remember Monk?).

The data warehouse supported reports and interactive dashboards that enabled business management to have a full grasp on the state of the business. That said, report authoring was static and not really enabled for democratizing data. Typically, the nascent concept of self-service BI was limited to cloning a subset of the data warehouse to smaller data marts, and extracts to Excel for business analysis purposes. This proliferation of additional data silos created reporting environments that were out of sync (remember the heated sales meetings where teams couldn’t agree as to which report figures were correct?) and the analysis paralysis caused by spreadmarts meant that more time was spent working the data rather than driving insight. But we all dealt with it, as it was agreed that some information (no matter the effort it took to acquire) was more important that no data.

Phase 2: Optimize the Data Warehouse
But IT man grew unhappy with being held captive by proprietary data warehouse vendors. The costs of proprietary software and expensive hardware (and let’s not even get started on user-defined functions in PL/SQL and proprietary SQL extensions that created architectural lock-in) forced organizations to limit the amount and granularity of data in the data warehouse. IT Man grew restless and looked for ways to reduce the costs associated with operating these proprietary data warehouses while delivering more value to Business Man.

Then Hadoop was born out of the ultra-cool and hip labs of Yahoo. Hadoop provided a low-cost data management platform that leveraged commodity hardware and open sources software that was an estimated to be 20x to 100x cheaper than proprietary data warehouses.

Man soon realized the financial and operational benefits afforded by a commodity-based, natively parallel, open source Hadoop platform to provide an Operational Data Store (now that’s really going old school!) to off-load those nasty Extract Load and Transform (ETL) processes off the expensive data warehouse (see Figure 2).

Figure 2: Optimize the Data Warehouse

The Hadoop-based Operational Data Store was deemed very good as it helped IT Man to decrease spending on the data warehouse (guess not so good if you were a vendor of those proprietary data warehouse solutions…and you know who you are T-man!). Since it’s estimated that ETL consumes 60% to 90% of the data warehouse processing cycles, and since some vendors licensed their products based upon those cycles – this concept of “ETL Offload” could provide substantial cost reductions. So in an environment limited by Service Level Agreements (because outside of Doc Brown’s DeLorean equipped with a flux capacitor, there’s still only 24 hours in a day in which to do all the ETL work), Hadoop provided a low-cost, high-performance environment for dramatically slowing the investment in proprietary data warehouse platforms.

Things were getting better, but still weren’t perfect. While IT Man could shave costs, he couldn’t make the tools easy to use by simple data consumers (like Executive Man). And while Hadoop was great for storing unstructured and semi-structured data, it couldn’t always keep up to the speed relied upon for relational or cube based reporting from traditional transactional systems.

See blog “The Data Warehouse Modernization Act” for more details on the role of the Hadoop-based Operational Data Store and how it has helped to “modernize” today’s existing data warehouse environment.

Phase 3: Introducing Data Science
Then God created the Data Scientists, or maybe it was the Devil based upon one’s perspective. The data scientists needed an environment where they could rapidly ingest high volumes of granular structured (tables), semi-structured (log files) and unstructured data (text, video, images). They realized that data beyond the firewall was needed in order to drive intelligent insight. Data such as weather, social, sensor and third party could be mashed up with the traditional data stores in the EDW and Hadoop to determine customer insight, customer behavior and product effectiveness. This made Marketing Man happy. The scientists needed an environment where they could quickly test new data sources, new data transformations and enrichments, and new analytic techniques in search of those variables and metrics that might be better predictors of business and operational performance. Thusly, the analytic sandbox, which also runs on Hadoop, was born (see Figure 3).

Figure 3: Introducing Data Science

The characteristics of a data science “sandbox” couldn’t be more different than the characteristics of a data warehouse:

Finance Man tried desperately to combine these two environments but the audiences, responsibilities and business outcomes were just too varying to create an cost-effectively business reporting and predictive analytics in single bubble.

Ultimately, the analytic sandbox became one of the drivers for the creation of the data lake that could support both the data science and data warehousing (Operational Data Store) needs.

Data access was getting better for the data scientists but we again were moving towards proprietary process and a technical skill reserved for the elite. Still, things were good as IT Man, Finance Man and Marketing Man could work through the data scientists to drive innovation. But they soon wanted more.

See the following blogs for more details on the complementary nature of the data warehouse and the data lake:

Phase 4: Creating Actionable Dashboards
But Executive Man was still unsatisfied. The Data Scientists were developing wonderful predictions about what was likely to happen and prescriptions about what to do, but the promise of self-service BI was missing. Instead of the old days, and having to run to IT Man for reports, now he was requesting them of the Data Scientist.

The reports and dashboards created to support executive and front-line management in Stage 1 were the natural channel for rendering the predictive and prescriptive insights, effectively closing the loop between the data warehouse and the data lake. With data visualization tools like Tableau and Power BI, IT Man could finally deliver on the promise of self-service BI by providing interactive descriptive and predictive dashboards that even Executive Man could operate (see Figure 4).

Figure 4: Closing the Analytics Loop

See the blog “Creating Actionable Dashboards” for more details on how to convert existing reports and dashboards into actionable reports and dashboards!

And Man was happy (until the advent of Terminator robots began making decisions for us).

The post Data Warehouse and Data Lake Analytics Collaboration appeared first on InFocus Blog | Dell EMC Services.

 

At CloudEXPO Silicon Valley, June 24-26, 2019, Digital Transformation (DX) is a major focus with expanded DevOpsSUMMIT and FinTechEXPO programs within the DXWorldEXPO agenda. Successful transformation requires a laser focus on being data-driven and on using all the tools available that enable transformation if they plan to survive over the long term. A total of 88% of Fortune 500 companies from a generation ago are now out of business. Only 12% still survive. Similar percentages are found throughout enterprises of all sizes.

Register Today and SAVE ▸ Here

Speaking Opportunities ▸ Here

Sponsorship & Exhibit Opportunities ▸ Here

Silicon Valley Faculty ▸ Here

Silicon Valley Schedule ▸ Here

CloudEXPO Has Been the M&A Capital For Cloud Companies

CloudEXPO has been the M&A capital for Cloud companies for more than a decade with memorable acquisition news stories which came out of CloudEXPO expo floor. DevOpsSUMMIT New York faculty member Greg Bledsoe shared his views on IBM's Red Hat acquisition live from NASDAQ floor. Acquisition news was announced during CloudEXPO New York which took place November 12-13, 2019 in New York City.

Our Silicon Valley 2019 schedule will showcase 200 keynotes, sessions, general sessions, power panels, and hands on tutorials presented by 150 rockstar speakers in 10 hottest conference tracks of 2019:

» CloudEXPO
» DevOpsSUMMIT
» ServerlessSUMMIT
» Kubernetes at CloudEXPO
» FinTechEXPO Blockchain
» DXWorldEXPO Digital Transformation
» AI | ML | DL | Artificial Intelligence
» Big Data | Analytics
» IoT | IIoT | Smart Cities
» Mobility | Security
» Enterprise Cloud Hot Topics

CloudEXPO Silicon Valley 2019 Show Prospectus ▸ HERE

Prospectus At-a-Glance ▸ HERE
Attendee Profile ▸ HERE
Keynote Opportunities ▸ HERE
General Session Opportunities ▸ HERE
Diamond Sponsorship Opportunity ▸ HERE
Platinum Sponsorship Opportunity ▸ HERE
Gold and Silver Sponsorship Opportunities ▸ HERE
Bronze Sponsorship and Exhibitor Packages ▸ HERE
Benefits of Exhibiting at CloudEXPO 2019 ▸ HERE

CloudEXPO is the single event where technology buyers and vendors meet to experience and discus cloud computing and all that it entails. For more than a decade, sponsors and exhibitors of CloudEXPO benefit from unmatched branding, profile building and lead generation opportunities through our following unique tools. For more information on sponsorship, exhibit, and keynote opportunities call us at 954 242-0444 or contact us ▸ Here

  • Featured on-site presentation and ongoing on-demand webcast exposure to a captive audience of industry decision-makers
  • Showcase exhibition during our new extended dedicated expo hours
  • Breakout Session Priority scheduling for Sponsors that have been guaranteed a 40-minute technical session
  • Online advertising on 4,5 million article pages in SYS-CON's leading i-Technology Publications
  • Capitalize on our Comprehensive Marketing efforts leading up to the show with print mailings, e-newsletters and extensive online media coverage
  • Unprecedented PR Coverage: Unmatched editorial coverage on Cloud Computing Journal
  • Tweetup to over 184,000 plus Twitter followers
  • Press releases sent on major wire services to over 500 industry analysts

FinTech and Blockchain Are Now Part of CloudEXPO 2019 Program

Financial enterprises in New York City, London, Singapore, and other world financial capitals are embracing a new generation of smart, automated FinTech that eliminates many cumbersome, slow, and expensive intermediate processes from their businesses.

Accordingly, attendees at the upcoming 23rd CloudEXPO, June 24-26, 2019 at Santa Clara Convention Center in Santa Clara, CA will find fresh new content in full new FinTech & Enterprise Blockchain track.

ServerlessSUMMIT & Kubernetes at CloudEXPO Silicon Valley

As you know, enterprise IT conversation over the past year have often centered upon the open-source Kubernetes container orchestration system. In fact, Kubernetes has emerged as the key technology -- and even primary platform -- of cloud migrations for a wide variety of organizations. Kubernetes is critical to forward-looking enterprises that continue to push their IT infrastructures toward maximum functionality, scalability, and flexibility.

Kubernetes is critical to forward-looking enterprises that continue to push their IT infrastructures toward maximum functionality, scalability, and flexibility.



As they do so, IT professionals are also embracing the reality of Serverless architectures, which are critical to developing and operating real-time applications and services. Serverless is particularly important as enterprises of all sizes develop and deploy Internet of Things (IoT) initiatives.

Serverless and Kubernetes are great examples of continuous, rapid pace of change in enterprise IT. They also raise a number of critical issues and questions about employee training, development processes, and operational metrics.

DevOpsSUMMIT at CloudEXPO Celebrates Its 12th Event in Six Years

Cloud-Native thinking and Serverless Computing are now the norm in financial services, manufacturing, telco, healthcare, transportation, energy, media, entertainment, retail and other consumer industries, as well as the public sector.

The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is no time to wait for long development cycles that produce software that is obsolete at launch. DevOps may be disruptive, but it is essential.

ServerlessSUMMIT and DevOpsSUMMIT at CloudEXPO expands the DevOps community, enable a wide sharing of knowledge, and educate delegates and technology providers alike.

There's a real need for serious conversations about Serverless and Kubernetes among the people who are doing this work and managing it.

So we are very pleased today to announce the ServerlessSUMMIT at CloudEXPO.

DXWorldEXPO Showcases Cutting-Edge IoT, Artificial Intelligence, Machine Learning, and Digital Transformation

Now is the time for a truly global DX event, to bring together the leading minds from the technology world in a conversation about Digital Transformation. DX encompasses the continuing technology revolution, and is addressing society's most important issues throughout the entire $78 trillion 21st-century global economy.

DXWorldEXPO® has organized these issues along 10 tracks, 22 keynotes and general sessions, and a faculty of 222 of the world's top speakers.

DXWorldEXPO® has three major themes on its conference agenda:

Technology - The Revolution Continues
Economy - The 21st Century Emerges
Society - The Big Issues

Global 2000 companies have more than US$40 trillion in annual revenue - more than 50% of the world's entire GDP. The Global 2000 spends a total of US$2.4 trillion annually on enterprise IT. The average Global 2000 company has US$11 billion in annual revenue. The average Global 2000 company spends more than $600 million annually on enterprise IT. Governments throughout the world spend another US$500 billion on IT - much of it dedicated to new Smart City initiatives.

For the past 10 years CloudEXPO® helped drive the migration to modern enterprise IT infrastructures, built upon the foundation of cloud computing. Today's hybrid, multiple cloud IT infrastructures integrate Big Data, analytics, blockchain, the IoT, mobile devices, and the latest in cryptography and enterprise-grade security.

Digital Transformation is the key issue driving the global enterprise IT business. DX is most prominent among Global 2000 enterprises and government institutions.

About DXWorldEXPO LLC

DXWorldEXPO LLC is a Lighthouse Point, Florida-based trade show company and the creator of DXWorldEXPO - Digital Transformation Conference & Expo. The company produces and presents the world's most influential technology events including CloudEXPO, DevOpsSUMMIT, and FinTechEXPO.

Read the original blog entry...

More Stories By William Schmarzo

Bill Schmarzo, author of “Big Data: Understanding How Data Powers Big Business” and “Big Data MBA: Driving Business Strategies with Data Science”, is responsible for setting strategy and defining the Big Data service offerings for Hitachi Vantara as CTO, IoT and Analytics.

Previously, as a CTO within Dell EMC’s 2,000+ person consulting organization, he works with organizations to identify where and how to start their big data journeys. He’s written white papers, is an avid blogger and is a frequent speaker on the use of Big Data and data science to power an organization’s key business initiatives. He is a University of San Francisco School of Management (SOM) Executive Fellow where he teaches the “Big Data MBA” course. Bill also just completed a research paper on “Determining The Economic Value of Data”. Onalytica recently ranked Bill as #4 Big Data Influencer worldwide.

Bill has over three decades of experience in data warehousing, BI and analytics. Bill authored the Vision Workshop methodology that links an organization’s strategic business initiatives with their supporting data and analytic requirements. Bill serves on the City of San Jose’s Technology Innovation Board, and on the faculties of The Data Warehouse Institute and Strata.

Previously, Bill was vice president of Analytics at Yahoo where he was responsible for the development of Yahoo’s Advertiser and Website analytics products, including the delivery of “actionable insights” through a holistic user experience. Before that, Bill oversaw the Analytic Applications business unit at Business Objects, including the development, marketing and sales of their industry-defining analytic applications.

Bill holds a Masters Business Administration from University of Iowa and a Bachelor of Science degree in Mathematics, Computer Science and Business Administration from Coe College.

IoT & Smart Cities Stories
While the focus and objectives of IoT initiatives are many and diverse, they all share a few common attributes, and one of those is the network. Commonly, that network includes the Internet, over which there isn't any real control for performance and availability. Or is there? The current state of the art for Big Data analytics, as applied to network telemetry, offers new opportunities for improving and assuring operational integrity. In his session at @ThingsExpo, Jim Frey, Vice President of S...
In his keynote at 18th Cloud Expo, Andrew Keys, Co-Founder of ConsenSys Enterprise, provided an overview of the evolution of the Internet and the Database and the future of their combination – the Blockchain. Andrew Keys is Co-Founder of ConsenSys Enterprise. He comes to ConsenSys Enterprise with capital markets, technology and entrepreneurial experience. Previously, he worked for UBS investment bank in equities analysis. Later, he was responsible for the creation and distribution of life settl...
@CloudEXPO and @ExpoDX, two of the most influential technology events in the world, have hosted hundreds of sponsors and exhibitors since our launch 10 years ago. @CloudEXPO and @ExpoDX New York and Silicon Valley provide a full year of face-to-face marketing opportunities for your company. Each sponsorship and exhibit package comes with pre and post-show marketing programs. By sponsoring and exhibiting in New York and Silicon Valley, you reach a full complement of decision makers and buyers in ...
Two weeks ago (November 3-5), I attended the Cloud Expo Silicon Valley as a speaker, where I presented on the security and privacy due diligence requirements for cloud solutions. Cloud security is a topical issue for every CIO, CISO, and technology buyer. Decision-makers are always looking for insights on how to mitigate the security risks of implementing and using cloud solutions. Based on the presentation topics covered at the conference, as well as the general discussions heard between sessio...
The Internet of Things is clearly many things: data collection and analytics, wearables, Smart Grids and Smart Cities, the Industrial Internet, and more. Cool platforms like Arduino, Raspberry Pi, Intel's Galileo and Edison, and a diverse world of sensors are making the IoT a great toy box for developers in all these areas. In this Power Panel at @ThingsExpo, moderated by Conference Chair Roger Strukhoff, panelists discussed what things are the most important, which will have the most profound e...
The Jevons Paradox suggests that when technological advances increase efficiency of a resource, it results in an overall increase in consumption. Writing on the increased use of coal as a result of technological improvements, 19th-century economist William Stanley Jevons found that these improvements led to the development of new ways to utilize coal. In his session at 19th Cloud Expo, Mark Thiele, Chief Strategy Officer for Apcera, compared the Jevons Paradox to modern-day enterprise IT, examin...
Rodrigo Coutinho is part of OutSystems' founders' team and currently the Head of Product Design. He provides a cross-functional role where he supports Product Management in defining the positioning and direction of the Agile Platform, while at the same time promoting model-based development and new techniques to deliver applications in the cloud.
There are many examples of disruption in consumer space – Uber disrupting the cab industry, Airbnb disrupting the hospitality industry and so on; but have you wondered who is disrupting support and operations? AISERA helps make businesses and customers successful by offering consumer-like user experience for support and operations. We have built the world’s first AI-driven IT / HR / Cloud / Customer Support and Operations solution.
LogRocket helps product teams develop better experiences for users by recording videos of user sessions with logs and network data. It identifies UX problems and reveals the root cause of every bug. LogRocket presents impactful errors on a website, and how to reproduce it. With LogRocket, users can replay problems.
Rafay enables developers to automate the distribution, operations, cross-region scaling and lifecycle management of containerized microservices across public and private clouds, and service provider networks. Rafay's platform is built around foundational elements that together deliver an optimal abstraction layer across disparate infrastructure, making it easy for developers to scale and operate applications across any number of locations or regions. Consumed as a service, Rafay's platform elimi...