OLAP is over! What’s next?

The role of OLAP data warehouse is very important in today’s Big data analytics era.

But, comparing to last decade, the usage of volume of data in analytics is increasing. The AI & ML are very hungry and it needs tera bytes of data with complex query.

The modern BI tools like Qliksense, PowerBI, Tableau can talk to the large volume data warehouse. But if we think about low latency in very large volume of data, something needs to be changed in the architecture.

Change in Architecture

The first generation platforms architecture can be used in the normal big data analytics.

Ref: http://cidrdb.org/cidr2021/papers/cidr2021_paper17.pdf

In this architecture as shown, the BI tools can directly talk to Data warehouses to do any analysis.

But if you think about, the implementation of AI and ML on this we’ve to think about the unstructured data.

Ref: http://cidrdb.org/cidr2021/papers/cidr2021_paper17.pdf

The two-tier architecture using both Data lake and Data warehouses. This will solve the unstructured data problems for doing any AI and ML activities.

In the first generation architecture, there are no option to handle unstructured data. But the Data lake can be used to store both structured and unstructured data.

Still, the two tier architecture won’t solve the very large amount of dataset problems.

Lakehouse architecture

Ref: http://cidrdb.org/cidr2021/papers/cidr2021_paper17.pdf

This Lakehouse architecture will solve bot unstructured data problems and very large volume of datasets.

In Lakehouse architecture the ETL process is continuous. Due to the Data lake inside the architecture, we can access the raw data at any point of time. Also ETL is active, so that, the BI, AI & ML applications will get the near real time or real time data to its storage.

Conclusion

These kinds of architectural changes will reduce the cost of data storage as well as improving the latency (low) of the data aggregation and increase the productivity of data analysis.

--

--

--

» 6+ years of experience in Data engineering, Dashboard designing » 3+ years of experience in Web application development

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

How to visualize data usefully

Data Pre-Processing: AI End-to-End Series (Part — 2.2 - NLP)

Decision Trees and Random Forest

Redistricting Committee testimony and making your own maps

Hypothesis Testing — II : Using T-tests

What makes a passing bill? — Topic Modeling Congressional bills

Streamlit Tips, Tricks, and Hacks for Data Scientists

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
<muTheTechie/>

<muTheTechie/>

» 6+ years of experience in Data engineering, Dashboard designing » 3+ years of experience in Web application development

More from Medium

Data “Black Holes” in Manufacturing Processes

Async programming in Shiny plus Spinners

How to Read Data Values Separated by Blanks Using the SAS infile Statement