In Data lakehouses for electric coops – Lake Data Insights, I made the case that a lakehouse approach drove a cost effective, scalable, flexible, maintainable, and easy to use approach for dealing with the variety of data sources and analytics in the electric coop utility area. In this post, I want to present an overall architecture showing how you can implement this approach using Azure technologies. Please refer to the diagram below.
For more context on this diagram, please see the following blog posts:
- The foundation of the diagram is the data engineering lifecycle. See Using the data engineering lifecycle for data architecture – Lake Data Insights for background on this.
- The overall design pattern uses a medallion based approach. See Medallion Architecture – Lake Data Insights for background on this.
Of course, architecture diagrams are not worth the paper they are printed on if they haven’t been implemented. This lakehouse architecture has been the foundation for implementations at three electric coops. So far, it has proven to deliver on what I asserted in the earlier post – an approach that is cost effective, scalable, flexible, maintainable, and easy to use. Costs are always a concern when using cloud resources. At one of the coops where I have visibility into the Azure costs, the cloud cost for the daily processing of new data and storage of over 1 billions meter data points is less than $10 / day.
Future posts will get into more specifics of targeted areas of the architecture. If you have questions on the architecture, please reach out to me.