A long time has passed since the last post, we have gone through a long and tedious journey to adapt what Azure offers us, to our needs.
Our needs were simple, the Current Datawarehouse (SQL Server on VM inazure) served the BI.
ML teams worked on GCP, we want to let both teams to work on Azure in a platform that will have the ability to scale and will not fail every 2 days.
We checked:
- Snowflake on azure
- Synapse analytics
- GCP
We decided to go for the full Azure product for the reasons:
- Migration time
- support
- costs
Synapse as a platform contains many components, and the challenge was to find what fits us as an organization and as a group.
The knowledge of the people and their abilities influenced the plans.
Here's what we planned and what we did:
We start to put everything in the Data Lake in parquet or delta format, build on top of Azure ADLS gen 2.
We had to move some data to T-SQL compatible platform, so this involves setting up a dedicated Synapse pool, which is a fully managed big data platform that allows us to ingest, prepare, manage, and serve data for immediate business intelligence and machine learning needs.
Using Azure Data Factory, we can create and schedule data pipelines to move data from our SQL Server database to the Data Lake.
This enables us to scale our data processing and analysis capabilities and take advantage of the flexibility and power of a big data platform.
We also, in many cases used Azure Databricks to manage the Date Lake.
In the ingestion layer we choose to use Azure Data Factory and Azure Databricks.
and in the analytics layer, we give the ability to query the data or with Azure Data Factory or Azure Databricks or Azure Function.
Now we have lots of work with all Azure data offering.
We have started to stable the system and moving lots of production load to it.
So, in the next posts i will post a lot on those azure data tools.
And then comes Microsoft Fabric :-).
Have a nice day
Comments
Post a Comment