Build with an Azure free account. Get USD200 credit for 30 days and 12 months of free services.

Start free today

Azure Friday

Unlock petabyte-scale datasets in Azure with aggregations in Power BI

Sep 12, 2018 at 4:00PM

by Scott Hanselman, Rob Caron
Average of 4.75 out of 5 stars 7 ratings

Sign in to rate

9 comments

Tweet

Share

Share

Play Unlock petabyte-scale datasets in Azure with aggregations in Power BI

Sign in to queue

Description

Christian Wade joins Scott Hanselman to show you how to unlock petabyte-scale datasets in Azure with a way that was not previously possible. Learn how to use the aggregations feature in Power BI to enable interactive analysis over big data.

For more information:

Follow @SHanselman Follow @AzureFriday Follow @_christianWade

Tag:

Azure

Embed

Download

Right click or Alt+Enter to download this episode

MP3 (13.5 MB)
Low Quality MP4 (37.6 MB)
High Quality MP4 (513.4 MB)
Mid Quality MP4 (132.4 MB)

Download captions

The Discussion

kbaig

Hi Guys,

Thanks for the great demo and a great feature. Queries that are not cached are getting processed by spark as mentioned but can you share more details around how a 23 node spark cluster fits into this eco-system ?

Thanks

Last modified Sep 13, 2018 at 10:14AM
wadecb

@kbaig:thanks for the feedback. The spark cluster is optional. From the Power BI side it works the same way if it's HDI Spark, Azure SQL Data Warehouse, DataBricks and various other sources in Azure (that support DirectQuery). The setup and optimization of these systems is dependent on the system itself and is standard for query perf tuning on that system - there is nothing special about setting up/query optimizing these systems that is different when using aggregations

Last modified Sep 13, 2018 at 2:23PM
Danaraj

This is super awesome, Christian!
Just curios if there is a plan for aggregations in Analysis Services Tabular?

Last modified Sep 14, 2018 at 6:22AM
Ezra Gabay

Christian! You are brilliant we just need to figure out how to travel to Mars and back combining all of NASA data. I can setup that appointment if need be as I know a few smart people there. All the best! I will be using this for a few of our companies.
Ezra Gabay

Last modified Sep 14, 2018 at 8:01AM
wadecb

@Danaraj: we are currently focusing on going in the other direction: bringing the Analysis Services scalability, manageability, ALM, debugging, etc. to Power BI

Last modified Sep 15, 2018 at 10:48PM
wadecb

@Ezra Gabay: Thank you so much Ezra! Glad to be of service! We're all about pushing the boundaries :)

Last modified Sep 15, 2018 at 10:50PM
Danaraj

@wadecb: That's interesting, looking forward for what's coming up next. Thanks Christian, amazing work.

Last modified Sep 16, 2018 at 11:13AM
aljj

re: Spark query. In order for this query to complete in reasonable time over big data, the data has to be partitioned. But there are limited way you can partition data in Spark (not more than 100 partitions).
So can you explain a bit how the data is partition bucketed?

Last modified Sep 18, 2018 at 3:05AM
wadecb

@aljj:it is stored in parquet and coalesced into 200 random chunks of rows

Last modified Sep 19, 2018 at 8:29PM

More episodes in this series

Azure State Configuration experience

Azure State Configuration experience

Batch and matrix routing with Azure Maps

Batch and matrix routing with Azure Maps

Related episodes

Create dependent pipelines in your Azure Data Factory

Create dependent pipelines in your Azure Data Factory

An overview of Azure Integration Services

An overview of Azure Integration Services

Azure Instance Metadata Service updates for attested data

Azure Instance Metadata Service updates for attested data

Anomaly detection using machine learning in Azure Stream Analytics

Anomaly detection using machine learning in Azure Stream Analytics

An overview of Azure Blueprints

An overview of Azure Blueprints

Enhanced monitoring capabilities and tags/annotations in Azure Data Factory

Enhanced monitoring capabilities and tags/annotations in Azure Data Factory

Monitor your Azure Data Factory pipelines proactively with alerts

Monitor your Azure Data Factory pipelines proactively with alerts

Using HashiCorp Consul to connect Kubernetes clusters on Azure

Using HashiCorp Consul to connect Kubernetes clusters on Azure

Run Azure Functions from Azure Data Factory pipelines

Run Azure Functions from Azure Data Factory pipelines

Interning in Azure Engineering and the Visual Studio Code extension for ACR Build

Interning in Azure Engineering and the Visual Studio Code extension for ACR Build