Query Folding with Salesforce

Published by Thomas foster on

Background

First some background to both Salesforce and Query Folding. Salesforce is a popular cloud enterprise software provider. Which, more importantly for this post, has a Power Query connector. It actually has two, the object and the report connector. However this only applies to the object connector. The object connector allows connection to the underlying tables that sit behind salesforce. Querying salesforce is done with a SQL-like language called Salesforce Object Query Language [SOQL]. You can see easily with the Salesforce Workbench.

Now for the folding, I’ve mentioned query folding before where you can see it with a more traditional query to a SQL server. A major difference between a SQL server datasource and Salesforce is that you cannot see the native query. This means it doesn’t look like folding is taking place and it is difficult to investigate with just Power BI alone.

However it is not it impossible to see if has been folded or to see the queries being sent. We can do this with either the Power BI query trace, or through analyzing the packets sent to Salesforce with a tool like Fiddler. I prefer using the query trace and will do so for this post. To enable this go into options then diagnostics and tick enable tracing. Keep in mind that this setting resets when you close Power BI.

Power BI Options screen showing  the power quyer tracing option enabled

Once this setting is enabled the next time you load data using the query editor several files will appear in the Power BI traces folder. Within that folder there will be some log files, open the ones that start with “Microsoft.Mashup.Container.NetFX45” then search for “Engine/IO/Web/Request/GetResponse” in those files. You should be able to see some lines that contain something like

https://eu16.salesforce.com/services/data/v29.0/query?q=

This is the SOQL query that is being sent to Salesforce. The number and region code before salesforce.com will depend on your salesforce instance. From this we can see if changes made in the Query Editor are being folded to the source.

The folding

Only a small set of functions can be folded back to Salesforce, basic operations such as filtering rows, selecting columns. As well as sorting, limiting and grouping rows. Some functions such as creating new columns which do get folded with SQL server do not get filtered for Salesforce.

For an example I have loaded the Salesforce sample data that comes with new development environments and then constructed a query that takes advantage of a lot of query folding. I have reduced the number of columns, filtered and sort rows, kept the top 5 and then grouped by two columns with two aggregations. All these operations have gotten folded into the query, this can be seen in the logs in the following line.

Power BI Query Log
SELECT Id%2CName%2CStageName%2CAmount FROM Opportunity WHERE ((StageName%3D’Closed Won’) OR (StageName%3D’Proposal%2FPrice Quote’)) LIMIT 5

This is the SOQL statement and you can clearly see the folded elements. I have tested other functions which fold with SQL server such as new columns, joins, and simple transformations. However I none of these appear to fold and may actually stop folding happening on functions where it works.

Thank you for reading this post, I realise it is longer than what I have written before but hopefully it will help you out.

If you want to read more about the query traces and what can be done, I recommend this blog post by Kasper on BI and this one by Chris Webb.


4 Comments

Andreww Davis · 2019-05-22 at 20:55

THomas, apologies for the incredibly delayed reply. I genuinely believed that the 250 row query limit per call would have thrown out my dream of incrementally refreshing Salesforce data in Power BI, but I was able to get it up and running. I just read your post on Powershell (good stuff!), and while I might not be much of a PS guru myself, I thought to mention that the notion of implementing an incremental refresh policy in Power BI, in conjunction with more direct utilization of the Power BI REST API, might be the least resource intensive methodology for providing near real-time data in PBI! I’m currently unable to get the app in our AD to begin using the API, but it sounds like you’re up and running with it. As an idea, maybe try posting a refresh to a particular dataset which has a functioning incremental refresh policy enabled. Your imports should be near instant, and could be triggered as often as you like with the API.

    Thomas foster · 2019-05-22 at 22:08

    By incremental refresh policies with the REST API, do you mean the hybrid push datasets that the API can create?

    If so I do have a much delayed blog post in the works, in addition to a possible talk on the subject of using MS flow to push data to Power BI.
    The advantage with flow is that you get triggers for things like salesforce, sharepoint, and many more. Setting that up is far easier than developing a custom connector, I never managed to get mine to work beyond the basics on a test SalesForce instance.

Andrew Davis · 2019-04-26 at 17:38

Thomas,

This is a fantastic post! I’ve been working to navigate a solution to implementing Power BI’s incremental refresh policies through the Salesforce Object connector, and have been having a less than stellar time in my attempt. While you are correct, that SOQL, by virtue of being structured similarly to SQL, does fold back to the source with a certain limited set of operations, it seems that this is all for naught when attempting to incrementally refresh against an object with 250+ rows.

This seems to be a unique consequence of Salesforce’s REST API returning a maximum of 250 rows each time you hit the endpoint, forcing the Object connector to iterate calls to the “Query” endpoint until the load is complete. Any thoughts here?

    Thomas foster · 2019-04-26 at 17:49

    Hi Andrew,

    I have had trying to implement incremental refresh as an item on my possible experiment ideas for a while but haven’t ever got around to it.
    As I mostly ran into authentication issues before I could get started.

    Does the 250 row limit interfere with your ability to fold back filtering by dates?
    As while folding other commands back is nice the filtering is the key foldable command.

Leave a Reply

Your email address will not be published. Required fields are marked *