Title: | An Interface to the 'Power BI REST APIs' |
---|---|
Description: | Makes it easy to push data to 'Power BI' using R and the 'Power BI REST APIs' (see <https://docs.microsoft.com/en-us/rest/api/power-bi/>). A set of functions for turning data frames into 'Power BI' datasets and refreshing these datasets are provided. Administrative tasks such as monitoring refresh statuses and pulling metadata about workspaces and users are also supported. |
Authors: | Christian Vermehren [aut, cre] |
Maintainer: | Christian Vermehren <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.1.0 |
Built: | 2024-10-27 03:33:08 UTC |
Source: | https://github.com/cvermehren/powerbir |
A look-up dimension related to Fact Visitors through the hour_key column.
dim_hour
dim_hour
A data frame with 24 rows and 2 columns:
Primary key (unique identifier of hour).
Hour as a name (character type).
Anonymized data from google analytics.
A fact table showing individual visitors and their transactions on an e-commerce website.
fact_visitors
fact_visitors
A data frame with 10,033 rows and 5 columns:
Unique identifier of the visitor.
Unique identifier of the transactions.
The value of the transaction in USD.
The time of visit in minutes since 1970-01-01.
Foreign key referring to dim_hour.
Anonymized data from google analytics.
This function authenticates your Power BI session using a service principal that represents an application registered in Azure Active Directory.
pbi_auth( tenant = Sys.getenv("PBI_TENANT"), app = Sys.getenv("PBI_APP"), password = Sys.getenv("PBI_PW") )
pbi_auth( tenant = Sys.getenv("PBI_TENANT"), app = Sys.getenv("PBI_APP"), password = Sys.getenv("PBI_PW") )
tenant |
Your Microsoft tenant ID. |
app |
Your Microsoft app ID. |
password |
Your Microsoft app password (client secret). |
The function returns an authentication token invisibly and makes it available to other functions in this package. The token is automatically refreshed upon expiration.
To auto-authenticate, you can specify credentials in environment variables
via an .Renviron file or using Sys.setenv
(see example
below).
pbi_auth() is a wrapper for AzureAuth::get_azure_token(). Currently, only non-interactive authentication is supported. You therefore need to register an Azure Active Directory service-principal application and obtain tenant ID, app ID and app password (client secret).
For reasons of CRAN policy, the first time AzureAuth is loaded, it will prompt you for permission to create a user-specific directory in order to cache the token. The prompt only appears in an interactive session, not in a batch script. For more details, see AzureAuth.
Returns a token invisibly.
## Not run: # Basic authentications pbi_auth( tenant = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx", # The tenant ID app = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx", # The app ID password = "****" # The client secret ) # Using environment variables Sys.setenv( PBI_TENANT = "my_tenant_id", PBI_APP = "my_app_id", PBI_PW = "my_app_client_secret" ) pbi_auth() ## End(Not run)
## Not run: # Basic authentications pbi_auth( tenant = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx", # The tenant ID app = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx", # The app ID password = "****" # The client secret ) # Using environment variables Sys.setenv( PBI_TENANT = "my_tenant_id", PBI_APP = "my_app_id", PBI_PW = "my_app_client_secret" ) pbi_auth() ## End(Not run)
Triggers a refresh for the specified dataset from the specified workspace.
pbi_dataset_refresh(group_id, dataset_id)
pbi_dataset_refresh(group_id, dataset_id)
group_id |
The ID of the workspace. |
dataset_id |
The ID of the dataset. |
If successful, the refresh request ID is returned.
## Not run: group_id <- "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" dataset_id <- "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" pbi_dataset_refresh(group_id, dataset_id) #> A refresh of dataset xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx was triggered. #> #> To check status, use pbi_dataset_refresh_hist() and the request ID returned #> by this function. #> [1] "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" ## End(Not run)
## Not run: group_id <- "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" dataset_id <- "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" pbi_dataset_refresh(group_id, dataset_id) #> A refresh of dataset xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx was triggered. #> #> To check status, use pbi_dataset_refresh_hist() and the request ID returned #> by this function. #> [1] "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" ## End(Not run)
Returns the refresh history for the specified dataset from the specified workspace.
pbi_dataset_refresh_hist(group_id, dataset_id, top = NULL, request_id = NULL)
pbi_dataset_refresh_hist(group_id, dataset_id, top = NULL, request_id = NULL)
group_id |
The workspace ID |
dataset_id |
The dataset ID |
top |
The number of most recent entries in the refresh history. The default is all available entries. |
request_id |
The request ID returned by pbi_dataset_refresh(). If provided the refresh status of the request ID is returned. |
By default the function will return all historical refreshes. You
can reduce the list to the most recent refreshes using the top
argument.
If request_id
is provided the function will return a single refresh
status, but will still query the Power BI API for all historical entries.
If you query the top 5 most recent refreshes using the top
argument,
the function will only return a status if the provided request_id is in
this list.
The status value return can be either 'Completed', 'Failed' or 'Unknown', which means that the refresh is still in progress.
A data frame with status, start and end times of historical refreshes
or a single refresh status message if request_id
is used.
## Not run: group_id <- "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" dataset_id <- "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" pbi_dataset_refresh_hist(group_id, dataset_id) ## End(Not run)
## Not run: group_id <- "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" dataset_id <- "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" pbi_dataset_refresh_hist(group_id, dataset_id) ## End(Not run)
Deletes the specified dataset from the specified workspace. Applicable to push datasets as well as imported datasets.
pbi_delete_dataset(group_id, dataset_id)
pbi_delete_dataset(group_id, dataset_id)
group_id |
The dataset ID. |
dataset_id |
The workspace ID. |
Deletes the entire dataset.
## Not run: group_id <- "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" dataset_id <- "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" pbi_delete_dataset(group_id, dataset_id) ## End(Not run)
## Not run: group_id <- "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" dataset_id <- "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" pbi_delete_dataset(group_id, dataset_id) ## End(Not run)
Deletes all rows from the specified table within the specified dataset from the specified workspace (group ID). Only applicable to push datasets.
pbi_delete_rows(group_id, dataset_id, table_name)
pbi_delete_rows(group_id, dataset_id, table_name)
group_id |
The Power BI workspace ID. |
dataset_id |
The Power BI dataset ID. |
table_name |
The Power BI table name. |
All rows will be deleted from the specified table.
## Not run: group_id <- "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" dataset_id <- "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" table_name <- "My Table" pbi_delete_rows(group_id, dataset_id, table_name) ## End(Not run)
## Not run: group_id <- "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" dataset_id <- "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" table_name <- "My Table" pbi_delete_rows(group_id, dataset_id, table_name) ## End(Not run)
Returns the IDs and meta data of all available datasets in the specified Power BI workspace (group ID).
pbi_list_datasets(group_id)
pbi_list_datasets(group_id)
group_id |
The Power BI workspace ID. |
A data.table / data frame with dataset information.
## Not run: group_id <- "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" pbi_list_datasets(group_id) ## End(Not run)
## Not run: group_id <- "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" pbi_list_datasets(group_id) ## End(Not run)
Returns the ids and meta data of all Power BI workspaces to which the service principal app has been granted access.
pbi_list_groups()
pbi_list_groups()
A data frame with workspaces.
## Not run: pbi_list_groups() ## End(Not run)
## Not run: pbi_list_groups() ## End(Not run)
Pushes a dataset schema to the specified Power BI workspace. To add rows to the dataset, use pbi_push_rows().
pbi_push_dataset_schema(schema, group_id, retention = c("none", "basicFIFO"))
pbi_push_dataset_schema(schema, group_id, retention = c("none", "basicFIFO"))
schema |
A push-dataset schema created by pbi_schema_create(). |
group_id |
The ID of the destination Power BI workspace. |
retention |
The retention policy of the dataset. Default is "none". |
A dataset with tables will be created in the specified Power BI workspace.
## Not run: group_id <- "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" schema <- pbi_schema_create( dt_list = list(iris), dataset_name = "The iris dataset", table_name_list = list(iris) ) pbi_push_dataset_schema(schema, group_id) ## End(Not run)
## Not run: group_id <- "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" schema <- pbi_schema_create( dt_list = list(iris), dataset_name = "The iris dataset", table_name_list = list(iris) ) pbi_push_dataset_schema(schema, group_id) ## End(Not run)
Adds new data rows to the specified table within the specified dataset from the specified Power BI workspace. Only applicable to push datasets.
pbi_push_rows(dt, group_id, dataset_id, table_name, overwrite = FALSE)
pbi_push_rows(dt, group_id, dataset_id, table_name, overwrite = FALSE)
dt |
A data frame with rows to be added to the specified Power BI table (table_name). The columns and data types must match the specified table. |
group_id |
The ID of the destination Power BI workspace. |
dataset_id |
The ID of the destination Power BI dataset. |
table_name |
The name of the destination Power BI table. |
overwrite |
If TRUE, existing rows will be deleted prior to adding new rows. If FALSE, the new rows will be appended to the existing rows. |
The Power BI REST has a limit of 10K rows per POST rows request. This limit is handled by splitting the data frame into chunks of 10K rows each and pushing these chunks one at a time. However, you should manually observe the other limitations of the API. See https://docs.microsoft.com/en-au/rest/api/power-bi/ for more details.
A dataset with tables and optionally defined relationships will be created in the specified Power BI workspace.
## Not run: group_id <- "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" dataset_id <- "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" pbi_push_rows(group_id, dataset_id, "My table") ## End(Not run)
## Not run: group_id <- "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" dataset_id <- "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" pbi_push_rows(group_id, dataset_id, "My table") ## End(Not run)
Creates a Power BI dataset schema from a set of data frames. Columns and data types will be inferred from the data frames. Only applicable to push datasets.
pbi_schema_create( dt_list, dataset_name = "My Power BI Dataset", table_name_list, relations_list = NULL, date_format = "yyyy-mm-dd", integer_format = "#,###0", double_format = "#,###.00", sort_by_col = NULL, hidden_col = NULL, default_mode = c("Push", "Streaming", "PushStreaming", "AsOnPrem", "AsAzure") )
pbi_schema_create( dt_list, dataset_name = "My Power BI Dataset", table_name_list, relations_list = NULL, date_format = "yyyy-mm-dd", integer_format = "#,###0", double_format = "#,###.00", sort_by_col = NULL, hidden_col = NULL, default_mode = c("Push", "Streaming", "PushStreaming", "AsOnPrem", "AsAzure") )
dt_list |
A list of data frames which the schema should be inferred from. |
dataset_name |
A custom name of the Power BI dataset. |
table_name_list |
A list of custom names corresponding to the list of data frames. |
relations_list |
A list of relation definitions returned by pbi_schema_relation_create() |
date_format |
The format of date columns (if any). Default is 'yyyy-mm-dd'. |
integer_format |
The format of integer columns (if any). Default is '#,###0'. |
double_format |
The format of double columns (if any). Default is '#,###.00'. |
sort_by_col |
A list of lists of column-sorting definitions. The inner lists must include elements named 'table', 'sort' and 'sort_by'. See example for more details. |
A list of lists columns to be hidden. The inner lists must include elements named 'table' and 'hidden'. See examples for more details. |
|
default_mode |
The dataset mode or type. Defaults to 'Push'. |
A list with schema properties.
# Load package library(powerbiR) # Use data from the powerbiR package data(dim_hour) data(fact_visitors) # Define dataset and its tables table_list <- list(fact_visitors, dim_hour) table_names <- c("visitors", "hour") dataset_name <- c("Online Visitors") # Define relations between tables relation <- pbi_schema_relation_create( from_table = "visitors", from_column = "hour_key", to_table = "hour" ) # Define sorting behavior of columns in the hour table sortlist = list( table = c("hour"), sort = c("hour"), sort_by = c("hour_key") ) # Hide hour_key in the hour and visitors tables hidden <- list( list( table = c("hour"), hidden = c("hour_key") ), list( table = c("visitors"), hidden = c("hour_key", "visitor_id") ) ) # Create schema schema <- pbi_schema_create( dt_list = table_list, dataset_name = dataset_name, table_name_list = table_names, relations_list = list(relation), sort_by_col = list(sortlist), hidden_col = hidden )
# Load package library(powerbiR) # Use data from the powerbiR package data(dim_hour) data(fact_visitors) # Define dataset and its tables table_list <- list(fact_visitors, dim_hour) table_names <- c("visitors", "hour") dataset_name <- c("Online Visitors") # Define relations between tables relation <- pbi_schema_relation_create( from_table = "visitors", from_column = "hour_key", to_table = "hour" ) # Define sorting behavior of columns in the hour table sortlist = list( table = c("hour"), sort = c("hour"), sort_by = c("hour_key") ) # Hide hour_key in the hour and visitors tables hidden <- list( list( table = c("hour"), hidden = c("hour_key") ), list( table = c("visitors"), hidden = c("hour_key", "visitor_id") ) ) # Create schema schema <- pbi_schema_create( dt_list = table_list, dataset_name = dataset_name, table_name_list = table_names, relations_list = list(relation), sort_by_col = list(sortlist), hidden_col = hidden )
Defines a relationship between tables in a Power BI push dataset. To add this definition to a Power BI dataset schema, use pbi_schema_add_relations().
pbi_schema_relation_create( from_table = NULL, from_column = NULL, to_table = NULL, to_column = from_column, direction = c("OneDirection", "BothDirections", "Automatic"), name = paste0(from_table, to_table, from_column) )
pbi_schema_relation_create( from_table = NULL, from_column = NULL, to_table = NULL, to_column = from_column, direction = c("OneDirection", "BothDirections", "Automatic"), name = paste0(from_table, to_table, from_column) )
from_table |
The name of the foreign key table |
from_column |
The name of the foreign key column |
to_table |
The name of the primary key table |
to_column |
The name of the primary key column. Defaults to from_column |
direction |
The filter direction of the relationship. Defaults to 'OneDirection' |
name |
The relationship name and identifier. Defaults to a concatenation of from_table, to_table and from_column |
A data.table
# An example
# An example