Power BI and R: Custom Visuals for Social Network Analysis

Bharat Garg

Verified Expert in Engineering

Bharat is a data scientist and developer who specializes in designing and developing interactive reports and tools to facilitate decision-making. He has worked with small startups and large corporations, such as Comcast, MetLife, UnitedHealth Group/Optum, and Jefferson Health. One of Bharat’s projects delivered $6 million in revenue, and another delivered $10 million in savings.

Expertise

R Data Visualization Data Analysis

Previous Role

Senior Analyst

PREVIOUSLY AT

Share

Social network analysis is quickly becoming an important tool to serve a variety of professional needs. It can inform corporate goals such as targeted marketing and identify security or reputational risks. Social network analysis can also help businesses meet internal goals: It provides insight into employee behaviors and the relationships among different parts of a company.

Organizations can employ a number of software solutions for social network analysis; each has its pros and cons, and is suited for different purposes. This article focuses on Microsoft’s Power BI, one of the most commonly used data visualization tools today. While Power BI offers many social network add-ons, we’ll explore custom visuals in R to create more compelling and flexible results.

This tutorial assumes an understanding of basic graph theory, particularly directed graphs. Also, later steps are best suited for Power BI Desktop, which is only available on Windows. Readers may use the Power BI browser on Mac OS or Linux, but the Power BI browser does not support certain features, such as importing an Excel workbook.

Structuring Data for Visualization

Creating social networks starts with the collection of connections (edge) data. Connections data contains two primary fields: the source node and the target node—the nodes at either end of the edge. Beyond these nodes, we can collect data to produce more comprehensive visual insights, typically represented as node or edge properties:

1) Node properties

Shape or color: Indicates the type of user, e.g., the user's location/country
Size: Indicates the importance in the network, e.g., the user's number of followers
Image: Operates as an individual identifier, e.g., a user's avatar

2) Edge properties

Color, stroke, or arrowhead connection: Indicates type of connection, e.g., the sentiment of the post or tweet connecting the two users
Width: Indicates strength of connection, e.g., how many mentions or retweets are observed between two users in a given period

Let’s inspect an example social network visual to see how these properties function:

A graph of circles connected by lines of varying widths appears with three distinct sections. The left of the graph has six green shapes of various sizes labeled 1, 2, 3, 4, 5, and 6 in a hexagon. Numbers 1-5 are circles, while 6 is a diamond. They are interconnected by green arrows of varying widths and directions, and some arrowheads are filled green while others are not filled. To the right of the green shapes is the next section: three dark blue shapes arranged in a triangle that are labeled 7, 8, and 9, and are interconnected by blue arrows of varying widths and directions (with some arrowheads filled blue). Nodes 7 and 9 are connected to nodes 3 and 4 with gray arrows of varying widths and directions (with some arrowheads filled gray). In the middle of the graph, below the first two shape groups, is a single light blue diamond labeled 10. It is connected to nodes 5, 4, and 9 by dotted gray arrows of varying widths and directions (with some arrowheads filled gray). — Green, light blue, and dark blue nodes and varying circle or diamond shapes demonstrate different node types. Numbers with transparent backgrounds act as the node image identifiers, and larger nodes (such as Node 4) are more important in the network. Different edge types are indicated by color (green, blue, or gray), stroke (solid or dotted), and arrowheads (empty or filled); edge width shows strength (for example, the connection from Node 8 to Node 9 is strong).

We can also use hover text to supplement or replace the above parameters, as it can support other information that cannot be easily expressed through node or edge properties.

Having defined the different data features of a social network, let’s examine the pros and cons of four popular tools used to visualize networks in Power BI.

Extension	Social Network Graph by Arthur Graus	Network Navigator	Advanced Networks by ZoomCharts (Light Edition)	Custom Visualizations Using R
Dynamic node size	Yes	Yes	Yes	Yes
Dynamic edge size	No	Yes	No	Yes
Node color customization	Yes	Yes	No	Yes
Complex social network processing	No	Yes	Yes	Yes
Profile images for nodes	Yes	No	No	Yes
Adjustable zoom	No	Yes	Yes	Yes
Top N connections filtering	No	No	No	Yes
Custom information on hover	No	No	No	Yes
Edge color customization	No	No	No	Yes
Other advanced features	No	No	No	Yes

Social Network Graph by Arthur Graus, Network Navigator, and Advanced Networks by ZoomCharts (Light Edition) are all suitable extensions to develop simple social networks and get started with your first social network analysis.

Many dark blue, light blue, and orange circles (50+ circles) are connected by thin gray lines on a white background. The circles have a solid color border and are filled with small images of various Pokémon that have a white background, and the circles block the view of most of the gray lines. They form a circular shape overall. — An example visualization made using the Social Network Graph by Arthur Graus extension.

Many blue, purple, and gray circles (50+ circles) are connected by thin gray lines on a white background. The circles are solid and filled, and block the view of some of the gray lines. They form a circular arrangement overall. — An example visualization made using the Network Navigator extension.

An example visualization made using the Advanced Networks by ZoomCharts (Light Edition) extension.

However, if you want to make your data come alive and uncover groundbreaking insights with attention-grabbing visuals, or if your social network is particularly complex, I recommend developing your custom visuals in R.

Many green, blue, and purple circles (50+ circles) are connected by thin lines of varying colors (green, gray, and red) on a white background. The circles are solid and filled with a Pokémon image at their center, and most of the thin lines are visible. They form a spread-out circular shape overall, with the green circles frequently branching out toward smaller blue or purple circles. The top right corner of the chart has the text "Social Network," and below the chart is a legend of lines and circles with related text: a green line with the text "Positive," a gray line with the text "Neutral," a red line with the text "Negative," a blue circle with the text "Mention," and a purple circle with the text "Retweet." — An example visualization made using custom visuals in R.

This custom visualization is the final result of our tutorial’s social network extension in R and demonstrates the large variety of features and node/edge properties offered by R.

Creating an extension to visualize social networks in Power BI using R comprises five distinct steps. But before we can build our social network extension, we must load our data into Power BI.

Prerequisite: Collect and Prepare Data for Power BI

You can follow this tutorial with a test dataset based on Twitter and Facebook data or proceed with your own social network. Our data has been randomized; you may download real Twitter data if desired. After you collect the required data, add it into Power BI (for example, by importing an Excel workbook or adding data manually). Your result should look similar to the following table:

Once you have your data set up, you are ready to create a custom visualization.

Step 1: Set Up the Visualization Template

Developing a Power BI visualization is not simple—even basic visuals require thousands of files. Fortunately, Microsoft offers a library called pbiviz, which provides the required infrastructure-supporting files with only a few lines of code. The pbiviz library will also repackage all of our final files into a .pbiviz file that we can load directly into Power BI as a visualization.

The simplest way to install pbiviz is with Node.js. Once pbiviz is installed, we need to initialize our custom R visual via our machine’s command-line interface:

pbiviz new toptalSocialNetworkByBharatGarg -t rhtml
cd toptalSocialNetworkByBharatGarg
npm install 
pbiviz package

Don’t forget to replace toptalSocialNetworkByBharatGarg with the desired name for your visualization. -t rhtml informs the pbiviz package that it should create a template to develop R-based HTML visualizations. You will see errors because we have not yet specified fields such as the author’s name and email in our package, but we will resolve these later in the tutorial. If the pbiviz script won’t run at all in PowerShell, you first may need to allow scripts with Set-ExecutionPolicy RemoteSigned.

On successful execution of the code, you will see a folder with the following structure:

A File Explorer listing containing eight subfolders (.tmp, .vscode, assets, dist, node_modules, r_files, src, and style) and eight files (capabilities.json, dependencies.json, package.json, package-lock.json, pbiviz.json, script.r, tsconfig.json, and tslint.json). All of the files are 1 KB, except for capabilities.json (2 KB) and package-lock.json (23 KB).

Once we have the folder structure ready, we can write the R code for our custom visualization.

Step 2: Code the Visualization in R

The directory created in the first step contains a file named script.r, which consists of default code. (The default code creates a simple Power BI extension, which uses the iris sample database available in R to plot a histogram of Petal.Length by Petal.Species.) We will update the code but retain its default structure, including its commented sections.

Our project uses three R libraries:

DiagrammeR: Creates graphs from text
visNetwork: Provides interactive network visualizations
data.table: Assists with data organization, similar to data.frame

Let’s replace the code in the Library Declarations section of script.r to reflect our library usage:

libraryRequireInstall("DiagrammeR")
libraryRequireInstall("visNetwork")
libraryRequireInstall("data.table")

Next, we will replace the code in the Actual code section with our R code. Before creating our visualization, we must first read and process our data. We will take two inputs from Power BI:

num_records: The numeric input N, such that we will select only the top N connections from our network (to limit the number of connections displayed)
dataset: Our social network nodes and edges

To calculate the N connections that we will plot, we need to aggregate the num_records value because Power BI will provide a vector by default instead of a single numeric value. An aggregation function like max achieves this goal:

limit_connection <- max(num_records)

We will now read dataset as a data.table object with custom columns. We sort the dataset by value in decreasing order to place the most frequent connections at the top of the table. This ensures that we choose the most important records to plot when we limit our connections with num_records:

dataset <- data.table(from = dataset[[1]]
                      ,to = dataset[[2]]
                      ,value = dataset[[3]]
                      ,col_sentiment = dataset[[4]]
                      ,col_type = dataset[[5]]
                      ,from_name = dataset[[6]]
                      ,to_name = dataset[[7]]
                      ,from_avatar = dataset[[8]]
                      ,to_avatar = dataset[[9]])[
order(-value)][
seq(1, min(nrow(dataset), limit_connection))]

Next, we must prepare our user information by creating and allocating unique user IDs (uid) to each user, storing these in a new table. We also calculate the total number of users and store that information in a separate variable called num_nodes:

user_ids <- data.table(id = unique(c(dataset$from, 
                                     dataset$to)))[, uid := 1:.N]

num_nodes <- nrow(user_ids)

Let’s update our user information with additional properties, including:

The number of followers (size of node).
The number of records.
The type of user (color codes).
Avatar links.

We will use R’s merge function to update the table:

user_ids <- merge(user_ids, dataset[, .(num_follower = uniqueN(to)), from], by.x = 'id', by.y = 'from', all.x = T)[is.na(num_follower), num_follower := 0][, size := num_follower][num_follower > 0, size := size + 50][, size := size + 10]

user_ids <- merge(user_ids, dataset[, .(sum_val = sum(value)), .(to, col_type)][order(-sum_val)][, id := 1:.N, to][id == 1, .(to, col_type)], by.x = 'id', by.y = 'to', all.x = T)

user_ids[id %in% dataset$from, col_type := '#42f548']

user_ids <- merge(user_ids, unique(rbind(dataset[, .('id' = from, 'Name' = from_name, 'avatar' = from_avatar)],
      dataset[, .('id' = to, 'Name' = to_name, 'avatar' = to_avatar)])),
      by = 'id')

We also add our created uid to the original dataset so that we can retrieve the from and to user IDs later in the code:

dataset <- merge(dataset, user_ids[, .(id, uid)],
                                by.x = "from", by.y = "id")

dataset <- merge(dataset, user_ids[, .(id, uid_retweet = uid)],
                                by.x = "to", by.y = "id")

user_ids <- user_ids[order(uid)]

Next, we create node and edge data frames for the visualization. We choose the style and shape of our nodes (filled circles), and select the correct columns of our user_ids table to populate our nodes’ color, data, value, and image attributes:

nodes <- create_node_df(n = num_nodes, 
                        type = "lower",
                        style = "filled",
                        color = user_ids$col_type, 
                        shape = 'circularImage',
                        data = user_ids$uid,
                        value = user_ids$size,
                        image = user_ids$avatar,
                        title = paste0("Name: ", user_ids$Name,"
",
                                       "Super UID ", user_ids$id, "
",
                                       "# followers ", user_ids$num_follower, "
",
                                       "")
                        )

Similarly, we pick the dataset table columns that correspond to our edges’ from, to, and color attributes:

edges <- create_edge_df(from = dataset$uid,
                        to = dataset$uid_retweet,
                        arrows = "to",
                        color = dataset$col_sentiment)

Finally, with the node and edge data frames ready, let’s create our visualization using the visNetwork library and store it in a variable the default code will use later, called p:

p <- visNetwork(nodes, edges) %>%
  visOptions(highlightNearest = list(enabled = TRUE, degree = 1, hover = T)) %>%
  visPhysics(stabilization = list(enabled = FALSE, iterations = 10), adaptiveTimestep = TRUE, barnesHut = list(avoidOverlap = 0.2, damping = 0.15, gravitationalConstant = -5000))

Here, we customize a few network visualization configurations in visOptions and visPhysics. Feel free to look through the documentation pages and update these options as desired. Our Actual code section is now complete, and we should update the Create and save widget section by removing the line p = ggplotly(g); since we coded our own visualization variable, p.

Step 3: Prepare the Visualization for Power BI

Now that we have finished coding in R, we must make certain changes in our supporting JSON files to prepare the visualization for use in Power BI.

Let’s start with the capabilities.json file. It includes most of the information you see in the Visualizations tab for a visual, such as our extension’s data sources and other settings. First, we need to update dataRoles and replace the existing value with new data roles for our dataset and num_records inputs:

# ...
  "dataRoles": [
    {
      "displayName": "dataset",
      "description": "Connection Details - From, To, # of Connections, Sentiment Color, To Node Type Color",
      "kind": "GroupingOrMeasure",
      "name": "dataset"
    },
    {
      "displayName": "num_records",
      "description": "number of records to keep",
      "kind": "Measure",
      "name": "num_records"
    }
  ],
# ...

In our capabilities.json file, let’s also update the dataViewMappings section. We’ll add conditions that our inputs must adhere to, as well as update the scriptResult to match our new data roles and their conditions. See the conditions section, along with the select section under scriptResult, for changes:

# ...
 "dataViewMappings": [
    {
       "conditions": [
        {
          "dataset": {
            "max": 20
          },
          "num_records": {
            "max": 1
          }
        }
      ],
      "scriptResult": {
        "dataInput": {
          "table": {
            "rows": {
              "select": [
                {
                  "for": {
                    "in": "dataset"
                  }
                },
                {
                  "for": {
                    "in": "num_records"
                  }
                }
              ],
              "dataReductionAlgorithm": {
                "top": {}
              }
            }
          }
        },
# ...

Let’s move on to our dependencies.json file. Here, we will add three additional packages under cranPackages so that Power BI can identify and install the required libraries:

{
    "name": "data.table",
      "displayName": "data.table",
      "url": "http://cran.r-project.org/web/packages/data.table/index.html"
},
{
    "name": "DiagrammeR",
      "displayName": "DiagrammeR",
      "url": "http://cran.r-project.org/web/packages/DiagrammeR/index.html"
},
{
    "name": "visNetwork",
      "displayName": "visNetwork",
      "url": "http://cran.r-project.org/web/packages/visNetwork/index.html"
},

Note: Power BI should automatically install these libraries, but if you encounter library errors, try running the following command:

install.packages(c("DiagrammeR", "htmlwidgets", "visNetwork", "data.table", "xml2"))

Lastly, let’s add relevant information for our visual to the pbiviz.json file. I’d recommend updating the following fields:

The visual’s description field
The visual’s support URL
The visual’s GitHub URL
The author’s name
The author’s email

Now, our files have been updated, and we must repackage the visualization from the command line:

pbiviz package

On successful execution of the code, a .pbiviz file should be created in the dist directory. The entire code covered in this tutorial can be viewed on GitHub.

Step 4: Import the Visualization Into Power BI

To import your new visualization in Power BI, open your Power BI report (either one for existing data or one created during our Prerequisite step with test data) and navigate to the Visualizations tab. Click the … [more options] button and select Import a visual from a file. Note: You may need to first select Edit in a browser in order for the Visualizations tab to be visible.

A pane appears with the title "Visualizations" and two ">" arrows to its right. Below, the text "Build visual" with two images below it: two yellow rectangles and a line on the left, and a paper and paintbrush on the right. The two yellow rectangles image is selected and below it has a panel of more than 30 various graph icons. The last icon is an ellipsis, which has the hover text "Get more visuals." Below the icons panel, the text "Values" with a line of text below that reads: "Add data fields here." Below that, the text "Drill through," followed by "Cross-report" with an "Off" radio button selected next to it.

Navigate to the dist directory of your visualization folder and select the .pbiviz file to seamlessly load your visual into Power BI.

Step 5: Create the Visualization in Power BI

The visualization that you imported is now available in the visualizations pane. Click on the visualization icon to add it to your report, and then add relevant columns to the dataset and num_records inputs:

You can add additional text, filters, and features to your visualization depending on your project requirements. I also recommend that you go through the detailed documentation for the three R libraries we used to further enhance your visualizations, since our example project cannot cover all use cases of the available functions.

Our final result is a testament to the power and efficiency of R when it comes to creating custom Power BI visualizations. Try out social network analysis using custom visuals in R on your next dataset, and make smarter decisions with comprehensive data insights.

The Toptal Engineering Blog extends its gratitude to Leandro Roser for reviewing the code samples presented in this article.

Understanding the basics

What is Power BI used for?
Power BI helps you create dashboards with interactive data visualizations that can be used to monitor real-time metrics, analyze data, and make business decisions.
Is Power BI difficult to learn?
Power BI is not difficult to learn, especially if you have experience with other data visualization tools. The UI is intuitive and there are plenty of online resources to help you get started. However, there is a learning curve involved in mastering all Power BI features and capabilities.
Why is social network analysis important?
Social network analysis can be used to understand the relationships between individuals within a group. This information can be used to perform targeted marketing and outreach efforts, study the spread of information, and understand the structure of the social network.
What are the basic steps in social network analysis?
First, choose a social network to analyze. Then, define what constitutes a connection between two individuals. Next, identify all individuals in the social network. Then, identify all of the connections between individuals. Finally, analyze the connections to find patterns or trends.
What is visualization in social network analysis?
Visualization in social network analysis is the process of mapping out relationships and patterns in data in order to better understand the underlying structure of a social system. This can be done using a variety of methods, including social network diagrams, node-link diagrams, and matrices.
Can you use R with Power BI?
Yes, you can create custom visuals in Power BI using R code. Microsoft’s pvibiz library simplifies the process by providing the required infrastructure with just a few lines of code.

Bharat Garg

Verified Expert in Engineering

Delhi, India

Member since June 24, 2020

About the author

Bharat is a data scientist and developer who specializes in designing and developing interactive reports and tools to facilitate decision-making. He has worked with small startups and large corporations, such as Comcast, MetLife, UnitedHealth Group/Optum, and Jefferson Health. One of Bharat’s projects delivered $6 million in revenue, and another delivered $10 million in savings.

Expertise

R Data Visualization Data Analysis

Previous Role

Senior Analyst

PREVIOUSLY AT

Hire Bharat

World-class articles, delivered weekly.

Join the Toptal^® community.

Hire a Developer or Apply as a Developer

Bharat Garg

Expertise

Previous Role

Structuring Data for Visualization

Prerequisite: Collect and Prepare Data for Power BI

Step 1: Set Up the Visualization Template

Step 2: Code the Visualization in R

Step 3: Prepare the Visualization for Power BI

Step 4: Import the Visualization Into Power BI

Step 5: Create the Visualization in Power BI

Further Reading on the Toptal Blog:

Understanding the basics

What is Power BI used for?

Is Power BI difficult to learn?

Why is social network analysis important?

What are the basic steps in social network analysis?

What is visualization in social network analysis?

Can you use R with Power BI?

Tags

Bharat Garg

About the author

Expertise

Previous Role

Bharat Garg

Apache Spark Optimization Techniques for High-performance Data Processing

Toptal Developers

ByBharat Garg

Expertise

Previous Role

Structuring Data for Visualization

Comparing Power BI’s Social Network Extensions

Building a Social Network Extension for Power BI Using R

Prerequisite: Collect and Prepare Data for Power BI

Step 1: Set Up the Visualization Template

Step 2: Code the Visualization in R

Step 3: Prepare the Visualization for Power BI

Step 4: Import the Visualization Into Power BI

Step 5: Create the Visualization in Power BI

Upgrading Your Next Social Network Analysis

Further Reading on the Toptal Blog:

Understanding the basics

What is Power BI used for?

Is Power BI difficult to learn?

Why is social network analysis important?

What are the basic steps in social network analysis?

What is visualization in social network analysis?

Can you use R with Power BI?

Tags

About the author

Expertise

Previous Role

Toptal Developers

Bharat Garg