Running Robyn in R

October 8, 2021
Michael Taylor
Facebook Twitter WhatsApp LinkedIn

I recently had a thread go viral on Twitter with 170k impressions and 35k likes/comments. What got people so excited? Statistics. Specifically Facebook’s use of statistics to solve a tracking problem. Facebook’s Robyn is an open-source project from their marketing science team that automates Marketing Mix Modeling (MMM), and I was live tweeting an obscure webinar about it.

What is Marketing Mix Modeling?

Marketing Mix Modeling (MMM) is a technique from the 1960s that's still used to this day to attribute TV and offline spending by most of the Fortune 500. It’s being reinvented for the digital age by startups like Harry’s, ThirdLove and Monday.com, who are solving for the high cost, slow speed and human bias you get with traditional methods. So why are Facebook interested?

Why is Facebook using MMM?

Because it doesn’t rely on user-level data, MMM is a potential holy grail: completely future proof to whatever happens with privacy legislation, ad blockers or Apple’s tracking policies. So while Google had already been publishing papers on MMM, it’s a big vote of confidence that Facebook is getting in the game. The fact that it’s open source increases trust 

What happened to click tracking?

There had been cracks forming in click-based attribution methods for a while. We knew last-click attribution had flaws, we successfully ignored the rise of adblockers, but when startups grew up and expanded to offline channels and TV, there was no ‘click’ to track. So when iOS14.5 made it easy to opt out of tracking, it was the beginning of the end for digital attribution.

How do I use MMM?

Most of the people interested in Robyn from that Twitter thread were overwhelmed. They came for a solution to a problem, not a statistics lesson. If you’re spending less than $100k per year on ads across 1 or 2 digital channels, you likely don’t need Robyn. You can build a marketing mix model in an afternoon with Excel / GSheets, as I detailed in my Econometrics guide. To keep learning about MMM, head to my simulator-based courses on Vexpower.

Should I use Robyn?

If you are a larger advertiser spending $100k+ per month, across 3 or more channels, especially offline channels, then you should consider Robyn. Even if you’re already using a marketing mix model built in-house or with a vendor, you can compare the results to your own. It’s free apart from the time your analysts spend on it, which will be negligible if you send them this guide!

How do I run Robyn?

I’ve been coding for 6 years (JavaScript & Python), and spent last year building vexpower.com, so I’m technical: yet it still took me over half a day to get it running on my Windows machine. So I decided to put this guide out there to help you get started. If you’re on a Mac the steps are the same but the instructions should be slightly easier.

*** Disclaimer: there’s all sorts of advanced computer stuff going on here that I don’t understand. I could be giving you bad advice that could lead to your computer being hacked. If you get stuck, Google around or ask a friendly IT person, then let me know so I can update this. ***

Install Anaconda

This isn’t strictly needed to run Robyn, and it is a pretty large download, but it’s useful for lots of things. I use it regularly for running Jupyter Notebooks, which is great for showing your work when completing data science tasks in Python. We’re going to use it to create a virtual environment for R to run Python in (don’t ask - we’ll get to this later).

> Download Anaconda

Install R

R is a programming language used primarily by statisticians and data scientists. It’s the coding language most marketing mix modeling experts use if they aren’t just using Excel. I have no idea how it works, and you don’t have to either. Don’t get confused by the ‘CRAN mirror’ wording: they just mean download the package from somewhere it’s hosted that’s near you.

> Download R

Install R-Studio

R-Studio is an IDE (Integrated Development Environment) that lets you write code in R and run it. You need it to run the Robyn script and you can actually get it from inside Anaconda... BUT that didn’t work for me, so I recommend you download it directly.

> Download R-Studio

Check R Version

Something small to check off the todo list: make sure you’re using the right version of R (and run your first R code!). In the console of R-Studio, type the word ‘version’ and hit enter. That should be a number like 4.0.0 or above, which tells you you did the last two parts correctly.

```version```

Install Robyn

This part is actually really easy. In R-Studio just add the lines below in the script window, select the code you just copy and pasted, and press ‘Run’ (or Ctrl + Enter keys). This part used to be a lot harder but they packaged the code up in this library as part of their version 3 release.

```install.packages('remotes')
remotes::install_github("facebookexperimental/Robyn/R")```

Install Nevergrad

Ok this is where it gets tricky. This is what had me stuck for half a day (a fate you can hopefully avoid, especially if you have a mac). Nevergrad is an optimization library that uses an derivative-free, evolutionary approach, which makes it useful for more things. First complication: Nevergrad is a Python library (yay!) that Robyn needs to run in R (boo!).

Install Nevergrad Step 1: Install Reticulate

This part is easy. You install and load reticulate, which is an R library that lets you run Python libraries from your R code. As we did with the Robyn package, copy these lines into R-Studio and run them.

```install.packages(‘reticulate’)
library(reticulate)```

Install Nevergrad Step 2: Conda Install

This is where we use Anaconda. If you want to use pip, there’s an alternative method listed in the Robyn quickstart docs that you can try (didn’t work for me). We run the below lines, again in R-Studio. The first line creates a virtual environment called ‘r-reticulate’, the second line uses that environment to install nevergrad, then the third line lets us use it going forward.

```conda_create("r-reticulate") # must run this line once
conda_install("r-reticulate", "nevergrad", pip=TRUE)
use_condaenv("r-reticulate")```

If that worked the first time, consider yourself lucky! You can skip the next step. Otherwise fear not, I’ve been down this rabbit hole, and I can guide you through it.

Install Nevergrad Step 3/4/5/6/n: It’s Not Working!?

The first thing to try is what the Robyn team suggests – locate your Python file then run this line:

```use_python("~/this/is/your/path/to/python")```

Ok, but stupid question, how do you locate your Python file? Well you open the terminal for the environment you want to know this for, and then type the following command:

```where python```

This is what it looks like on my windows terminal, which I can get access to by searching my start menu for ‘terminal’. 

If you’re on windows, remember to replace the slashes with two backslashes. So for example:

```use_python("~\\python")```

However if you’re using Anaconda, you might want to go with that one.

```use_python("~\\Hammer\\anaconda3\\python")```

You’ll know it’s working when you can run the following lines without it failing:

```conda_install("r-reticulate", "nevergrad", pip=TRUE)
use_condaenv("r-reticulate")```

If that doesn’t work, or you don’t see the Anaconda one in your Terminal, don’t worry, you can open up a terminal for any environment you like, which is kind of nice. Choose your environment (in the screenshot I have selected the ‘r-reticulate’ one we created from R-Studio) and click Launch for the ‘CMD.exe Prompt’ app.

Now you have a terminal open just for that environment, and you can type the same `where` command again to check where your version of Python is.

```where python```

You would enter that first one in R-Studio as.

```use_python("~\\Hammer\\anaconda3\\envs\\r-reticulate\\python")```

Oh and while you’re here, you might also just want to try installing Nevergrad directly. So for example you can do:

```pip install nevergrad```

Then you can skip the first and second lines in R-Studio and just try the following line before moving on to test the demo code.

```use_condaenv("r-reticulate")```

Why am I giving you so many options? Because I tried everything and honestly I’m not 100% sure what worked! Perhaps someone smarter than me can chime in and tell me which was the right one. In any event, this wasn’t enough, because I also had to fix the SSL issue. 

This is where things got a little scary. If you have a Mac, you almost definitely don’t have this problem (and if you do I can’t help you). Brave Windows users, read on. Linux users… I don’t know why you’re reading this guide, you already know more than me. 

Still with me? If you read this Robyn quickstart doc it sends you here (dead end) and to this stack overflow comment, which takes you to this guide, which leads you to… ok take a deep breath and look at this website.



I know, it feels like we’re getting hacked just looking at it. Well my solution requires you to actually download some software from this website. If you’re uncomfortable with this, now is the time to run crying to IT support. 

For those of you left, I want you to say the words “maximum effort” in the mirror and click the EXE link next to where it says Win64 OpenSSL v1.1.1L. This is what I clicked and I haven’t been hackedski so farski Кто может это прочитать Обожаю Владимира Путина!

If you survived that part, double click on what you just downloaded to run the installer. When it asks you, tell it to install it outside the system directory. Remember where you installed it, and then follow this handy guide to add that to your user path. If that doesn’t work for you, then I don’t know what to tell you: Silicon Valley developers don’t care about Windows users.

Run The Demo

Hooray! You made it here either as an embattled but victorious Windows user, or a Mac user that got to skip the hard part. Either way, this is where the fun begins. The Robyn team made a nicely commented demo script for us with demo data so we can run it and see how it works. Just visit this link and copy and paste the whole contents into R-Studio.

> Demo Script


You’ll want to go through this line by line (that’s actually one of the nice things about R-Studio), so you can see what’s happening and what you might need to modify when you run it yourself. As you run each line the output will show up in the console, and as you define them, variables will appear in the environment window on the right.

I recommend you go through and read the comments (in green) before you run anything, and then read them again to follow the instructions as you run each line. Once you get to line ~280, step 3, you’re actually running the model! The bad news is that it takes about an hour or so to run (depending on your computer, and what you modified).

Once Robyn is finished, you get the results in your parent directory (regardless of what working directory you set) called something like `2021-10-06 23.23 init`. This provides a one page summary of each chosen model saved as a PNG, as well as a number of other files detailing the models it ran and which ones it deemed best. Note that Robyn runs thousands of models and then narrows that down for you to about a hundred that fit the data best. From there you have to pick the one you think best represents your business (a topic for another time!).

Congratulations! If you made it this far, you’ve managed to get Robyn running. As you may have noticed (particularly if you followed me deep into Nevergrad step 3) that I don’t think this software is particularly… accessible in its current form. The vast majority of marketers can’t be depended on to set up development environments or run R code (Python version please! 🙏). 

However the Robyn team are making improvements and their commitment to open source is commendable: they could have easily kept this as a proprietary tool for their big clients and instead it’s out there for us to use for free. This was a smart strategic move, because who’s going to trust a black box modeling tool that says “spend more on Facebook”. 

However because it’s open source, however challenging it is to use, I hope my guide inspires you to play with the tool, learn from it, maybe even contribute improvements to it. If we can bring marketing mix modeling to the modern age by automating it and removing human bias, then that’ll be a huge help to all marketers, who are just trying to prove the value of their campaigns.

If you want to learn a bit more about how to intepret the model, the best place is my simulator-based course on Vexpower.com: https://app.vexpower.com/sim/can-we-try-facebook-robyn/