Pull Assets To Local Storage With Python

Pull Assets To Local Storage With Python

Export assets from Content Builder to local storage using the REST API.

Jason Hanshaw
18 minute read

In my experience developing in Salesforce Marketing Cloud, one of the more frustrating aspects has been a lack of real version-control for assets in the platform. Most solutions involve either complicated integrations or hacky workarounds to prevent unwanted edits, deletions or path changes. Thankfully, the Content Builder REST API provides us with everything we need to create this functionality ourselves.

I'll be going into the complete method of creating your own git-like wrapper for Content Builder assets in a future post but, for now, let's take a look at a piece of that functionality that has broad applications for how we manage our content.

Pulling To Local Storage

In this post, we'll be looking at a python script that will take input from a user and, based on that input, will download Content Builder assets and put them in a folder specified by the input. Before we can look at the script, we'll need to configure our system.

Prerequisites

  • Python 3.8.1
  • Installed Package set up in your SFMC Account

If you don't currently have python installed on your local system, here is a guide on how to get set up:

Python 3 Installation Guide

If you are unsure of how to create and install a package in your SFMC account, please refer to the following documentation for more information:

Create and Install Packages

Getting Started

Now that we have both Python and an SFMC package created and installed, we'll start configuring project-specific items. We need to set up our project folder, add libraries and set some environment variables that we can use to make requests to SFMC.

First, let's create a folder on our desktop called python-pull-assets. This will be our project folder and will house our files. After that, let's create a new file in that directory called .ENV. Here, we'll set some environment variables for our API calls so that we can authenticate and access the assets within Content Builder. You'll need to replace the values in this file to map to the information present in the installed package you setup in SFMC.

/python-pull-assets/.ENV

clientID = YOUR CLIENT ID
clientSecret = YOUR CLIENT SECRET
authURL = YOUR AUTH URL
baseURL = YOUR BASE REST URL

Now that we have our project directory and environment variables created, let's add some libraries that the script will rely on. First, we'll add the requests library. This will make it easier for us to form our REST calls to SFMC. To do this, let's open a command prompt and navigate to our project folder:

cd Desktop
cd python-pull-assets

Then, we'll add the requests library using pip:

pip install requests

Let's also add the decouple library. This will allow us to import the variables from our .ENV file into our script:

pip install python-decouple

That's all we need for our initial configuration! Now, let's take a look at the actual scripting necessary to pull Content Builder assets to local storage.

Adding Python Script

We'll break down the script into smaller blocks and analyze each one but, first, let's take a look at the full script (in case you want to skip this section and just dive in):

/python-pull-assets/downloadAssets.py

from decouple import config
import requests, json, os, io, shutil

# set req vars
clientID = config("clientID")
clientSecret = config("clientSecret")
authURL = config("authURL")
baseURL = config("baseURL")

# prepare the target directory
if 'assets' not in os.listdir(os.getcwd()):
    os.makedirs('assets')
print("\n/**** SFMC ASSET PULL ****/\n")
# get asset id for api call
assetID = int(input("\nENTER ASSET ID: "))
# get folder name in order to build path
folderName = input("\nENTER FOLDER NAME: ")
filePath = "assets/" + folderName + "/"
if filePath not in os.listdir(os.getcwd()):
    os.makedirs(filePath)
#get access token for api call
def getToken():
    payload = {
        "clientSecret": clientSecret,
        "clientId":clientID
    }
    print("\nAuthenticating...\n")
    res = requests.post(authURL, data=payload)
    res.raise_for_status()
    return json.loads(res.text)["accessToken"]
    print("Done.")

accessToken = getToken()
headers = {
    'content-type': 'application/json',
    'Authorization': 'Bearer ' + accessToken
}
# content request payload
contentRequest = {
    "query": {
        "property": "assetType.id",
        "simpleOperator": "equals",
        "valueType": "int",
        "value": assetID
    },
    "page": {
        "pageSize": 500
    }
}
print("Getting assets...\n")
# get asset data from Content Builder rest api
res = requests.post(baseURL + 'asset/v1/content/assets/query', data=json.dumps(contentRequest, separators=(',', ':')), headers=headers)
res.raise_for_status()
print("Done\n")

assetsJSON = res.text
assetLoad = []
numOfAssetsFound = 0
print("Downloading Library\n")
assetsList = json.loads(assetsJSON)["items"]
for asset in assetsList:
    # append html extension if asset id greater than 194
    fileIDExt = filePath + asset["name"].replace("/","_")
    if assetID > 194:
        fileID = fileIDExt + ".html"
    else:
        fileID = fileIDExt
    targetFile = io.open(fileID, 'w+', encoding="utf-8")
    # if template-based, html or text email
    if assetID == "207" or assetID == "208" or assetID == "209":
        if "content" in asset:
            # write asset to local storage
            targetFile.write(str(asset["views"]["html"]["content"]))
    # if not in the criteria above but asset id is of type "block"
    elif (assetID != "207" and assetID != "208") and (assetID > 194):
        if "content" in asset:
            # write asset to local storage
            targetFile.write(str(asset["content"]))
    else:
        assetURL = str(asset["fileProperties"]["publishedURL"])
        # retrieve asset from the web and download to local storage
        response = requests.get(assetURL, stream=True)
        with open(fileID, 'wb') as out_file:
            shutil.copyfileobj(response.raw, out_file)
        del response
    targetFile.close()
    numOfAssetsFound += 1
print(str(numOfAssetsFound) + " assets found\n")
print("Data successfully pulled to local storage")
exit()

For the first portion of our script, we are importing the necessary libraries we'll need along with the environment variables for our SFMC API calls.

from decouple import config
import requests, json, os, io, shutil

# set req vars
clientID = config("clientID")
clientSecret = config("clientSecret")
authURL = config("authURL")
baseURL = config("baseURL")

Next, we'll check to see if an assets folder already exists and, if it doesn't, we'll go ahead and create that directory. Then, we'll prompt the user to input both the Id of the Asset Type they want to pull along with the name of the folder they wish to create and store these assets in. We can then use this data to create both the file path variable and the necessary folder structure in which to store our assets.

Note: You can find a list of Asset Types and their corresponding Id's here.

# prepare the target directory
if 'assets' not in os.listdir(os.getcwd()):
    os.makedirs('assets')
print("\n/**** SFMC ASSET PULL ****/\n")
# get asset id for api call
assetID = int(input("\nENTER ASSET ID: "))
# get folder name in order to build path
folderName = input("\nENTER FOLDER NAME: ")
filePath = "assets/" + folderName + "/"
if filePath not in os.listdir(os.getcwd()):
    os.makedirs(filePath)

With our folder structure in place, and our user input captured, we can make an API call to SFMC to generate an access token for authenticating into the platform. In our example below, we are using the legacy endpoint/authentication to generate our access token. Newer packages using the v2/token endpoint will require a different payload and token retrieval path. For more information on this please refer to the documentation on Server-to-Server Integrations with Client Credentials Grant Type.

#get access token for api call
def getToken():
    payload = {
        "clientSecret": clientSecret,
        "clientId":clientID
    }
    print("\nAuthenticating...\n")
    res = requests.post(authURL, data=payload)
    res.raise_for_status()
    return json.loads(res.text)["accessToken"]
    print("Done.")

accessToken = getToken()

Let's take the access token generated in our last snippet and now make a request to the SFMC Content Builder API to return all assets that match the Asset Type Id that the user defined in their input. With our contentRequest object declaration, we are specifying the advanced query logic that will filter the data we return from SFMC. This query can include nested AND/OR logic based on the asset parameters returned in a 200 response.

Also, we can set values for the number of pages and page size of the data that we wish to return. Pagination of the data is not included in this example but, unless the amount of assets for a given type is exceedingly large, you should be able to get by with setting the pageSize value to a large upper limit to return all results in one page.

headers = {
    'content-type': 'application/json',
    'Authorization': 'Bearer ' + accessToken
}
# content request payload
contentRequest = {
    "query": {
        "property": "assetType.id",
        "simpleOperator": "equals",
        "valueType": "int",
        "value": assetID
    },
    "page": {
        "pageSize": 500
    }
}
print("Getting assets...\n")
# get asset data from Content Builder rest api
res = requests.post(baseURL + 'asset/v1/content/assets/query', data=json.dumps(contentRequest, separators=(',', ':')), headers=headers)
res.raise_for_status()
print("Done\n")

Now that we have taken our user's input, generated an access token and made our request to SFMC for the specified Asset type, it's time to parse the response from SFMC and save our assets to our local folder.

First, we'll take the response and find the nested level containing the information we want to retrieve using json.loads(assetsJSON)["items"]. Then, we'll iterate through the asset list and push the content from each file to local storage.

Since the structure of the data differs depending on the type of asset that you are pulling, we'll need to use some simple if/else statements to let our script know exactly where and how to pull the correct information from the JSON. In the snippet below, you can see that the nested data structure for Id's 207, 208 and 209 (template-based, html and text emails respectively) isn't the same as that of the standard "block" types in SFMC.

Even different still is our method of parsing and saving non-HTML type content (i.e. images, documents, etc...). For this, we'll need to grab the URL of the asset within Content Builder and make a separate request to retrieve that object and write it's output to storage.

assetsJSON = res.text
assetLoad = []
numOfAssetsFound = 0
print("Downloading Library\n")
assetsList = json.loads(assetsJSON)["items"]
for asset in assetsList:
    # append html extension to file if asset id greater than 194 (type "block")
    fileIDExt = filePath + asset["name"].replace("/","_")
    if assetID > 194:
        fileID = fileIDExt + ".html"
    else:
        fileID = fileIDExt
    targetFile = io.open(fileID, 'w+', encoding="utf-8")
    # if template-based, html or text email
    if assetID == "207" or assetID == "208" or assetID == "209":
        if "content" in asset:
            # write asset to local storage
            targetFile.write(str(asset["views"]["html"]["content"]))
    # if not in the criteria above but asset id is of type "block"
    elif (assetID != "207" and assetID != "208") and (assetID > 194):
        if "content" in asset:
            # write asset to local storage
            targetFile.write(str(asset["content"]))
    else:
        assetURL = str(asset["fileProperties"]["publishedURL"])
        # retrieve asset from the web and download to local storage
        response = requests.get(assetURL, stream=True)
        with open(fileID, 'wb') as out_file:
            shutil.copyfileobj(response.raw, out_file)
        del response
    targetFile.close()
    numOfAssetsFound += 1
print(str(numOfAssetsFound) + " assets found\n")
print("Data successfully pulled to local storage")
exit()

All that's left to do is save the full script inside of our python-pull-assets folder with a file name of downloadAssets.py. Once that is saved, we can go back to the command prompt and simply run the following:

py downloadAssets.py

You should be presented with a prompt for the Asset ID and folder name. After successful input and script execution, navigate to assets/YOUR_FOLDER_NAME in your project directory to view the assets that you've retrieved from Content Builder.

Conclusion

Utilizing this approach should allow you to maintain consistent backups of your Content Builder content as well as giving you the ability to construct a version-control system that resembles a more modern development workflow. With some simple modifications to the above script, you could also pull journey assets or expand this into a full-fledged application for managing and updating your content.

In addition to this blog post, you can find the code for this example in this github repository.

python
rest api