CI/CD for Submitting URLs to Bing

Recently, I've been thinking about improving the SEO for this blog and I came across the Bing URL submission API. This API allows website owners to submit new pages of their app for instant indexing by Bing. In this post, we'll go over how to automate this process with GitLab CI.

So, how this new CI stage will work is as follows:

  • Diff the changed files in the current commit the pipeline is running for
  • Extract any files from the diff that are under the blog post directory
  • Loop over the posts and make the POST requests to the Bing submission API
  • This pipeline will also only run on the main branch so it only submits blog posts that are ready.

So let's break down these steps. First off, we'll use the below command to grab the diff:

1git fetch
2diff=$(git --no-pager diff --name-only HEAD^ HEAD)

This diff command basically diffs the previous commit (HEAD^) and the current commit (HEAD). The --name-only option will only print the file names, not the content that was changed. We also throw in --no-pager so that the command will output right to stdout.

Next up, we need to figure out if any of the changed files are blog posts:

1post=$(echo $diff | grep -Po 'content/blog/\K[a-zA-Z\-]*')

The above grep command matches anything in the content/blog directory and uses Perl Regex to match only alphabetic characters and the - character, which hence far will match all of my blog post files.

The rest of the pipeline deals with the Submission API request. For this, here's what we are doing:

 1for post in $posts; do
 2    cat << EOT > data.json
 3{
 4    "siteUrl":"https://caffeinatedcoder.dev",
 5    "urlList":[
 6    "https://caffeinatedcoder.dev/blog/${post}"
 7    ]
 8}
 9EOT
10
11    curl -X POST -H "Content-Type: application/json" -d @data.json https://bing.com/webmaster/api.svc/json/SubmitUrlbatch?apikey=${BING_API_KEY}
12done

This block has a few pieces, first we loop over every item in the posts variable. If no posts were found, this pipeline just exits. For every item, we construct a POST payload that the Bing submission API can parse, and inject the current post variable into the payload. After that, we run a curl command to actually submit the new URL.

After all of this, this is what the GitLab CI stage looks like:

 1bing-url-submission:
 2  stage: bing-url-submission
 3  image: bitnami/git
 4  script:
 5  - |
 6    git fetch
 7    diff=$(git --no-pager diff --name-only HEAD^ HEAD)
 8    posts=$(echo $diff | grep -Po 'content/blog/\K[a-zA-Z\-]*')
 9    for post in $posts; do
10        cat << EOT > data.json
11    {
12        "siteUrl":"https://caffeinatedcoder.dev",
13        "urlList":[
14        "https://caffeinatedcoder.dev/blog/${post}"
15        ]
16    }
17    EOT
18
19        curl -X POST -H "Content-Type: application/json" -d @data.json https://bing.com/webmaster/api.svc/json/SubmitUrlbatch?apikey=${BING_API_KEY}
20    done
21  only:
22    - main

A few more things to unpack with the full stage YAML. We're using the Bitnami Git image, which also happens to include Grep and Curl, the other two Linux utilities we need for this new stage. We also have a variable called BING_API_KEY which is a GitLab CI variable on my GitLab namespace that stores the Bing API key. And of course, the last couple of lines on this stage make the pipeline only run on the main branch.

And that's it. With this, every time I merge a new branch into main with new posts, they are automatically submitted to the Bing Submission API so they are indexed right away.

I hope this was helpful to anyone looking to do something similar, and thanks for reading!