Byron PeeblesBlog

Publishing to a staging site with Pelican

I finally settled on Pelican to write a new site/blog/articles—​something with. And after I had a home page looking okay, I wanted to test it non-locally but not in the production location. But, I wanted to be able to keep doing that even after, so this is what I ended up with.

I created a new configuration file named nextpublishconf.py and populated it with:

nextpublishconf.py
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
#!/usr/bin/env python

import os
import sys
sys.path.append(os.curdir)
from publishconf import *


# If your site is available via HTTPS, make sure SITEURL begins with https://
SITEURL = 'https://next.byronpeebles.com'

EXTRA_PATH_METADATA = EXTRA_PATH_METADATA or {}

if 'extra/robots.txt' in EXTRA_PATH_METADATA:
    del EXTRA_PATH_METADATA['extra/robots.txt']

EXTRA_PATH_METADATA['extra/robots-next.txt'] = {'path': 'robots.txt'}

This also lets us have custom robots.txt files, as we see later on.

While in tasks.py, we add the follow (this is using a new dictionary union operator added in 3.9, see PEP-0584 for alternatives in earlier versions):

tasks.py
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
NEXT_CONFIG = CONFIG | {
    'settings_publish': 'nextpublishconf.py',
    'ssh_path': '/var/staging',  # Set this to the path you want for next/staging
    # 'ssh_host': 'staging.host.tld',  # use this if you have it on its own server
}

# <snip>

@task
def nextpublish(c):
    """Publish to "next" via rsync"""
    pelican_run('-s {settings_publish}'.format(**NEXT_CONFIG))
    c.run(
        'rsync --delete --exclude ".DS_Store" -pthrvz -c '
        '-e "ssh -p {ssh_port}" '
        '{} {ssh_user}@{ssh_host}:{ssh_path}'.format(
            NEXT_CONFIG['deploy_path'].rstrip('/') + '/',
            **NEXT_CONFIG))

Back to robots.txt! I wanted to disallow them to my staging/next site, but allow them to the primary one so I created two files:

content/extra/robots-next.txt
+User-agent: *
+Disallow: /
content/extra/robots.txt
User-agent: *
Allow: /

and then controlled which one was uploaded by adding:

pelicanconf.py
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# path-specific metadata
EXTRA_PATH_METADATA = {
    'extra/robots.txt': {'path': 'robots.txt'},
}

# static paths will be copied without parsing their contents
STATIC_PATHS = [
    'images',
    'extra/robots.txt',
    'extra/robots-next.txt',
]

and then add this to nextpublishconf.py:

nextpublishconf.py
1
2
3
4
if 'extra/robots.txt' in EXTRA_PATH_METADATA:
    del EXTRA_PATH_METADATA['extra/robots.txt']

EXTRA_PATH_METADATA['extra/robots-next.txt'] = {'path': 'robots.txt'}

Finally, to publish to staging/next, run:

$ invoke nextpublish

and to "production", as it were, run:

$ invoke publish