GoHugo Post Scheduler

7 min read • April 2, 2023

After building this website with Hugo (I moved to Astro a while ago), I wanted to have the ability to schedule blog posts. However, by default, Hugo doesn’t build pages with dates in the future. To solve this issue, I created a short Python script and paired it with GitHub Actions to create my own scheduler.

Possible Solutions

I built this website with Hugo (static site generator) and host it on Cloudflare Pages. Cloudflare builds/updates my website whenever I push new changes, but if I want a new blog post to appear tomorrow and I push the changes today, it won’t be displayed. I found two solutions:

  1. I can use GitHub Actions to re-build my website every day at a certain hour:
    • waste of limited monthly builds if you store the website in a private repository.
    • if your blog post date is later than the scheduled build hour (e.g. build/update every day at 12:00, post scheduled for 13:00), your post will be published later than you intended (i.e. next day at 12:00).
  2. Create a script that will gather all dates and schedule the next build with cron:
    • Build/update only when needed.
    • The post will appear at the exact time you want.

Even though the second approach is more difficult it fits my needs, so I’ll go with it.

1. Setting Up GitHub Actions

We will modify GitHub Action’s file with our Python script and this requires some additional permissions - workflow scope that the default GitHub Action token doesn’t have. We have to create a new PAT token.

  1. Go to https://github.com/settings/tokens.
  2. Choose Tokens (classic).
    GitHub settings screenshot with "Tokens (classic)" option highlighted.
  3. Click “Generate new token” -> “Generate new token (classic)” and select workflow in scopes.
    GitHub token's settings
  4. Copy your token and go to your website’s repository (where you store your Hugo site).
  5. Go to Settings > Secrets > Actions.
    GitHub repository secrets settings.
  6. Add your token by clicking “New repository secret”. Name it UPDATE_WORKFLOW_TOKEN.
  7. Create Deploy Hook (docs for Cloudflare, and Netlify), and add this URL as a secret. Name it however you want (e.g. CLOUDFLARE_PAGES_DEPLOY_HOOK).
  8. Disable automatic deployments (i.e. running a new build every time there’s a new commit on GitHub). We will use the deploy hook’s URL to decide when we want to run the build, so we don’t need it.

Disabling automatic deployment on Cloudflare Pages.
Disabling automatic deployment on Cloudflare

2. Creating GitHub Actions’ Files

Now we have to create GitHub Action that will be used to run our Python script and run the scheduled builds.

  1. Create a GitHub Action directory in your repository (.github/workflows/).
  2. Create 3 files (requirements.txt, schedule_build.py, schedule_build.yml).
    Screenshot of GitHub repository files.
  3. Add python-frontmatter and ruamel.yaml to requirements.txt.
  4. Copy and paste this code into schedule_build.yml.
.github/workflows/schedule_build.yml
name: Schedule Build
on:
push:
branches:
- main
jobs:
scheduler:
runs-on: ubuntu-latest
steps:
- name: Check out repository code
uses: actions/checkout@v3
with:
token: ${{ secrets.UPDATE_WORKFLOW_TOKEN }}
- name: Setup Python
uses: actions/setup-python@v4
with:
python-version: 3.x
- name: Install packages needed
run: python -m pip install -r .github/workflows/requirements.txt
- name: Run python script
run: python .github/workflows/schedule_build.py
- name: Commit and push
if: ${{ env.COMMIT_MSG }}
run: |
git diff
git add .
git config --global user.name "github-actions[bot]"
git config --global user.email "github-actions[bot]@users.noreply.github.com"
git commit -m "${{ env.COMMIT_MSG }}" -a
git push
- name: Trigger build webhook on Cloudflare Pages
run: curl -s -X POST "${DEPLOY_HOOK}"
env:
DEPLOY_HOOK: ${{ secrets.CLOUDFLARE_PAGES_DEPLOY_HOOK }}

Remember to modify the CLOUDFLARE_PAGES_DEPLOY_HOOK (last line) to match the name of your secret.

3. Python Script

The last step is to create a Python Script. Let’s consider that you have a blog directory in content where you store your blog posts:

content
└─blog
├─post-1.md
├─post-2.md
└─some_dir
├─post-3.md
└─image.png
  1. First, let’s import all the modules we need and create a few variables.

    .github/workflows/schedule_build.py
    import os
    from datetime import datetime, timedelta, timezone
    import frontmatter
    from ruamel.yaml import YAML
    from ruamel.yaml.scalarstring import DoubleQuotedScalarString as DQSS
    yaml = YAML()
    now = datetime.now(timezone.utc)
    future = []
  2. Then, we have to loop through the blog posts, get all the dates and add them to the list called future.

    .github/workflows/schedule_build.py
    import os
    from datetime import datetime, timedelta, timezone
    import frontmatter
    from ruamel.yaml import YAML
    from ruamel.yaml.scalarstring import DoubleQuotedScalarString as DQSS
    yaml = YAML()
    now = datetime.now(timezone.utc)
    future = []
    # Loop through content and find earliest, future scheduled post's date
    # `os.walk` will recursively check everything inside `content/blog`
    for path, dirs, files in os.walk(os.path.join("content", "blog")):
    for f in files:
    # Ignore files other than markdown
    if f.endswith(".md"):
    # Load frontmatter and change timezone to UTC
    post = frontmatter.load(os.path.join(path, f))
    date = post["date"].astimezone(timezone.utc)
    # Date is in the future and draft=False (post is scheduled)
    if not post["draft"] and date > now:
    future.append(date)

    If you store your blog posts differently you might have to modify the first for loop.

  3. After that, we have to find the earliest/first date with min() and change it to a cron format. We can do it with this function:

    .github/workflows/schedule_build.py
    import os
    from datetime import datetime, timedelta, timezone
    import frontmatter
    from ruamel.yaml import YAML
    from ruamel.yaml.scalarstring import DoubleQuotedScalarString as DQSS
    yaml = YAML()
    now = datetime.now(timezone.utc)
    future = []
    def datetime_to_cron(dt):
    # Add one minute - cron doesn't work with seconds
    dt += timedelta(minutes=1)
    cron = f"{dt.minute} {dt.hour} {dt.day} {dt.month} *"
    tm = dt.strftime("%Y/%m/%d, %H:%M")
    return cron, tm
    # Loop through content and find earliest, future scheduled post's date
    for path, dirs, files in os.walk(os.path.join("content", "blog")):
    for f in files:
    # Ignore files other than markdown
    if f.endswith(".md"):
    # Load frontmatter and change timezone to UTC
    post = frontmatter.load(os.path.join(path, f))
    date = post["date"].astimezone(timezone.utc)
    # Date is in the future and draft=False (post is scheduled)
    if not post["draft"] and date > now:
    future.append(date)
  4. These are the main functions that make scheduling work. Now we just have to add code that will modify the GitHub Actions file to schedule the next build.

    .github/workflows/schedule_build.py
    import os
    from datetime import datetime, timedelta, timezone
    import frontmatter
    from ruamel.yaml import YAML
    from ruamel.yaml.scalarstring import DoubleQuotedScalarString as DQSS
    yaml = YAML()
    now = datetime.now(timezone.utc)
    future = []
    def datetime_to_cron(dt):
    # Add one minute - cron doesn't work with seconds
    dt += timedelta(minutes=1)
    cron = f"{dt.minute} {dt.hour} {dt.day} {dt.month} *"
    tm = dt.strftime("%Y/%m/%d, %H:%M")
    return cron, tm
    # Loop through content and find earliest, future scheduled post's date
    for path, dirs, files in os.walk(os.path.join("content", "blog")):
    for f in files:
    # Ignore files other than markdown
    if f.endswith(".md"):
    # Load frontmatter and change timezone to UTC
    post = frontmatter.load(os.path.join(path, f))
    date = post["date"].astimezone(timezone.utc)
    # Date is in the future and draft=False (post is scheduled)
    if not post["draft"] and date > now:
    future.append(date)
    # Load current GitHub action config
    with open(".github/workflows/schedule_build.yml", "r") as f:
    yaml_file = yaml.load(f)
    # If there are future posts - schedule next build
    if future:
    cron, tm = datetime_to_cron(min(future))
    commit_msg = f"Build scheduled for: {tm}"
    # Exit if cron schedule didn't change (avoid GitHub Action's push error)
    try:
    if yaml_file["on"]["schedule"][0]["cron"] == cron:
    exit()
    except Exception:
    pass
    # Add date to scheduler
    yaml_file["on"]["schedule"] = [{}]
    yaml_file["on"]["schedule"][0]["cron"] = DQSS(cron)
    else:
    # Try to remove old date
    if not yaml_file["on"].pop("schedule", default=None):
    exit()
    commit_msg = "No future schedules"
    # Creating custom commit message
    with open(os.getenv("GITHUB_ENV"), "a") as f:
    f.write(f"COMMIT_MSG={commit_msg} [skip ci]")
    # Update GitHub Actions' file
    with open(".github/workflows/schedule_build.yml", "w") as f:
    yaml.dump(yaml_file, f)

    I added comments to explain each part of the code.

That’s it! Now you should be able to schedule your posts just by setting the date in the post’s frontmatter. If you have any questions or if something doesn’t work you can always let me know in the comments!


SHARE: