When a visitor clicks a link to a page on your website, which does not exist, the webserver will generate a 404 error response and usually show a default error page. After migrating my website to Pelican, my goal is to inform visitors about the restructuring and give some help to find the content they are looking for. That's why I have set up a custom 404 error page for this website.
There are two steps needed:
- Configure Apache to serve a custom error page for the 404 Not Found error.
- Create a suitable error page (with Pelican)
Set up Apache
First we need to tell the webserver to serve a custom error page. For this, you
can either modify the main apache configuration or - as I have done - put the
new setting into an .htaccess
file in the /blog
subdirectory, where my
website is stored.
Then you need just one line to tell Apache, which page to send out, if a non-existent URL is requested. Ensure that you use an abolute path for the error page (not filesystem path, but path in the URL).
# Page to show on 404 error. You cannot specify relative paths here!
ErrorDocument 404 /blog/pages/error-404.html
Create the Error Page
I put a file named
error-404.md
into the content/pages
directory of my Pelican project and
added some text, including some links, e.g. to my About page. The result is this error page.
Then I realized, that there is one issue. I want to add links to the error
page. Pelican generates links as relative links. But the error page is served
for each non-existing URL! So if someone tries to access the URL
https://www.a-netz.de/blog/a/b/c/d/e
, the error page is served under this
URL. If the error page now contains a link to ../../pages/about.html, this will
not work, because the browser will now request
https://www.a-netz.de/blog/a/b/c/pages/about.html
, which also does not
exist. So relative links on this page do not work.
Simple Workaround
The simple workaround would be to just put absolute links in the error page. As
I use the development server when writing content, the pages are served from
http://localhost:8000
. And on my blog, they are served from
https://www.a-netz.de/blog
. So absolute links to not work for both use cases.
Better Solution
I realized, that Pelican already knows about the root location of my website. It
has the SITEURL setting! Now how to put this automatically into the
error-404.md
page? That must be possible with the jinja2content plugin.
But unfortunately, this plugin does not provide any context when processing the
page content with Jinja. So I cannot access the SITEURL
value. One small
modification to the read(...)
method later, the plugin knows about all Pelican
settings...
def read(self, source_path):
prefix = self.settings.get('JINJA2CONTENT_PREFIX', '')
with pelican_open(source_path) as text:
content = prefix + text
# Pass the settings dictionary as context to Jinja
content = self.env.from_string(content).render(**self.settings)
with NamedTemporaryFile(delete=False) as f:
f.write(content.encode())
f.close()
content, metadata = super().read(f.name)
os.unlink(f.name)
return content, metadata
Now I can create links in the error page like this:
{{ SITEURL }}/pages/about.html
The link will correctly adjust when I access the page on the Pelican devserver. And also adjusts when deploying content to my main webserver.
Note: I'm planning to contribute my changes to the jinja2content
plugin
back to the Pelican project. If you're lucky, you can soon use an unmodified
version of the plugin for this.