Tech Blog Site

11-21-2020

EDIT: I flip back and forth about making my website repo open or private. If the repo links don't work and you're interested, let me know and I'll make these into gists or something.

It seems inevitable that every dev eventually start and then abandon a blog. I've thought a bit about doing one as a sort of journal for myself. I know there are a lot of sites that provide that functionality, but I wanted to play around with trying to invent that wheel, and wanted something I could tweak and fiddle with. I also wanted to be able to write version controlled markdown, and then have that converted into html and pushed through a simple pipeline. Finally, I didn't want to have to learn or be dependent on a more mainstream content platform like Medium, Blogger, or Wordpress. Instead I wanted to somehow embed it into my website. I ended up with this mess of a hacky solution that I had a lot of fun building.

My website has basically a three step deploy process: npm run build and then npm run deploy-content. Then I need to go to cloudfront and invalidate the cache (if I want to quickly check the results). I wanted something where with one script I could push files somewhere, and then they could be dynamically pulled in without having to invalidate caches. I also wanted to convert the markdown to html so that I could preview locally almost exactly what the 'published' result would look like.

The Basic Setup

In between matches of Halo friday night I started a simple node script to read all the markdown files in an input folder, use markdown-it to convert it to html, and then publish it to an output folder. Then I use the aws sdk to push it to a bucket. (I figured I could use a number of hosting platforms, but pushing to the bucket was easy, cheap, and in line with hosting my site).

Saturday was spent finishing the above script and working though the needed website changes. I had two main challenges: dynamically discovering what blog html files existed, and then pulling them into the site. I assumed I'd do a bucket list objects, and then pull them by accessing the public-read files. Figuring out the list objects rest call was trickier than I thought it would be, so I stubbed the file names I knew I had and worked on the second challenge.

Given a list of filenames, and knowing the bucket, I fetched each of the html files from AWS, converted them to strings, and then set the inner html of my 'blog-entry' components to that html. While a hacky and possibly unsafe operation, these files come from the same bucket as the rest of my website, and angular does do some sanitation automatically these days, so I figured it was 'good enough'. (I also had to fiddle with cors in the bucket settings to grab the files locally).

Listing Files

Next came trying to discover said files to pull. While I've done a lot of listing objects in buckets through the cli or sdk, I've never done the base api call. It took quite a bit of digging to find something that worked, and I also realized I had to update my bucket policy to allow that action. Annoyingly, it only comes back in xml, and so I had to parse the xml and navigate its nodes to get the keys (file names) that I cared about.

This solution worked until I pushed to 'prod'. In production I got failures to load because of 'mixed content'. I hadn't noticed, but the list object api call was http instead of https. Setting it to https returned content, but didn't have a certificate itself, so despite my site having a cert, it was still considered insecure. At this point I was pretty frustrated with what should have been a simple action, and so I decided to take a step back and think of other ways I could solve the problem.

Because I was using a pipeline to generate and deploy the dynamic content, it did have knowledge about those files. So I decided to have the node publisher script to keep track of the output file names and create an additional text file that had a line separated list of filenames. This meant that I could hardcode the website to find that 'index' file without a list objects, and then just read that file to know what files to pull. While still being somewhat hard coded, this gives me a nicer separation of concerns and lets the blog own what entries to show (I could push multiple blog entries and only show some of them in the future). This also made the website code simpler as it only parses a text file. (This feels more unixy too).

Other Additions

Later I realized my blogs should be sorted by date. I considered using file created/modified date, but thought that didn't work for backdating blogs etc. Once again I relied on a hacky solution that works since I'm the only one that needs to follow the convention. When publishing the files, I read the third line and parse the date, and then sort the filenames in my output file list using those dates. This means that the output list is in the right order, and the site just pulls them/displays them in the same order it gets from the list. Not robust, but convenient and possible since it's easier to enforce a convention with just one dev.

Once I had the html in my site, I wanted to make a couple changes to it. I wanted a dynamically generated table of contents, and I wanted each entry's title to be an anchor tag that I could bookmark or share. I debated doing this as part of the publishing step (creating a json object with title, id, and content). In the end though, I liked that the blog publishing was just responsible for the content and its order, without knowing about what the website would do with it.

Instead I transformed the html in my main blog page component. Here I grabbed the header tag and turned it into an anchor tag with a link to itself (the anchor is actually added in the blog entry component. These ids/blog entry objects could then also be passed to my table of contents so that it could create the links to each section. Finally, I had to add a lifecycle hook to navigate to an anchor tag on page render, so a shared/bookmarked link would scroll to the right post.

Conclusion

A note on testing: Due to my relative lack of skill (and patience) with front ends, I compound my frustrations by not writing many tests. This, and my general hacking together of front end solutions means that my website is not robust and I often need to hunt down bugs and self inflicted wastes of time. This is something I should do better at, but find hard to motivate myself to do when there are other more exciting things I could spend my free time on building.

I sunk a ton of work into creating a rickety solution when there are so many robust, polished tools out there, but I'm really happy with my extremely personalized 'blogging platform', and I really like having a tool that works exactly how I'd like it to, and that I can customize further with any features I think up. Now I wonder: will I continue to use it, or will it become yet another abandoned dev blog?

The week after I made this post, I found this comic that rings pretty true (slight language warning).