[Note: The below is a lightly-edited revision of an email message I sent to a contributor to the Traceable Heraldic Art collection who asked about the technology used to update the web site. It’s somewhat rambling and may not be of interest to most, but I figured it was worth putting it in the public record. — Mathghamhain]
In hindsight it would have been sensible to tackle the creation of the online Traceable Heraldic Art collection as a web database project, but for historical reasons that’s not at all how it’s architected.
Back in 2016 I was a new herald looking for projects to get involved with, and was recruited to help with an effort to produce an updated version of the Pennsic Traceable Art book, which was a 685-page PDF file created in 2005-07. Although it was available online, it’s primary use was to sit as printouts in a stack of jumbo three-ring binders that were kept in the art tent behind Pennsic Heralds Point and traced onto submission forms using a set of light tables.
Almost all of the art in the Pennsic Traceable collection had itself been drawn in that same tent during previous years, and for a decade the collection had done a decent job of filling its niche… but a lot of the scans were horribly pixelated, and when people digitally copy-n-pasted the art into new designs rather than using it as a template for manual tracing, the results were ugly, so for years there had been talk about refreshing the collection with cleaner line art, and replacing some of the items that had been returned as unregistrable or that were difficult to trace clearly.
As a new herald with some digital graphics experience, I figured I’d be able to find ways to contribute to the effort to produce a new edition of this book — but then after a couple of months, the person who had originally volunteered to lead the project dropped out, and I wound up in charge of the overall effort.
Because the original goal was to produce a new version of the Pennsic Traceables, the primary form of the content in my new version followed the same format — a printable document with one page per illustration, with each charge repeated at a bunch of different sizes on the same sheet, so that someone working in the art tent can grab the page for “an antelope salient” and stick it on their light table and trace whichever size fits in the appropriate spot on the submission form.
Some of the collection’s idiosyncrasies are a result of this same original use case — for example, on the printable versions, the charge name is on the bottom of the printable pages because in the Pennsic Heralds’ Art Tent, all of the pages wind up in three-ring binders with holes punched at the top edges, so when flipping through the pages it’s easier to see the names if they’re on the bottom edge.
(The old Pennsic collection didn’t include ordinaries and field divisions, which I had previously started drawing as part of a personal heraldic clip-art collection and soon folded in to the combined new collection; it doesn’t make sense to print those at multiple sizes, but they do need separate versions for device and badge forms, so when they were added I was forced to create a new page layout but still followed most of the same conventions.)
Four and a half years later, that’s still the master format — a giant printable document with one design per page, generally repeated a bunch of times at multiple sizes — although as the collection grew I eventually had to split it up into separate files per volume, because dealing with a 5,000 page, 600 MB file would bring my computer to a standstill.
I do all of the document creation in a Mac technical drawing tool called OmniGraffle, which is sort of like a cross between Visio and Illustrator. In addition to exporting the entire document as a giant PDF file, it also makes it easy to export each individual illustration as separate files in various standard formats (SVG, PDF, PNG, JPEG, etc), and early on I figured I would use that ability to make things easier for people who weren’t at Pennsic by posting those files on my web site.
While I was working on building the web site, I was also playing around with some command-line tools for handling PDFs and graphics files, and realized I could use those tools to extract the text from the individual pages of the giant PDF for reuse on the web. I put together a quick Perl script to invoke those command-line tools and generate one web page for each charge, along with download links for the various image files, then over time added more features to that, like grouping related charges together, building indexes of artists’ names, and so forth.
Years later, that script is now a 200 KB monster that takes between five minutes and and an hour to run as it incrementally updates the web site to reflect the latest changes in the master documents. You can find that code, along with instructions about how to get it working, on the Web Site Build Code page.
So, my typical workflow for adding new charges looks like this:
- Convert a new image to a format OmniGraffle can use.
- For folks who send me SVG files, this is easy — I just drag them in and make sure that the colors and line weights they’ve used are reasonable.
- When folks send me PNG or JPEG files, I use a vectorizer called DragPotrace that can convert black-and-white images to vector formats, although sometimes I need to open their source files and futz with the contrast or clean up some line edges so that it converts cleanly.
- When I’m importing art from old books or period rolls, I either flatten the scans to black and white, clean up the edges and feed them to the vectorizer, or else import the art to my drawing tool, move it to a locked layer, and trace over each shape with bezier curves, click by click.
- Open the OmniGraffle document for whatever section of the site I’m adding it to (ordinaries, birds, tools, etc) and create a new blank page, paste in the new charge and apply the scaling factors to create smaller versions of it as needed, then fill in the title and description text.
- After I’ve added several pages to one of those documents, I export the entire document in multiple formats — as a single giant PDF file, as a folder of separate SVG files, and as a folder of separate PNG files.
- When I’ve exported one or more of those documents, I launch my command-line Perl script, which works through each section looking for changes. If there are new PDF files it extracts all of the text from them, if there are new PNG files it creates scaled thumbnail files and black-and-white versions, and so forth. It reads in the text for every page in the master PDF files and makes a list of every PNG and SVG file, then uses that to regenerate the entire web site as a directory full of flat HTML files. Finally, the script uses rsync and scp to synchronize any files that have changed from my desktop computer up to my rented Linux web server.
The whole system feels quite jury rigged, and as someone who builds highly-dynamic database-backed web applications for their professional career I’m somewhat horrified by the static-page-generation approach… but on the other hand, it works.
I’m limited in how much information I can attach to each item by the amount of space on each printable page for text. I try to be consistent about how that text is written, so that the code I’ve written can pick out the name of the source and artist, recognize when a charge’s default posture is listed, add items that mention being a Step from Core Practice to a relevant list, and so forth.
The original sources of art I was working with in the first year were all modern drawings by SCA artists, so there wasn’t any system of tagging illustrations by geography or time period, but as I’ve added more art from period sources I’ve been trying to figure out how to incorporate that kind of data. I’m pretty careful about linking each image with a source, and have put some work into grouping the sources by geographic region and chronological sequence, but those associations are still incomplete.
(And I haven’t even thought about how to handle detailed tagging of individual items — eg, this charge comes from a German family’s arms, but it was drawn in a Spanish armorial by a French artist… sheesh.)
I think that if I was starting over with what I know now, I would be inclined to approach it differently: instead, I’d store everything in a SQL database, including the SVG files and all of the charge metadata: title, artists, source, and more. Then I would write code to programmatically generate both the web pages and the other graphics files — for example, it’s pretty easy to use a command-line tool to render an SVG file into PNG format, and it’s also possible to generate a full PDF page containing SVG files and custom text — I figured out a technique for this in order to create the “Visual Catalog” file that incorporates all of the charges into a single 300-page flip book.
With a bit of extra effort, such a tool could support user authentication, so that rather than sending me SVG files by email and waiting for me to get them posted, you could sign in to the site, upload your art, and type in the title and source information to have the new images appear on the site immediately.
But that would require months of development effort and ongoing maintenance, and I’ve never quite managed to convince myself to prioritize that project, and so I keep doing things the old way.
I’ve been hoping that someone will eventually come along and build such a web site, then allow me to send them a dump of all of my content to be imported, at which point I can move on to focus on other projects, of which I have a dozen queued up waiting for my attention — but I haven’t yet stumbled on anyone who has both the necessary skills and the inclination to dedicate a big chunk of their life to such an effort.
If you might be that person, get in touch and let’s talk!
Here are some notes laying out some of the kinds of data managed by the current system with an eye towards a possible design for the schema of a future database implementation:
http://blog.heraldicart.org/2021/09/a-database-schema-for-the-traceable-art/