Webcomics R&D

Distribution

https://twitter.com/Iron_Spike/status/1495165593456160769

Nah, WP Crowdfunding was a one-time $400 fee, and when the project is over, I'm converting the funding page to a store page. It's actually really well-integrated into Wordpress!

And my online store is already high-traffic, has been for years, so the hosting plan won't change. https://twitter.com/PaulFerence/status/1495164119250579459


My most expensive ongoing charge, actually? My mailing list.

After you get over 50k addresses, the cost goes monthly and triple-digits.

Weep. 💸

@cabel · Feb 19, 2022 Our list is over 500k at this point and we’ve since switched to a self-install of “Sendy” and I absolutely love it. We pay very little to Amazon SES to deliver the emails and Sendy is a one-time purchase. Lots of admin, yes, but tuck this one into your back pocket for someday!

@Iron_Spike · Feb 19, 2022 OOOOOO. Thank you!

The organization saves development and testing costs by writing and deploying native JavaScript that targets only modern browsers. Through an approach inspired by BBC News’ cutting the mustard, the Foundation enables millions of people (1% of its 2 billion monthly users) to access Wikipedia through a JavaScript-free experience. This is the same experience that all page views start at prior the (optional) arrival of JavaScript code.

The Wikimedia Foundation’s development principles and browser support policy reflects this by emphasizing the importance of progressive enhancement.

Viewing Wikipedia through a web browser is the most common access method, but Wikipedia’s knowledge is consumed far beyond the canonical experience at Wikipedia.org. “Wikipedia content goes everywhere. It’s distributed offline through Kiwix and IPFS, rendered in native apps like Apple Dictionary, and even shared peer-to-peer through USB sticks,” said Timo. What these environments have in common is that they may not involve JavaScript as they require high security and high privacy. This is made possible at no extra cost due to APIs offering complete content HTML-first, with CSS and embedded media based on open formats only.

https://openjsf.org/blog/2023/10/05/wikimedia-case-study/

Digital preservation

A public website is a great first step towards preserving your work, since that lets the Wayback Machine and other web archivers automatically save copies. (Many of my favorite webcomics from growing up can now only be found there; 8EB, etc. Note that the autocrawler isn’t guaranteed to get everything, especially for less popular sites, so it’s a good thing these services also let you manually request archiving particular URLs. I wonder if you can script an archival request when publishing a page…) Letting others archive their own copies is an especially good idea in the face of unnoticed data corruption over time.

Even the Library of Congress recognizes HTML for its digital preservation qualities

But I’m an artist, not a coder!

That’s how I felt for about a decade of my life.

If 12-year-olds could learn how to make Web pages through word-of-mouth back in the 90s…

…then you can learn how to make Web pages today. In my experience, learning how to make comics takes way more technique and patience. If you can make comics, then you absolutely can learn enough HTML+CSS to make your webcomics do things that interest you.

https://daringfireball.net/2003/07/independent_days

The web is where independents shine. Independent web sites tend to look better and are better produced. Their URLs are even more readable. This isn’t bluster about the future, this is a description of today. With a text editor and an Apache web server, you’re on equal footing with any web site in the world.

This can still be true today. Big, well-funded websites have drawbacks your site doesn’t have to:

Also, there’s lots of historical context behind front-end web developers (the kinds that use HTML & CSS) being thought of as artists.

Why your own website?

You can block AI from using your work on a website.

Hey! For anyone who has a website and doesn’t want their work used for generative models (a.k.a. “AI”), I did some research and found ways to block some of them.

For ChatGPT and Google, add the following to your site’s robots.txt:

User-agent: Google-Extended
Disallow: /

User-agent: GPTBot
Disallow: /

User-agent: ChatGPT-User
Disallow: /

GPTBot is used for the actual Large Model training, and ChatGPT-User is I guess a sneaky workaround to let people using the chat interface ask ChatGPT to fetch and process URLs?

noai,noimageai pseudo-standard

For deviantArt and hopefully others, add the following to every HTML page’s <head>:

<meta name="robots" content="noai,noimageai">

This may or may not work for companies other than deviantArt in the future — as the Cohost admins point out:

It’s possible that dataset or model vendors will see platforms blanket-tagging pages with noai and noimageai, and decide that the original intent of the standard — to reflect an artist’s personal, conscious decision not to have their artwork scraped — has been compromised, and their scrapers will start disregarding the flags altogether. In this case, they’ll probably blame us for not playing fair; so be it. At least then they’ll have to admit how thin their commitment to respecting this standard was in the first place.

(You can also set it as an HTTP header on every URL, not just HTML ones, via X-Robots-Tag: noai,noimageai. Doing that is a little bit more involved, since setting it there requires access to your Web server software and syntax specific to it.)

There’s also ai.txt, which I don’t know who respects it — but it’s a small text file, so it doesn’t hurt to have it.

# ai.txt

User-Agent: *
Disallow: *.txt
Disallow: *.pdf
Disallow: *.html
Disallow: *.gif
Disallow: *.ico
Disallow: *.jpeg
Disallow: *.jpg
Disallow: *.png
Disallow: *.svg
Disallow: *.svgz
Disallow: *.webp
Disallow: *.m4a
Disallow: *.mp3
Disallow: *.opus
Disallow: *.mp4
Disallow: *.webm
Disallow: *.ogg
Disallow: *.js
Disallow: *.css
Disallow: *.php
Disallow: /
Disallow: *

What about Google (Bard) and Bing (AI)? [outdated, see above]

Unfortunately there seems to be no way to disallow Google and Bing from using your site for large model training (i.e. Google Bard and Bing AI) without also blocking them from accessing your website entirely. That’s probably intentional on their part — you could block GoogleBot and BingBot entirely if you’re okay with being unsearchable on the Web.

I’ve seen recommendations to block CommonCrawl’s bot, but I doubt that does any good:


Nobody can shadowban you, autocensor your work, or abuse automation to falsely detect Terms of Service violations. You can draw nipples of any kind, and Tumblr can’t tell you no. Your website is immune to the enshittification every big company platform seems doomed to.

Your webhost may screw with you, kowtwo to DMCA takedown requests, or even just plain go out of business — but web hosting is a commodity. You can take your website to a competitor. Or even host it yourself, if you’re willing to leave a computer (or phone!) on 24/7.

The tradeoff for this is you’ll probably have to start paying for your website if it gets sufficiently popular, whether that be from outgrowing free hosting plans, signing up for a CDN, or paying a guy to worry about it for you.

But a popular website is a good problem to have!

Several years ago, I was unhappy with the hosting service that my website was on. So I downloaded all the data from the entire website, uploaded a multi-gigabyte tarball to a new hosting service, changed my domain name so it pointed to the new service, then closed my account with the old service.

Erin Ptah is the author/artist of the comics Leif & Thorn and But I’m A Cat Person. Both are hosted on multi-database websites that integrate third-party applications, open-source components, seamless interactions with major platforms like Patreon and Twitter, and the artist’s own custom-written code. —https://www.comicsbeat.com/opinion-kickstarter-wont-explain-their-blockchain-protocol-so-i-will/

There can be technical hiccups with transferring website data like this, but notice the words I used. Technical hiccups. The kind that are inconvenient, but solvable.

Web publishing is weirder than normal publishing

One thing that can trouble authors is that the Web has its own style of “good publishing”.

Graphics editor programs let you control every pixel, promising identical reproduction when you publish. But the Web’s not like that. Its unprecedented ease-of-access and global reach comes at the cost of its inherent flexibility and eternally-partial support across almost every personal computing device there is:

A Dao of Web Design

Website platforms are still platforms

Instagram

Portability
None. Can publish HTML comics
No. Worthwhile site functionality
Almost none. You get one link in bio, and that's it.
No theming possible
[Does Instagram have RSS?]
Reading the comic chronologically from the beginning is not possible. Shitty parts
Users without Instagram accounts have a really bad time
No control over how images are displayed
Aggressive image optimization can make your comic look bad and you can’t do anything about it
Supporting probably the worst tech company in the world
Followers not actually likely to see your updates Do your own your site?
No

Tumblr

Portability
eh. There are export tools, at least Can publish HTML comics
Kinda — they let you make “custom HTML pages” in the theme editor. Those don’t work in their normal posting workflow though, so you’ll need to make a post linking to them to show up in people’s Dashboards/RSS/schedule posts, etc. Worthwhile site functionality
✅ Good RSS (choose from all posts or specific tags-only)
Themes as custom as you can code
Seems to have notifications support for updates once you “follow” a blog, but that requires a Tumblr account
🆗 You can mess with the theme to do things like PWAs
The right theme can set up /chrono posts to read posts in order, like most webcomic archives should Shitty parts
Automated NSFW detection algorithms are still annoying in the best case
Tons of shitty JS you can’t remove
Image resizing/optimization can be aggressive, very long images get cropped Do your own your site?
No

Wix

Portability
[I don’t know yet] Can publish HTML comics
[dunno yet] Worthwhile site functionality
❌ No RSS as far as I can tell Do your own your site?
No

wordpress.com

Portability
✅ Great — you can freely export all your site content and infrastructure and upload it to your own hosted WordPress install Can publish HTML comics
I think so? Might require some creativity with their tag allowlist, though. Worthwhile site functionality
Can choose from various themes, but I dunno if you can code your own
Good RSS (choose from all posts or comments) Do your own your site?
No, but good portability

Self-hosted WordPress (a.k.a. wordpress.org)

Portability
🌟 Most popular option by far — hosts have tons of import/export tools Can publish HTML comics
🌟 (not as easy as it could be, but a plugin could easily make it so) Worthwhile site functionality
🌟 (for everything other than HTML comics, there’s a plugin for it, or content online showing you how to code it yourself)
Email subscription updates, good RSS, infinitely customizable RSS Shitty parts
You have to stay on top of updates. Like your phone/app updates, but with higher security problems if you fall behind. Do your own your site?
Yes

Squarespace

Portability
Not sure yet, but nothing else out there uses its HTML templating system if you turn on Developer Mode Can publish HTML comics
[not sure] Worthwhile site functionality
Sort of has RSS, but it’s annoying and you can’t get one for all site updates. Do your own your site?
No

Static HTML hosts

Portability
Best of all possibilities. Can publish HTML comics
Duh Worthwhile site functionality
Things like comments and dynamic behavior are by default not possible. But if you’re not afraid of code, you can set it up with IndieWeb technologies and regenerating pages. Do your own your site?
Yes

Why mailing lists?

https://buttondown.email

Why old-fashioned feeds?

a.k.a. Why RSS/Atom/h-feed/JSON Feed/etc.?

https://developers.google.com/search/docs/appearance/google-discover#feed-guidelines

RSS by platform

Some platforms make this easier than others:

Why bother?

If this is all so confusing, and most readers doesn’t use feed readers, then why bother?

RSS, also known as/confused with Feeds, Atom, and Google Reader, are how websites can let people subscribe to updates.

To be fair, the whole situation is confusing. The underlying technologies don’t have that much going on, but they seem complex and unfriendly to non-techie users because of historical reasons, bad acronyms, and developers’ inability to let things be.

RSS feed
A URL that promises that every time you view it, you receive a list of the website’s recent updates in a data format that isn’t for humans to read, but for other programs to process and then show humans the updates.
The feed can republish the full contents of the updates, or just provide a link to the full URL of the updates, or anywhere in between.

RSS reader a.k.a Feed reader : A program designed to automatically check RSS feed URLs, process the results, then show the updates to humans.

Google Reader
A specific RSS reader popular back in the day, mostly because it combined “read website updates” with a social layer that recommended things from RSS feeds other people were reading. Google killed it despite its popularity because that’s how Google do.

Feed reader Reader app : Turns out most programs that call themselves “RSS readers” can subscribe to more than just RSS. Feedbin, for example, can also subscribe to email newsletters, social media accounts, podcasts, and YouTube channels. (Don’t tell anyone, but podcasts and YouTube channels publish their own RSS feeds, so Feedbin just knows how to look for those.) : They can be phone apps, desktop programs, websites themselves, or a mixture of any of those. Some, like Feedbin, also let you consume their subscribed updates in other reader apps if you want, so the distinctions are all pretty loose.

Different kinds of feeds

RSS the standard feed file format a.k.a. RSS 2.0 : Different from the phrase “RSS feed” because that one has the genericized trademark problem. : A specific way to code website updates into an XML file. Because it came first, not all of the decisions it made were good ones.

Atom feed

A specific flavor of RSS feed. All Atom feeds are RSS, but not all RSS feeds are Atom.

Basically any RSS reader treats them the same to their users, but if you’re a programmer, going with Atom tends to be less buggy and easier to code: it’s still XML under the hood but has a better-defined processing model, more explicit escaping, and the weird warts of RSS 2.0 were filed off.

JSON Feed

Even though the difference between RSS and Atom already confuses people, one day a programmer thought the technologies they’re based on were kind of old and icky. So they made a new data format for feeds using a technology called JSON that is more appealing to modern programmer sensibilities, but accomplishes exactly the same thing.

To be more fair to the inventors, programming a correct RSS/Atom feed is harder than just passing some data to the JSON-creating function that every programming language has nowadays.

h-feed

The opposite of JSON Feed, sorta. One of the general issues with feeds is that you’re duplicating your website updates:

  1. First, you publish updates on your website — in HTML.
  2. Then, you publish those updates again as RSS-flavored XML, Atom-flavored XML, or JSON

But HTML is already a data format that computers are really good at parsing and showing to humans. Why not just sprinkle in some hints to your website’s HTML to let Reader software reuse your website to provide an update feed automatically? That’s h-feed: some class attributes added to a website’s existing HTML to let software check for updates.

As much as I like h-feed, separation between your site’s HTML and your feed updates can be nice:

  • Feed-only content
    • Reader programs don’t always have the features browsers do (embeds, CSS, JavaScript, etc.), so sometimes you want to provide a lower-fidelity version for Reader programs specifically. Like, if you embed an interactive data visualization into your blog post to support your point, RSS readers probably won’t support the <iframe> used to embed it, the JavaScript that parses the dataset, or the CSS+SVG used to actually visualize the data — so instead, you’d use a preview image inside a link to the full thing for Reader programs specifically.
    • Or, what Adrian Roselli does with his RSS Club
  • Some RSS feeds don’t have websites attached.

Ads? Affiliates? "How can my site make money, or at least break even?"

Beware bringing Notification Syndrome onto your site

Notes from old Sparkbox presentation on analytics

  1. Acquisition: they came

  2. Activation: they tried

  3. Retention: they returned

  4. Referral: they recommended

  5. Revenue: $

  6. Happiness

  7. Engagement (time spent)

  8. Adoption

  9. Retention

  10. Task success (accomplishment)

See: search/browse Think: save for later/discuss/reviews Do: action

Designing with Data (note the first 3 steps are "free", which means while they're insufficient too many organizations stop there)

  1. Frame the question: known unknowns, unknown unknowns, iterate, new sources
  2. Start w/analytics: segment, "behaviors", general trends
  3. Add social listening: see what social media says (potential bias: only talky social users)
  4. Consider a study: attitude and/or behavior
  5. A/B test alternatives; but remember big picture
  6. Measure w/meaning: what, why, who
  7. End w/more questions

Punk DIY websites