XHTML Club
21 points
1 hour ago
| 7 comments
| xhtml.club
| HN
GavinAnderegg
18 minutes ago
[-]
In the early 2000s I was 100% sold on the idea of strict XHTML documents and the semantic web. I loved the idea that all web pages could be XML documents which easily provided their data for other sources. If you marked your document with, an XHTML 1.0 Strict or XHTML 1.1 doctype, a web browser was supposed to show an error if the page contained an XML error. Problem was, it was a bit of a pain to get this right, so effectively no one cared about making compliant XHTML. It was a nice idea, but it didn't interact well with the real world.

Decades later, I'm still mildly annoyed when I see self-closing tags in HTML. They're no longer required and they remind me of the strict XHTML dream.

EDIT: I just checked, and my site (at least the index page) still validates! https://validator.nu/?showsource=yes&doc=https%3A%2F%2Fander...

reply
hannob
22 minutes ago
[-]
I used to create a number of simple web pages in XHTML back in the days when we believed XHTML was the future. Recently, while going through and restructuring some of my old "online stuff", I learned that XHTML really isn't in a state that I'd want to use it any more:

* XHTML 1.0 and 1.1 are officially deprecated by the W3C.

* XHTML5 exists as a variant of HTML5. However, it's very clear that it's absolutely not a priority for the HTML5 working groups, and there's a statement that future features will not necessarily be supported by the XHTML5 variant.

* XHTML5 does not have a DTD, so one of the main advantages of XHTML - that you can validate its correctness with pure XML functionality - isn't there.

* If you do a 'view source' in Firefox on a completely valid XHTML 1.0/1.1 page, it'll redline the XML declaration like it's something wrong. Not sure if this is intended or possibly even a bug, but it certainly gives me a 'browser tells me this is not supposed to be there' feeling.

It pretty much seems to me XHTML has been abandoned by the web community. My personal conclusion has been that whenever I touch any of my old online things still written in XHTML, I'll convert them to HTML5.

reply
al_borland
19 minutes ago
[-]
I was in college when XHTML was all the rage and everything we wrote had to pass validation. I still get uncomfortable adding breaks without closing them.
reply
jraph
47 minutes ago
[-]
In the linked article:

> you should master the HTML programming¹ language

The footnote reads:

> 1. This is a common debate - but for simplicity sake I'm just calling it this.

It's not really a debate, HTML is a markup language [1], not a programming language: you annotate a document with its structure and its formatting. You are not really programming when you write HTML (the markup is not procedural) (and this is not gatekeeping, there's nothing wrong about this and doesn't make HTML a lesser language).

To avoid the issue completely, you can phrase this as: "you should master HTML" and remove the footnote. Simple, clean, concise, clear. By the way, ML already means "Markup Language", so any "HTML .* language" phrasing can feel a bit off.

[1] https://en.wikipedia.org/wiki/Markup_language

reply
falcor84
25 minutes ago
[-]
I think that it is a debate, and it depends on the role of HTML in your system.

If all you're doing is using HTML to "annotate a document with its structure and its formatting", then yes, I'll accept that it's not quite programming, but I've not seen this approach of starting with a plain non-html document and marking it up by hand done in probably over two decades. I do still occasionally see it done for marking up blog posts or documentation into markdown and then generating html from it, but even that's a minuscule part of what HTML is used for these days.

Your mileage my vary, but what I and people around me typically do is work on hundreds/thousands of loosely coupled small snippets of HTML used within e.g. React JSX, or Django/Jinja templates or htmx endpoints, in order to dynamically control data and state in a large program. In this sense, while the html itself doesn't have control flow, it is an integral part of control flow in the larger system, and it's extremely likely that I'll break something in the functionality if I carelessly change an element's type or attribute value. In this sense, I'm not putting on a different hat when I'm working on the html, but just working on a different part of the program.

reply
jraph
20 minutes ago
[-]
> React JSX, or Django/Jinja templates

Those are not HTML. PHP neither, even when used as a templating language for HTML.

> htmx endpoints

Not really familiar with htmx, but I would say this is HTML augmented with some additional mechanisms. I don't know how I would describe this augmented HTML, but I'm not applying my "not programming" statement to htmx (I probably could, but I haven't given enough thoughts to do it).

> In this sense, I'm not putting on a different hat when I'm working on the html, but just working on a different part of the program.

I agree with this actually. I wouldn't consider that writing HTML (or CSS) is really a separate activity when I'm building some web app.

reply
throwaway150
12 minutes ago
[-]
> In this sense, while the html itself doesn't have control flow, it is an integral part of control flow in the larger system

That's correct but I don't see what it has got to do with the question of whether HTML is a programming language or not.

Strings do not have control flow but strings are integral part of larger programs that have control flow. So what? That doesn't make strings any closer to being programming languages.

reply
radicalethics
26 minutes ago
[-]
What happens if I simply add an iterator mechanism to HTML (well, I guess we need variables too)? Is it no longer a markup language here (I won't add anything else):

<for i=0; i<1; i++> <html> </html> </for>

Better question, why don't we upgrade XML to do that?

reply
jraph
23 minutes ago
[-]
That's not technically HTML anymore.

But if you disagree with this, or somehow work around this statement by replacing your for element with some "for-loop" custom element (it is valid HTML to add custom tags with dashes in their names), my stronger argument is at https://news.ycombinator.com/item?id=46743219#46743554

reply
embedding-shape
38 minutes ago
[-]
I dunno, you're being pedantic :) Yes yes, the name clearly ends up "Markup Language" so yeah, with a very strict definition of programming languages, HTML is not one of them.

But if we use a broader definition, basically "a formal language that specifies behavior a machine must execute", then HTML is indeed a programming language.

HTML is not only about annotating documents or formatting, it can do things you expect from a "normal" programming language too, for example, you can do constraints validation:

    <input name="token" required pattern="[A-Z]{3}-\d{4}" title="Must match ABC-1234 (3 uppercase letters, hyphen, 4 digits)" placeholder="ABC-1234">
That's neither annotating, just a "document" or just formatting. Another example is using <details> + <summary> and you have users mutating state that reveals different branches in the page, all just using HTML and nothing else.

In the end, I agree with you, HTML ultimately is a markup language, but it's deceiving, because it does more than just markup.

reply
jraph
35 minutes ago
[-]
> I dunno, you're being pedantic :)

It might be, I'm usually not, but this is all xhtml.club and this footnote are about, might as well be correct :-)

Constraint validation is still descriptive (what is allowed)

All details and summary are doing is conveying information on what's a summary and what's the complete story, and it has this hidden / shown behavior.

In any case, you will probably find something procedural / programming like in HTML, but it's not the core idea of the language, and if you are explaining what HTML is to a newbie, I feel like you should focus to the essential. Then we can discuss the corners between more experienced people.

In the end, all I'm saying is: you can just avoid issues and just say "HTML" without further qualifying it.

reply
throwaway150
21 minutes ago
[-]
I'm not sure we can call your parent comment pedantic. They're just being correct. Is it pedantic to say that fish is not a fruit? It's just correct to do so.

If anything, it is the act of stretching the definition of "programming language" so much that it includes HTML as a programming language that we should call pedantic.

reply
nathell
37 minutes ago
[-]
It’s ironic that the very site in question, despite claiming XHTML compliance, is served as text/html instead of application/xhtml+xml, so the browser will never parse it as XML.

To quote [0]:

> All those “Valid XHTML 1.0!” links on the web are really saying “Invalid HTML 4.01!”.

Although the article is 20 years old now, so these days it’s actually HTML5.

Edit: Checked the other member sites. Only two are served as application/xhtml+xml.

[0]: https://webkit.org/blog/68/understanding-html-xml-and-xhtml/

reply
assimpleaspossi
22 minutes ago
[-]
>>these days it’s actually HTML5.

There is no HTML5. It's just a buzzword. https://html.spec.whatwg.org/dev/introduction.html#is-this-h...?

reply
jraph
15 minutes ago
[-]
That's a stretch. Your link says

> Is this HTML5?

> In short: Yes.

See also [1].

That HTML5 was used in marketing doesn't make the technical term disappear. HTML5 is a bit more precise than HTML, it refers to the living standard that's currently in use, as opposed to HTML 4.01 and the previous versions of HTML.

[1] https://fr.wikipedia.org/wiki/HTML5

reply
jraph
33 minutes ago
[-]
And this makes the XML prolog invalid, because it's invalid to have it in HTML.

Not having it is XHTML compliant though, so it could just be removed.

reply
reconnecting
24 minutes ago
[-]
Valid HTML 4.01 (1) made in 2025 counts?

I don’t thing it’s about luddites as website mentioned. Many professions have tools suggesting that person have extensive experience and in terms of web development, XHTML or old standards of HTML are such.

1. https://www.tirreno.com

reply
jraph
3 minutes ago
[-]
The XML part of XHTML is an important feature which HTML 4.01 doesn't have.

I know it is unfortunately not the case, but bragging that your HTML is valid should be equivalent to being proud that your java code parses and compiles…

I would go to the lengths of stating that writing valid HTML should be a bare minimum, and then we can talk about whether to use the XML markup.

And I think few things are actually invalid HTML5.

But since browsers will happily take invalid HTML, it's nice to care about doing the right thing and ensuring your HTML is valid.

reply
throwaway150
20 minutes ago
[-]
It does not? HTML 4.01 is not XML. So not XHTML. What's the confusion?
reply
reconnecting
12 minutes ago
[-]
Both technologies are from the same period and share same validation culture from W3.
reply
kevincox
31 minutes ago
[-]
I would really like to use XHTML. It would make my HTML emitter much simpler (as I don't need special rules for elements that are self-closing, have special closing or escaping rules and whatever else) and more secure.

However no browsers have implemented streaming XHTML parsers. This means that the performance is notably worse for XHTML and if you rely on streaming responses (I currently do for a few pages like bulk imports) it won't work.

reply
jraph
27 minutes ago
[-]
> no browsers have implemented streaming XHTML parsers

Dang, I hadn't considered this. That's something to add to the "simplest HTML omitting noisy tags like body and head vs going full XHTML" debate I have with myself.

One for XHTML: I like that the parser catches errors, it often prevent subtle issues.

reply