You can use newline characters in URLs
45 points
3 days ago
| 12 comments
| lemire.me
| HN
bmandale
3 hours ago
[-]
>Remove all ASCII tab or newline from input.

the title is referring to inside html attributes, where they will be removed hence not affect where the link points.

reply
joshuahaglund
3 hours ago
[-]
Yeah "You can use newline or tab characters in the HREF attribute and the browser will throw a validation error, remove the offending character, try again, then succeed" would be a more accurate title.
reply
locknitpicker
53 minutes ago
[-]
> the title is referring to inside html attributes, where they will be removed hence not affect where the link points.

I thought so too, until I read the URL definition in RFC 1738

   In some cases, extra whitespace (spaces, linebreaks, tabs, etc.) may need to be added to break long URLs across lines.  The whitespace should be ignored when extracting the URL.

   No whitespace should be introduced after a hyphen ("-") character. Because some typesetters and printers may (erroneously) introduce a hyphen at the end of line when breaking a line, the interpreter of a URL containing a line break immediately after a hyphen should ignore all unencoded whitespace around the line break, and should be aware that the hyphen may or may not actually be part of the URL.
reply
pants2
3 hours ago
[-]
You can put pickle juice in your cereal too
reply
nine_k
3 hours ago
[-]
When you write a regexp to detect liquids in your cereal, you have to account for the pickles, that is, newlines an tabs.
reply
dotancohen
1 hour ago
[-]
Don't forget about the pickled cabbage (vertical tabs) and pickled pigs foot (null bytes).
reply
dylan604
3 hours ago
[-]
I was thinking similar. Just another example of just because you can doesn't mean you should.
reply
layman51
3 hours ago
[-]
After I read this, I started to look at the Wikipedia article on Base64 and eventually got to the article for the data URI scheme. That's where I found a sentence that seems to a little bit at odds with the blogpost. The Wikipedia article mentions that "whitespace characters are not permitted in data URIs".

But then I suppose it goes back to the main thrust of the blogpost because it says that in the context of HTML 4 and 5, that linefeeds within an attribute value are ignored. So possibly there are some other contexts where whitespace might not be ignored.

reply
TZubiri
2 hours ago
[-]
They are not, but you can encode them, if you encode whitespace characters, you included whitespace in a URL.

One of the requirement of URLs is that it needs to be transmissible over paper or aural media, so arbitrary octets and the unused portion of ASCII are not legal either.

reply
yndoendo
2 hours ago
[-]
Don't forget about pigeon packets. https://www.rfc-editor.org/rfc/rfc2549
reply
sheept
2 hours ago
[-]
Somewhat relatedly, GitHub Pages does support using URL-encoded newline characters %0A to reference file names with newlines,[0] but GitHub itself will omit the file from the web UI's tree view.

[0]: https://sheeptester.github.io/hello-world/test/%20%0A%20%0A/...

reply
behnamoh
3 hours ago
[-]
title is misleading. I agree with @bmandale's comment.
reply
renewiltord
3 hours ago
[-]
I don't even put space characters in my filenames. May MyDocu~1 live on forever.
reply
galaxyLogic
43 minutes ago
[-]
I try to use "_" instead of whitespace in filenames. Means no need to URI-encode them ever. If you have a space you don't know whether it's a tab or space. Or maybe two spaces. Also when you tell somebody what the file-name is, you don't prnounce spaces.
reply
est
3 hours ago
[-]
on a side note you can use many surprising non-standard HTTP verbs, but many CDNs like Cloudflare filter them
reply
jprjr_
2 hours ago
[-]
I stopped reading Daniel Lemire a while back.

He had a blog post that seemed just weird and out of left field. Like it was clearly a response to something but what? What was the motivation for it?

When asked he said y'know. He just thinks about stuff and writes and that's what he does.

Turns out the blog post was a post he also made on social media. And said post was a response to something. And I guess he thought it was pretty good writing and should go on his blog, too.

Nothing wrong with that on it's own but I feel like most people would preface a post like that with "I saw this thing." And when directly asked like... He just straight up lied?

That whole thing just rubbed me the wrong way.

For full context https://lemire.me/blog/2025/10/17/research-results-are-cultu...

In the comments I turned into kind of a dick. I was pretty upset about being lied to.

Anyways between that and articles like this that are honestly useless and kinda misleading - I'm not really the biggest fan.

reply
jprjr_
2 hours ago
[-]
Looking back I'm still perplexed about why he never just linked to the original thing he was responding to.

I mean listen I understand - I'm not owed anything. If he wants to take posts from elsewhere and share them to his blog with all context and background removed that's his business. And he doesn't have to respond to any comments he doesn't want to.

But if he gets a question he doesn't want to answer... He could just not answer it. Just leave my comment hanging. Hell - he could delete it even. I'd be perplexed but would probably shrug it off.

The whole lying thing is what bothers me. I'd rather somebody just not respond than try to feed me bullshit.

reply
etothet
2 hours ago
[-]
“Hey you got new lines in my URLs!”

“You got URLs in my new lines!”

reply
bubblewand
3 hours ago
[-]
Vertical tabs in file names is where it’s at.
reply
TZubiri
3 hours ago
[-]
Cool thanks I 100% will not, if only because newlines are header separators in HTTP.
reply
vivzkestrel
3 hours ago
[-]
- https://lemire.me/blog/ I am not able to see a quick list of all the posts on your blog, I tried all the pages

- https://lemire.me/posts

- https://lemire.me/archive

- https://lemire.me/archives

- Everyone of them gives me a 404, can you kindly add some page on your blog form where I can just see the titles of all the articles quickly?

- Most blogs posted on HN are not user friendly in this regard, sometimes the reader wants a quick glimpse of everything on 1 page so that they can quickly pick interesting stuff

reply