CSS-Tricks has covered how to break text that overflows its container before, but not much as much as you might think. Back in 2012, Chris penned “Handling Long Words and URLs (Forcing Breaks, Hyphenation, Ellipsis, etc)” and it is still one of only a few posts on the topic, including his 2018 follow-up “Where Lines Break is Complicated. Here’s all the Related CSS and HTML.”
Chris’s tried-and-true technique works well when you want to leverage automated word breaks and hyphenation rules that are baked into the browser:
.dont-break-out {
/* These are technically the same, but use both */
overflow-wrap: break-word;
word-wrap: break-word;
word-break: break-word;
/* Adds a hyphen where the word breaks, if supported (No Blink) */
hyphens: auto;
}
But what if you can’t? What if your style guide requires you to break URLs in certain places? These classic sledgehammers are too imprecise for that level of control. We need a different way to either tell the browser exactly where to make a break.
Why we need to care about line breaks in URLs
One reason is design. A URL that overflows its container is just plain gross to look at.
Then there’s copywriting standards. The Chicago Manual of Style, for example, specifies when to break URLs in print. Then again, Chicago gives us a pass for electronic documents… sorta:
It is generally unnecessary to specify breaks for URLs in electronic publications formats with reflowable text, and authors should avoid forcing them to break in their manuscripts.
Chicago 17th ed., 14.18
But what if, like Rachel Andrew (2015) encourages us, you’re designing for print, not just screens? Suddenly, “generally unnecessary” becomes “absolutely imperative.” Whether you’re publishing a book, or you want to create a PDF version of a research paper you wrote in HTML, or you’re designing an online CV, or you have a reference list at the end of your blog post, or you simply care how URLs look in your project—you’d want a way to manage line breaks with a greater degree of control.
OK, so we’ve established why considering line breaks in URLs is a thing, and that there are use cases where they’re actually super important. But that leads us to another key question…
Where are line breaks supposed to go, then?
We want URLs to be readable. We also don’t want them to be ugly, at least no uglier than necessary. Continuing with Chicago’s advice, we should break long URLs based on punctuation, to help signal to the reader that the URL continues on the next line. That would include any of the following places:
- After a colon or a double slash (//)
- Before a single slash (/), a tilde (~), a period, a comma, a hyphen, an underline (aka an underscore, _), a question mark, a number sign, or a percent symbol
- Before or after an equals sign or an ampersand (&)
At the same time, we don’t want to inject new punctuation, like when we might reach for hyphens: auto;
rules in CSS to break up long words. Soft or “shy” hyphens are great for breaking words, but bad news for URLs. It’s not as big a deal on screens, since soft hyphens don’t interfere with copy-and-paste, for example. But a user could still mistake a soft hyphen as part of the URL—hyphens are often in URLs, after all. So we definitely don’t want hyphens in print that aren’t actually part of the URL. Reading long URLs is already hard enough without breaking words inside them.
We still can break particularly long words and strings within URLs. Just not with hyphens. For the most part, Chicago leaves word breaks inside URLs to discretion. Our primary goal is to break URLs before and after the appropriate punctuation marks.
How do you control line breaks?
Fortunately, there’s an (under-appreciated) HTML element for this express purpose: the <wbr>
element, which represents a line break opportunity. It’s a way to tell the browser, Please break the line here if you need to, not just any-old place.
We can take a gnarly URL, like the one Chris first shared in his 2012 post:
http://www.amazon.com/s/ref=sr_nr_i_o?rh=k%3Ashark+vacuum%2Ci%3Agarden&keywords=shark+vacuum&ie=UTF8&qid=1327784979
And sprinkle in some <wbr>
tags, “Chicago style”:
http:<wbr>//<wbr>www<wbr>.<wbr>amazon<wbr>.com<wbr>/<wbr>s/<wbr>ref<wbr>=<wbr>sr<wbr>_<wbr>nr<wbr>_<wbr>i<wbr>_o<wbr>?rh<wbr>=<wbr>k<wbr>%3Ashark<wbr>+vacuum<wbr>%2Ci<wbr>%3Agarden<wbr>&<wbr>keywords<wbr>=<wbr>shark+vacuum<wbr>&ie<wbr>=<wbr>UTF8<wbr>&<wbr>qid<wbr>=<wbr>1327784979
Even if you’re the most masochistic typesetter ever born, you’d probably mark up a URL like that exactly zero times before you’d start wondering if there’s a way to automate those line break opportunities.
Yes, yes there is. Cue JavaScript and some aptly placed regular expressions:
/**
* Insert line break opportunities into a URL
*/
function formatUrl(url) {
// Split the URL into an array to distinguish double slashes from single slashes
var doubleSlash = url.split('//')
// Format the strings on either side of double slashes separately
var formatted = doubleSlash.map(str =>
// Insert a word break opportunity after a colon
str.replace(/(?<after>:)/giu, '$1<wbr>')
// Before a single slash, tilde, period, comma, hyphen, underline, question mark, number sign, or percent symbol
.replace(/(?<before>[/~.,\-_?#%])/giu, '<wbr>$1')
// Before and after an equals sign or ampersand
.replace(/(?<beforeAndAfter>[=&])/giu, '<wbr>$1<wbr>')
// Reconnect the strings with word break opportunities after double slashes
).join('//<wbr>')
return formatted
}
Try it out
Go ahead and open the following demo in a new window, then try resizing the browser to see how the long URLs break.
This does exactly what we want:
- The URLs break at appropriate spots.
- There is no additional punctuation that could be confused as part of the URL.
- The
<wbr>
tags are auto-generated to relieve us from inserting them manually in the markup.
This JavaScript solution works even better if you’re leveraging a static site generator. That way, you don’t have to run a script on the client just to format URLs. I’ve got a working example on my personal site built with Eleventy.
If you really want to break long words inside URLs too, then I’d recommend inserting those few <wbr>
tags by hand. The Chicago Manual of Style has a whole section on word division (7.36–47, login required).
Browser support
The <wbr>
element has been seen in the wild since 2001. It was finally standardized with HTML5, so it works in nearly every browser at this point. Strangely enough, <wbr>
worked in Internet Explorer (IE) 6 and 7, but was dropped in IE 8, onward. Support has always existed in Edget, so it’s just a matter of dealing with IE or other legacy browsers. Some popular HTML-to-PDF programs, like Prince, also need a boost to handle <wbr>
.
One more possible solution
There’s one more trick to optimize line break opportunities. We can use a pseudo-element to insert a zero width space, which is how the <wbr>
element is meant to behave in UTF-8 encoded pages anyhow. That’ll at least push support back to IE 9, and perhaps more importantly, work with Prince.
/**
* IE 8–11 and Prince don’t recognize the `wbr` element,
* but a pseudo-element can achieve the same effect with IE 9+ and Prince.
*/
wbr:before {
/* Unicode zero width space */
content: "\200B";
white-space: normal;
}
Striving for print-quality HTML, CSS, and JavaScript is hardly new, but it is undergoing a bit of a renaissance. Even if you don’t design for print or follow Chicago style, it’s still a worthwhile goal to write your HTML and CSS with URLs and line breaks in mind.
References
- Andrew, Rachel. 2015. “Designing for Print with CSS.” Smashing Magazine, January 7.
https://www.smashingmagazine.com/2015/01/designing-for-print-with-css/. - ———. 2018. “A Guide to the State of Print Styles in 2018.” Smashing Magazine, May 1.
https://www.smashingmagazine.com/2018/05/print-stylesheets-in-2018/. - Coyier, Chris. 2012. “Handling Long Words and URLs (Forcing Breaks, Hyphenation, Ellipsis, etc).” CSS-Tricks, January 30, 2012. Last Modified July 25, 2018.
https://css-tricks.com/snippets/css/prevent-long-urls-from-breaking-out-of-container/. - ———. 2018. “Where Lines Break is Complicated. Here’s all the Related CSS and HTML.” CSS-Tricks, May 9, 2018. Last modified April 24, 2020.
https://css-tricks.com/where-lines-break-is-complicated-heres-all-the-related-css-and-html/. - The Chicago Manual of Style. 2017. 17th ed. Chicago: University of Chicago Press.
https://www.chicagomanualofstyle.org/. - Prince. n.d. “Convert HTML to PDF with CSS.” Accessed February 25, 2021.
https://www.princexml.com/. - The Unicode Consortium. 2009. “Special Areas and Format Characters.” In The Unicode Standard. Version 5.2.0.
http://unicode.org/versions/Unicode5.2.0/ch16.pdf. - WHATWG. n.d. “HTML: Living Standard.” Accessed February 25, 2021.
https://html.spec.whatwg.org/multipage/.
I’m pretty sure this is terrible for accessibility. Screen readers end up reading the entire URL. Usability and accessibility points to making the actual text the link … not the URL.
I believe this article is meant for when you must put links in the page content, not to encourage it.
I’m all for accessibility. I have two initial thoughts here.
First, this use case for conforming to the Chicago Manual of Style is specifically a matter of print styles. The code in question could be moved to a
@media print
query for CSS or an environment variable for server-side JS. Depending on a project’s accessibilty requirements, adding these line break opportunities for screens could create the kind of conflict you raise, in which case Chicago’s recommendation to forego line breaks in URLs for electronic documents still has merit.Second, the
<wbr>
is a semantic element, and this technique is part of the HTML Living Standard. If a certain accessibility client doesn’t recognize<wbr>
or reads them aloud, that feels like a problem for that client specifically at least as much as it may also be a problem of accessibility from a developer’s perspective. The solution seems to be to improve accessibility software itself—at least in addition to weighing when, how, and whether to implement this technique for either print or screen for a given project. My gut reaction is that writing semantic HTML that conforms to the specification shouldn’t be inaccessible. Which is to say, the issue you raise doesn’t fall squarely on front-end developer’s shoulders. That’s not to dismiss the concern—it’s valid—just to contextualize it a bit.As it is, this technique doesn’t generate any false positives with pa11y or axe. However, if there is a concern about implementinting this technique, it might be good to flag it as a known issue, similar to what the folks at the A11Y Project do.
Just use url shorteners ;) or print url as [1] with list on next or srparate page
Those are both good solutions for some projects, just like Chris’s method at the top of the post. But they don’t necessarily work for formal writing or print styles, like the use cases in this post. Different projects have different needs.
Why wasn’t CSS
line-clamp
covered as a suggestion?Wow, the
<wbr>
element is INCREDIBLY useful for the niche industry I work in: domain names. We often display domain names as part of our marketing or even search results. Normally we wrap the SLD and TLD in s in order to better control the line breaking:<span>css-tricks</span><span>.com</span>
Thank you! Why do you return inside the forEach callbacks? This could confuse new developers.
Best regards,
Sladi
This doesn’t seem to work for email addresses. Ideally, I’d like to remove the “mailto:” portion from the printed address for a cleaner look. I’d also like to add a possible break after the colon and after the @. Is this possible?
To PREVENT breaks within URLs, use <nobr>https://mysite/mypage.html</nobr>
Don’t be afraid of the bedwetters who fret that it’s unofficial, non-standard, deprecated, or whatever. It is needed, it works, and it is universally supported.
The regex does not work on IE 11 as it does not support Named Capture Group