The Rise of the Em-Dash in Hacker News Comments
Posted by sobradob 2 days ago
Comments
Comment by Iuz 2 days ago
Comment by MikeTheGreat 2 days ago
Like, I could see some people noticing that the book they're reading has dashes that are a bit longer than normal, but what made you think "That must be it's own thing, separate from a normal dash" as opposed to something like "In this font the dashes are very long"?
Comment by pinusc 1 day ago
Also, it's called em dash because it's as long as the letter m (as a rule of thumb), so it's usually an easy visual comparison. Finally, a typeface with hyphens as long as em dashes would be terrible and quite noticeably wrong!
Comment by nagaiaida 1 day ago
Comment by Iuz 2 days ago
Comment by bb88 2 days ago
Comment by lamasery 1 day ago
Comment by illiac786 1 day ago
Comment by ranger_danger 2 days ago
Comment by palmotea 1 day ago
The em-dash is indicative of AI usage when it shows up in contexts where it doesn't belong. Like informal context like forum comments and emails (though "smart" substitutions do complicate the picture a bit).
I'd only be funny if they argued it indicated AI usage in context where it does belong, like formal writing.
Comment by JKCalhoun 1 day ago
Comment by lamasery 1 day ago
Comment by palmotea 1 day ago
I think you have to make a distinction: there's using a dash as you describe and using the actual em-dash character. Without an smartquotes-type autocorrect-type feature (which admittedly is common in certain apps/platforms like Outlook and Word), an actual em-dash is awkward to type. I'd expect someone using it informally to just use a regular dash (-) or two (--).
I think you're automatically in a pretty formal writing context if you care if you use an em-dash character or not.
Which brings up an interesting idea: would Microsoft turn off it's smartquotes-type autocorrect, because now it makes you look like a dumb AI-user? Probably, if they cared about their users. But I doubt they will because they're so into hyping AI that "Microslop" is a thing.
Comment by lamasery 1 day ago
Comment by ranger_danger 1 day ago
I have known people that personally used em-dashes in all the wrong places way before AI... entire emails would just be paragraphs-long run-on sentences filled with dashes.
Comment by yencabulator 1 day ago
Comment by BeetleB 2 days ago
It went from 19.3 to 32.5. It did not even double. Which means that if you see a comment with an em-dash, it's more likely to be human than LLM.
Comment by meisel 2 days ago
Comment by tmoertel 2 days ago
For example, take a look at just about any stock chart (try https://www.google.com/finance/beta/quote/GOOG:NASDAQ?hl=en). There's actual money on the line, but no baseline. Why do you think that is?
Comment by wtallis 2 days ago
Comment by throwway120385 2 days ago
Comment by BeetleB 2 days ago
Visually, this is vastly exaggerating the variation. Actual usage did not even double.
Comment by tmoertel 2 days ago
No, it is literally showing the exact variation of interest. If you think it's exaggerating the variation, you are not reading the chart. You are glancing at the chart, ignoring what it actually says in multiple ways, and imagining it has a baseline of zero, when it clearly does not.
Read the chart. What does it actually say?
Comment by BeetleB 2 days ago
That's true of every instance where a chart is criticized for playing around with the axes scale. Imagine the stock price of a company varied between 50.1 and 50.2 over a week. And I presented it as a chart with the min being 50.09 and max being 50.21, and drew all the variation over a large vertical space. And then tried to imply that the stock was volatile. What would be the problem?
Let me ask you this. What is the point of this chart (or any similar chart)? Simply presenting a table with all the values would have conveyed all the information - wouldn't you agree?
Comment by tmoertel 1 day ago
> That's true of every instance where a chart is criticized for playing around with the axes scale.
Indeed. The criticism, however, is only apt when the chart's intended audience is likely to have a hard time understanding what that chart is trying to communicate. If you're publishing a bar chart in USA Today and its y-axis doesn't start at zero, yeah, that's a problem.
But the OP's chart that started this whole thread? It's fine. First, the intended audience is HN readers, who can be assumed to be numerically literate. Second, it's a line chart whose y-axis labels make clear what the range of variation is. Third, the data points, themselves, are labeled with their values. Finally, the thrust of the chart, that em-dash usage in HN posts has markedly increased since the widespread adoption of LLMs, is itself also explicitly called out and labeled: "+79% from pre-AI baseline."
If you try to tell me that the author of that chart is trying to mislead HN readers about the growth of em-dash use on HN, I'm going to have a hard time taking your claim seriously.
> Imagine the stock price of a company varied between 50.1 and 50.2 over a week. And I presented it as a chart with the min being 50.09 and max being 50.21, and drew all the variation over a large vertical space.
I have an easy time imagining your chart because that's how stock charts are plotted. That's what the financial community expects. That's how it's done: The y axis is bracketed by the low and high values over the period being charted, perhaps after rounding to the nearest nice value. For example, today's chart for the Russell 2000 Index shows a gain of just 0.30%, similar to the tiny relative volatility in your example. The chart's y axis ranges from 2,695 to 2,715 (https://share.google/oKPQxlmZFsgSVoNOS). It does not start at zero.
If it did start at zero, it would be unsuited for its intended purpose. How would you observe the day's variation on what appeared to be a flat horizontal line at the top of a chart whose y axis ranged from 0 to 3000?
Why do you think the financial world does stock charts the way it does stock charts? Do you think financial analysts don't know how to communicate the day’s movement of a stock to each other?
> And then tried to imply that the stock was volatile. What would be the problem?
The problem would be that your audience, if they were accustomed to reading stock charts, would think you didn't know what you're talking about. Your chart would refute your claims, and anybody accustomed to reading stock charts would know it.
> Let me ask you this. What is the point of this chart (or any similar chart)? Simply presenting a table with all the values would have conveyed all the information - wouldn't you agree?
The point of this chart, like any good chart, is to present the intended information to the intended audience faster and more conveniently than the alternatives. (Do you have any problem with that claim?) And, in this case, I'd say the OP's chart met that standard. Likewise, I'd argue that the typical stock chart, which is bracketed by the stock's low and high values, meets that standard as well.
In both of those examples, you could also communicate the same information in a table, but a table wouldn't be as fast or convenient as a chart, given the expected audiences.
Comment by BeetleB 1 day ago
I am saying precisely that. A significant number of HN users have a strong (and IMO irrational) anti-LLM bias. And these people pollute the discussion forums accusing people of using LLMs to write the content/comments.
It's not a stretch to believe that those folks will look at the chart uncritically. Everyone - even the smartest of folks - have blind spots (this was quite apparent when I worked with top professors in their fields while in academia). And blind spots often correlate with their biases.
Comment by tmoertel 1 day ago
Well, then, do you believe that the following evidence supports or undermines your hypothesis that the author is trying to mislead HN readers about em-dash use?
1. The author explicitly labeled each data point with its numeric value so that even if readers ignored the y-axis labels they could not misread the points.
2. The author explicitly labeled the pre- to post-AI growth as +79% so that even if readers ignored the y-axis labels and the data-point labels they could not misread the growth.
(The fact that you posed an example about a stock chart earlier but then completely ignored my response that refuted your argument about it suggests that you are not likely to be swayed by evidence and reason, but I'm giving it this one last try.)
Comment by mcphage 2 days ago
Honestly, I hate that about stock charts. They adjust the axes and scales so that the graph itself provides no information. Did it go up 1 point? 200 points? 5%? 50%? You can’t tell, because the graph is just a scale free squiggle.
Comment by tmoertel 1 day ago
If a stock chart "provides no information," why do you believe the financial world uses them to communicate the movement of a stock over a period of interest? Do you believe that financial analysts do not understand how to communicate stock movements to each other?
> Did it go up 1 point? 200 points? 5%? 50%? You can’t tell, because the graph is just a scale free squiggle.
I don't know what stock charts you've been looking at, but all the common ones list the stock's price on the y axis, making it easy to answer all of the questions you pose. For example, consider what a Bloomberg Terminal gives you [1] or what you get from Google Finance [2] or from Yahoo! Finance [3].
Take a look at the y axes. See those labels? What do you think they mean?
[1] See page 15 of https://data.bloomberglp.com/professional/sites/10/Getting-S...
Comment by cosmotic 2 days ago
Comment by ortusdux 2 days ago
Comment by Sarkie 2 days ago
Now she's been accused of using AI for her pieces.
Oh well.
Comment by throw0101a 2 days ago
Read the observation that AI was (presumably) trained on the 'best' (or at least 'quality') writing, and so if good writers tended to use em-dashes, it should not be surprising that AI generates text with it.
But, if one's personal style included using them, you should continue to do so because why should you dial down your own voice just because someone else may be mimicking it?
Comment by bb88 2 days ago
You can do small succinct sentences, but style-wise it sucks for longer passages.
Comment by umanwizard 2 days ago
Comment by razingeden 2 days ago
This gets corrected to an emdash.
I get annoyed and put the double dash back in.
Sometimes swearing a little or grumbling “HEY. I typed what I typed” at it helps a little.
I don’t even know how many times in 20-30+ years I’ve checked some box in system or program preferences begging it to knock that off.
This is the real reason I already loathe and avoid the emdash (nitpicking over a personal stylistic preference I won’t relent on even if I’m wrong) but I can’t be the only one this happens to.
Getting piled on and called “AI” really doesn’t ease my distaste for it, but .. do people.. not write enough to understand that it brute forces its way into human copy as well?
and yes. phone posting on HN. will insert them. to my dismay.
The other one that ticks me off endlessly but I’ve finally said to hell with it and just let it go?
Turning " into “.
(Writer. Not a very good one and I’m not here to steer anyone to that drivel. But at least I’m a human one.)
Comment by bitwize 2 days ago
Comment by Terr_ 2 days ago
Comment by Gormo 2 days ago
Although I still prefer the traditional ASCII double-dash -- easier to type, and less potential for character encoding issues. Also, LLMs don't seem to use it at all.
Comment by Gormo 2 days ago
Comment by operatingthetan 2 days ago
Comment by ButlerianJihad 2 days ago
Comment by interstice 2 days ago
Comment by turtleyacht 2 days ago
Comment by lamasery 1 day ago
(Incidentally, I love that backlash against LLM writing has more people developing as much of an allergy to emojis and content-marketing- and personal-branding-style writing as I've long had)
Comment by JKCalhoun 1 day ago
Comment by Freedom2 2 days ago
Comment by interstice 1 day ago
Comment by tayo42 2 days ago
Comment by JKCalhoun 1 day ago
Comment by BoredPositron 2 days ago
Comment by throw0101a 2 days ago
Depending on the text area you are typing into, if you type two hyphens/minuses right after each other (no spaces), Apple systems often translate them to an em-dash (kind of mimicking (La)TeX).
(If you don't want the em-dash, hit <cmd-z> with macOS to undo that auto-conversion.)
Comment by bitwize 2 days ago
Comment by sobradob 2 days ago
Comment by jordand 2 days ago
Comment by add-sub-mul-div 2 days ago
Comment by xxxxxxxx 2 days ago
Comment by flowerthoughts 1 day ago
Comment by iamnothere 1 day ago
Comment by flowerthoughts 20 hours ago
Comment by zvr 1 day ago
Comment by JKCalhoun 1 day ago
Comment by number6 1 day ago
Comment by JKCalhoun 1 day ago
Comment by northisup 2 days ago
Comment by juped 2 days ago
Comment by spudlyo 2 days ago
Comment by kstrauser 2 days ago
Comment by myhf 2 days ago
https://en.wikipedia.org/wiki/Caedite_eos._Novit_enim_Dominu....
Comment by lamasery 1 day ago
Until that relatively recent shift, it was named the Morse Dash—you'd think because of the "long" glyph when rendering Morse Code, but no, it was named for the 17th century English Catholic martyr Henry Morse, for reasons lost to time.
Comment by lz400 2 days ago
Comment by dragonwriter 2 days ago
Maybe if you are looking at it in a monospaced environment like the HN edit window; rendered in a proportional font, hyphens, en-dashes, and em-dashes are quite distinct from eachother.
> It's no surprise humans barely use them. Then why did it get picked up so much by AIs?
It got picked up by AIs because their training corpus includes plenty of professionally published work, not just informal, off-the-cuff communication, and professionally published work uses typographic dashes (em-dashes, en-dashes, and even 2-em- and 3-em-dashes) extensively. (3-em less so in newer works, it having, e.g., dropped out of the recommendations of the Chicago Manual of Style as of 2024.)
Comment by dr_dshiv 2 days ago
Comment by marssaxman 2 days ago
Comment by lamasery 1 day ago
Within months I was convinced that every default English keyboard I'd ever seen except the Mac one is strictly worse. It bothers me now how hard it is to get a consistent Mac-style keymap on Linux. This is one thing others should for-sure just rip off entirely. It's so much better.
Comment by mananaysiempre 2 days ago
Comment by marssaxman 2 days ago
I've never heard of a "level 3 shift key"; I'll have to look that up.
Comment by BeetleB 2 days ago
Comment by hyperhello 2 days ago
Comment by wwalexander 2 days ago
Comment by UqWBcuFx6NV4r 2 days ago
Comment by yojo 2 days ago
Comment by dragonwriter 2 days ago
Comment by crazygringo 2 days ago
If it's all comments, including flagged/dead/downvoted/etc., then it's not reflective of the actual filtering HN does.
But if it's weighting comments by their likelihood of being read -- e.g. mostly top comments on popular stories -- then I'd be a lot more curious.
I'm not surprised AI spam has increased substantially. But I'd be surprised if it's affected the comments most people actually read to anywhere close to the degree shown in this graph.
Comment by sobradob 2 days ago
Comment by crazygringo 2 days ago
Comment by Rekindle8090 2 days ago
Comment by mcphage 2 days ago
Comment by jcims 2 days ago
key insight - https://trends.google.com/explore?q=key%2520insight&date=all...
etc.
Comment by ChrisArchitect 1 day ago
Show HN: Hacker News em dash user leaderboard pre-ChatGPT
Comment by kkfx 1 day ago
Comment by lapcat 2 days ago
Comment by qup 2 days ago
Is HN more botted, or less? And are banned accounts excluded?
Comment by negura 2 days ago
Comment by andrewclunn 2 days ago
Comment by derbOac 2 days ago