Half million 'Words with Spaces' missing from dictionaries(linguabase.org) |
Half million 'Words with Spaces' missing from dictionaries(linguabase.org) |
I could, however be convinced these could be documented/defined in a separate document, especially from the perspective you are coming from (word games).
"Words" don't have "spaces."
Phrases are made of words separated by spaces.
"Boiling Water" is not a word.
"Water" is a word. A noun, the subject.
"Boiling" is a word. An adjective, in this case. Which modifies the subject.
I don't know if you're trying to be clever, but you're not.
Just because the phrase is used colloquially to describe a specific group of restaurants doesn't change the fact the phrase "fast food" is comprised of two words, one being a noun; the other an adjective.
Fast greasy food. Fast disgusting food.
Fast food is not a word. It is a phrase.
Words don't have spaces.
In your native tongue you take these for granted, but in a second language you have to learn that the sum is more (or different) than the parts.
> Got a word Didn’t
> frozen water → ice boiling water
Freezing water doesn’t have a word. Boiled water does have a word.ice - water - steam
This is also an interesting case because “vapor” without a qualifier also refers to a suspension of solid or liquid particles in gas (of which “steam” is a particular example).
I suspect the answer isn't binary, but it's interesting to think about.
This "sixth sense" phenomenon seems to pop up a lot. Crosswords are a great example. The sense some people are getting for detecting LLM output might be another.
Entschädigungsleistungen - compensation benefits
Wiederbeschaffungskosten - replacement value
Kraftfahrzeughaftpflichtversicherung - motor vehicle liability insurance
Donaudampfschifffahrtsgesellschaftskapitän - Danube steamboat captain
Rindfleischetikettierungsüberwachungsaufgabenübertragungsgesetz - beef labeling regulation law
It's pretty obvious "boiling water" shouldn't be in the dictionary to be begin with but "boiling" and "water" should be.
Unless I'm just not understanding it
It's just part of how language works that when there isn't a single word carrying the meaning you want, you put multiple words together and they can mean the thing together.
Even though there isn't a specific word for that, I wouldn't say "It's just part of how language works that when there isn't a single word carrying the meaning you want, you put multiple words together and they can mean the thing together" actually is one big word with spaces in it.
It's a bunch of words together that carry a more specific meaning when put together in that order.
Water: transparent, odorless, tasteless liquid
Boiling: having reached the boiling point
Boiling Water: transparent, odorless, tasteless liquid which has reached the boiling point
If Boiling Water had some other completely different meaning that has nothing to do with the individual words then sure, maybe, otherwise this is completely redundant and opinionated.
Personally, I don't agree that "boiling water" is a word (with a space) - I would refer to it as a phrase if it had specific meaning, but it just seems like an ordinary pairing of adjective and noun. Also, if a word can contain a space, then what is the meaning of "words" as there doesn't seem an easy way to distinguish between a "compound word" and a common phrase. Is "barking dog" a pair of words, a compound word or a phrase? (It's a pair of words in my mind)
> “If you speak slowly, you pause very briefly after each word. Thatʼs why we leave a space in those places when we write. Like this: How. Many. Years. Old. Are. You?” He wrote on his paper as he spoke, leaving a space every time he paused: Anyom a ou kuma a me?
> “But you speak slowly because youʼre a foreigner. Iʼm Tiv, so I donʼt pause when I speak. Shouldnʼt my writing be the same?”
"Eachother" feels as natural as "somebody", "nobody", "anybody" to me
So the issue is just that this is figurative language, and you have to know that a kickoff is the beginning of certain sports, for example. It's more of a cultural issue than something a dictionary needs to fix.
If we are talking engineering, the term steam generally implies water vapor that is at or above the saturation temperature.
In every day usage they are usually drawing a distinction between visible and invisible water vapor, usually caused by the presence of liquid droplets, with "steam" being essentially "fog", but hotter.
And there's effectively no other gas in the steam, because dissolved air in the boiler's feedwater (particularly oxygen and carbon dioxide) has to be removed to prevent corrosion. To that end, water going into the boiler is first run through a deaerator, to remove any air that dissolved in the water as it came through the condensor.
Well, that's true, I haven't, BUT still I went back and forth writing and deleting and rewriting and eventually deleting a whole digression about the special case of the jargon of steam power and how it uses “wet steam” (or “saturated steam”) for “steam” in the general use sense and “dry steam” for “water vapor” and “superheated steam” for dry steam created by heating wet steam away from contact with water, before deciding that was way too much, but, yeah, that's all true. (And, in details about the actual processes used, a lot more than I knew or would have gone into even if I had and had decided to keep the digression.)
I guess we'll have to disagree then, because "boiling water" is "water that's boiling" to me. It's not a different state of matter to "water", that would be "steam". It being a hazard doesn't mean it's a singular concept, same as "wet floor"
Adding two words together creates a new and different concept. The permutations necessary to represent every concept ever formed by combining two or more different words would be endless.
Some of them on the list, like black hole, do make sense. That's a very distinct thing. It's not a hole in the conventional sense and it's not really black. Boiling water, though, is water. And it's boiling.
Norwegian is almost as compound-happy as German, and we could've filled many volumes with compounds. But what generally happens for one of the compunds to enter the dictionary is that the compound needs to have a meaning that is non-obvious from the individual parts, at least to some people, and typically that the compound has a non-obvious meaning if interpreted as two separate words.
E.g. "akterutseilt" is an example. "Akterut" means behind, aft. "Seilt" means sailed. "Behind sailed" helps as a way to remember it, but it's not obvious whether it's strictly a sailing term, or means that you've been left behind or have left someone else behind.
In this case if you say someone has been akterutseilt, it means they've been metaphorically left behind, often by their own failure to keep up.
Those kinds of compounds deserve dictionary entries whether they are actually written in two words or one, because they function as a single unit however it is written.
I think black hole is a perfect example in English. And in fact, this is a compound that is written in two words in Norwegian as well, but is in Norwegian dictionaries despite that[1] as "svart hull".
May I introduce you to the German language?
We have "gesundheitszeugnis" (health certificate) and "bärenstark" (strong as a bear), and of course "[der] Donaudampfschifffahrtsgesellschaftskapitän" ([the] Danube Steamship Navigation Company Captain) and "[Das] Rindfleischetikettierungsüberwachungsaufgabenübertragungsgesetz" ([the] cattle marking and beef labeling supervision duties delegation law).
"Hackernewsleser" would be a word I just made up but every German can understand. A reader of Hackernews. Obviously this makes a dictionary tricky. And it has been a big problem for spell corrections in early MS Word Software.
“Honey, I’ve overheated the fondue! The problem is I can’t describe the liquid because English completely lacks any word that might be apposite in this situation other than the newly-minted ‘boiling water’.”
“It’s a problem. Maybe you could call it ‘boiling water that happens to be quite cheesy’. It’s not great, but it’s the best we can do.”
> Traditional dictionaries skip almost all such phrases, because they contain spaces.
Yes, because they're phrases, not words. I don't even understand what's surprising about this. Sure, the entire article talks about how dictionaries contain _some_ phrases; but it's clear it's not many of them. Dictionaries are for words, not phrases.
- Don't put your hand in water that's boiling,
- Add the pasta to water that's boiling,
- That saucepan is full of water that's boiling.
If "boiling water" were a distinct word, all of these sentences would change meaning compare to their idiomatic counterparts.
The word "boiling water" is not currently found in the dictionary because the meaning has not been considered widespread or significant enough to justify inclusion. The article is pondering what line exactly defines widespread or significant.
Are there idiomatic expressions for warm/cold/dirty water, which mean something other than a literal adjective describing the temperature or condition of water?
Depending on the context you got sewage, slush, runoff, murk, waste etc.
"Boiling water should be performed in a metal pot".
> It’s a hazard, a cooking stage, a state of matterAll of these are ancillary and depend on context, but in every one of these downstream cases the same underlying process is happening: the water is boiling.
Not necessarily. It might refer to heating water to bring it to a boil.
Q. What are you doing over there?
A. Oh, just boiling water.
As far as I'm aware, there is no separate word for freezing water -- i.e. water that is very cold and will, if it continues to get colder (and has something to crystallise around), turn into ice.
So the symmetry seems complete: ice -> freezing water -> water -> boiling water -> steam.
Frozen water represents a state change and that different state commonly gets its own word: ice/water/steam equates to solid/liquid/gas
Boiling/freezing water represents the state of the liquid, not the transition. Its descriptive. Water boils away into steam, or freezes into ice.
Should we consider luke-warm water also singular? What about body-temperature water? cool water? It makes sense not to treat adjectives/descriptive words combined with the subject as singular because the definition already exists in the root of the words (meaning of adjective word + meaning of subject word). Blue clay is another example, why would that be a singular?
It really only makes sense to me in the rare cases where the combination words represent something different or non obvious than the combined meanings of the two words (i.e to 'give up')
We have a lot of words for "frozen water" because it takes a lot of forms. As far as I know "boiling water" is only one thing so we've never needed additional words to distinguish it.
Ice cream is a shortened pronunciation.
The chef was out the back, boiling water.
The chef was out the back. Boiling water had spilled everywhere.
The seas had turned to boiling water.
I dunno, could be down to interpretation.
Which is why "state of matter" is, itself, often in the dictionary, possibly to the dismay of the Team Single Word in this comment section.
And some of the entries on this list are wrong. "Good night" exists in OED as "goodnight" [1] because there are multiple ways it's used. One is the clausal phrase "I hope you have a good night", which can be modified by changing the adjective, e.g. "great night" or "terrible night". "Goodnight" the bedtime ritual can't be modified the same way, so OED chooses to write it as a compound word without spaces.
Also the slider examples are abysmal. "I love you", "Go home" and "How are you" are not words by any stretch of imagination. For someone who makes word games, I don't see a particularly deep love of words here.
Edit: Obligatory reference to Borges's Tlön: https://en.wikipedia.org/wiki/Tl%C3%B6n,_Uqbar,_Orbis_Tertiu...
The problem with introducing phrase/sentences into a word game (let's take Scrabble) is that you'd spend half the night with your friends arguing over what is and is not acceptable with the only litmus test being its... corpus frequency?
You really have to love the human messiness of language!
I would hope that none of those examples were taking up space in a dictionary.
On languages other than English: in general, different languages do word division very differently. At least in German and Dutch, many of those phrasal verbs are separable, meaning that they are one word in the infinitive but are multiple words in the present tense. So for example, where in English you would say "I log in to the website", in Dutch it would be "Ik log in op de website". "Log in" is two words in both cases, but in Dutch it's the separated form of the single-word separable verb inloggen ("I must log in now" = "Ik moet nu inloggen"). The verb is indeed separable in that the two words often don't end up next to each other: "I log in quickly" = "Ik log snel in".
Dutch, like German, has lots of compounds. But there are also agglutinative languages, which have even more complex compound words, perhaps comprising a whole sentence in another language. Eg (from Wikipedia) Turkish "evlerinizdenmiş" = "(he/she/it) was (apparently/said to be) from your houses" or Plains Cree "paehtāwāēwesew" = "he is heard by higher powers"; and these aren't corner cases, that's how the language works.
Collocation dictionaries are lists of collocations. The reason they're absent from single word dictionaries is because there's about 25x more collocations than single words.
Presumably if the word thesaurus was actually "synonym dictionary" it would likewise be absent.
Well I can't even.
(1) Who counted those? Whence those numbers?
(2) The examples are normal two-word phrases with one word modifying the other, often categorised as an adjective. The examples are counter-examples to the very claim made in that article.
(3) Using Clause to brainstorm s.t. is a weird thing to say...
(4) I would say the use of 'lexicalized' is wrong or at least uncommon. It usually refers to specialised semantics of something that could be interpreted generically, too. Like 'sleeping bag'. Or indeed 'cold feet'. Lexicalisation may involve deleting spaces, like 'hotdog'. And I am pretty sure lexicalised phrasal words are usually intensionally listed in dictionaries. And so 'ice' is not lexicalised 'frozen water', but it is not overtly a phrase but is a separate atomic word.
=> I don't get the point.
'N, culinary. A paste made of ground up nuts, sometimes with additional oils and other ingredients. E.g. "peanut butter", "almond butter".'
"Amusement park", same. Falls very much under the "place of recreation" definition of "park".
"Black hole" is maybe a bit different, because it's a scientific term - and certainly in a science dictionary would be included as a two-word item - but, for consistency, in a regular dictionary should be handled identically to the above, with a note on the word "hole".
While including noun phrases as singular entities in a word game is entirely appropriate, I don't think the OP has formed a rigorous definition of the concept that they are trying to describe. I agree with the other comment which suggests that they need some instruction / practice using a dictionary.
We act as if some languages have "compound words" that can encompass entire sentences (subject & object attaching to the verb as prefixes or suffixes) while others don't form compounds, and most are somewhere in between. But these are all statements about lexicographic conventions and say nothing about the languages. In reality all languages are muddles sprawling across a multidimensional continuum, and they abso-frigging-lutely do n't sit neatly in such pigeonholes.
We could have a similar debate about whether common suffixes and prefixes should be regarded as individual words.
Much like "planets" don't really exist as a separate natural object, words don't really exist in natural languages. They are artificial concepts, and therefore we will always have edge cases.
I would argue that it is still a useful discussion, as it sheds light on the nature of language (or of celestial bodies), even if the definitions defy the same rigour as mathematical concepts.
The article is questioning why some words don't meet the bar for inclusion in the dictionary. The word "boiling water" is one such word that it sees as being on the fence. The comments here demonstrate exactly why it is on the fence, but it remains unclear exactly what would be necessary for it to tip towards inclusion.
Better yet, you can take advantage of English's adjective ordering to demonstrate this point. Would I describe the water I'm currently boiling for the purpose of cooking "cooking boiling water", or "boiling cooking water". Since purpose tends to be the last adjective we use, any native speaker would choose the later.
I think maybe the word the author is looking for is 'phrase'
The confusion might be that this seems to be a spectrum rather than a binary phenomon.
We have single words at one extreme, ordinary sentences at the other, and in the middle we have idiomatic assemblies of words that span a range of substitutability.
"Hot dog" and "Saturday night" are arguably great examples, because they exist at the opposite extremes of the spectrum. Saturday night can retain some of the original meaning following substitution, whereas hot dog almost deserves a hyphen.
You can argue that there's a connotative association with the phrase. Sure. Just like "beach weather", or "blizzard conditions". But that doesn't make "saturday night" special in any way.
I suspect the entire list was produced by an AI entity which had not been prompted to avoid giving offense. I predict a range of (tedious) opinions about whether a prohibition on that particular word is an appropriate inclusion in a system prompt.
That's also not a term I've - thankfully! - ever heard, so I've no idea if it's hallucinated. This is not an invitation, HN, to define or explain it to me.
My favorite adjective he's coördinated is "burntwing", used to describe moths spiraling downwards after passing through candleflames. If I had crafted such a descriptive contraction, my former styling would've been "burnt-wing", had I even been capable of generating such concise imagery [1].
McCarthy's stylings have helped me to reduce hyphenations in my own writings — reducing their usage mainly to contractedwords which might be all-too-confusing without them.
[0] pg104 has ten words that I do not know their definitions, yet through context they work to advance the storyline of character racists (book is set in 1950s).
[1] decades ago, during college burnout, I was searching for the essense of "burntwing" — reduced to writing a professor about "feeling like a burning airplane in tailspin." My trajectory back then was definitely burntwing.
Wiktionary doesn't need to make that distinction between MWEs and Idioms and tends to conflate MWEs and Idioms as there is no separate "Wikidiom". Arguably, that multi-book confusion runs deep on the internet because Urban Dictionary should probably be fully titled the Urban Dictionary of Idioms and Slang.
It's not just page limits but also categorical limits and classic lexicographers would build multiple books/volumes, not just settle on one "dictionary". Classic scholars would often have a "reference shelf" with multiple dictionaries, books of idioms, thesauri, and more. The CD-ROM and then the internet has kind of tunnel visioned that this entire shelf can be merely "one app".
English: cream of mushroom soup
Spanisch: sopa cremosa de champiñones
German: Champignoncremesuppe
It has some compound words. But including too many of them would quickly get out of hand
Seems like they would have just as much of a problem since the issue is delineating when a "phrase" becomes a "word"
Is there a distinction between words that get enumerated and compound nouns that do not?
It does seem, though, that German speakers might be more comfortable with the fuzziness that apparently exists at the edges of what the word "word" means.
Peter Norvig - The Unreasonable Effectiveness of Data
https://www.youtube.com/watch?v=yvDCzhbjYWs&t=1477s
Not to mention Tobias Fünke’s analyst + therapist web site, analrapist.com.
To me, any discussion of this topic that doesn't mention collocations signals an amateurish approach.
I also disagree with the premise that "this was not possible before LLM." That's nonsense. Linguists created many dictionaries of collocations for different languages, so that work is precisely what they did!
(Before any LLM zealots attack me, yes, it is now possible to have a more exhaustive list of collocations thanks to LLMs. This doesn't contradict my point.)
Examples of collocation dictionaries:
[1] I wonder how many here have ever been told something like "Prithee, husband, bring back a dozen canned goods from the market, for in the meanwhile I shall do my household chores".
"I used to smoke marijuana. But I’ll tell you something: I would only smoke it in the late evening. Oh, occasionally the early evening, but usually the late evening -- or the mid evening. Just the early evening, mid evening and late evening. Occasionally, early afternoon, early midafternoon, or perhaps the late-midafternoon. Oh, sometimes the early-mid-late-early morning... But never at dusk." -Steve Martin
Fiskemat Fiskmat
The latter means food made from fish, the former means food for fish. Standard varieties of Norwegian only use the former to mean both, to the annoyance of many old fishermen.
This maybe illustrates why the author's examples such as boiling water aren't so weird. Yes, in English it means water that's boiling, but you have to know that. It could for instance have meant water for boiling, like "cooling water" means water for cooling say in a nuclear reactor, not water which is in the process of getting cool.
Roovleisslaghuisinspekteur =
Rooi = red
Vleis = meat
Slag = butcher
Huis = house
Inspekteur = inspector
"Inspector who controls the quality of red meat in butcheries"
in the slavic languages do they have a different way to describe boiling or freezing milk, or any other liquid?
i guess Saturday night could have some extra details explaining the context around our standard work week. But even that is a stretch.
Yeah, I agree! Fuck ICE!
----
As blacksheep of an intellectual family (lawyers, politicians, engineers), I've spent the majority of my employment around fellow bluecollars.
Despite my education (left medschool, decades ago) it scares my family when I speak in the colloquial jargon of my electrician co-workers. If I don't codeswitch back into the grammatically correct language of our upbringing, my brothers value what I have to say less ("what you said sounds dumb even though I understood you better").
Isn't the whole purpose of language to communicate the realities of World? As their brother, I think they mostly write to obfuscate intentions... I prefer the honesty of pure dumb.
Another contemporary writer who worked with new words in a very creative way was Gene Wolfe in The Book of the New Sun. Some were inventions using Greek, French, or Latin roots. Others were forgotten terms which he resurrected. Someone compiled a dictionary, Lexicon Urthus, which discusses the origins of certain terms and their placement within the series.
Absolutely. Similarly, I read the Tao Te Ching 4x annually, by reading the same single passage both before and after bed, daily. Both Laotzi's and McCarthy's density of construction is just soooooo human condition.
[Suttree book world] Harrington just found the eyeball in the junkyard vehicle — in a single paragraph humanity just oozes, including his toying with viscosity and shock, and re-toying again. Washes hands. The drunk boss having previously joked "yeah the driver only scraped his shinbones."
I am hooked. McCarthy's books jumped to the top of my bookqueue after reading a HN article a few months ago about his library/collections being catalogued, post-humously.
----
I've just read Dave Wallace's three major novels (Jest & King & Broom, ~2000 pages) and McCarthy is absolutely the better author, not requiring hundreds of footnotes to say less with more esoteric bullshit [0]. DFW just seems like a bully to me ("wow I'm so smart"—DFW, probably), and honestly his samizdat is about 800 pages too long (myself a former bored addict prodigy with poor family comms) [1].
Mostly I read DFW because he's my judge-brother's favorite fiction author — it felt like a challenging obligation/chore, much like our personal relationship. With both, I've felt mostly emptiness. For powerful shortform pieces, both are quite capable of emotional stirring (This is Water).
I laugh when I see this book on others' shelves, because they probably haven't read it and it isn't really worth the time to read [all of] it. A few simple questions of the "reader" verifies this. My own bullying is that "I have" [snooty], however much I wish for all that reading time back. Bullies making bullies =D
----
By page 100 of Suttree you are hooked. By page 100 of Jest you are bored [2]. I've yet to read more than six pages of McCarthy in one day. For Wallace my eyes would constantly glaze over dozens of pages and just think: what happened here?! why did author include [all of] this!?
Although I am tired after reading either author for twenty minutes, McCarthy's doesn't feel like the author is just wasting my time.
----
My McCarthy readlist is structured so: Suttree (current); Blood Meridian; The Road — is this advisable?
----
[0] DFW's footnotes == even more of his esoteric bullshit
[1] If you do read Infinite Jest, absolutely use a study guide(s) (specifically Aaron Swartz's incredible breakdown... which can reduce the book just just a few hundred pages). If you've ever suffered an addiction (whether yourself or crazybestfriends's), you probably don't need to suffer through any longform DFWallace.
[2] I understand this is part of DFW's "style" : the frenzied passages of speed addicts, thirty pages into killing a dog (e.g.) when three pages would have done better, more respectful of reader's time (addict or not).
Some cases are basically impossible "Crash blossoms" you don't stand any chance without knowing why we call them that
Some are middling difficult, "Home Secretary" requires that you know every meaning for the two words and then you happen to pick the correct obscure meaning, a "Secretary" could be in charge, and "Home" could mean the entire country as distinct from everywhere else.
But "Hospital bills" doesn't seem even marginally difficult
But "ginger ale" seems straightforward to me. It's an ale, flavored with ginger. Not even idiomatic, just descriptive. Root beer. Grape soda. Orange chicken.
A team of people will compile a bill for all of those services. The bill will be presented to the insurance company whose card I showed Friday morning. It will likely be less than a million dollars, but it could easily be more than a hundred thousand dollars. That's the right order of magnitude to consider: a good percentage of a house, maybe a very large nice house.
The insurance company will claim that some of these charges are too much. The hospital knows this, and there are three mechanisms in which they justify their prices. First, although the two Tums antacids have a street value of eighteen cents when you buy it over the counter in quantity fifty, the hospital buys them in blister-packs so to avoid cross-contamination until they reach the patient. Second, it is customary to pretend that only the services which a patient actually used can be charged for, so the in-house plumber, the gas plumbers, the cryogenic fluids specialists, the oxidizing gases technicians, the potable water testers, and the electricians among a cast of thousand all need to be paid for.
And third, there's emergency care for the uninsured.
The US is cruel, but not stupid. No, I lie, it is frequently both cruel and stupid, always to people already disadvantaged in some other way. As a matter of law, a hospital can't turn away or discharge a person who is likely to die without treatment, even if they can't pay. But the government doesn't provide money to pay for that.
Finally, most hospitals or hospital systems in the US are run by for-profit private companies. I won't mention organized crime in the same sentence, but one can reasonably presume that the two are interchangeable in terms of law-abidingness and willingness to trade down ethics for an increase in profits.
So, having created the bill and sent it to an insurance company, they will argue back and forth and finally some portion of the money will eventually be transferred and everyone will be more or less happy, right?
No. Because in the US, the standard for healthcare insurance is to avoid the moral hazard of people attempting to get too much healthcare by having the insurance company bill the patient.
Remember the bill that started out as the same scale as a house? 10% "coinsurance" is often considered generous. 20% is pretty normal. Some specific services will be called out with specific fees, and others may be "disallowed" -- and sent through entirely to the patient.
That's on top of the monthly payments that have already happened.
But I work for a tech company with an unusually enlightened attitude, so I expect that my family's fiscal impact from this bout of medical intervention will be limited to the parking fees that my wife paid when she came to visit me.
It's privilege, but I'd rather that the system be reformed so that everybody got it.
https://www.merriam-webster.com/dictionary/hot%20water
> warm water - n. an ocean or sea not in the arctic or antarctic regions
https://www.merriam-webster.com/dictionary/warm%20water
> cold water - n. depreciation of something as being ill-advised, unwarranted, or worthless. e.g. threw cold water on our hopes
https://www.merriam-webster.com/dictionary/cold%20water
Seems that what makes sense to be in dictionaries is already there.
Basically as it gets colder water exchanges energy with the environment and gets colder.
But once it reaches freezing temperature, it can no longer get colder and all the energy is used for the formation of crystals.
> But once it reaches freezing temperature, it can no longer get colder and all the energy is used for the formation of crystals
Water at freezing temperature can get much colder without freezing. https://en.wikipedia.org/wiki/Supercooling:
“Water normally freezes at 273.15 K (0.0 °C; 32 °F), but it can be "supercooled" at standard pressure down to its crystal homogeneous nucleation at almost 224.8 K (−48.3 °C; −55.0 °F).”
But the semantic point still stands. Boiling water is still water -- in the specific sense of H2O in its liquid state -- while ice is not. The complaint that frozen water has a single-word synonym while boiling water does not is making a false equivalence.
Simply put, "boiling water" is a word whenever someone uses it as a word. It is reasonable to say that it isn't commonly used as a word, but that's kind of the point of the article: Asking when a word becomes worthy of inclusion in the dictionary. The very similar "hot water" is a word that is found in the dictionary. Of course, it is a word used frequently, so the inclusion isn't suspiring.
But it remains unclear where the line is between worthy of inclusion and not worthy of inclusion. The article is asking where that line is.
- A multi-word phrase is a phrase, not a word
- A lexeme is a basic unit of meaning in a language, like a word (and it's forms [1]) or phrase.
- Every place I was able to find described a lexeme as a "word _OR_ phrase", making it clear those two are different things.
- Dictionaries, in general, focus on words. Many do include phrases also. This point is less definitive; and just my understanding from looking at dictionaries and how they describe themselves. That being said, every source I can find that discussed something close to the topic seems to support this
[1] A word with all it's forms, in that "walk", "walked", and "walks" are all a single lexeme (with each form being a distinct word) OR a phrase
Side note: I'm not looking to "correct" anyone; just pointing out what information I'm able to find on the topic. I'm open to being corrected, but that correction would need to include reasonable sources.
If you've read Suttree you could do either one next.
If you were coming new to McCarthy, I would start with Blood Meridian, as there's nothing else like it (The Road invites comparisons with other post-Apocalyptic fiction).
There are some attestations to it from 1732 onwards: https://archive.org/search?tab=fulltext&query=%22iced+cream%...
The attestations for ice cream (or often ice-cream, as these open compound words used to often be hyphenated -- the loss of that hyphen eventually leading to articles like this one) are much, much more and much messier, not least because someone tagged every edition of The Gentleman's Magazine as being published in 1731 -- the Internet Archive is a fantastic resource but I wish they'd allow crowd sourcing corrections for metadata. Excuse the m-dashes.
You may be right that it was mostly called ice cream at first and eventually at last. To be honest I took the Wiktionary etymology at its word.
In talking about the validity of the suggested compound word "boiling water", an example of exactly what the article is talking about arises: when exactly does a sequence of invididual words (state, of, matter) become more than the sum of its parts?
A further question raised by your comment is does the existence of a compound word with a specific meaning then rule out use of the same words in a less specific manner? Perhaps for maximum clarity of expression, it's confusing, but is it wrong? It's an interesting point because if you didn't know the special meaning of the compound word "state of matter" then there is a word out there that is, completely unknown to you, invalidating your writing which would otherwise be correct both syntactically and semantically.
The general consensus among the HN crowd here seems to be quite vehement that "boiling water" has not reached the point where it "deserves" a dictionary entry. But there are words in many dictionaries like "cherry blossom" that I would say are little more deserving.
I wonder if the connotative association is exactly what we are trying to capture here though, and if those other phrases also fit in at the "separate words but slightly special" end of the spectrum.
There is meaning being communicated in all of those phrases that would be obvious to most or all people who are embedded in the language and culture where they are used, and which transcends the definitions of the individual words themselves.
It seems that there are several axis here -- how explicit is meaning, how atomic, how literal, how substitutable are the individual words -- and all vary continuously.
That might all seem needlessly pedantic for the question of "should it warrant a dictionary entry", but if you are trying to extract all information encoded in a verbal exchange, they might be useful concepts.
Or how about "Sunday morning"? It's evocative for sure. But very differently for different groups.
Or "island breeze". Stirs up images and feelings. But the definition is literal and the connotations are somewhat personal.
I'd argue that none of these phrases belong in a dictionary. Possibly explicitly because the "missing" meanings are the associative connotations, but those vary for different people, so what's the canonical definition?
It likely could apply to other liquids in the same mixed state, but would be assumed to refer to water (or solutions or colloidal mixtures primarily consisting of water) in common speech.
Water is extremely common, and has anonymously high heats of crystalization and vaporization, so it is the most common example of a mixed phase system and the only one most people encounter in everyday life.
qanik -- snow falling
aput -- snow on the ground
pukak -- crystalline powder snow (like salt)
aniuk -- snow used to make water
maujaq -- deep soft snow you sink into
piqsirpoq (verb) -- drifting snow / blowing snow
Central Alaskan Yup’ik qanuk -- falling snow
aput -- snow on the ground
nevluk -- wet snow
aniu -- snow for drinking waterOf course is like an abbreviation of something like ‘in the natural course of things’. Which has become more like just ‘yes’ over time. In the usage of ‘yes’ it’s easier to argue it could be one word.
English spelling does NOT line up with pronounced English when it comes to what we in Swedish calls "särskrivning" which is a word that roughly means "separateness writing".
Nevertheless, dictionaries (conventional ones, at least) concern themselves with written rather than spoken English, so I think my point stands. :-)
How do you suppose we determine acceptability now?
Absolutely not all - there's a near unbounded set of possible compounds.
In Norwegian, we in fact have a compound for the incorrect separation of compounds: "orddelingsfeil" (word separation error). Actually, we have two - technically it's "særskrivingsfeil" (separate writing error), but "orddelingsfeil" is more common... We take this seriously.
The problem is that while some are definitely wrong, others change meaning.
E.g. "en norsk lærer" means "a Norwegian teacher" but "en norsklærer" means "a teacher of the subject Norwegian". There's an infinite set of possible -lærer compounds: If you create a new subject then a teacher of that subject is a <subject>lærer. Obviously they can't all go in the dictionary.
Some other examples:
"Røyk fritt" means "smoke freely" while "røykfritt" means "smokefree". "Steke ovn", means "to fry an oven", while "stekeovn" means "oven". These two belong in the dictionary because they are so common and that though technically you can use "ovn" and "fri"/"fritt" to form a near infinite number of other common forms as well, in practice the number of common forms that use them is quite limited.
The key part is that most compounds in languages like German or Norwegian will only have one valid way of writing them. Add spaces, and you usually end up with something ungrammatical or with an entirely different meaning.
Whereas in English whether or not a word can be written with a space, with a hyphen, or combined much more often changes over time, and can differ in different places at different times, as the <separate words> -> <hyphenated> -> <compound> pipeline in English is slow and arbitrary and not necessarily reflecting a change in meaning.
Meanwhile, a Bahnhof would be a "Yard/square of lanes" if one didn't get taught that it's "train station". Although I suppose anyone learning German will quickly learn that "Bahn" is something to do with trains. Unless it's Autobahn. Or Schwimmbahn.
Paraphrasing a similar remark, I think I pulled from "sed & awk” [1]: A reference can teach you the rules, but they don’t show you how to really use them. There's the difference between reading the rules of a sport and actually playing the game.
Tangent: I’m beginning to question how broad the line is between a “rule breaker” and an acute student of tradition at odds a sort of institutionalized inertia. Maybe this “Words with Spaces” guy is on to something.
> Isn't the whole purpose of language to communicate the realities of World? As their brother, I think they mostly write to obfuscate intentions... I prefer the honesty of pure dumb.
This may speak to the significance of the court jesters of the past. And perhaps the rise of virtue signaling today?
[1]: https://www.oreilly.com/library/view/sed-awk/1565922255/
Copspeak for technically you're correct, but we're still going to fuck with you.
[0] A cop actually said this to me (I asked him whether he was violating a 3rd-party's Fourth Amendment by questioning); handcuffed, he tucked a Miranda Card into my buttondown's shirtpocket, tapping condescendingly about my questioning his authority. And what a ride it was.
Maybe you don't have "hospital bills". I don't have "landscaping bills", but I know exactly what they are.
As you've pointed out, the word "bills" clarifies what it is. I don't see why every combination needs to be in a dictionary. The list would be incredibly long, eg. "phone bills" or "power bills", etc.
(In the printed versions, you might need to go to the Universalwörterbuch or so to find the English entry, it might not be in the normal "Die deutsche Rechtschreibung"; I have not checked.)
Since 2004 the official guidelines for the german speaking countries (Germany, Austria, Swiss, Belgium, South Tirol, Liechtenstein, Romania, Hungary - see this founding document with the list: https://www.rechtschreibrat.com/DOX/wiener_erklaerung.pdf) are covered by the Rechtschreibrat (https://www.rechtschreibrat.com/).
The official german dictionary is here: https://grammis.ids-mannheim.de/rechtschreibung/6774
Also, from what I can tell using the site, it does not serve as a full dictionary. Rather, it lists the general rules of German orthography (as decided by the Rechtschreibrat) and has some limited tables of special words.
Just the name gives me flashbacks to German-lessons in highschool.
Ginger ale is in fact, not an ale, it's a soft drink. It is distantly related to Ginger Beer and some variants of Ginger Beer are alcoholic like ales, but Ginger ale was conceived as a soft drink and today continues as a soft drink.
Mass-market ginger ales and root beers are not made that way today, of course.
Dictionaries are also language specific. We don't necessarily expect a 1:1 mapping of words between languages. I have personally always wondered if this subtley shapes thoughts in different languages as well.
I.e. AFAICT, all compound words that defy literal interpretation are idioms. And it's that simple.
The argument then becomes that idioms should be in the dictionary. Some of them are of course, but idioms and slang are a) fast-moving, and b) often dismissed by the sorts of people who edit dictionaries.
At the same time, I am having intuitive issues seeing "hot dog" as an idiom, vs just an ordinary noun. It certainly seems to follow noun rules, and fit into speech as one.
I don't know for sure that it's NOT an idiom though. I could just be wrong here, and have intuition in need of calibration.
While these are not separate states of matter, they ARE special thermodynamic systems, with the particular property that they tend to remain exactly at the phase transition temperature while heat is added or removed from the system.
This is a somewhere esoteric technical distinction, but it has practical everyday consequences. It's why boiling food works so consistently as a universal cooking option.
You don't need to control the temperature of boiling water, it is an exact temperature that depends only on ambient pressure. As a consequence recipes work by only specifying time, sometimes with a single adjustment for people at higher altitudes.
This is remarkable given the wide variety of containers and heat sources used, and it is used practically by virtually every cooking tradition, even if it's reason for working is not common knowledge.
It shouldn't be surprising it'd acquire a single word as a unified concept.
edit: In those other languages is it like how we use ice? where water is the default, but it could mean any frozen liquid?
I would agree that "boiling milk" and "boiling oil" are very unlikely to get separate words, unless one of them happens to be an extremely common thing that people encounter a lot and that has special practical implications.
Milk might be a special case, in that it essentially is just water with some other stuff dissolved. It is to water as salt water is to water... but more so.
My guess would be that the single word might get pressed into service like "ice" does, but I think we'd have to find languages that include this word and survey native speakers. It could vary.
Nearly everyone encounters boiling water in everyday life, but do most people ever see other liquids boiling, even once, and especially during the historical periods that shaped our current languages? If not we might be getting into something like technical language, where daily life lines up poorly and terms and jargon get formalized.
So English arguably has three unique words for the three common states of H2O.
Asking for the "reasons" behind a certain word existing is sort of like asking why the human body looks the way it does. Sure, scientists may have good theories why it was evolutionary advantageous to have five fingers and no tail, but in the end the only answer that's for certain is, "because it evolved that way". So the answer is, "we" have a word for boiling water because people found it useful to have such a word.
When those technical distinctions are important we use specific technical terms for them (of which there are a few different ones for the phase transition - depending on discipline).
The cooking term is "rolling boil" which is a nice two word combo with a specific meaning.
I don't know how it is in other languages but in English "boiled water" and "boiling water" refer to different things - boiled water may be steam or water that has underwent some boiling, e.g. for sanitation, on the other hand "boiling water" refers strictly to water that is in the process of boiling.
I can see why some languages may have a separate word for one of these concepts to avoid some of the ambiguity.
I'm not a fan of extending the language with new words unless they are compound (with or without spaces) but extending the dictionaries with more and better descriptions is a no-brainer, there's a lot missing from them.
It depends on the tea, but some cannot be well made with a metal pot of water that's taken a few minutes to get from the kettle to the table.
The general rule of thumb is that black tea (i.e. fermented tea leaves) should be brewed at 100°C, green tea (non-fermented tea leaves) should be brewed around 80°C to avoid it being bitter and white tea (young, non-fermented tea leaves) is best at around 70°C.
Boiled water does have the extra connotation that it is presumed to be mostly sterile, which, while not hard to derive from the fact it has been boiling, is not immediately clear. After all the past tense does not tell us how recently it was boiled.
For that reason I'd argue that if one of boiling water and boiled water should be in the dictionary, it should be boiled water. Of the two, it is the term that potentially carries extra information.
If you're asking for isvann at a restaurant, you'd expect to get water with ice, not just very cold water.
But if you're talking about having gone bathing in isvann one spring, it specifically means in water that - whether or not there is actually ice in it - is cold enough that it might have recently melted.
(I'm a native speaker, but had to look up the precise nuance there to be sure I wasn't just making stuff up)
> Lexicographers used a substitutability test: if you can swap synonyms freely, it’s not a lexical unit. “Cold feet” (meaning fear) can’t become “frigid feet”—so it gets an entry. But the test cuts both ways. You can say “boiling water” but not “seething water” or “raging water.” The phrase resists substitution too.
These aren't failures for substitution because "Raging" isn't' a synonym in this case. where frigid would be a reasonable.
I wonder perhaps if the author is confusing the idiom "hot water" which is in there https://en.wiktionary.org/wiki/hot_water and would fail the substitution test.
There are a few things for which English simply doesn't have anything to substitute and those are harder to assess. boiling is one but so would "blood" in "blood pressure", obviously replacing it with another liquid has basically the same meaning eg water pressure, oil pressure but as far as I can tell there's literally no synonym for blood.
I those cases I try to use a stand in from another language to see of the substitution works. for for example "sangre" in Spanish so "sangre pressure" which doesn't seem to affect it's meaning much so I'd argue it's exclusion.
Conversely "Red tape" cannot be "roja tape" and a "caliente dog" is one trapped in a car not a food.
Every word in a thesaurus belongs in a dictionary.
It's a term referring to a small set of types of sausages served in a specific small set of ways. In some places, a hot dog can be used as a synonym for the predominant type of sausage most common in hot dogs in that place, but the term is still more commonly referring to the assembly of a wiener or frankfurter wrapped in a bread of some sort.
I had that disagreement in an alpine resort once. A seller was vending some sort of sausage stuffed in a bread, i was hungry so I walked up to them with money in hand and said "A hot dog please" while pointing at the only thing they were selling. The lady was mortified by my utterance, and was not willing to accept the money until I agreed with her that it is a bratwurst and not a hot dog. :D The disagreement felt a bit academical, but given that she was holding the hot dogs hostage and money does not taste that good she won the argument.
So it was an idiom, now it's canon.
Another good one might be "hot dish", which has an idiomatic meaning in the midwestern US, and is slowly spreading. Not sure if it's made it to the dictionary yet. (which dictionary becomes an important question -- I'd expect to see it in M-W before, say, OED)
But, yeah, some places "hot dog" also carries a connotation of potentially using lower quality sausages, so I can also totally see a bratwurst vendor taking offense...