By analogy with words you already know. In more detail, you guess mainly by recognizing morphemes, taking into account the three main spelling systems that exist within English and taking into account common phonetic pressures that alter pronunciations. Educated guessing cannot be reduced to rules, of course, but I can give you a feel for it with a scattering of examples.
Morphemes
Morphemes are the smallest elements of a word that have a meaning. For example, -ing and -tion are morphemes, as in going and information. If you know a lot of morphemes, you will be good at guessing pronunciation, because morphemes are usually spelled in only one or two ways, and their pronunciation in one word usually works by analogy with their pronunciation in other words.
For example, if you've seen -tion carrying that meaning in a number of words, you intelligently guess that it's pronounced the same way in a word you've never seen before, such as, say, transubstantiation. By analogy with relation, generation, acceleration, and many similar words, even transmission, which has the other spelling of the morpheme and is preceded by a different vowel, you can reasonably guess that conflagration has the stress on the second a. And you can intelligently guess that informational has the stress on -mā-.
This doesn't give you the correct pronunciation in every single case—equation is slightly different—but this is fundamentally how English spelling and pronunciation work: by analogy with related words, where having the same morpheme is the most important kind of relation.
Etymological spelling
English actually has three main spelling systems, corresponding to its three main root languages: Anglo-Saxon, Latin, and Greek.
Anglo-Saxon
In an Anglo-Saxon word, a vowel needs to be followed by two consonants or a consonant at the end of a word in order to be short. For example, batted vs. bated, chipped vs. griped. You know that mob is /mǒb/, not /mōb/, because of this pattern. In Anglo-Saxon words, it's as if every vowel wants to be long, but gets cut short by a terminal consonant. In Anglo-Saxon words, ch is usually pronounced /tʃ/, the "native" pronunciation.
Latin
Not so in words from Latin. Risible is /ˈrĭz.ə.bl/, not /ˈrīz.ə.bl/, and there's really no way to guess that, except perhaps by a rather superficial analogy with divisible and visible. Words from Latin usually follow pretty regular spelling, but in many cases you can't tell whether a morpheme's vowel will be long or short from its spelling—though often you can make a good guess by analogy with other words.
The soft pronunciations of c and g before i and e began in Latin, and were only introduced into English with the Norman invasion. So, Latin-derived words are normally pronounced that way, and most of the exceptions come from Anglo-Saxon, such as get, girl, and together. Anglo-Saxon words like bagged and bagging use the doubled g to indicate that the preceding vowel is short, so the g stays hard. Exaggerate and suggest come from Latin, so you know not to expect their soft g to make a good precedent for Anglo-Saxon words.
Greek
Words from Greek have a distinct spelling system, where English letters correspond to Greek letters: ch stands for the Greek letter χ; ph stands for the Greek letter φ; k stands for the Greek letter κ; th stands for θ; etc. The Greek sounds of these letters are approximated by English phonemes, as in chrysanthemum, metaphor, photon, system, synchronize, autarky, etc. Notice that ch in Greek-derived words is pronounced /k/. Greek consonant clusters that don't occur in English get reduced, as in psychology, bdellium, and pneumonia (but not sphere: we allowed /sf-/ to start a syllable in order to absorb that word). Eu- in Greek words is a morpheme meaning "good", and is pronounced /ju-/, as in eulogy and euphemism. Leonhard Euler is German, so you learn not to make an analogy between his name and Greek words.
The main difficulties with Greek-derived words involve the letters g and y, especially when they occur together. In Greek-derived words, g stands for the Greek letter γ, which we would normally represent in speech with a hard g as in gynecology and gigabyte, but phonetic pressures and misunderstandings have usually set precedents that make it soft, as in gymnasium and gyroscope. It goes silent when it's part of a reduced, un-English consonant cluster, as in gneiss. In the middle of Greek-derived words, y stands for υ, but at the end, it stands for -εια, which means the same as Anglo-Saxon -ness. The Greek vowel υ doesn't correspond well to any English vowel. Usually it's pronounced as a short /ĭ/, but English phonetic pressures sometimes shape it into a long /ī/, as in hypothesis, psychology, and papyrus. The -gy ending is a morpheme, pronounced /-dʒē/ (unstressed), found in energy, synergy, biology and even analogy, the main concept that explains how English works!
Can you really guess the origin of an unknown word?
You might wonder, "How am I supposed to guess which language a word came from?" It's surprisingly easy. Anglo-Saxon, Latin, and Greek words tend to have a very different "feel" to them, and the spellings of the words tend to follow patterns that are fairly easy to recognize. Noticing the different styles of words from each main root language is an important part of mastering English, including its huge vocabulary of synonyms. There are, of course, a few words that picked up spellings by "false etymology", such as ache and ptarmigan, but happily, it doesn't matter, because if you guess these words' pronunciations by the usual analogies, you'll be right. These spellings were established by people trying to follow the same analogies.
By the way, roughly 20–30 years ago, many schools stopped teaching children about classical roots and how they're spelled. Consequently, a lot of natives today don't know much about it consciously. So, with this clue, you can learn to do better than a lot of natives! Some kids are still being taught, though, as evidenced by the questions asked in this contest of guessing the spelling of a word (which is harder than guessing the pronunciation from the spelling). Here's an excellent article by an expert on dyslexia, about the true complexity of English spelling and what's involved in learning it.
Conflicting precedents
Precedents can conflict. For example, you can make a good educated guess about impious by noticing its three morphemes: im-, meaning negation; pi-, meaning reverence or dutifulness, by analogy with pious and piety; and -ous, making an adjective in a Latin style. So, a pretty good guess is to pronounce it /ǐm.ˈpī.əs/. However, impious can also be understood as following the precedent of the Latin word impius. The phonetic pressures in Latin put the stress on the first syllable, and we usually feel the same pressures in English words that come from Latin. In fact, the English word impious, pronounced /ˈǐm.pē.əs/, was borrowed directly from Latin in the 16th century. That pronunciation has been echoed ever since. The -ous spelling in impious is a slightly stretched analogy with Latin borrowings that follow a different pattern. Consequently, both pronunciations are in use today, /ˈǐm.pē.əs/ being more standard. (I prefer /ǐm.ˈpī.əs/, since it communicates the morphemes more clearly. Someone who's never heard it before will easily understand the meaning. /ˈǐmp.ē.əs/ misleadingly echoes imp.)
Pressures
I knew a native speaker who mispronounced fruition as /ˈfru.ʃən/. He was going by analogy with fruit. But /ˈfru.ʃən/ is a somewhat uneducated guess. The correct pronunciation, /fru.ˈǐʃ.ən/, better reflects both the etymology and the pressure from -tion to put stress on the preceding vowel. The spelling of fruit reflects the fact that it can break up when used as a morpheme in a larger word. And we see the same pattern established in circuit/circuitous and other -uit- words.
There is another factor at work here, too, which helps a lot for guessing pronunciations. Experience teaches you that common pressures will likely have removed the second syllable from /ˈfru.ĭt/, but these pressures don't occur in fruition. In English, there's a pressure to merge those last two syllables at the end of a word, as in biscuit, pursuit and circuit. Explaining phonetic pressures is a complex subject, but it's like everything else in English: it works by analogies and precedents, and you get a feel for it from experience. What's relevant here is that the unstressed short vowel has a hard time holding its own against the preceding long, stressed vowel without help from another consonant; the terminal -t offers only very weak support.
The word intuit is the only word of this pattern that has kept that unstressed syllable. I think people continue to pronounce it because the words intuitive and intuition are much more common, so when people use the unusual verb intuit, they feel some pressure to echo the corresponding syllable in the familiar words in order to be understood. I have heard people attempt /ǐn.ˈtut/, independently inventing the word during conversation, and then draw back from it because it sounds weird and doesn't clearly indicate the connection with intuition. There is also pressure to avoid echoing the slightly silly verb "toot".
Since pronunciation is shaped by interaction between many pressures, if you pay attention to those pressures, you are in a good position to guess the pronunciations of unknown words.
Analogies, not rules
Some people harp on the fact that English spelling doesn't follow rules, as if this were proof that English is "illogical" or completely without order. Really, though, the English language works like English common law: by precedents, not rules. When deciding new cases, or when inventing new words, people try to be consistent with previous cases or with the spellings of words already in the language. Old cases and familiar words can't pre-decide every aspect of a new case, of course. Law and language accumulate over the centuries, one case and one word at a time, embodying subtlety and hopefully wisdom beyond what could be captured by any fully articulated system of rules.
As you get to know the language better, you can often guess pronunciation from spelling pretty accurately, especially with "big" words. When I was in the 2nd grade (age 7), I was given a test where I had to guess the pronunciations of a long list of words, up to "college level". I got nearly all of them right, despite not knowing, at that age, anything about etymological spelling—and despite not having encountered most of the words before. I could usually see some sort of analogy with a word I already knew, and I made an educated guess. That's how it works.
The moral of all this is that you should not memorize rules. You should not even memorize the few rules that happen to be true. Memorizing rules runs against the spirit of the language. What you should do is make analogies from specific words to specific words that you already know how to pronounce. Occasionally you'll guess wrong—and that will teach you a new precedent, which you will apply in new situations. Eventually, gradually, you learn to perceive English spellings with acuity. (You can probably guess that it's /ə.ˈcju.ĭ.tē/, despite the spelling's ambiguity, because of its congruity with annuity and gratuity.)
25The joys of a non-phonemic language – laureapresa – 2015-06-12T08:08:52.867
1Yes, 'joy' for the natives, 'nightmares' for a sincere student like me :P @writingthesis – Maulik V – 2015-06-12T08:09:35.123
1Well I am not a native, but non-phonemic languages amuse me :) – laureapresa – 2015-06-12T08:13:31.537
1I just go with the flow. As a young lad I was met with hoots of laughter as I talked about a "mouse tache" (pron: /tayche/), a word I was reading out loud from a book. Naturally, I realised a split second too late that it was "moustache" (/mə-stăsh/). Don't even get me started on reading Lenin (Le-NIN and not LEN-in)... – JMB – 2015-06-12T08:17:47.147
14Even for native speakers, the relation between spelling and pronunciation is sometimes baffling, so don't worry :) I have seen an internet discussion once, after the introduction of the Segway transporter where it appeared that many native speakers never made the link between the name of that thing and the (obscure?) word segue. Many professed to knowing _ the word _segue and even using it in writing, but thing of that word as sounding something like seeg. I have also heard native speakers talk about an "ehpytohm" when they meant an epitome. – oerkelens – 2015-06-12T09:06:18.247
3Some children were taught phonics there are indeed spelling rules that cover many situations. Some linguists who should know better like to claim that there is no relationship between English spelling and English pronunciation, – Brian Hitchcock – 2015-06-12T10:59:46.670
3Take a look at some adult literacy programs. These are for native speakers who failed to learn to read as children. You will find that they all use phonics to some degree, to teach the sounds of common letter combinations. Some of these illiterate adults went to schools who practiced the absurd theory that every word should be learned "by sight" (i.e., by rote memorization). There are too many words in the language to do it that way! – Brian Hitchcock – 2015-06-12T11:06:27.677
15@JMB - My favourite personal example (as a native English speaker): I used to think there were two rivers in London - the Thames (pronounced with a th like in "this", and ames like in "fames") and the Tems (which I never realised I'd never seen written down). I was probably about 10 years old before I found out! – AndyT – 2015-06-12T11:06:28.843
1@MaulikV - not a full answer so I'll make this a comment. One useful rule is that if there is a double consonant in the middle of a word, then the vowel before is a "short" sound; if there is a single consonant then the vowel before is a "long" sound. Compare "hopping" to "hoping". At the end of a word, the letter e changes the vowel sound preceding it to lenghten. Compare "man" with "mane". – AndyT – 2015-06-12T11:09:45.453
2Most native speakers pick up these rules instinctively and don't actually consciously know them. One of the ways to test for dyslexia is to give a native speaker a made-up word and see if they pronounce it "right" - dyslexics (often) don't instinctively learn the rules and have to be taught them. I'm not dyslexic, and the only reason I consciously know the double consonant rule is that my dyslexic friend told me after he'd been taught it! – AndyT – 2015-06-12T11:11:43.570
One final thing to say - "rules" of English pronunciation only ever work about two thirds of the time. This is because our words come from a mix of historic languages, each with their own (different) pronunciation rules. – AndyT – 2015-06-12T11:13:54.050
1Btw your last rule isn't infallible. "iamb" is pronounced either with or without a final "b". – Steve Jessop – 2015-06-12T13:32:00.067
7@MaulikV Even well educated native speakers mispronounce unfamiliar words all of the time. Epitome, hearth, facade, and soooo many more. If they are lucky, a friend or coworker will politely and discreetly inform them of the mispronunciation. As a nonnative speaker, we would expect more mispronunciations of these words, and again, hopefully you would be discreetly informed of the mispronunciation. It's a wretched system of doing things, but we're too lazy to look unfamiliar words up in the dictionary for their pronunciations. – Jason Patterson – 2015-06-12T13:50:55.077
5It takes a lifetime. You start by getting it right 50% of the time. Then 90% of the time. Then 95%, 99%, 99.5%, 99.9%, etc. You never get to 100%! – CJ Dennis – 2015-06-12T13:52:20.453
4For years and years I used the verb misle (pronounced my-zul)- meaning to intentionally give someone false information. (I don't like that guy - he is always misling people. He's a misler. He tried to misle me!) I used it in speech (it is a fun word to say) and I used it in formal writing, and no one ever asked me what it meant. I never looked it up, but I saw it in books all the time - usually in past tense. Then, when I was in my twenties, I realized how similar the past tense of misle is to the past tense of mislead: Misled. I had been misling/misleading myself. – Adam – 2015-06-12T17:49:45.413
1English isn't easy. We've taken many parts of various languages and thrown them together into a big jumbly mess. Often, pronouncation comes from looking for patterns of spelling to see which way to go. you see 'ois', french, 'ph', probably greek/latin. 'ei', probably a latin root word. Charade is a word though, that pretty much every misses the first time. It doesn't have a french look so it is very easy to 'ch' the beginning. What a fun time. – Michael Dorgan – 2015-06-12T17:53:06.853
2@Steve Jessop: Indeed, I think that last 'mb' rule is just plain wrong. Certainly I pronounce the 'b' in all those words. It's not strongly voiced, as it would be if it were at the start of the word, but 'bom' does not sound exactly the same as 'bomb'. – jamesqf – 2015-06-12T18:10:10.033
2bourgeois (pronounced "booj wah") is a fun one. No native speaker would guess that correctly. – BlueRaja - Danny Pflughoeft – 2015-06-12T20:25:07.110
Lol -" berg-juice" is what comes to mind at first glance before I recognized the word :) – Michael Dorgan – 2015-06-12T20:56:27.147
1@BlueRaja - Danny Pflughoeft: But would a native speaker of French? And just be grateful that you don't have to deal with Gaelic spelling :-) – jamesqf – 2015-06-12T20:57:04.933
4Please don't closevote this question. It's one of the best ever posted on ELL! The other question asks, essentially, "how do you know the pronunciation from reading?" This one asks "how do you guess?" That's a very interesting and insightful way to frame the question, and it's led to much more informative answers. The other question mostly got answers in terms of abstract rules; this one's gotten several in terms of analogies with already-known words. – Ben Kovitz – 2015-06-12T21:53:15.530
2@oerkelens I read
segue
as seeg; when the Segway came out I learned better, but I still feel "seeg between topics" is more correct, even though I know it isn't. @Adam hey, I used to readmisled
as mizzled and didn't connect it to the spoken sound miss-led for a long time. Short answer: native speakers guess and get it wrong all the time, but we have a much larger background knowledge of "words we've heard before" to work with. – TessellatingHeckler – 2015-06-12T21:54:12.170@BenKovitz I agree that both questions got many good answers, but I still feel that they're both answers to the same question, not two different questions. How can we best keep them "linked", then, so that interested people know to read both? – Dan Getz – 2015-06-13T03:26:12.620
1
@DanGetz I think the surest way to keep this question from getting deleted is to keep it open. Please allow me another go at persuading you that the two questions are different: Notice that the most accurate answer to the other question is probably this, which simply refutes the premise of the question. But that answer would not make sense in response to this question.
– Ben Kovitz – 2015-06-13T04:57:38.3632@BenKovitz rereading some more, I agree that the answers here are better answers for this question than the ones there. – Dan Getz – 2015-06-13T05:09:22.010
For what it's worth I always read "albeit" as "all-byte", the same way "height" is pronounced "hyte"... even though it's "all-be-it". Same thing with "freight", which I always imagine as being pronounced as "fright". And if I had to say it in speech (which I don't normally), I would probably say it wrong... even though I know that's not the correct pronunciation. So no, don't expect to get everything right, and don't expect that everyone does either... – user541686 – 2015-06-13T09:48:35.197
1Don't forget the simply illogical choir Unless you hear it being said while a person is reading the word aloud, I'd say nobody would know how this word is pronounced. – Mari-Lou A – 2015-06-13T21:00:27.487
No mention of "silent" letters as in debt, interesting, chocolate, and eczema
– Mari-Lou A – 2015-06-13T21:08:41.633@Mari-LouA interesting and chocolate only have silent letters for some speakers. AmE speakers often pronounce every vowel in those words. And I don't think eczema has any silent letters, but the second e is very lightly pronounced. – Aaron Brown – 2015-06-14T09:06:01.817
2While the English language is strange, it can be understood through thorough thought, though. – Mutantoe – 2015-06-14T09:23:01.497
@BlueRaja-DannyPflughoeft I'd have upvoted it 1000 times for that comment of yours. ELL did not allow me! :) I'll never forget that word now. Thanks a TON. – Maulik V – 2015-06-15T05:04:56.940
3The main advantage native speakers have is that if enough of them get it wrong in the same way, that makes them right. – RemcoGerlich – 2015-06-15T08:04:27.260
1
Even president Obama misprounces stuff. see https://www.youtube.com/watch?v=bNr66HHhMjs
– Ivo Beckers – 2015-06-15T09:28:46.233@IvoBeckers I cannot believe this. This is so so so so very tricky! I'm feeling so proud posting this question here that is getting me priceless tips and information. Thanks you all native speakers. love... – Maulik V – 2015-06-15T09:37:09.873
1A recent favourite of mine is gauge. What the heck is that u doing there? – gerrit – 2015-06-15T09:59:11.133
@gerrit The biggest blow to my medical mind was 'asthma' where there's no 'th' pronounced. I mean, I thought I should tear off my medical degree. And trust me, even pulmonologists and super specialists here in India STILL don't know the pronunciation of this very famous disease. Dengue, Rendezvous, lingerie, penis are a few more to add complexity. – Maulik V – 2015-06-15T10:13:07.940
3
@MaulikV Actually, the OED lists /ˈæsθmə/ as the first (most standard) pronunciation of asthma. The way to understand what's happening is that speakers feel a phonetic pressure to simplify the /sθm/ because this is not a normal English consonant cluster (as happens in many words from Greek). Some speakers bow to this pressure more than others.
– Ben Kovitz – 2015-06-15T22:03:10.7171
@MaulikV Rendezvous and lingerie are borrowed directly from French, and have French pronunciation (actually pretty Anglicized). Penis comes from Latin and has the usual difficulty with Latin words: the spelling doesn't indicate vowel length. The information is just not there in the spelling, even for someone who knows Latin. For the /ē/, there aren't even common patterns you can exploit to make an educated guess. (The unstressed second syllable is easy, though.) Dengue has a bizarre etymology, starting in Swahili and going through Spanish.
– Ben Kovitz – 2015-06-15T22:25:05.7202
@BlueRaja-DannyPflughoeft Actually, in AmE, the r in bourgeois is usually pronounced. It's actually somewhat predictable: the /rʒw/ sequence of consonants is unnatural in English, so speakers feel a pressure to simplify it. Writing tends to be more stable, so after a few hundred years, the resulting altered pronunciations are not that big a surprise.
– Ben Kovitz – 2015-06-15T22:35:40.903@BenKovitz: I have never once heard the r pronounced in 'bourgeois' in American English, and the audio files in the link you posted corroborate that. Apparently it is commonly pronounced in British English, though. – BlueRaja - Danny Pflughoeft – 2015-06-15T22:47:33.243
1
@BlueRaja-DannyPflughoeft Hey, that's my friend Dvortygirl's pronunciation! Most sources seem to say that the r is gone from the BrE pronunciation. Here's what the OED lists: Brit. /ˈbʊəʒwɑː/ , /bʊəˈʒwɑː/ , /ˈbɔːʒwɑː/ , /bɔːˈʒwɑː/ , U.S. /bʊrˈʒwɑ/ , /ˈbʊrʒwɑ/. The pressure to drop the r is stronger in non-rhotic accents, of course.
– Ben Kovitz – 2015-06-15T23:27:02.477