## Whence the "everything is linear" phenomenon, and what can we do about it?

59

21

$$\color{red}{(a+b)^2 = a^2+b^2}$$ $$\color{red}{\sqrt{x^4+y^4} = x^2+y^2}$$ $$\color{red}{e^{t^2+C} = e^{t^2}+e^C}$$

I've observed this phenomenon -- wherein, implicitly, students say, "Everything is linear! Just pass the operation through!" -- in courses at all levels. High school students. Undergraduates in calculus. It's all over the place.

• Where does it come from? Is there a human tendency to view things linearly? More specifically, is there a reason that this occurs quite frequently with students of mathematics? Do we unknowingly "teach" this behavior, only to have to "unteach" it later?
• What can we do about it? Is there any evidence that this can be "unlearned" at a later age, and how can this be done effectively? Do we need to just accept that it is part of human learning to view the world linearly, and instead focus on ways of addressing it as it comes up? How can this be done?

So, I'm hoping that this thread, on matheducators.se, can broaden the types of answers seen there. I really want to understand why and how this behavior develops (is it innate, is it a consequence of the way we teach, is it both?) and what we can do to address it effectively (both in the short term, like that other thread, but also in the long term). I'm sincerely hopeful that there is research out there about this phenomenon and that it can be shared here.

Addendum: Please do not duplicate answers/suggestions from that math.se thread. I am looking for significant evidence of effectiveness of techniques, and do not want this thread to become a big list of suggestions.

@ChrisA -- I apologize for any perceived naivete, but that seems barbaric. Mathematics is a mental game. Associating things like nonlinearity with WHACK! does not seem like an effective way to help them in any way.Robin Goodfellow 2015-01-01T17:51:24.747

@Robin - worry not, I'm not a serial abuser of students!! An example would be, we're in a tutorial session, the student is a little disengaged and says something daft like "3x3=6", when (s)he knows perfectly well that it isn't. A gentle tap with the learning tool is a humorous and quick way of getting them to recognise that (s)he wasn't actually thinking. Mostly I only need to pick it up and wave it gently... The learning tool is never applied in cases of actual non-understanding.ChrisA 2015-01-02T18:51:42.503

7I taught my Calculus I students: "if you think you have an identity like $\sqrt{x + y} = \sqrt x + \sqrt y$, then try plugging in specific numbers. If they don't come out equal, it's not really an identity." On the exam, one student actually wrote: "I want to simplify $\sqrt{x^2 + y^2}$. Try $x = 2$ and $y = 3$: $\sqrt{4 + 9} = 3.605$, but $\sqrt4 + \sqrt9 = 5$." On the next line, he continued: "$\sqrt{x^2 + y^2} = x + y$". (I pass otherwise silently over the issue of believing that $\sqrt{4 + 9} = 3.605$.)LSpice 2015-04-02T15:47:45.397

Behavioral studies tell me that negative reinforcement always works better... so what about telling them that "you are right, but in mod-2 arithmetic: $a^2 + b^2 = (a+b)^2\, (mod \,2)$?wilsonw 2014-03-29T07:10:39.897

Realated post on math.SE: Pedagogy: How to cure students of the “law of universal linearity”?

Martin 2015-05-18T07:44:14.480

4I often see students do $\sqrt{x+y}=\sqrt{x}+\sqrt{y}$. I point out to them that if this were true, the Pythagorean theorem would simplify from $a^2+b^2=c^2$ to $a+b=c$. This seem to work for that particular mistake, but I don't think it fixes the underlying problem, which is probably some combination of lack of technical skill and lack of discipline in reasoning according to well-defined principles.Ben Crowell 2014-03-29T22:09:51.350

28Linearity/distributivity, when first introduced, may appear to be a property of brackets, rather than a property of multiplication, because the multiplication sign is suppressed. This doesn't cover all cases obviously, but this is an explanation I've gotten for attempts at $(x+y)^{1/2}$ or $\sin(a+b)$ from various students. "Can't you just do that when there are brackets?" Nope, and this should be strongly emphasized early on.Robert Mastragostino 2014-03-30T07:22:20.600

@RobinGoodfellow While it might not be good to associate nonlinearity with WHACK! I think it would help a lot of students if they associated I'll just say random stuff and hope it's true with WHACK!DRF 2017-03-22T13:27:37.033

My suspicion of where this comes from is that most students are intellectually lazy and when they are not sure what to do, they guess the simplest possible rule. In some sense, this rule of thumb is not too bad -- often the simplest possible rule is the correct rule. Linearity only appears because it is generally the simplest possible rule available. Alas, students often lack the intellectual energy to make the effort to check the validity of their invented rule before proceeding on with it, which leads them astray in the examples cited.Michael Joyce 2017-06-28T20:46:55.350

I had a colleague who was so happy with a technique he showed me... he drew little hearts over the parentheses to remind students about distribution. Then I asked, "Do they know what operations that's appropriate for?" and he was at a loss for a response. One problem is that our institution-mandated final exam only tests on things that do distribute, and most instructors only teach those specific manipulations on the final.Daniel R. Collins 2017-07-18T04:34:54.610

3I think a better word choice is "everything commutes".TheBluegrassMathematician 2014-04-28T18:09:45.057

3The "everything has a linear/proportional effect" is much, much more widespread.vonbrand 2014-04-29T01:28:51.640

Thank god I don't actually teach in a school... fortunately I have a little more freedom with my private students (with full permission of the parents) to apply a device I call the Learning Tool. It's a little pencil that I whack my students over the head with if they do things like this. Seriously, it works. I usually have to just put the learning tool on the table and they focus their thoughts instantly.ChrisA 2014-05-08T17:04:16.850

40

The problem you describe is well-known in mathematics education research. I cite the paper of De Bock, D., Van Dooren, W., Janssens, D., & Verschaffel, L. (2002). Improper use of linear reasoning: An in-depth study of the nature and the irresistibility of secondary school students’ errors. Educational Studies in Mathematics, 50(3), 311–334. and give some of the references mentioned:

As the concept of linearity itself, the misuse of linearity has many faces: it has been found at different age levels and in a variety of mathematical domains (see, e.g., De Bock et al., 1999). In elementary arithmetic, the phenomenon of improper proportional reasoning is often related to a ‘lack of sense-making’ in the mathematics classroom (Gagatsis, 1998; Greer, 1993; Nesher, 1996; Verschaffel et al., 1994, 2000; Wyndhamn and Säljö, 1997). When confronted with so-called ‘pseudoproportionality problems’ (such as, e.g. “It takes 15 minutes to dry 1 shirt outside on a clothesline. How long will it take to dry 3 shirts outside?”), many students give answers based on direct proportionality (i.e., tripling the drying time because the number of shirts is tripled). [...]

In secondary education, ‘linearity errors’ are often reported in the fields of algebra and (pre)calculus. Students tend to overgeneralise what has been experienced as ‘true’ for linear functions to non-linear functions (e.g. “the square root of a sum is the sum of the square roots” or “the logarithm of a multiple is the multiple of the logarithm”). This type of systematic errors has been discussed and illustrated by Berté (1987, 1993), Gagatsis and Kyriakides (2000) and Matz (1982). According to Matz (1992), these linearity errors result from students’ overgeneralisation of the distributive law. The immense number of occasions wherein students add and use the distributive law in arithmetic and early algebra is very likely to reinforce students’ acceptance of linearity.

Berté, A. (Réd.): 1987, Enseignement des mathématiques utilisant la ‘réalité’, Tome 1, IREM, Bordeaux.

Berté, A.: 1993, Mathématique dynamique, Nathan, Paris.

De Bock, D., Verschaffel, L. and Janssens, D.: 1999, ‘Some reflections on the illusion of linearity’, Proceedings of the 3rd European Summer University on History and Epistemology in Mathematical Education, Vol. 1, Leuven/Louvain-la-Neuve, Belgium, pp. 153–167.

Gagatsis, A. and Kyriakides, L.: 2000, ‘Teachers’ attitudes towards their pupils’ mathematical errors’, Educational Research and Evaluation 6(1), 24–58.

Matz, M.: 1982, ‘Towards a process model for high school algebra errors’, in D. Sleeman and J.S. Brown (eds.), Intelligent Tutoring Systems, Academic Press, London, pp. 25–50.

Benoît Kloeckner 2015-05-02T12:37:10.293

2Thank you, this is what I was looking for! I will check out these articles.Brendan W. Sullivan 2014-03-28T18:01:52.143

3If you have problems accessing some papers, let me know. I am not sure if I can get them. At least, I have the main paper.Anschewski 2014-03-28T18:16:04.900

20"... overgeneralisation of the distributive law." — this is it in a nutshell and also a hidden explanation for why. There is no such thing as "the distributive law." There's a "distributive property of multiplication over addition" (and potentially one of exponentiation over multiplication, or some such), but that whole "of something over something else" part is important.Isaac 2014-03-29T02:34:16.900

Do you have an electonic copy of of Matz (1982)? I have seen it referenced as discussing "overgeneralised linearity" in a paper of Zazkis (pdf) but found nothing in my various searches.

Benjamin Dickman 2015-06-08T17:10:58.117

2Here's a fun game to play. Give your students the example of what to simplify before you teach them how to simplify it. What I notice is quite often the type of simplifications described here are what students give as answers before I teach them, and given that I often see these mistakes after teaching them, it suggests my part approach was wrong.

Derek Muller suggests that in science education we should give counters to common misconceptions as well as "the facts." See www.youtube.com/watch?v=eVtCO84MDj8‎. I feel fairly certain that this approach might be successful here as well. – David Wees 2014-03-31T13:45:57.533

I'am sorry, I don't have the Matz (1992) paper...Anschewski 2015-10-28T15:59:06.417

35

I don't view these common mistakes as 'universal linearity' assumptions. The mistake that $(a+b)^2=a^2+b^2$ is just a visually appealing statement. It is mistaken to be correct because it looks nice. Our brains tend to like things that look nice. Similarly, $\sqrt{a+b}=\sqrt a+\sqrt b$ is visually appealing and it resembles the correct formula $\sqrt {ab}=\sqrt a\cdot \sqrt b$. This is a fallacy of the kind "I don't really understand why all these algebraic rules for square root are true, I never really took the time to see the proof and I never really stopped for a second to think about the equalities I was given. Instead I just vaguely tried to remember them, totally devoid of content. Thus, I kinda remember that square root behaves like that, kinda, I hope, and so I'll just compute that way. Moreover, I have so little ability to compute even the simplest things in my head, and I don't feel like pulling out my trusty pocket calculator, that I am completely incapable of detecting the falsehood of this silly claim."

There are plenty of such fallacies, linearly looking or not. The cause, I believe, is simply a fundamental lack of understanding of the formulas coupled with a deep lack in ability to compute very simple things mentally. The way to deal with that is to include true/false questions of this type where the students need to prove or provide a counterexample, and disallow use of calculators.

8Indeed, I agree with your assessment. The simple truth is that these errors are not rooted in logical misapplication of rules. Rather, they are rooted in apathy mixed with poor background in arithmetic. For some, it's just apathy, I speak from my own experience both as a teacher and as a student.The Chef 2014-03-28T00:30:45.400

3The first example, $(a + b)^2 = a^2 + b^2$ in the US is the result of being taught the "distributive method", which is a shortcut method that isn't always true. I fell for the same logic myself, but I can assure you, it wasn't because I thought everything should be or was intended to be linear.person27 2014-03-28T06:59:11.387

1Nice answer! Actually the first incorrect formula you write, $(a+b)^2=a^2+b^2$, could be seen as a wrong version of the corresponding, correct version for multiplication, $(ab)^2=a^2b^2$.mickep 2016-09-26T17:02:26.863

2It's worth pointing out that $\sqrt{ab}\neq\sqrt{a}\sqrt{b}$ and this IMO is exactly the sloppy thinking that leads to $(a+b)^2=a^2+b^2$ the fact that noone has mentioned that seems scary.DRF 2017-03-22T13:46:53.817

2Well, the way in which products of square roots need not be square roots of products is qualitatively different, and subtler. I think to first order it is ok to think that they are the same, etc...paul garrett 2017-06-27T21:40:53.377

@DRF technically the only counter-examples I can perceive are ones related to resulting in it returning the non-principal square root. Is that what you are assuming?The Great Duck 2018-01-14T07:47:57.077

@TheGreatDuck Not really, my issue is that $\sqrt{ab}$ may exist while $\sqrt{a}\sqrt{b}$ might not, assuming you are working over any field which is not algebraically closed.DRF 2018-01-15T14:16:25.987

@DRF the individual factors might not exist but the entire expression would be in the set assuming the left side exists along with closure under negation.The Great Duck 2018-01-15T16:21:43.427

@TheGreatDuck What do you mean by in the set? In $\mathbb{Z}_5$ $\sqrt{2}\sqrt{3}$ does not exist. Whereas $\sqrt{1}=1$. This means $\sqrt{2}\sqrt{3}\neq \sqrt{2\times 3}$.DRF 2018-01-16T08:31:52.173

@DRF Using modular arithmetic to form a counterexample is a bit of a cop out seeing as how many algebraic properties break under sufficient conditions. The distributive property can also be violated under sufficient conditions. I was referring to the set of real numbers and how if $a$ and $b$ are negative then separating the radical is valid in that the final product is a real number. It is only the intermediary calculations that are not real and so it is valid as long as you accept that the square root is multivalued or set valued (whichever you prefer).The Great Duck 2018-01-16T15:38:15.837

@TheGreatDuck it is no way a cop out. It's exactly the same problem in $\mathbb{Z}_5$. You get the same exact solution if you adjoin the appropriate roots as you get by adjoining the appropriate roots for the reals. The product $\sqrt{2}\sqrt{3}$ in $\mathbb{Z}_5[X]/(x^2-2)$ evaluates to $x\times 2x=4$ which is a squareroot (thoughnot the one I had in mind). In other words there is absolutely no difference between a finite field and $\mathbb{R}$ in this sense in both cases you need to adjoin roots to make sense of the expression.DRF 2018-01-16T21:26:11.240

@DRF that's a ring not a field.The Great Duck 2018-01-16T21:52:28.783

@TheGreatDuck Really? I would have sworn $x^2-2$ was irreducible over $\mathbb{Z}_5$. But I might be wrong I'm sure.Eiher way that wasn't really the point, even if I did get the wrong construction there surely is a field extension of $\mathbb{Z}_5$ where both $2$ and $3$ are squares.DRF 2018-01-16T22:03:47.510

@DRF I'm just saying that $Z_5$ is a ring.The Great Duck 2018-01-16T22:09:18.653

DRF 2018-01-16T22:18:14.610

3Excellent answer. Saved me a lot of typing. I see this (and other similar) a lot with my students, and it's always a lack of understanding of how numbers work, coupled with a desire to just get the question done.ChrisA 2014-07-31T17:35:02.330

20

This became to big to be a comment.

Layman's opinion.

Where does it come from?
It comes from the fact that universal linearity is useful to move forward in calculations even if it's wrong. Psychologically this is very attractive. The other option is being stuck. Moving forward has the added incentive that it can be right, that maybe the student can get some points. Another reason why this might happen is that typographically linearity is aesthetically pleasing, so it should be right...

What can we do about it?
A solution is to increase critical thinking overall. This can be done by not slacking off on detail, rigor and logic. Students should be taught to justify every step by going back to the definitions or previously proved properties. But this is school of thought isn't used outside of proof-based courses and this is what happens.

6This hits the nail on the head. The alternatives are often "give up" or "try what looks right". What's described as "universal linearity" looks right to a person without the ability to think critically.Jonathan 2014-03-28T01:37:35.863

2This I think is an extremely essential point. Often students don't know what else to do and have been taught that doing something is better then doing nothing since in the US at least but other places possibly there is nothing to lose by having nonsense on the exam. In my country when I got these types of answers during an oral exam that was the end of the exam. Students soon learn it's better to think for a while or say I don't know instead of saying whatever randomness comes first.DRF 2017-03-22T13:44:07.953

16

You can encourage your students to check their algebra with random numerical examples.

E.g.: Try setting $a=x=t=2$ and $b=y=C=3$. Then the equations above give $$25=13$$ $$9.85=13$$ $$1096.6=74.7$$ which are vividly wrong.

If one pair of numbers comes out wrong, your algebra was definitely wrong. If a couple pairs of numbers come out right, your algebra was probably right. It's a great heuristic.

5

I do this quite frequently. And this is almost identical to the "Give counterexamples" suggest in this answer: http://math.stackexchange.com/a/630653/37705 I am looking for more significant evidence about whether this works.

Brendan W. Sullivan 2014-03-28T00:08:49.977

@Brendansullivan07: I find this essential when I do complicated algebraic manipulations; it works for me. – None – 2014-03-28T00:10:02.023

2@brendansullivan07: This tweaks that "give counterexamples" suggestion in two ways: 1) students can try random examples themselves, and 2) it helps to reduce everything to decimals: $\frac{1}{2}=2$ may be vivid to you but for most people $0.5=2.0$ is more striking. – None – 2014-03-28T00:13:42.630

8"Random numerical testing" can also be further dignified by calling it "the Monte Carlo method".paul garrett 2014-03-28T14:14:51.587

5While this sounds like it would be effective, my experience teaching algebra to college students suggests that this method is actually completely useless to anyone who does not already understand what is going on. Since although I believe you've posted this with good intentions, I believe you do not have any directly relevant experience, I've downvoted the answer. Nothing personal; have a good day!Chris Cunningham 2014-03-28T21:08:30.210

1@ChrisCunningham: I have used this when discussing spreadsheets in non-academic contexts, and when teaching high school students in a summer program, and found it helpful in both settings. – None – 2014-03-29T03:47:32.517

1Interesting -- I wonder what causes the difference. Maybe it works with small groups who are independently interested in understanding what is going on? Sorry for the strongly-worded comments, by the way, but I have strong feelings on this particular issue!Chris Cunningham 2014-03-29T14:53:25.040

6@MichaelE2: In my (admittedly limited) experience, teaching students to routinely perform these checks is a very effective way to help them unlearn wrongly generalized algebraic manipulation rules, and can even provide a kind of immunity against such overgeneralization in the future. It's not, however, a quick fix -- it takes many hours of active teaching and practice to instill such a routine, and many more for the students to slowly figure out which of the rules they thought they knew are actually wrong, and why. It's even harder if you're trying to teach something new at the same time.Ilmari Karonen 2014-03-30T13:30:41.517

14

As a student myself, I'd say that, while I'm not representative of all students, some of it is the intimidating or dismissive way some lecturers or teachers might treat questions relevant to simple things like these, or not teach off the exam specification. This leads to a subconscious or even conscious bias against asking these sort of questions which we're unsure of.

For example, if I asked a question such as why you can separate roots with multiplication signs, I might be told to just "accept it and learn it" or if I asked if you could do something incorrect, such as one of the examples you gave, I'm likely to receive a response like "Of course not! Don't be silly!"

I think these two factors especially lead people to just try to memorise as much of the method, but not logic or reasoning, behind the mathematics, and as a result end up getting it wrong, improvising, or guessing.

Personally, I think it's a lot due to teachers not concerning themselves with a student's future, only getting the student past the current exam hurdle, which results in generations of students who are assumed to know something relatively simple, but since the teachers deemed it unnecessary to explain since they don't need the explanation for the exam at hand, never learnt it.

Edit: Glancing back at this, it occurs to me I should mention that I would probably blame government policy more than teachers, who no doubt are under lots of unhelpful pressure and stress which may make it much more difficult to teach "well" - this is just what I have observed from my teachers from an objective standpoint.

11

In addition to other good answers and comments... I think it should be noted that "linear mathematics" at higher levels is the part of mathematics that we (collectively) understand relatively well, while "non-linear mathematics" is often intractable... except to the extent we can usefully approximate it linearly.

It is both symbol patterns and the mathematical assertion(s) given by the simplest symbol patterns that are both appealing and very handy ... if correct.

Paraphrasing, and as in some of the other answers, if the choice is between "being stuck" and "making progress", often an admittedly dubious assumption of linearity, or some other mildly outrageous optimistic assumption, is necessary to avoid getting stuck. That is, methodologically, linearity assumptions (and other such) are entirely reasonable... at least as a transitional device.

And, after all, "differentiability" of a function is in many regards the assertion that the function can be locally approximated by a linear function. The Newton-Raphson method shows how iteration of a nearly-linear device achieves excellent effects.

11

I guess you have already got plenty to read for "Where does it come from?" part of you question. Thus I just shortly introduce my favorite strategy that I use for "What can we do about it?" part, when there is a case to work on:

Encourage your students to find conditions that linearity assumption indeed works!

For examples, for which values of $a$ and $b$, $(a+b)^2=a^2+b^2$. Quite often, it turns to a nice challenge and rewarding in the direction you need.

8

Just to steer in a different direction. I think it's the way we learn mathematics (and I personally don't think this is a good thing). From a very young age, we learn to assume that everything is linear. In elementary school, we get problems like:

If John paints one house in five hours and Mary paints one house in three, how long does it take them to paint one house together?

Some kids will say: "Well, I don't know. Maybe they have to spend some time to divide the work between them, and maybe they spend a lot of time bringing their stuff in. You can't really divide this work unless they both bring half their stuff, but then they would have to share their equipment and they may have to wait for eachother...". But no, according to the teacher and the textbook they are wrong, and the answer should be $\frac{15}{8}$ hours. This reinforces the idea that you should just make the assumption of linearity when the problem is complicated.

In this case probably the only way to do an actual computation is to assume linearity. I think the despair of the teachers to show them that mathematics is useful has driven them to this kind of examples. But, it doesn't stop in primary school, linearity is almost always assumed. I think I even saw questions in mathematics and physics courses where you had to assume linearity to solve the question.

(I think the easiest examples that I can think of have to with pressure and the speed of water streaming out.)

Now linearity is not a bad thing (even if the thing we are trying to compute isn't exactly linearly we can still use it as an approximation), but the hidden assumption of it is a bad thing. Students will get confused and assume things are linear when they don't. The sentence 'a manager is someone who expects that two women can deliver a baby in 5 months' comes to mind now (but actually, you can substitute most students for the manager, and all non-linear things for 'delivering the baby').

3

I had always been one of those "some kids"; page 30 of this and this remind me of how I typically want to object upon being pushed such an ill-formed question.

Vandermonde 2015-01-30T00:42:34.643

2This is an interesting perspective that I hadn't considered before. Somehow I feel this is a more subtle and deliberate form of linearity as simplifying assumption that it's hard for me to connect it to the phenomenon in the question. The latter type of linearity seems more likely to conclude that John and Mary together paint the house in 8 hours!Erick Wong 2015-05-16T17:24:28.937

The manager joke is iconic!Engelsmann 2017-09-16T14:11:36.440

8

I've read all the existing answers long ago but still feel that none have gotten to the heart of the issue. We obtain mathematical results through a process of reasoning. That reasoning must be logical and enough to convince anyone that our results are correct given our initial assumptions. That is the actual purpose of a proof. It does not matter what form the reasoning takes, whether using only words or only mathematical symbols or only a diagram. The requirement is simply to convince the other person. If we cannot do so, then our reasoning is insufficient or incorrect.

This proper attitude must start right from the basics. For example $\frac{1+2}{1+3} \ne \frac{\not{1}+2}{\not{1}+3}$. Explaining to the student that one cannot do that is almost useless. Instead, the student should be asked: "Why do you cancel?" and then "Why does cancelling keep the value the same?".

The problem is that if this is not done from the beginning of arithmetic, it simply causes students to create for themselves a deep quagmire of guesswork in order to heuristically write down things which they believe will get them their grades. If you have seen students who try to mimic their teachers' phrasing but clearly without understanding of the meaning, or students who care only about how to get the answer and not why the method is correct, you know what I mean.

As a result, very few students have a full grasp of even the fundamentals, namely the field of rationals. What I mean by this is that few are able to state all the field axioms correctly and prove results like the uniqueness of inverses (when they exist) and that $0 \times x = 0$ and that $-x \times -y = x \times y$. (Out of these, fewer still can give any explanation as to the rationale for the axioms, but that is another topic.)

It is obvious that with a proper foundation as I briefly described above, no student would ever write $(a+b)^2 = a^2+b^2$. Why? Because they know that "$x^2$" is defined as "$x \times x$" and "$()$" are used to denote what to do first, so $(a+b)^2 = (a+b) \times (a+b)$. Moreover, they also would know the distributivity field axiom that gives first $(a+b) \times (a+b) = a \times (a+b) + b \times (a+b)$ and then after 2 more applications the full expansion, using the commutativity and associativity axioms. Likewise none of the other mistakes that you mentioned would occur.

Furthermore, if students cannot handle the field axioms correctly, one might as well throw the induction axiom out of the window. The way it is taught in most textbooks and curricula is seriously lacking, precisely because it is not based on sufficiently formal reasoning. A simple example that most students who were brought up with textbook induction fail to solve is:

Given a function $f:\mathbb{Z}\to\mathbb{R}$ such that $f(0) = 0$ and $f(1) = 1$ and $f(x+1) + 6 f(x-1) = 5 f(x)$ for any $x \in \mathbb{Z}$, prove that $f(x) = 3^x - 2^x$ for any $x \in \mathbb{Z}$.

It is not hard at all, but only those who understand the logical structure of induction would be able to give a correct proof. In case anyone is wondering what I mean by textbook induction, two examples that I would consider seriously lacking are:

Finally, proper reasoning naturally requires sufficient precision, because one cannot reason logically about statements whose meaning is undefined or unclear. Vagueness in mathematics is one great recipe for confusion. This must start with the teacher. A teacher who is sloppy with mathematical statements or steps in reasoning is simply telling the students that it is alright to be sloppy and by extension it is alright if they do not know what they are doing as long as they get the answer!

One terrible example of sloppiness in most high-school curricula is solving differential equations by "separating variables". Try giving the following to any student:

Solve for $y$ as a function of a real variable $x$ given that the differential equation $\frac{dy}{dx} = 2\sqrt{y}$ holds.

You know what answer to expect, and I hope you know the correct answer. Even Wolfram Alpha gets it wrong. Now for students who give the wrong answer, tell them that it is wrong but do not tell them the correct answer, and ask if they can identify the mistake and fix it. Most will fail to identify the mistake, and fixing the mistake will require the foundation in logic that most students do not have.

Here are the solution sketches for the problems I've given above. I strongly encourage one to thoroughly check one's own work to verify whether each step follows completely logically from the preceding deductions, and merely look at these solutions to confirm.

Problem

Given a function $f:\mathbb{Z}\to\mathbb{R}$ such that $f(0) = 0$ and $f(1) = 1$ and $f(x+1) + 6 f(x-1) = 5 f(x)$ for any $x \in \mathbb{Z}$, prove that $f(x) = 3^x - 2^x$ for any $x \in \mathbb{Z}$.

Hints

Induction only allows you to derive something about the natural numbers. The desired theorem is about integers. Also, if you cannot prove the implication needed for the induction, a key technique that often works is to strengthen the induction hypothesis to include enough information so that you can prove the implication step. Of course that also means that the implication you need to prove has changed!

Solution sketch

First notice that the theorem to be proven is that $f(x) = 3^x - 2^x$ for all integers $x$, and so induction in one direction is not enough! Also, notice that it is impossible to prove that $f(x) = 3^x - 2^x$ implies $f(x+1) = 3^{x+1} - 2^{x+1}$, and hence the induction hypothesis must contain information about at least two 'data points' for $f$. The easiest one would be to let $P(x)$ be "$f(x) = 3^x - 2^x$ and $f(x-1) = 3^{x-1} - 2^{x-1}$". Then one must prove $P(x+1)$, which expands to "$f(x+1) = 3^{x+1} - 2^{x+1}$ and $f(x) = 3^x - 2^x$". I would not accept if the student does not fully prove $P(x+1)$. This would handle the natural numbers, and a similar induction would handle the negative integers. It is of course possible to combine both inductions into one, which it should be explored, although in general it is good to keep a proof as modular as possible.

Problem

Solve for $y$ as a function of a real variable $x$ given that the differential equation $\frac{dy}{dx} = 2\sqrt{y}$ holds.

Hint

The answer is not $y = (x+a)^2$, which you would get by the method of separating variables. What went wrong? Note that the error would still be there if you used the theorem that allows change of variables in an integral. Look carefully at each deduction step. One step cannot be justified based on any axiom. Think basic arithmetic. After you get that, you need to consider cases and use the completeness axiom for reals to extend the open intervals on which the standard solution works.

Solution sketch

The field axioms only give you a multiplicative inverse when it is not zero. Now how to solve the problem? Split into cases. Note that you need to work on intervals since having isolated points where $y$ is nonzero is useless. First prove that for any point where $y \ne 0$, there is an open interval around $x$ for which $y \ne 0$. Then we can use the completeness axiom for reals to extend the interval in both directions as far as $y \ne 0$. Now we can use any method to solve for $y$ on that interval. Note that the method of separating variables is formally invalid, so we should use the change of variables substitution. But the prerequisite for that is that $\frac{dy}{dx}$ is continuous, so we need to prove that! Well, $y$ is differentiable and hence continuous, so $2\sqrt{y}$ is continuous. So we get the solution on the extended interval, and it shows that $y$ becomes zero in exactly one direction in this example. Hence after some checking you will get either $y = 0$ or $y = \cases{ 0 & \text{if } x \le a \\ (x-a)^2 & \text{if } x > a }$ for some real $a$.

Alternative subproof

In fact, the substitution theorem can be completely avoided as follows. On any interval $I$ where $y \ne 0$, we have $y'^2 = 4y$, where "${}'$" denotes the derivative with respect to $x$. Thus $(y'^2)' = (4y)'$, which gives $2y'y'' = 4y'$, and hence $y'' = 2$ since $y' = 2\sqrt{y} \ne 0$. Thus $y' = 2x+c$ on $I$ for some real $c$, and hence $y = x^2+cx+d$ on $I$ for some real $d$. Note that most of the above steps are not reversible and hence we need to check all the solutions we finally obtain with the original differential equation. We would get $c^2 = 4d$. After simple manipulation we obtain the same result for $y$ on $I$ as in the other solution. The other parts of the solution still need to be there.

2I think you're right about the first part: students develop a veritable morass of heuristics and "rules" that somehow "make sense" to them because they get the "right answer" but they cannot explain them at all!Brendan W. Sullivan 2015-05-01T18:37:58.790

3I don't understand your differential equation example, though. In what sense does Wolfram "get it wrong"? What mistake do you observe students making? I think you should explain it here as opposed to being coy with what you had in mind..Brendan W. Sullivan 2015-05-01T18:38:46.827

@brendansullivan07: Okay I'll put the solution in, but seriously I expect all teachers to be able to obtain and prove the correct solution.user21820 2015-05-02T04:06:30.813

3Of course. But I think answers on this site should be self-contained and immediately helpful. There's nothing to be gained by being coy/deceptive about it!Brendan W. Sullivan 2015-05-02T04:10:00.607

1@brendansullivan07: No that's not my purpose. My purpose is to not give an easy way to get the solution, to neither students nor teachers. I want people to spend enough effort to get the answer on their own, before I give any answers. But as I said, give me a while to put the answers in.user21820 2015-05-02T04:20:37.473

1@brendansullivan07: Ok I've added them. If you still do not understand them, let me know again. There is a reason why Wolfram Alpha gives the wrong answer, which is that the algorithm it uses caters only to a smaller more well-behaved class of functions, in which case there is a unique solution. Do you know what that class is?user21820 2015-05-02T05:15:42.800

@BenoîtKloeckner: I'm not sure what you are talking about. For any $x$ where $y \ne 0$, the open interval around $x$ (obtained from continuity) where $y \ne 0$ can be extended on each side, either to infinity, or to a point given by the completeness axiom, where $y = 0$ at that endpoint by continuity. It turns out that the solution on each such interval must be strictly increasing and hence there can be only one such interval. To the left of that interval $y$ must hence be always zero.user21820 2015-05-16T13:57:55.860

@brendansullivan07: I edited my answer, because I originally played with $(\frac{dy}{dx})^2 = 4y$ which had the original solutions I gave but I later decided to change the question to use the positive square-root so that $y'$ can be easily seen to be differentiable when $y \ne 0$, to make it easier to solve, but I forgot to check the solutions. The original question can be solved in the same way, but one must first prove at least continuity of $y'$, which is not so easy.user21820 2015-05-16T14:09:21.390

@user21820: sorry, I was careless. Comment deleted.Benoît Kloeckner 2015-05-16T18:37:04.567

1@BenoîtKloeckner: No problem. I am personally very prone to careless mistakes in mathematics too, so I try my best to make everything logically clear so that my students can easily point out any mistakes because they would be glaring logical gaps. =)user21820 2015-05-17T06:20:25.757

@user21820: in order to make a more constructive comment out of my mistake, let me stress that the slightly different example $y'(x)=\sqrt{|y(x)|}$ has the advantage of having a right-hand side which is defined for all values of $y(x)$, and it makes it an even more compelling example to show the importance of the regularity assumptions in uniqueness theorems (which was not you point here, granted).Benoît Kloeckner 2015-05-17T08:43:18.140

Aren't there more solutions? It seems that $y = (x-b)^2, x<b ; y = (x-a)^2, x>a$, and otherwise $y=0$ is also a solution to $\frac{dy}{dx} = 2\sqrt{y}$Chris Cunningham 2015-09-09T15:09:56.843

2@ChrisCunningham: Guess what... I made that mistake in my first version of my answer! For that function when $x < b$ the gradient is negative, but $2 \sqrt{y}$ is positive, so it is not a solution. However, it is a solution to the differential equation I stated 5 comments up. =)user21820 2015-09-10T09:11:02.567

Ha, yes, excellent :)Chris Cunningham 2015-09-16T16:10:47.500

1@user4571: Did you see the answer? There are not just $4$ solutions, but infinitely many, and not of the kind you think! Besides, it is totally not rigorous to make claims of the form "you can only switch paths at $0$". That is one of the problems. Write out a formal proof and you will see that you cannot avoid using the completeness axiom for reals to extend a solution on an open interval to show that there is only one such interval. The other problem is that the substitution theorem needs continuity, but the "separating variables method" does not bother to prove the required condition.user21820 2016-04-13T07:15:28.237

Sorry for being imprecise. What I really wanted to ask was, what distinction are you making between "separating variables" and "change of variables substitution"? As I understand it, these are the same thing -- separating variables means (to me) exactly moving all the $y$ terms to one side and integrating using change of variables.user4571 2016-04-13T18:48:21.057

1@user4571: As long as you check the conditions required by the change of variables theorem, that's alright. I'm objecting to the way majority of students have been and are still being taught separating variables, which is nothing more than just callously moving around the $d$'s and prepending $\int$'s with no understanding whatsoever. But you also hand-waved regarding the bifurcations, which is independent of the errors in the high-school-taught method of separating variables.user21820 2016-04-14T02:54:31.173

@user21820 Ok, but what you do here is nothing but carefully separating variables! To say you are not separating variables is quite confusing, in my opinion. I agree that mindless symbol pushing is awful and that students should understand how the change of variables formula justifies the separation method. (Regarding the bifurcations, I was not fully rigorous, but I was not trying to be.)user4571 2016-04-14T03:29:03.760

(And here I admit "not fully rigorous" is a euphemism for "not rigorous at all.")user4571 2016-04-14T03:35:32.670

@user4571: Great! We don't disagree then, except that I personally would never use the name "separating variables" because it's highly misleading and suggests the incorrect interpretation, in the same way that "cross-multiplying" is not a meaningful name. You can be quite sure that many students and teachers don't think of any more than literally pulling the $dx$ and $dy$ apart when 'applying separating variables'.user21820 2016-04-14T04:58:37.670

6

I am deeply convinced that these fallacies come from the way we teach maths.

There is some research indicating that for the over reliance on linearity (see e.g. page 51 of the last EMS newsletter, as I commented on another answer giving more references).

But it seems to me that this is a particular case of a more general phenomenon: maths are just thought by many student as black magic. It is about learning and reciting formulas, using precise methods to solve precise exercise, always in the same way. Beware if you misspell a formula, as you might summon an efreet by accident (i.e. the teacher will be mad at you)! In other words, students are lacking the relation between symbols, rules, formulas, theorems in the one hand, and the meaning of them in the other hand. Without such connection, there is simply no way to tell a correct formula from an incorrect one; even checking may be out of the question, as replacing letters with specific values can only be done with a certain understanding of the role of variables, as opposed to the mere ability to reproduce formal manipulations of symbols.

I do not know if the following hypothesis has been tested rigorously, but I also think that this is linked very closely to the way we evaluate our students: the more we ask them to solve standardized exercises, the less sense and reasoning they would put into their maths.

6

What can we do about it?