I've read all the existing answers long ago but still feel that none have gotten to the heart of the issue. We obtain mathematical results through a process of reasoning. **That reasoning must be logical and enough to convince anyone that our results are correct given our initial assumptions.** That is the actual purpose of a proof. It does not matter what form the reasoning takes, whether using only words or only mathematical symbols or only a diagram. The requirement is simply to convince the other person. **If we cannot do so, then our reasoning is insufficient or incorrect.**

This proper attitude must start right from the basics. For example $\frac{1+2}{1+3} \ne \frac{\not{1}+2}{\not{1}+3}$. **Explaining to the student that one cannot do that is almost useless**. Instead, the student should be asked: **"Why do you cancel?"** and then **"Why does cancelling keep the value the same?"**.

The problem is that if this is not done from the beginning of arithmetic, it simply causes students to create for themselves a deep quagmire of guesswork in order to heuristically **write down things which they believe will get them their grades**. If you have seen students who try to mimic their teachers' phrasing but clearly without understanding of the meaning, or students who care only about how to get the answer and not why the method is correct, you know what I mean.

As a result, very few students have a full grasp of even the fundamentals, namely the field of rationals. What I mean by this is that few are able to state all the field axioms correctly and prove results like the uniqueness of inverses (when they exist) and that $0 \times x = 0$ and that $-x \times -y = x \times y$. (Out of these, fewer still can give any explanation as to the rationale for the axioms, but that is another topic.)

It is obvious that with a proper foundation as I briefly described above, no student would ever write $(a+b)^2 = a^2+b^2$. Why? Because they know that "$x^2$" is **defined** as "$x \times x$" and "$()$" are used to **denote** what to do first, so $(a+b)^2 = (a+b) \times (a+b)$. Moreover, they also would know the distributivity field axiom that gives first $(a+b) \times (a+b) = a \times (a+b) + b \times (a+b)$ and then after 2 more applications the full expansion, using the commutativity and associativity axioms. Likewise none of the other mistakes that you mentioned would occur.

Furthermore, if students cannot handle the field axioms correctly, one might as well throw the induction axiom out of the window. The way it is taught in most textbooks and curricula is seriously lacking, precisely because it is not based on sufficiently formal reasoning. A simple example that most students who were brought up with textbook induction fail to solve is:

Given a function $f:\mathbb{Z}\to\mathbb{R}$ such that $f(0) = 0$ and $f(1) = 1$ and $f(x+1) + 6 f(x-1) = 5 f(x)$ for any $x \in \mathbb{Z}$, prove that $f(x) = 3^x - 2^x$ for any $x \in \mathbb{Z}$.

It is not hard at all, but only those who understand the logical structure of induction would be able to give a correct proof. In case anyone is wondering what I mean by textbook induction, two examples that I would consider seriously lacking are:

Finally, proper reasoning naturally requires sufficient precision, because one cannot reason logically about statements whose meaning is undefined or unclear. **Vagueness in mathematics is one great recipe for confusion.** This must start with the teacher. **A teacher who is sloppy with mathematical statements or steps in reasoning is simply telling the students that it is alright to be sloppy and by extension it is alright if they do not know what they are doing as long as they get the answer!**

One terrible example of sloppiness in most high-school curricula is solving differential equations by "separating variables". Try giving the following to any student:

Solve for $y$ as a function of a real variable $x$ given that the differential equation $\frac{dy}{dx} = 2\sqrt{y}$ holds.

You know what answer to expect, and I hope you know the correct answer. Even Wolfram Alpha gets it wrong. Now for students who give the wrong answer, tell them that it is wrong but do not tell them the correct answer, and ask if they can identify the mistake and fix it. Most will fail to identify the mistake, and fixing the mistake will require the foundation in logic that most students do not have.

Here are the solution sketches for the problems I've given above. I strongly encourage one to thoroughly check one's own work to verify whether each step follows completely logically from the preceding deductions, and merely look at these solutions to confirm.

**Problem**

Given a function $f:\mathbb{Z}\to\mathbb{R}$ such that $f(0) = 0$ and $f(1) = 1$ and $f(x+1) + 6 f(x-1) = 5 f(x)$ for any $x \in \mathbb{Z}$, prove that $f(x) = 3^x - 2^x$ for any $x \in \mathbb{Z}$.

**Hints**

Induction only allows you to derive something about the natural numbers. The desired theorem is about integers. Also, if you cannot prove the implication needed for the induction, a key technique that often works is to strengthen the induction hypothesis to include enough information so that you can prove the implication step. Of course that also means that the implication you need to prove has changed!

**Solution sketch**

First notice that the theorem to be proven is that $f(x) = 3^x - 2^x$ for all **integers** $x$, and so induction in one direction is not enough! Also, notice that it is impossible to prove that $f(x) = 3^x - 2^x$ implies $f(x+1) = 3^{x+1} - 2^{x+1}$, and hence the induction hypothesis must contain information about at least two 'data points' for $f$. The easiest one would be to let $P(x)$ be "$f(x) = 3^x - 2^x$ **and** $f(x-1) = 3^{x-1} - 2^{x-1}$". Then one must prove $P(x+1)$, which expands to "$f(x+1) = 3^{x+1} - 2^{x+1}$ **and** $f(x) = 3^x - 2^x$". I would not accept if the student does not fully prove $P(x+1)$. This would handle the natural numbers, and a similar induction would handle the negative integers. It is of course possible to combine both inductions into one, which it should be explored, although in general it is good to keep a proof as modular as possible.

**Problem**

Solve for $y$ as a function of a real variable $x$ given that the differential equation $\frac{dy}{dx} = 2\sqrt{y}$ holds.

**Hint**

The answer is not $y = (x+a)^2$, which you would get by the method of separating variables. What went wrong? Note that the error would still be there if you used the theorem that allows change of variables in an integral. Look carefully at each deduction step. One step cannot be justified based on any axiom. Think basic arithmetic. After you get that, you need to consider cases and use the completeness axiom for reals to extend the open intervals on which the standard solution works.

**Solution sketch**

The field axioms only give you a multiplicative inverse when it is not zero. Now how to solve the problem? Split into cases. Note that you need to work on intervals since having isolated points where $y$ is nonzero is useless. First prove that for any point where $y \ne 0$, there is an open interval around $x$ for which $y \ne 0$. Then we can use the completeness axiom for reals to extend the interval in both directions as far as $y \ne 0$. Now we can use any method to solve for $y$ on that interval. Note that the method of separating variables is formally invalid, so we should use the change of variables substitution. But the prerequisite for that is that $\frac{dy}{dx}$ is continuous, so we need to prove that! Well, $y$ is differentiable and hence continuous, so $2\sqrt{y}$ is continuous. So we get the solution on the extended interval, and it shows that $y$ becomes zero in exactly one direction in this example. Hence after some checking you will get either $y = 0$ or $y = \cases{ 0 & \text{if } x \le a \\ (x-a)^2 & \text{if } x > a }$ for some real $a$.

**Alternative subproof**

In fact, the substitution theorem can be completely avoided as follows. On any interval $I$ where $y \ne 0$, we have $y'^2 = 4y$, where "${}'$" denotes the derivative with respect to $x$. Thus $(y'^2)' = (4y)'$, which gives $2y'y'' = 4y'$, and hence $y'' = 2$ since $y' = 2\sqrt{y} \ne 0$. Thus $y' = 2x+c$ on $I$ for some real $c$, and hence $y = x^2+cx+d$ on $I$ for some real $d$. Note that most of the above steps are not reversible and hence we need to check all the solutions we finally obtain with the original differential equation. We would get $c^2 = 4d$. After simple manipulation we obtain the same result for $y$ on $I$ as in the other solution. The other parts of the solution still need to be there.

@ChrisA -- I apologize for any perceived naivete, but that seems barbaric. Mathematics is a mental game. Associating things like nonlinearity with

WHACK!does not seem like an effective way to help them in any way. – Robin Goodfellow – 2015-01-01T17:51:24.747@Robin - worry not, I'm not a serial abuser of students!! An example would be, we're in a tutorial session, the student is a little disengaged and says something daft like "3x3=6", when (s)he knows perfectly well that it isn't. A gentle tap with the learning tool is a humorous and quick way of getting them to recognise that (s)he wasn't actually thinking. Mostly I only need to pick it up and wave it gently... The learning tool is never applied in cases of actual non-understanding. – ChrisA – 2015-01-02T18:51:42.503

7I taught my Calculus I students: "if you think you have an identity like $\sqrt{x + y} = \sqrt x + \sqrt y$, then try plugging in specific numbers. If they don't come out equal, it's not really an identity." On the exam, one student

actually wrote: "I want to simplify $\sqrt{x^2 + y^2}$. Try $x = 2$ and $y = 3$: $\sqrt{4 + 9} = 3.605$, but $\sqrt4 + \sqrt9 = 5$." On the next line, he continued: "$\sqrt{x^2 + y^2} = x + y$". (I pass otherwise silently over the issue of believing that $\sqrt{4 + 9} = 3.605$.) – LSpice – 2015-04-02T15:47:45.397Behavioral studies tell me that negative reinforcement always works better... so what about telling them that "you are right, but in mod-2 arithmetic: $a^2 + b^2 = (a+b)^2\, (mod \,2)$? – wilsonw – 2014-03-29T07:10:39.897

Realated post on math.SE: Pedagogy: How to cure students of the “law of universal linearity”?

– Martin – 2015-05-18T07:44:14.4804I often see students do $\sqrt{x+y}=\sqrt{x}+\sqrt{y}$. I point out to them that if this were true, the Pythagorean theorem would simplify from $a^2+b^2=c^2$ to $a+b=c$. This seem to work for that particular mistake, but I don't think it fixes the underlying problem, which is probably some combination of lack of technical skill and lack of discipline in reasoning according to well-defined principles. – Ben Crowell – 2014-03-29T22:09:51.350

28Linearity/distributivity, when first introduced, may appear to be a property of

brackets, rather than a property of multiplication, because the multiplication sign is suppressed. This doesn't cover all cases obviously, but this is an explanation I've gotten for attempts at $(x+y)^{1/2}$ or $\sin(a+b)$ from various students. "Can't you just do that when there are brackets?" Nope, and this should be strongly emphasized early on. – Robert Mastragostino – 2014-03-30T07:22:20.600@RobinGoodfellow While it might not be good to associate nonlinearity with WHACK! I think it would help a lot of students if they associated I'll just say random stuff and hope it's true with WHACK! – DRF – 2017-03-22T13:27:37.033

My suspicion of where this comes from is that most students are intellectually lazy and when they are not sure what to do, they guess the simplest possible rule. In some sense, this rule of thumb is not too bad -- often the simplest possible rule

isthe correct rule. Linearity only appears because it is generally the simplest possible rule available. Alas, students often lack the intellectual energy to make the effort to check the validity of their invented rule before proceeding on with it, which leads them astray in the examples cited. – Michael Joyce – 2017-06-28T20:46:55.350I had a colleague who was so happy with a technique he showed me... he drew little hearts over the parentheses to remind students about distribution. Then I asked, "Do they know what

operationsthat's appropriate for?" and he was at a loss for a response. One problem is that our institution-mandated final exam only tests on things that do distribute, and most instructors only teach those specific manipulations on the final. – Daniel R. Collins – 2017-07-18T04:34:54.6103I think a better word choice is "everything commutes". – TheBluegrassMathematician – 2014-04-28T18:09:45.057

3The "everything has a linear/proportional effect" is much, much more widespread. – vonbrand – 2014-04-29T01:28:51.640

Thank god I don't actually teach in a school... fortunately I have a little more freedom with my private students (with full permission of the parents) to apply a device I call the Learning Tool. It's a little pencil that I whack my students over the head with if they do things like this. Seriously, it works. I usually have to just put the learning tool on the table and they focus their thoughts instantly. – ChrisA – 2014-05-08T17:04:16.850