Category Archives: pure mathematics

The infinite hotel paradox : due Jeff Dekofsky

This hardcore stuff about “infinity” , quite nicely, explained was pointed out to me by my ISC XII student, Mr. Utkarsh Malhotra! ūüôā


Genius of Srinivasa Ramanujan

  1. In December 1914, Ramanujan was asked by his friend P.C. Mahalanobis to solve a puzzle that appeared in¬†Strand¬†magazine as “Puzzles at a Village Inn”. The puzzle stated that n houses on one side of the street are numbered sequentially starting from 1. The sum of the house numbers on the left of a particular house having the number m, equals that of the houses on the right of this particular house. It is given that n lies between 50 and 500 and one has to determine the values of m and n. Ramanujan immediately rattled out a continued fraction generating all possible values of m without having any restriction on the values of n. List the first five values of m and n.
  2. Ramanujan had posed the following problem in a journal: \sqrt{1+2\sqrt{1+3\sqrt{\ldots}}}=x, find x. Without receiving an answer from the readers, after three months he gave answer as 3. This he could say because he had an earlier general result stating 1+x=\sqrt{1+x\sqrt{1+(x+1)\sqrt{1+(x+2)\sqrt{1+\ldots}}}} is true for all x. Prove this result, then x=2 will give the answer to Ramanujan’s problem.

Try try until you succeed!!

Nalin Pithwa.

Limits that arise frequently

We continue our presentation of basic stuff from Calculus and Analytic Geometry, G B Thomas and Finney, Ninth Edition. My express purpose in presenting these few proofs is to emphasize that Calculus, is not just a recipe of calculation techniques. Or, even, a bit further, math is not just about calculation. I have a feeling that such thinking nurtured/developed at a young age, (while preparing for IITJEE Math, for example) makes one razor sharp.

We verify a few famous limits.

Formula 1:

If |x|<1, \lim_{n \rightarrow \infty}x^{n}=0

We need to show that to each \in >0 there corresponds an integer N so large that |x^{n}|<\in for all n greater than N. Since \in^{1/n}\rightarrow 1, while |x|<1. there exists an integer N for which \in^{1/n}>|x|. In other words,

|x^{N}|=|x|^{N}<\in. Call this (I).

This is the integer we seek because, if |x|<1, then

|x^{n}|<|x^{N}| for all n>N. Call this (II).

Combining I and II produces |x^{n}|<\in for all n>N, concluding the proof.

Formula II:

For any number x, \lim_{n \rightarrow \infty}(1+\frac{x}{n})^{n}=e^{x}.

Let a_{n}=(1+\frac{x}{n})^{n}. Then, \ln {a_{n}}=\ln{(1+\frac{x}{n})^{n}}=n\ln{(1+\frac{x}{n})}\rightarrow x,

as we can see by the following application of l’Hopital’s rule, in which we differentiate with respect to n:

\lim_{n \rightarrow \infty}n\ln{(1+\frac{x}{n})}=\lim_{n \rightarrow \infty}\frac{\ln{(1+x/n)}}{1/n}, which in turn equals

\lim_{n \rightarrow \infty}\frac{(\frac{1}{1+x/n}).(-\frac{x}{n^{2}})}{-1/n^{2}}=\lim_{n \rightarrow \infty}\frac{x}{1+x/n}=x.

Now, let us apply the following theorem with f(x)=e^{x} to the above:

(a theorem for calculating limits of sequences) the continuous function theorem for sequences:

Let a_{n} be a sequence of real numbers. If \{a_{n}\} be a sequence of real numbers. If a_{n} \rightarrow L and if f is a function that is continu0us at L and defined at all a_{n}, then f(a_{n}) \rightarrow f(L).

So, in this particular proof, we get the following:

(1+\frac{x}{n})^{n}=a_{n}=e^{\ln{a_{n}}}\rightarrow e^{x}.

Formula 3:

For any number x, \lim_{n \rightarrow \infty}\frac{x^{n}}{n!}=0

Since -\frac{|x|^{n}}{n!} \leq \frac{x^{n}}{n!} \leq \frac{|x|^{n}}{n!},

all we need to show is that \frac{|x|^{n}}{n!} \rightarrow 0. We can then apply the Sandwich Theorem for Sequences (Let \{a_{n}\}, \{b_{n}\} and \{c_{n}\} be sequences of real numbers. if a_{n}\leq b_{n}\leq c_{n} holds for all n beyond some index N, and if \lim_{n\rightarrow \infty}a_{n}=\lim_{n\rightarrow \infty}c_{n}=L,, then \lim_{n\rightarrow \infty}b_{n}=L also) to  conclude that \frac{x^{n}}{n!} \rightarrow 0.

The first step in showing that |x|^{n}/n! \rightarrow 0 is to choose an integer M>|x|, so that (|x|/M)<1. Now, let us the rule (formula 1, mentioned above), so we conclude that:(|x|/M)^{n}\rightarrow 0. We then restrict our attention to values of n>M. For these values of n, we can write:

\frac{|x|^{n}}{n!}=\frac{|x|^{n}}{1.2 \ldots M.(M+1)(M+2)\ldots n}, where there are (n-M) factors in the expression (M+1)(M+2)\ldots n, and

the RHS in the above expression is \leq \frac{|x|^{n}}{M!M^{n-M}}=\frac{|x|^{n}M^{M}}{M!M^{n}}=\frac{M^{M}}{M!}(\frac{|x|}{M})^{n}. Thus,

0\leq \frac{|x|^{n}}{n!}\leq \frac{M^{M}}{M!}(\frac{|x|}{M})^{n}. Now, the constant \frac{M^{M}}{M!} does not change as n increases. Thus, the Sandwich theorem tells us that \frac{|x|^{n}}{n!} \rightarrow 0 because (\frac{|x|}{M})^{n}\rightarrow 0.

That’s all, folks !!


Nalin Pithwa.

Cauchy’s Mean Value Theorem and the Stronger Form of l’Hopital’s Rule

Reference: Thomas, Finney, 9th edition, Calculus and Analytic Geometry.

Continuing our previous discussion of “theoretical” calculus or “rigorous” calculus, I am reproducing below the proof of the finite limit case of the stronger form of l’Hopital’s Rule :

L’Hopital’s Rule (Stronger Form):

Suppose that


and that the functions f and g are both differentiable on an open interval (a,b) that contains the point x_{0}. Suppose also that g^{'} \neq 0 at every point in (a,b) except possibly at x_{0}. Then,

\lim_{x \rightarrow x_{0}}\frac{f(x)}{g(x)}=\lim_{x \rightarrow x_{0}}\frac{f^{x}}{g^{x}} ….call this equation I,

provided the limit on the right exists.

The proof of the stronger form of l’Hopital’s Rule is based on Cauchy’s Mean Value Theorem, a mean value theorem that involves two functions instead of one. We prove Cauchy’s theorem first and then show how it leads to l’Hopital’s Rule.¬†

Cauchy’s Mean Value Theorem:

Suppose that the functions f and g are continuous on [a,b] and differentiable throughout (a,b) and suppose also that g^{'} \neq 0 throughout (a,b). Then there exists a number c in (a,b) at which

\frac{f^{'}(c)}{g^{'}(c)} = \frac{f(b)-f(a)}{g(b)-g(a)}…call this II.

The ordinary Mean Value Theorem is the case where g(x)=x.

Proof of Cauchy’s Mean Value Theorem:

We apply the Mean Value Theorem twice. First we use it to show that g(a) \neq g(b). For if g(b) did equal to g(a), then the Mean Value Theorem would give:

g^{'}(c)=\frac{g(b)-g(a)}{b-a}=0 for some c between a and b. This cannot happen because g^{'}(x) \neq 0 in (a,b).

We next apply the Mean Value Theorem to the function:

F(x) = f(x)-f(a)-\frac{f(b)-f(a)}{g(b)-g(a)}[g(x)-g(a)].

This function is continuous and differentiable where f and g are, and F(b) = F(a)=0. Therefore, there is a number c between a and b for which F^{'}(c)=0. In terms of f and g, this says:

F^{'}(c) = f^{'}(c)-\frac{f(b)-f(a)}{g(b)-g(a)}[g^{'}(c)]=0, or

\frac{f^{'}(c)}{g^{'}(c)}=\frac{f(b)-f(a)}{g(b)-g(a)}, which is II above. QED.

Proof of the Stronger Form of l’Hopital’s Rule:

We first prove I for the case x \rightarrow x_{o}^{+}. The method needs no  change to apply to x \rightarrow x_{0}^{-}, and the combination of those two cases establishes the result.

Suppose that x lies to the right of x_{o}. Then, g^{'}(x) \neq 0 and we can apply the Cauchy’s Mean Value Theorem to the closed interval from x_{0} to x. This produces a number c between x_{0} and x such that \frac{f^{'}(c)}{g^{'}(c)}=\frac{f(x)-f(x_{0})}{g(x)-g(x_{0})}.

But, f(x_{0})=g(x_{0})=0 so that \frac{f^{'}(c)}{g^{'}(c)}=\frac{f(x)}{g(x)}.

As x approaches x_{0}, c approaches x_{0} because it lies between x and x_{0}. Therefore, \lim_{x \rightarrow x_{0}^{+}}\frac{f(x)}{g(x)}=\lim_{x \rightarrow x_{0}^{+}}\frac{f^{'}(c)}{g^{'}(c)}=\lim_{x \rightarrow x_{0}^{+}}\frac{f^{'}(x)}{g^{'}(x)}.

This establishes l’Hopital’s Rule for the case where x approaches x_{0} from above. The case where x approaches x_{0} from below is proved by applying Cauchy’s Mean Value Theorem to the closed interval [x,x_{0}], where x< x_{0}.¬†QED.

The Sandwich Theorem or Squeeze Play Theorem

It helps to think about the core concepts of Calculus from a young age, if you want to develop your expertise or talents further in math, pure or applied, engineering or mathematical sciences. At a tangible level, it helps you attack more or many questions of the IIT JEE Advanced Mathematics. Let us see if you like the following proof, or can absorb/digest it:

Reference: Calculus and Analytic Geometry by Thomas and Finney, 9th edition.

The Sandwich Theorem:

Suppose that g(x) \leq f(x) \leq h(x) for all x in some open interval containing c, except possibly at x=c itself. Suppose also that \lim_{x \rightarrow c}g(x)= \lim_{x \rightarrow c}h(x)=L. Then, \lim_{x \rightarrow c}f(x)=c.

Proof for Right Hand Limits:

Suppose \lim_{x \rightarrow c^{+}}g(x)=\lim_{x \rightarrow c^{+}}h(x)=L. Then, for any \in >0, there exists a \delta >0 such that for all x, the inequality c<x<c+\delta implies L-\in<g(x)<L+\in and L-\in<h(x)<L+\in ….call this (I)

These inequalities combine with the inequality g(x) \leq f(x) \leq h(x) to give

L-\in <g(x) \leq f(x) \leq h(x)<L+\in

L-\in <f(x)<L+\in

-\in <f(x)-L<\in….call this (II)

Therefore, for all x, the inequality c<x<c+\delta implies |f(x)-L|<\in. …call this (III)

Proof for LeftHand Limits:

Suppose \lim_{x \rightarrow c^{-}} g(x)=\lim_{x \rightarrow c^{-}}=L. Then, for \in >0 there exists a \delta >0 such that for all x, the inequality c-\delta <x<c implies L-\in<g(x)<L+\in and L-\in<h(x)<L+\in …call this (IV).

We conclude as before that for all x, c-\delta <x<c implies |f(x)-L|<\in.

Proof for Two sided Limits:

If \lim_{x \rightarrow c}g(x) = \lim_{x \rightarrow c}h(x)=L, then g(x) and h(x) both approach L as x \rightarrow c^{+} and as x \rightarrow c^{-} so \lim_{x \rightarrow c^{+}}f(x)=L and \lim_{x \rightarrow c^{-}}f(x)=L. Hence, \lim_{x \rightarrow c}f(x)=L. QED.

Let me know your feedback on such stuff,

Nalin Pithwa

Mobius and his band

There are some pieces of mathematical folklore that you really should be reminded about, even though they are “well-known” — just in case. An excellent example is the Mobius band.

Augustus Mobius was a German mathematician, born 1790, died 1868. He worked in several areas of mathematics, including geometry, complex analysis and number theory. He is famous for his curious surface, the Mobius band. You can make a Mobius band by taking a strip of paper, say 2 cm wide and 20 cm long, bending it around until the ends meet, then twisting one end through 180 degrees, and finally, gluing the ends together. For comparison, make a cylinder in the same way, omitting the twist.

The Mobius band is famous for one surprising feature: it has only one side. If an ant crawls around a cylindrical band, it can cover only half the surface — one side of the band. But, if an ant crawls around on the Mobius band, it can cover the entire surface. The Mobius band has only one side.

You can check these statements by painting  the band. You can paint the cylinder so that one side is red and the other is blue, and the two sides are completely distinct, even though they are separated by only the thickness of the paper. But,, if you start to paint the Mobius band red, and keep going until you run out of band to paint, the whole thing ends up red.

In retrospect, this is not such a surprise, because the 180 ¬†degrees twist connects each side of the original paper strip to the other. If you ¬†don’t twist before gluing, the two sides stay separate. But, until Mobius (and a few others) thought this one up, mathematicians didn’t appreciate that there are two distinct kinds of surface: those with two sides and those with one side only. This turned out to be important in topology. And, it showed how careful you have to be about making “obvious” assumptions.

There are lots of Mobius band recreations. Below are three of them:

  • If you cut the cylindrical band along the middle with two scissors, it falls apart into two cylindrical bands. What happens if you try this with a Mobius band?
  • Repeat, but this time, make the cut about one-thirds of the way across the width of the band. Now, what happens to the cylinder and to the band?
  • Make a band like a Mobius band but with a 360 degrees twist. How many sides does it have? What happens if you cut it along the middle?

The Mobius band is also known as a Mobius strip, but this can lead to misunderstandings, as in aLimerick written by a science fiction author Cyril Kornbluth:

A burleycue dancer, a pip

Named Virginia, could peel in a zip,

But she read science fiction

and died of constriction

Attempting a Mobius strip.

A more politically correct Mobius limerick, which gives away one of the answers, is:

A mathematician confided,

That a Mobius strip  is one-sided,

You’ll get quite a laugh

if you cut it to half,

For it stays in one piece when divided.

Ref:¬†Professor Stewart’s Cabinet of Mathematical Curiosities.

Note:¬†There are lots of interesting properties of Mobius strip, which you can explore. There is a lot of recreational and pure mathematics literature on it. Kindly Google it. Perhaps, if explore well, you might discover your hidden talents for one of the richest areas of mathematics — topology. Topology is a foundation for Differential Geometry, which was used by Albert Einstein for his general theory of ¬†relativity. Of course, there are other applications too…:-)

— Nalin Pithwa.








Number theory has numerous uses

One of the fun ways to get started in mathematics at an early age s via number theory. It does not require deep, esoteric knowledge of concepts of mathematics to get started, but as you explore and experiment, you will learn a lot and also you will have a ball of time writing programs in basic number theory. One of the best references I have come across is “A Friendly Introduction to Number Theory” by Dr. Joseph Silverman. It is available on Amazon India.

Well, number theory is not just pure math; as we all know, it is the very core of cryptography and security in a world transforming itself to a totally digital commerce amongst other rapid changes. Witness, for example, the current intense debate about opening up an iPhone (Apple vs. FBI) and some time back, there was the problem with AES Encrypted Blackberry messaging services in India.

Number theory is also used in Digital Signal Processing, the way to filter out unwanted “noise” from an information signal or “communications signal.” Digital Signal Processing is at the heart of modem technology without which we would not be able to have any real computer networks.

There was a time when, as G H Hardy had claimed that number theory is the purest of all sciences as it is untouched by human desire. Not any more !!!

Can you imagine a world without numbers ?? That reminds me of a famous quote: “God created the natural numbers, all the rest is man-made.” (Kronecker).

More later,

Nalin Pithwa

Careers in Mathematics

Most people already have a belief that the the only career possible with a degree in Mathematics is that of a teacher or a lecturer or a professor. Thanks to the co-founder(s) of Google, whose database search engine is based on the Perron-Frobenius Theorem, this notion is changing.

In particular, you might want to have a detailed look at the website of Australian mathematics/mathematicians —–

I will cull more such stuff and post in this blog later…


Nalin Pithwa


We have seen how the concept of continuity is naturally associated with attempts to model gradual changes. For example, consider the function f: \Re \rightarrow \Re given by f(x)=ax+b, where change in f(x) is proportional to the change in x. This simple looking function is often used to model many practical problems. One such case is given below:

Suppose 30 men working for 7 hours a day can complete a piece of work in 16 days. In how many days can 28 men working for 6 hours a day complete the work? It must be evident to most of the readers that the answer is \frac{16 \times 7 \times 30}{28 \times 6}=20 days.

(While solving this we have tacitly assumed that the amount of work done is proportional to the number of men working, to the number of hours each man works per day, and also to the number of days each man works. Similarly, Boyle’s¬†law for ideal gases states that pressure remaining constant, the increase in volume of a mass of gas is proportional to the increase in temperature of the gas).

But, there are exceptions to this as well. Galileo discovered that the distance covered by a body, falling from rest, is proportional to the square of the time for which it has fallen, and the velocity is proportional to the square root of the distance through which it has fallen. Similarly, Kepler’s law tells us that the square of the period of the planet going round the sun is proportional to the cube of the mean distance from the sun.

These and many other problems involve functions that are not linear. If for example we plot the graph of the distance covered by a particle versus time, it is a straight line only when the motion is uniform. But, we are seldom lucky to encounter only uniform motion. (Besides, uniform motion would be so monotonous. Perhaps, there would be no life at all motions if all motions were uniform. Imagine a situation in which each body is in uniform motion. A body at rest would be eternally at rest and those once in motion, would never stop.) So the simple method of proportionality becomes quite inadequate to tackle such non-linear problems. The genius of Newton lay in looking at those problems which are next best to linear, the ones that are nearly linear.

We know that the graph of a linear function is a straight line. What Newton suggested was to look at functions, small portions of whose graphs look almost like a straight line (see Fig 1).

In Fig 1, the graph certainly is not a straight line. But a small portion of it looks like a straight like a straight line. To formalize this idea, we need the concept of differentiability.


Let I be an open interval and f: I \rightarrow \Re be a function. We say that f is locally linear or differentiable at x_{0} \in I if there is a constant m such that


or equivalently, for x in a punctured interval around x_{0},


where r(x_{0},x) \rightarrow 0 as x \rightarrow x_{0}

What this means is that for small enough x-x_{0}, \frac{f(x)-f(x_{0})}{x-x_{0}} is nearly a constant or, equivalently, f(x)-f(x_{0}) is nearly proportional to the increment x-x_{0}. This is what is called the principal of proportional parts and used very often in calculations using tables, when the number for which we are looking up the table is not found there.

Thus, if a function f is differentiable at x_{0}, then \lim_{x \rightarrow x_{0}}\frac{f(x)-f(x_{0})}{x-x_{0}}

exists and is called the derivative of f at x_{0} and denoted by f^{'}(x_{0}). So we write

\lim_{x \rightarrow x_{0}}\frac{f(x)-f(x_{0})}{x-x_{0}}=f^{'}(x_{0}).

We need to look at functions which are not differentiable at some point, to fix our ideas. For example, consider the function f: \Re \rightarrow \Re defined by f(x)=|x|.

This function though continuous at every point is not differentiable at $latex x=0$. In fact, \lim_{x \rightarrow 0_{+}}\frac{|x|}{x}=-1. What all this means is that if one looks at the graph of f(x)=|x|, it has a sharp corner at the origin.

No matter how small a part of the graph containing the point (0,0) is taken, it never looks like a line segment. The reader can test for the non-differentiability of f(x)=|\sin{x}| at x=n\pi.

This leads us to the notion of the direction of the graph at a point: Suppose f: I \rightarrow \Re is a function differentiable at x_{0} \rightarrow I, and let P and Q be the points (x_{0},f(x_{0})) and (x, f(x)) respectively in the graph of f. (see Fig 2).

The chord PQ has the slope \frac{f(x)-f(x_{0})}{x-x_{0}}. As x comes close to x_{0}, the chord tends to the tangent to the curve at (x_{0}, f(x_{0})). So, \lim_{x \rightarrow x_{0}} \frac{f(x)-f(x_{0}}{x-x_{0}} really represents the slope of the tangent at (x_{0},f(x_{0})) (see Fig 3).

Similarly, if x(t) is the position of a moving point in a straight line at time t, then \frac{x(t)-x(t_{0}}{t-t_{0}} is its average velocity in the interval of time [t_{0},t]. Its limit as t goes to t_{0}, if it exists, will be its instantaneous velocity at the instant of time t_{0}. We have

x^{'}{t_{0}}=\lim_{t \rightarrow t_{0}}\frac{x(t)-x(t_{0})}{t-t_{0}} is instantaneous velocity at t_{0}.

If the limit of \frac{f(x)-f(x_{0})}{x-x_{0}} does not exist as x tends to x_{0}, the curve (x, f(x)) cannot have a tangent at (x_{0},f(x_{0})), as we saw in the case of f(x)=|x| at (0,0); the graph abruptly changes its direction. If we look at the motion of a particle which is moving with uniform velocity till time t_{0} and is abruptly brought to rest at that instant, then its graph would look as in Fig 4a.

This is also what we think happens when a perfectly elastic ball impinges on another ball of the same mass at rest, or  when a perfectly elastic ball moving at a constant speed impinges on a hard surface (see fig 4b). We see that there is a sharp turn in the space time graph of such a motion at time t=t_{0}. Recalling the interpretation of

x^{'}(t)=\lim_{t \rightarrow t_{0}} \frac{x(t)-x(t_{0})}{t-t_{0}} as its instantaneous velocity at t=t_{0}, we see that in the situation described above, instantaneous velocity at t=t_{0} is not a meaningful concept.

We have already seen that continuous functions need not be differentiable at some points of their domain. Actually there are continuous functions which are not differentiable anywhere also. On the other hand, as the following result shows, every differentiable function is always continuous.


If a function is differentiable at x_{0}, then it is continuous there.


If f is differentiable at x_{0}, then let \lim_{x \rightarrow x_{0}} \frac{f(x)-f(x_{0}}{x-x_{0}}=l. Setting

r(x,x_{0})=\frac{f(x)-f(x_{0})}{x-x_{0}}-l, we see that \lim_{x \rightarrow x_{0}}r(x, x_{0})=0. Thus, we have

f(x)-f(x_{0})=(x-x_{0})l + (x-x_{0})r(x,x_{0})

Now, \lim_{x \rightarrow x_{0}} (f(x)-f(x_{0}))=\lim_{x \rightarrow x_{0}}(x-x_{0})l + \lim_{x \rightarrow x_{0}} (x-x_{0})r(x, x_{0})=0

This shows that f is continuous at x_{0}.


Continuity of f at x_{0} tells us that f(x)-f(x_{0}) tends to zero as x - x_{0} tends to zero. But, in the case of differentiability, f(x)-f(x_{0}) tends to zero at least as fast as x-x_{0}. The portion l(x-x_{0}) goes to zero no doubt but the remainder |f(x)-f(x_{0})-l(x-x_{0})| goes to zero at a rate faster than that of |x-x_{0}|. This is how differentiation was conceived by Newton and Leibniz. They introduced a concept called an infinitesimal. Their idea was that when x-x_{0} is an infinitesimal, then so is f(x)-f(x_{0}), which is of the same order of infinitesimal as x-x_{0}.The idea of infinitesimals served them well but had a little problem in its definition. They were introduced seemed to run against the Archimedean property. The definition of infinitesimals can be made rigorous But, we do not go into it here. However, we can still usefully deal with concepts and notation like:

(a) f(x)=\mathcal{O}(g(x)) as x \rightarrow x_{0} if there exists a K such that |f(x)| \leq K|g(x)| for x sufficiently near x_{0}.

(b) f(x)=\mathcal{o}(g(x)) as x \rightarrow x_{0} if \lim_{x \rightarrow x_{0}}\frac{f(x)}{g(x)}=0.

Informally, f(x)=\mathcal{o}(g(x))=0 means f(x) is of smaller order than g(x) as

x \rightarrow x_{0}. In this notation, f is differentiable at x_{0} if there is an l such that


We shall return to this point again. Let us first give examples of derivatives of some functions.


(The proof are left as exercises).

(a) f(x)=x^{n}, f^{'}(x_{0})=\lim_{x \rightarrow x_{0}}\frac{x^{n}-{x_{0}}^{n}}{x-x_{0}}=n{x_{0}}^{n-1}, n a positive integer.

(b) f(x)=x^{n} (x \neq 0, where n Is a negative integer), f^{'}(x)=nx^{n-1}

(c) f(x)=e^{x}, f^{'}(x)=e^{'}(x)

(d) f(x)=a^{x}, f^{'}(x)=a^{x}\log[e]{a}

Boundedness of a Continuous Function

Suppose f:I \rightarrow \Re is a continuous function (where I is an interval). Now, for every x_{0} \in I and \varepsilon>0, we have a \delta >0 such that f(x_{0})-\varepsilon< f(x)<f(x_{0})+\varepsilon for x_{0}-\delta<x<x_{0}+\delta. This tells us that f is bounded in the interval (x_{0}-\delta, x_{0}+\delta). Does it mean that the function is bounded in its entire domain? What we have shown is that given an x \in I, there is an interval I_{x} and two real numbers m_{x} and M_{x} such that

m_{x}<f(\xi)<M_{x} for all \xi \in I_{x}.

Surely \bigcup_{x \in I}I_{x} \supset I. But, if we could choose finitely many intervals out of the collection \{ I_{x}\}_{x \in I}, say, I_{x_{1}}, I_{x_{2}}, \ldots, I_{x_{n}} such that I_{x_{1}}, \bigcup I_{x_{2}} \bigcup \ldots \bigcup I_{x_{n}} \supset I, then we would get m < f(\xi) < M, where M=max \{ M_{x_{1}}, \ldots, M_{x_{n}}\}  and m=min \{ m_{x_{1}, m_{x_{2}}}, \ldots, m_{x_{n}}\}. That, we can indeed make such a choice is a property of a closed bounded interval I in \Re and is given by the following theorem, the proof of which, is given below:

Theorem (Heine-Borel):

Let a, b \in \Re and let I be a family of open intervals covering [a,b], that is, for all x \in [a,b], there exists I \in \mathcal{I} such that x \in I. Then, we can find finitely many open intervals I_{1}, I_{2}, \ldots \mathcal{I} such that I_{1} \bigcup I_{2} \bigcup I_{3}\bigcup \ldots \bigcup I_{n} \supset [a,b].


Suppose our contention is false: Let us take the intervals [a,c] and [c,b] where c=\frac{a+b}{2}. If the hypothesis is false, then it should be false for at least one of the intervals [a,c] or [c,b]. Otherwise, we could find I_{1},I_{2}, \ldots I_{m} \in \mathcal{I} and J_{1}, J_{2}, \ldots \in \mathcal{I} such that I_{1} \bigcup I_{2} \bigcup \ldots I_{m} \supset [a,c] and J_{1} \bigcup J{2} \bigcup \ldots \bigcup J_{n} \supset [c,b] and then [I_{1} \ldots I_{m}, J_{1} \ldots J_{n}] would be the finite family of intervals for which I_{1} \bigcup I_{2} \bigcup \ldots \bigcup I_{m} \bigcup J_{1} \bigcup \ldots J_{n} \supset [a,b].

So let us assume that at least for one of the intervals [a,c] or [c,b] the assumption of the theorem is false. Call it [a_{1},b_{1}]. Again let c_{1}=\frac{a_{1}+b_{1}}{2}. Now since the claim of the theorem is false for [a_{1},b_{1}] it should be false for at least [a_{1},c_{1}] or [c_{1},b_{1}] by the above argument. Call it [a_{2},b_{2}]. We have a \leq a_{1} \leq a_{2} < b_{2} \leq b_{1} \leq b. We can continue this process to get a sequence of intervals [a_{1},b_{1}] \supset [a_{2},b_{2}] \supset [a_{3},b_{3}] \supset \ldots [a_{n},b_{n}] \supset \ldots for which the assertion is false. Observe further that b_{n}-a_{n}=\frac{b-a}{2^{n}} and that we have a \leq a_{1} \leq a_{2} \leq \ldots a_{n} < b_{1} \leq b_{n-1} \leq \ldots \leq b.

This gives us a monotonically increasing sequence (a_{n})_{n-1}^{\infty} which is bounded above and a monotonically decreasing sequence (b_{n})_{n=1}^{\infty} bounded below. So (a_{n})_{n=1}^{\infty} and (b_{n})_{n=1}^{\infty} must converge to say \alpha and \beta respectively. Then, \alpha=\beta because \beta - \alpha= \lim{(b_{n}-a_{n})}=\lim{\frac{(b-a)}{2^{n}}}=0. Since \mathcal{I} covers [a,b], \alpha must belong to J for some J \in \mathcal{I}. Also, since \lim_{n \rightarrow \infty}{a_{n}}=\alpha, there exists an n_{1} such that a_{n} \in J for all n > n_{2}. Now let n_{0}=max \{ n_{1},n_{2}\}. Therefore, we conclude that [a_{n},b_{n}] \subset J for all n > n_{0}. But, this violates our hypothesis that we cannot choose finitely many members of \mathcal{I} whose union will contain [a_{n},b_{n}] for any n. QED.


A continuous function on a closed interval is bounded.

The proof of the corollary is already given just before the Heine-Borel theorem. So, if we have a continuous function f:[a,b] \rightarrow \Re and M=\sup{\{f(x): a \leq x \leq b \}} and m=\inf{\{ f(x) : a \leq x \leq b\}}, the above corollary says -\infty < m \leq M < \infty. Next, we ask the natural question: do there exist two points x_{0},y_{0} \in [a,b] such that f(x_{0})=M and f(x_{0})=m? In other words, does a continous function on a closed interval attain its bounds? The answer is yes.


Suppose f:[a,b] \rightarrow \Re is continuous, and M=\sup{ \{f(x): a \leq x \leq b \}} and m=\inf{ \{ f(x): a \leq x \leq b\}}. Then, there are two points x_{0},y_{0} \in [a,b] such that f(x_{0})=M and f(y_{0})=m.

Note: these points x_{0} and y_{0} need not be unique.

Proof by contradiction:

Suppose there is no point x \in [a,b] such that f(x)=M, then we would have f(x)<M or M-f(x)>0 for all x \in [a,b]. Let us define y:[a,b] \rightarrow \Re by g(x)=\frac{1}{M-f(x)}

Since M-f(x) vanishes nowhere, y is also a continuous function. So, by the corollary above it ought to be bounded above, and below. Let 0 <g(x)<M_{1}, for all x \in [a,b]. On the other hand, by the property of a supremum we note that there exists an x \in [a,b] such that f(x)+\frac{1}{2M_{1}}>M, which implies that M-f(x)<\frac{1}{2M_{1}} or g(x)=\frac{1}{M-f(x)}>2M_{1}, which is contradiction. Therefore, f(x) must attain the value M at some point x_{0} \in [a,b]. The proof of the other part is very similar. QED.

The above theorem together with the corollary says that on a closed interval, a continuous function is bounded and attains its bounds. This, again by the intermediate value theorem, means that the function must attain all the values between its supremum and infimum. Thus, the image of a closed interval under a continuous map is a closed interval. However, if f is a continuous map on an open interval, then the function need not be bounded.


Let f: (0,1) \rightarrow \Re be defined by f(x)=1/x. This is surely continuous but the limit,

\lim_{x \rightarrow 0}f(x)=\infty, which means that given any M >0, we can always find x such that f(x)>M, viz., choose 0<x<\frac{1}{M}.

If f is a continuous function, then given \varepsilon >0, for each x_{0} fixed, we can find \delta >0 such that

|f(x)-f(x_{0})|<\varepsilon whenever |x-x_{0}|<\delta

Here \delta depends upon x_{0}.

Can we choose \delta_{0}>0 such that it works for all x_{0}? The answer in general is no.


Let f: \Re \rightarrow \Re be defined by f(x)=x^{2}. If we fix any \delta >0, then for x>0, f(x+\theta)-f(x)=2\theta x + \theta^{2} \geq 2\theta x, and hence as x becomes large, the difference between f(x+\theta) and f(x) also becomes large for every fixed \theta>0. So for say \varepsilon=1, we cannot choose \delta>0 such that \delta>0 such that |f(x+\theta)-f(x)|<\varepsilon for all \theta<\delta and all x. We thus have the following definition:


Let f: D \rightarrow \Re be a continuous function where D=\Re or [a,b] or (a,b). Then, f is said to be uniformly continuous if for all \varepsilon>0, there exists a \delta>0 such that

|f(x)-f(y)|<\varepsilon for all x, y \in D with |x-y|<\delta

We have seen above that every continuous function need not uniformly continuous. When D=[a,b], however, every continuous function is uniformly continuous as the next result shows.


Let f:[a,b] \rightarrow \Re be continuous. Then, f is uniformly continuous.


Fix \varepsilon > 0. The continuity of f implies that for every x \in [a,b], we can choose \delta_{x}>0 such that

|f(x)-f(y)|<\frac{\varepsilon}{2} whenever |y-x|<\delta_{x} and y \in [a,b]

Now, let I_{x}=(x-\frac{1}{2}\delta_{x}, x +\frac{1}{2}\delta_{x})

Then, clearly \{I_{x}: x \in [a,b] \} covers [a,b] as x \in I_{x}. By the Heine Borel theorem, we can get finitely many intervals out of this family, I_{x_{1}}, I_{x_{2}}, …, I_{x_{m}} such that

I_{x_{1}} \bigcup I_{x_{2}} \bigcup \ldots \bigcup I_{x_{m}} \supset [a,b].

Let \delta = \min \{\frac{1}{2}\delta_{x_{1}}, \frac{1}{2}\delta_{x_{2}}, \ldots, \frac{1}{2}\delta_{x_{m}} \}

Then, \delta>0 (note that minimum of finitely many positive numbers is always positive). Next we claim that if x, y \in [a,b], |x-y|<\delta then |f(x)-f(y)|<\varepsilon

Since x \in [a,b] \subseteq I_{x_{1}} \bigcup \ldots \bigcup I_{x_{m}}, we can find k \leq m such that x \in I_{x_{k}}, that is, |x-x_{k}|<\frac{1}{2}\delta_{x_{k}}. Now, |y-x_{k}| \leq |x-y|+|x-x_{k}| \leq \delta +\frac{1}{2}\delta_{x_{k}} \leq \delta_{x_{k}}.

Hence, |f(y)-f(x_{k})| < \frac{\varepsilon}{2} and |f(x)-f(x_{k})| < \frac{\varepsilon}{2} and therefore, |f(y)-f(x)|<\varepsilon. QED.

More later,

Nalin Pithwa