Monthly Archives: May 2015

Basic Set Theory: Functions

Among the kinds of relations we have been talking about, there are some which are very useful in mathematics. A relation f \subset A \times B is called a function from A to B if

i) domain of f=A

ii) (a,b) \in f and (a,b^{'}) \in f \Longrightarrow b=b^{'}.

Thus a function from A to B is a relation whose domain is the whole of A and which is not one-to-many. The set B is called the codomain or range of the function f.

If f \subset A \times B is a function from A to B, then we write f: A \rightarrow B and say that f is a function from A to B. We also say that f maps A into B and often a function is called a map or mapping. If

(a,b) \in f, we write


We call b the value of the function f at a or image of a under f. Observe that there can be no ambiguity in writing f(a) because it is impossible that (a,b) \in f and (a,b^{'}) \in f and b \neq b^{'}. So by definition of a function f: A \rightarrow B, we have for every a \in A a unique f(a) \in B. Thus, a function would be completely determined if we knew f(a) \in B for every a \in A. That is why sometimes a function is defined as a “rule” which associates to every element a \in A a unique element f(a) of B. We are obliged to put the word rule within quotation marks for the simple reason that the rule is often elusive. In fact, the aim in many situations is to “discover” the rule.


a) Suppose we are observing the position of a particle moving in a straight line. We know that at every instant of time the particle has a unique position on the line. If we agree that every point of the line corresponds to a real number with some convenient point on it decided to represent the real number zero, then the position of the particle at the time t is represented by the real number x(t). We observe that our with our ordinary notion of time and position, we do not expect a particle to occupy two different positions at the same time (nothing prevents, though,the particle from occupying the same position at two different times. Indeed, if the particle is at rest for a certain length of time, then it would have the same position for the entire length of time.) Suppose every time is represented by a real number, zero being a certain time deemed as the present time, then all future times are represented by positive real numbers and past times by negative real numbers. Here, a real number is used as an intuitive object, say, as points on a line, postponing a mathematical discussion of real numbers later. Let \Re be the set of real numbers, then the position of the particle moving in a straight line is, in fact, a function

x: \Re \rightarrow \Re

which associates to every time t, the position x(t).

b) Look at a railway time table. Below a particular train there are times recorded in a column. It records the times of arrival and departure of a train at certain stations. We may look upon this as a function whose domain consists of disjoint intervals of time and range consists of certain cities where the train stops. Thus, for every time in the mentioned interval we have a city where the train is at that time. The table does not say anything about the train’s position at a time after its departure and before its arrival at the next. This does not disqualify it to be a function if we take the domain to be the intervals of time depicting the times of arrival and departure at certain stations.

c) Consider the distance of a particle, falling freely under gravity, measured from the point from where the particle started its fall. if we record the time since it began to fall, we know that the distance x(t) at a time is given by the formula


(assuming, of course that the gravity does not vary and there is no air resistance).

Here we have in fact a “rule” which tells us where the particle would be at any time. This certainly represents a function. However, we are often not so lucky to have a neat formula like this for other functions.

d) The incoming news on a particular day that tells us the temperature of the four metros recorded at 5.30 am on that day. Here, we have a definite temperature for every such metro at 5.30am of that day. So it is in fact a function whose domain consists of the set of four metros and range the real numbers quantifying temperature.

e) Look at your school time table. What does it record on a particular day? it records the subjects to be taught at different times of the day. A particular student has a particular subject being taught during a particular period,as a student is not expected to be in two different classes at the same time. So, for each student, the time table is a function whose domain is the set of periods and the range the set of subjects.

f) When we want to mail a letter enclosed in an envelope, we usually go to the post master with it to tell us the denomination of the stamp to be affixed. The post master weighs the letter and tells us the postage according to the weight. Here, we have a definite postage for a definite weight, we don’t have different rates for the same weight (for same kind of mail like registered or ordinary). The rate chart, with the post master, for a particular kind of mail, truly, is a function whose domain is the set of positive real numbers representing the weights of the mail and the range again consists of positive real numbers representing the postage. We write the chart as a function f such that


meaning p is the postage to mail a letter of weight w.

g) We know that the solubility of a salt varies with temperature. That is to say that a given salt has a definite solubility at a definite temperature. So the relation which associates to every temperature the definite solubility of a salt is in fact a function.

h) Consider the population of a community of biological species at different times. If N(t) is the population of the community at time t, then surely

N: \Re \rightarrow \Re

represents a function.

i) If we measure the voltage of an AC power supply over a period of time, we observe that it is different at different times. At every instant of time, it has a voltage Sometimes, it is positive, sometimes it is negative and sometimes zero. In fact, for ordinary domestic supply in India, the voltage varies between 330 volts and -330 volts. But this change from 330 volts to -330 volts occurs within an interval of 1/100th of a second. (The ordinary DC voltmeters of your lab won’t be able to record this quick change in voltage). In fact, if V(t) is the volrage at time t, we have

V(t)=E_{0}\sin{(\omega t + \phi)}

where \omega=2\pi/50 and E_{0}=330 and \phi is a constant known as the phase.

j) Let S be a sphere resting on a horizontal plane H. Let n be the north pole and s be the southpole, where the sphere is touching the plane. Now let us join any other point p \in S to n and extend it to meet the plane at some point q. Now to every point p \in S - \{ n \}, we have a unique point q on H. This defines a function

f: S - \{ n \} \rightarrow H. This function is called the stereographic projection of the sphere on the plane.

k) The trivial map or function id: A \rightarrow A which sends every element of the set A to itself is called the identity map.

Look around, see and think…are there any functions that come to your notice? Let me know…

More later,

Nalin Pithwa

Basic Set Theory: Relations

Among the various kinds of relations, we are familiar with, father, mother, son, brother, sister, husband, wife, less than, equal to, parallel to, etc., are a few examples. We would like to fix our ideas as to what exactly is meant by a relation. Let us see how we use the terms in common parlance. We say Dasaratha is the father of Rama. Similarly, Babar is the father of Humayun and Janaka is the father of Sita. The above statements like Dasaratha is the father or Rama are but examples fatherhood. But, what do we exactly mean by fatherhood? We may attempt to define fatherhood as the collection of pairs like Dasaratha and Rama, Babar and Humayun, Janaka and Sita, etc. But in our usage “Dasaratha is the father of Rama” we cannot interchange the positions of Dasaratha and Rama. Indeed the statement “Rama is the father of Dasaratha” does not convey the same meaning as the statement “Dasaratha is the father of Rama”. Again, in the statement “Dasaratha is the father of Rama”, if we drop  the word Rama, then too it won’t mean anything. For the statement about fatherhood, what is important is a pair in a definite order: (Dasartha, Rama). Thus, a certain collection of ordered pairs stands for a definite relation, as in this case the ordered pairs

(Dasaratha, Rama), (Babar, Humayun), (Janaka, Sita)

are examples of relation of fatherhood. So fatherhood would stand for a certain collection of ordered pairs of human beings. Similarly, the relation of brotherhood would stand for a certain other collection of ordered pairs. So would sisterhood, etc. In each case, they are different collections of ordered pairs. Now we proceed to define the term relation formally. For that we need the notion of a cartesian product.

Cartesian Product.

For sets A and B, the set of all ordered pairs formed by the first element from A and the second element from B, is called the Cartesian product of A and B. It is denoted by A \times B. We have

A \times B= \{ (a,b): a \in A \hspace{0.1in} \textup{and} \hspace{0.1in} b \in B\}


By a relation or correspondence between the elements of a set A and those of a set B, we mean a subset f of the Cartesian product A \times B. Thus, f is a relation between set A and set B if

f \subset A \times B

If (a,b) \in f, then we say that a is related to b through f. We write this as afb. Thus, if f is the set of all human beings and M is the set of all male human beings, then fatherhood is a subset of M \times H, whereas brotherhood is another subset of M \times H. Motherhood is a subset of W \times H, where

W = \{ w: w is a woman\}. So if f \subset M \times H is the fatherhood relation, (a,b) \in f means a is the father of b. Similarly, if g \in M \times H is the brotherhood relation, then (a,b) \in G means a is a brother of b. (Observe that a is a brother of b does not imply b is a brother of a since b could be a sister of a. )


For sets A and B if f \subset A \times B, then \{ x: (x,u) \in f \hspace{0.1in} \textup{for some} \hspace{0.1in} u \in B\} is called the domain of the relation f, and \{ y: (x,y) \in f \hspace{0.1in textup{for some} \hspace{0.1in} x \in A}\} is called the range of the relation f.

It is quite obvious that the domain of f, above, is a subset of A and the range. a subset of B.

In the above example of fatherhood, its domain cannot contain a woman. Nor did it contain all men. Indeed many men are not fathers.

Like father, mother, brother, sister we also have the relation less than, equal to, greater than, similar to, congruent to, perpendicular to, etc.

For the set A=\{ 1,2,3,4\}, let f \subset A \times A be defined by

f=\{ (1,2), (1,3), (1,4), (2,3), (2,4), (3,4)\}.

We see that the relation f here is the relation less than in the set A. Here the domain of f is \{ 1,2,3\} whereas the range is \{ 2,3,4\}.

In case, f \subset A \times A, we call f a relation in A. For A, the set of all lines parallel to is a subset of

A \times A. Similarly, for the set T of all triangles, similarity and congruence are subsets of T \times T. We enumerate a few relations.

1) A relation f in A{ that is, f \subset A \times A is called a symmetric relation if afb \Longleftrightarrow bfa.

2) f \subset A \times A is called a reflexive relation if afa for all a \in A.

3) f \subset A \times A is called an antisymmetric relation if afb \hspace{0.1in} \textup{and} \hspace{0.1in} bfa \Longrightarrow a=b

4) A relation f \subset A \times A is called a transitive relation if afb, bfc \Longrightarrow afc.


a) Similarity of triangles is reflexive, symmetric and transitive.

b) Fatherhood is neither reflexive, nor symmetric, nor transitive.

c) “Ancestor of” is transitive but is neither reflexive nor symmetric.

d) The relationship “a is the spouse of b” is symmetric but neither reflexive nor symmetric.

e) “Less than” is transitive but not reflexive.

f) “Less than or equal to” is reflexive, antisymmetric and transitive.

g) If X is a class of sets, let R \subset X \times X be defined by ARB if A \subset B. We see that R is a reflexive, antisymmetric and transitive relation.


The way we have defined relation should strictly be called binary relation as it consists of ordered pairs. But, we have other kinds of relations too. Like collinearity of points, concurrence of lines, etc. They are not binary relations. But we don’t go into them here.

Equivalence relation.

A relation \sim on A is called an equivalence relaiton if it is reflexive, symmetric and transitive, that is,

i) a \sim a for all a \in A.

ii) a \sim b \Longleftrightarrow b \sim a

iii) a \sim b \hspace{0.1in} \textup{and} \hspace{0.1in} b \sim c \Longrightarrow a \sim c.


a) Equality is certainly an equivalence relation.

b) Similarity between triangles is an equivalence relation.

c) Congruence of triangles is an equivalence relation.

d) Parallelism between lines is an equivalence relation if we agree that a line is parallel to itself.

e) Define a relation \equiv (\mod 5) on Z by m \equiv n (\mod 5) if m - n is divisible by 5. Note that this is an equivalence relation.

More later,

Nalin Pithwa



The need for a Universal Set: Basic Set Theory

For any two sets A and B, we define A \setminus B=\{ x: x \in A \hspace{0.1in} \textup{and} \hspace{0.1in} x \notin B \}.

A\setminus B is the set of all those elements of A which are not in B. If E stands for the set of those persons who can speak English and H for the set of persons who can speak Hindi, then E\setminus H would stand for the set of all those persons who can speak English but cannot speak Hindi. Similarly, H\setminus E would stand for the set of all  those persons who can speak Hindi but not English. The sets A\setminus B and

B\setminus A are called the set of differences. In this context, there is an interesting concept called the symmetric difference of two sets. For sets A and B, the set (A\setminus B) \bigcup (B\setminus A)= A \triangle B.

For the sets E and H described above, E\triangle H stands for the set of all those persons who can either speak English or speak Hindi but not both. This is the case of exclusive or in digital logic or mathematical logic.


In an examination of three subjects physics, chemistry and mathematics, let P stand for the set of those candidates who passed in physics, C for the set of those candidates who passed in chemistry and M for the set of those who passed in mathematics.

Express the following sets by using the symbols of union, intersection and difference.

(a) The set of all those candidates who  passed in mathematics only.

(b) The set of all those who passed in mathematics and physics but not in chemistry.

(c) The set of those candidates who failed in one subject only.

(d) The set of those candidates who passed in either physics or chemistry but not in mathematics (we or not exclusive or here)

(e) The set of those candidates who passed in all the subjects.

(f) The set of all those candidates who passed at least in two subjects.

In this exercise, if we want to represent the set of those candidates who failed at least in one subject, what do we do? An obvious problem in expressing this as a set if that we have not specified the set of all the candidates who  took the examination. Similarly, if we are talking of the set of non-Bengalis we must be clear in which context we are talking? Are we talking of all the people of the world or only about Indian citizens? But when we say ‘not all birds can fly” we are talking of birds only. We are not talking of the other animals who cannot fly nor are we talking of insects which we can fly. Similarly, suppose somebody said “none other than Gandhi could do this feat” we are obviously talking in terms of all human beings. We can find many such examples where it is important to mention the context in which we are talking. In other words, we specify a set in terms of subsets of a set about which we are talking. This specified set is sometimes called the “the universe of discourse” or the universal set, U. Though we are using the article “the” we are talking of “the universal set” in a particular context. 

If is our universal set and A \subset U, then the set   U \setminus A=\{ x: x \in U \hspace{0.1in} \textup{and} \hspace{0.1in} x \notin A \} is called the complement of A. It is denoted by A^{'}. (Strictly speaking, we should call

U \setminus A as the complement of A with respect to U). In case, U is our universe of discourse, we can legitimately claim U^{'} = \phi.

More later,

Nalin Pithwa

Quick Review of Trigonometric Optimization Methods

Let us review together the four general methods we can use for triangular optimization.

I) Trigonometric Method: 

The essence of this method is the observation that the cosine of an angle is at most one and that it equals 1 only when the angle is zero. This fact is applied to the difference between two of the angles A, B and C, holding the third angle fixed, to show that unless those two angles are equal, the objective functionn can be increased (or decreased as the case may be). Consequently, at an optimal solution, these two angles must be equal. If the objective function is symmetric (as is the case, in almost all problems of triangular optimization), then every two of A, B and C must be equal to each other and hence, the triangle ABC must be equilateral.

This method is elementary and easy to apply. Even when the objective function is only partially symmetric, that is, symmetric in two but not in all the three variables, it can be applied to those two variables, holding the third variable fixed. Suppose, for example, that we want to maximize f(A,B,C)=\cos{A}+2\cos{B}+\cos{C}. This is symmetric in A and C. So, by the same reasoning as for maximizing \cos{A}+\cos{B}+\cos{C}, which at an optimal solution we must have A=C. Then, B=\pi-2A, which makes f effectively a function of just one variable, viz., \cos{A}+2\cos(\pi-2A)+\cos{A}, which equals 2\cos{A}-4\cos^{2}{A}+2. This can be maximized as a quadratic in \cos{A} either by completing the square or using calculus. The maximum occurs when A=\cos^{-1}{1/4}. Thus, the maximum value of \cos{A}+2\cos{B}+\cos{C} for a triangle ABC is 1/4. The method, of course, fails if the function is not even partially symmetric. This is not surprising. Basically, in a triangular optimization problem, we are dealing with a function f(A,B,C) of three variables. Because of the constraint A+B+C=\pi, any one of the variables can be expressed in terms of the other two. This effectively makes f a function of two variables. Optimization of functions of several variables requires advanced methods. It is only when f satisfies some other conditions such as partial symmetry that we can hope to reduce the number of variables further so that elementary methods can be applied.

II) Algebraic Method: 

The essence of this method is to reduce the optimization problem to some inequality using suitable trigonometric formulae or identities. The inequality is then established using some standard inequality such as the AM-GM-HM inequality, or Jensen’s inequality, or sometimes, by doing some more basic work. The fundamental ideas are very simple, viz., (a) the square of any real number is non-negative and is zero only when that number is zero, and (b) the sum of two or more non-negative numbers is non-negative and vanishes if and only if each of the term is zero. When this method works, it works elegantly. But it is not always easy to come up with the right algebraic manipulations. Sometimes, certain simplifying substitutions have to be used. Still, it is an elementary method and deserves to be tried.

III) Jensen’s inequality:

This is a relatively advanced method. It is directly applicable when the objective function is, or can be recast, in a certain form, viz., h(A)+h(B)+h(C), where h is a function of one variable whose second derivative maintains the same sign over a suitable interval. But, even when h fails to do so, the method can sometimes be applied with a suitable conversion of the problem.

IV) Lagrange’s Multipliers:

This is a highly advanced method based on the calculus of functions of several variables. It is applicable to all types of objective functions, not just those that are symmetric or partially symmetric. When applied to triangular optimization problems with symmetric objective functions, the optimal solution is either degenerate or an equilateral triangle.

Naturally, for a particular given problem, some of these methods may work better than others. The method of Lagrange’s multipliers is the surest but the most mechanical of all the four. The algebraic method is artistic and sometimes gives the answer very fast. Jensen’s inequality also works fast once you are able to cast the objective function in a certain form. Such a recasting may involve some ingenuity sometimes. The trouble is that both these methods work only in the case of an  optimization problem where the objective function is symmetric. And, in such cases, the method of Lagrange’s multipliers makes a mincemeat of the problem. From an examination point of view, this is a boon if a question about triangular optimization is asked in a “fill in the blanks” or “multiple choice” form, where you don’t have to show any reasoning. if the objective function is symmetric, then the optimal solution is either degenerate or an equilateral triangle. But, degenerate triangles are often excluded from the very definition of a triangle because of the requirement that the three vertices of a triangle must be distinct and non-collinear and, in any case, such absurdities are unlikely to be asked in an examination! So, it is a safe bet to simply assume that the optimal solution is an equilateral triangle and proceed with further work (namely, calculating the value of the objective function for an equilateral triangle). This saves you a lot of time.

More later,

Nalin Pithwa