Note: you'll see that the latex is not yet rendered. Still working it.

Students taking a multivariable calculus course may find to their frustration that textbooks and other resources do not address conceptually the appearance of $\sqrt{1+g_x^2+g_y^2}$ in the differential surface element expression $dS$ and treat it algorithmically. This article aims to address this need.

Consider some smooth surface given by $z=g(x,y)$. If we define a function,

$$ f: \mathbb{R}^3 \to{} \mathbb{R}, (x,y,z) \mapsto{} z-g(x,y)$$ or examine its additive inverse, $-f=g(x,y)-z$, our original surface is the level set of $f$ given by $f(x,y,z) = 0$. It can be shown that $\nabla{f} = \left\langle -g_x, -g_y, 1 \right\rangle$ is orthogonal to level sets of $f$. Thus $\nabla{f}$ is the surface normal to $z=g(x,y)$. The magnitude of this surface normal, $|\nabla{f}| = \sqrt{1+g_x^2+g_y^2}$, appears in the scalar and directed surface element expressions for flux integrals of scalar and vector fields, but is often uninterpreted. A geometric interpretation for this magnitude will be discussed.

Let there be a smooth $f:\mathbb{R}^n \rightarrow{} \mathbb{R}$, a level set,

$$S = \left{(x_1, x_2, \dots{}, x_n)|f(x_1, x_2, \dots{}, x_n)=c\right},$$ and a point $P = (p_1, p_2, \dots{}, p_n) \in{} S$. Curves that pass through $P$ and belong to $S$ can be parametrized as $r(t) = (x_1(t), x_2(t), \dots{}, x_n(t))$. Since $r(t)\in{}S$, $f(r(t)) = c$. Taking the total derivative with respect to t yields

$$\sum_{i=1}^n{\frac{\partial{f}}{\partial{x_i}}\frac{dx_i}{dt}} = 0 $$ This can be interpreted as the dot product of the tangent vector to the curve, $r'(t) = (x_1'(t), x_2'(t), \dots{}, x_n'(t))$, and the gradient, $\nabla{f}$. Since the dot product is zero, the vectors are orthogonal. By visualizing $S$ and $\nabla{f}$ as lying in $\mathbb{R}^n$, and observing a minimum of $n$ curves passing through $P$ in unique directions, $\nabla{f}$ is constrained normal to the level set.

%In the language of linear algebra, each curve represents a tangent vector of the tangent vector space, and to have a basis, it is required to have as many linearly independent basis vectors as the dimension of the space.

Consider the simplest case, $n = 2$. Level sets, $f(x,y) = c$, are 1-dimensional curves. In $\mathbb{R}^2$, the only curve that passes through each level set is the curve itself. Thus for every point of the curve, there is a unique normal vector given by $\nabla{f}$.

%insert image here

Consider the next case, $n=3$. Level sets, $f(x,y,z) = c$, are 2-dimensional surfaces. One can find infinitely many curves on the level set passing through a point, but all that is needed to constrain $\nabla{f}$ normal to the surface are two curves.

%insert image here

Choosing either $\nabla{f}$ or $\nabla{(-f)}$ is a choice of orientation. This is clear when considering a sphere as a level set, where one normal is directed radially inwards and the opposite normal radially outwards. The only difference for integrals is a sign change.

%insert image here of sphere and orientation

An addendum: One might be further fascinated with this idea of minimum number of curves. The equivalence classes of curves on a manifold incident to a point define a tangent vector, where two curves belong to the same class if they move through through the point with the same speed and direction. The set of all classes thus is the tangent space, which has vector space structure. A minimum number of curves corresponds to a linearly independent basis for this space.

%insert image here of tangent space and basis

Consider the smooth surface introduced earlier, $z=g(x,y)$, with an alternative expression: a parametrization of a vector $\vec{r}$ by $u$ and $v$. $\vec{r}(u,v) = \left\langle x(u,v), y(u,v), z(u,v)\right\rangle$ corresponds to each point of the surface for a given $u$ and $v$. However, since we are asserting our surface is expressible as $z = g(x,y)$ we can write $\vec{r}$ as $\vec{r}(x,y) = \left\langle x, y, g(x,y)\right\rangle$ By taking the partial derivatives of the vector function $\vec{r}(x,y)$, we get tangent and bitangent vectors to the surface: $\vec{r_x} = \left\langle 1, 0, g_x\right\rangle$ and $\vec{r_y} = \left\langle 0, 1, g_y\right\rangle$.

%insert image here showing r_x and r_y

$r_x$ and $r_y$ can be visualized as attached to some point of the surface, $(x_0, y_0, g(x_0,y_0))$, and also as lying in the tangent plane at that point, spanning a parallelogram over a unit square in the $x-y$ plane.

%insert image here showing spanning of plane$

The normal vector is simply $r_x \times{} r_y$. The direction of this vector is orthogonal to both tangent and bitangent, while the magnitude of the cross product in general is given by the area of the parallelogram subtended by both vectors, and thus the magnitude of the normal vector is the area of the parallelogram lying in the tangent plane over a unit square.

For surface integrals of scalar fields in $\mathbb{R}^3$, $dS$, is

$$dS = |\vec{r_x} \times{} \vec{r_y}|,dA = \sqrt{1+g_x^2+g_y^2},dx,dy $$ For surface integrals of vector fields the directed surface element, $\vec{dS}$, is $$\vec{dS} = \vec{r_x} \times{} \vec{r_y},dA = \left\langle-g_x, -g_y, 1 \right\rangle,dx,dy$$ %Maybe put something about r(u,v) with some arbitrary parametrization? %Add something about \nabla{f}?

As is, $dS$ can be interpreted as scaling up an infinitesimal area $dA = dx,dy$ by the area of a \emph{unit parallelogram} of the tangent plane to account for the local increase in surface area due to the tilt of the surface. This also accounts for increases in quantities linearly varying with area, whether it be a density or a flux. The magnitude of the normal, as identified with the area of the parallelogram, is the linear approximation of the surface just as slope is the linear approximation of a function, $f:\mathbb{R} \to \mathbb{R}$.

By distributing $dx$ and $dy$ into the cross product expression such that $\vec{dS} = \vec{r_x},dx \times{} \vec{r_y},dy$, one can then interpret this expression as scaling down the sides of the unit parallelogram by the differential lengths to the same effect on the magnitude of $\vec{dS}$, as discussed with $dS$.

%show image here of shrinking and scaling

Should the tangent plane be parallel to $z = 0$, we note that $dS = dA$ because the normal points vertically constraining $g_x$ and $g_y$ to $0$. This implies our parallelogram is a unit square, and conversely, any tilting of a plane yields a parallelogram of greater area.

The previous concept can be extended to arbitrary parametrizations of a surface. Additionally, the normal vector maintains its relation to higher dimensional differential elements as representing the linear approximation of the object in question. Nicer formalisms that carry with it concepts that account for this convenience is the exterior algebra, differential forms, and Hodge Duality.

Since the cross product is only valid in 3 dimensions, and we do not generally refer to higher dimensional objects as surfaces (although there might be some usage as a relative term, as in hypersurface), extending this idea would require defining an alternative product that generalizes to higher dimensions (such a product is known as the exterior product). For some function of 3 variables, rather than approximation by a tangent plane, a 3-dimensional manifold with arbitrary curvature would be approximated by differential volume elements, or tangent parallelopipeds mapped to by unit cube in the x-y-z space as a local, linear representation of the deformation of a 3-dimensional space.