Geometric Transformations

Vertex Transformation – A Spacial Odyssey

After going through the novice “initiation” part of this tutorial series, you now possess a better grasp on the elemental aspects of real-time 3D rendering in modern OpenGL. There’s a lot of ground to cover still, and at this point it’s time to set a solid foundation in basic matrix operations and how linear transformations are used in order to place objects in the scene.

For some this will be known as the “dreaded math chapter.” Math enthusiasts may be disappointed that the presented material does not go into greater detail on homogeneous coordinate spaces. There will be a “further reading” section in the last part of this tutorial set covering a collection of examples and exercises aimed at improving the understanding of geometric transformations.

Concepts such as 2D and 3D coordinate systems, matrix/vector representations and operations are the mathematical foundations that you need to know in order to survive computer graphics.

When using low-level APIs such as OpenGL, a frame of reference to measure and locate against is always required in order to draw objects. In other words, a coordinate system must be specified along with a way in which this is mapped into physical screen pixels.

Deep Thought - Hitchhicker's Guide To The Galaxy

Deep Thought – Hitchhiker’s Guide To The Galaxy

In the previous chapter we have briefly touched the 2D Cartesian Coordinate system, and coordinate clipping – a method of telling OpenGL how to translate specific coordinate pairs into screen coordinates by specifying the region of Cartesian space taking up the window area. Since your clipping area (width, height) will seldom exactly match the window area in pixels, the notion of viewport came about. This is responsible for mapping from a logical Cartesian system to physical screen pixel coordinates. Essentially, it deals with describing the clipping area within the window. By introducing the depth component we transition from a 2D Cartesian coordinate system to a 3D coordinate system, which brings us to our current topic: how do we go from a 3D coordinate system (our object’s state) to a 2D system (screen space) using the OpenGL API? The short answer: geometric transformations and matrix manipulation.

Geometrical transformations & matrix basics

From a mathematical point of view, a matrix is a rectangular array of values separated into rows and columns, and is treated as a single entity. Matrices can be multiplied by vectors, scalars, and other matrices. This is important because when a vector (describing the location of an object – a vertex in a mesh) is multiplied by a matrix (describing a number of transforms such as translation, rotation, scaling, shearing, etc.), the resulting vector will contain the new location, orientation, and size information from the matrix transform.

These transformations can be specified within three stages: modeling, viewing, and projecting. Before going into further detail on each of the pipeline stages, it is important to look at how transformations are represented first in 2D, then expressed with homogeneous coordinates, and finally set in 3D form.

Matrix representation

Internally, 4 x 4 matrices in OpenGL are represented as separate 16 contiguous floating-point values. This could be counterintuitive at a first glance considering that numerous mathematical libraries store this matrix as a bidimensional array.


Figure 1. Column vs row major representation. Image made available from Mabulous.

The OpenGL specification does not list the order for how elements are stored in a matrix: only that translations is stored in locations 13, 14, 15. If we are using C/C++ then in memory, the 4×4 approach of the two-dimensional array is laid out in a row-major order, as opposed to column-major order. Mathematically, the two orientations are the transpose of one another.

2D Transformation
2D transformations take place in a two-dimensional space (a plane and its coordinate system). The coordinate system represents a way of expressing location of points through an ordered pair of numbers. On an arbitrary plane, a various number of coordinate systems could exist. This means that the coordinates of a specific point mapped to the plane could yield different locations depending on the coordinate system used.

Throughout this tutorial the Cartesian coordinate system will be used; one of its notable properties is that the two axes defining the system are perpendicular to each other and intersect at the origin noted with O. There are other coordinate systems which do not rely on axes (polar coordinates).

Points in the (x, y) plane can be translated to different positions by adding translation amounts to the points’ coordinates. Consider having dx, and dy representing translation amounts (parallel to the X axis, and Y axis respectively) by which every point P(x, y) will be moved to a new point P'(x’, y’). This can be simply represented as follows:


Figure 2. a) The two coordinates of point P modified by distance d. b) Column vectors equivalent to the coordinates of point P, P’, and translation T satisfying the expression P’ = T + P.

Any object can be translated by applying the above expression (Fig. 1) to every point defining the object. Even though each line in the object is made up of an infinite amount of points, the translation process in OpenGL only affects the endpoints of each line. The new line is drawn between the translated endpoints(Fig. 2).

Figure 3. Example of translation by (3, -4) applied to a square.

Figure 3. Example of translation by (3, -4) applied to a square. a) Original position. b) Position after translation.

In a similar manner, points can be scaled by the factor sx along the X axis, and by sy along the vertical Y axis through multiplication:


Figure 4. a) The two coordinates of P scaled by factor s. b) Scaling matrix applied to initial point P. c) Generalised form of the scaling expression. Formula based on slides available at CS Brandeis Edu.

Figure. Scaling square by 1/2 in x, and 1/4 in y. a) Original sizes, b) Sizes after scaling.

Figure 5. Scaling square by 1/2 in x, and 1/4 in y. a) Original size of square. b) Size after scaling was applied.

Scaling is about the origin – if scale factor s > 1, the object would be both larger and further away from origin(magnification), otherwise it will be the opposite (minification). Proportions can stay unaffected if sx = sy (uniform scale), however non-uniform scaling is not uncommon, and assumes that the scale factors are not identical.
In addition to this, the scale matrix has an inverse due to the fact that its determinant is always positive.

Points defining an object can also be rotated through an angle θ about the origin. Mathematically, this rotation is illustrated as follows:

Figure. a) b) c)

Figure 6. a) Straight-forward approach at conveying rotation around X and Y axis.  b) Same expression rendered with matrices. c) Generalised form defining point P’ with the new orientation

As with scaling, the rotation is about the origin, however, rotating about an arbitrary point is possible. It is worth noting that positive angle values are measured counterclockwise from X toward Y. For negative angle values, the identities cos(- θ) = cos θ, and sin(-θ) = -sin(θ) can be used to modify above equation.

Figure. Rotation of square by 45°. a) Original orientation, b) Orientation after rotation.

Figure 7. Rotation of square by 45° a) Original orientation, b) Orientation after rotation.

Due to the fact that the rotation is about the origin , the distances from origin to P and P’, labeled r in the figure, are equal. Through trigonometric knowledge it follows that:

Figure 8. a) Expression defining the coordinates of the initial point P. b) Expression defining P’ which is P with a new rotation c) Derived rotation equation. The distances from origin are equal(r) due to the rotation being about the origin of the system.

Other notable transformations
There are two other notable types of transformations: reflection and shearing. They will be discussed in the following tutorials, however, for now it’s important to note that the reflection transformation defines P’ as negative P (-x, -y), and shear preserves parallel lines while translating points along a single axis.

To be continued…
The next tutorial (Part II) will go into further detail on homogeneous coordinates and affine transformations, and will strive to explain why it is important to treat all of the transformation expressions in a generalised, and consistent manner.

Software engineer (computer graphics) & CS PhD Padawan. Interested in game engine dev, GLSL, computer vision, GPGPU, V.E./V.R, algo challenges, & Raspberry Pi dev. When I don't shift bits, I enjoy playing SC2, driving to new places, drawing, and playing guitar.

blog comments powered by Disqus