May 16, 2023. To learn more about the basic fields of mathematics relevant to machine learning, check out this Medium post by Wale Akinfaderin, The Mathematics of Machine Learning
Akinfaderin published the post initially on his Lindked account in 2016. Although it's an "ancient" article by today's standards, where anything 5 weeks old can feel 5 years old, Akinfarerin's article is still relevant today.
Akinfaderin keenly noted the nascent interest on machine learning, implicitly recognizing the potential growth for the field. By now, in May of 2023, it is evident that the interest in machine learning keeps increasing daily. Interest in machine learning is far from peaking yet. Machine learning is here to stay. Machine learning is a growing field. It is up to you to jump on it stay up to date on the playing field, or to retire to the sidelines.
Momentum for everything AI is here. The AI industry it in its infancy. It will continue growing in the years and decades to come. Just as biological intelligence, AI has no limits or boundaries. Bookmark these words. As long as there is human life on the planet, there will be a vibrant market for AI. Just like computers and the internet, AI is now here forever.
Back in 2016, Akinfaderin noted that many AI enthusiasts lacked the basic mathematical background to understand the computational science behind machine learning. That was true then; it is true today; and will be true tomorrow. The more we learn, the more we need to continue learning.
Every day humanity will have more exposure to AI and machine learning. The interest in the field will continue growing. More people will be interested in knowing more about AI and how machine learning works. Kids in elementary school today will be majoring in computer science 10 to 15 years from now. High school kids will do so within the next 4 years. Many college kids will pursue post graduate studies in machine learning.
It is important to recognize the main fields or branches of mathematics involved in machine learning. The four basic fields of mathematics relevant to machine learning are: linear algebra; statistics; multivariable calculus; and algorithms.
LINEAR ALGEBRA
Let's take a quick look at basic concepts in linear algebra. Below we will simply scratch the surface on the topic. There is plenty more to study and learn. We will continue learning because its fun and gives us something to do. Learning beats the alternative.
Linear algebra is a branch of mathematics that deals with the representations of linear equations using vectors and matrices.
Mathematics is the study of quantities and shapes. There is an infinitum or endless set of things to study in life. When you are studying amounts and forms (quantities and shapes), you are studying mathematics. There are multiple branches of mathematics. More are being created. The continuum goes to infinity. So long as there is movement, there is transformation. Good luck trying to make things stop. Stillness is an illusion.
Vectors are lists of numbers. Plotted on graphs geometrically, vectors form arrows with represented magnitude and direction. Think of an arrow on a graph indicating with a starting point, a magnitude or length, and a direction. Vectors are lists of numbers represented geometrically by arrows.
Matrices are linear transformations of vectors. Since vectors are represented numerically as just a list of numbers, whether in a row or a column, matrices are formed with rows and columns of vectors. Vectors are linearly transformed into matrices.
Linear algebra is the addition, subtraction, and multiplication of linear equations represented by vectors and matrices.
A linear equation is an algebraic equation where each variable has an exponent of 1, and when graphed always form a straight line.
Linear equations are called linear because they form lines when graphed. To identify a linear equation just look at any algebraic equation. If any variable has an exponent other than number 1 (i.e. raised to the power of anything other than one), the equation is not linear.
Common linear algebra operations on vectors are addition, subtraction, multiplication by a scalar (a real number). Multiplication of two vectors is accomplished by either the dot product or the cross product. [Check out Mathinsight for more about vectors.]
Linear algebra operations relevant to machine learning include all of the following:
Principal Component Analysis (PCA). This is a mathematical method aimed at reducing the number of variables in a data set without significantly reducing the accuracy of the data set. Naturally, PCA is very helpful in managing big data sets for machine learning by reducing complexity without sacrificing utility.
Singular Value Decomposition (SVD). Singular value decomposition is the simplification or factorization of a matrix into simple matrices to better understand data components. Since machine learning is all about processing big data sets to find patterns buried in the data, SVD simplification can help improve the efficiency of the process.
Eigendecomposition of a matrix either by Lower-Upper (LU) decomposition or by QR decomposition. LU decomposition is factoring of a lower triangular matrix and an upper triangular matrix. QR decomposition is factoring a matrix that has independent columns as the product of a matrix with orthonormal columns (Q matrix or real square matrix with rows and columns that are orthonormal vectors) and a upper triangular matrix R. These eigendecompositions can help in performing matrix operations and in gathering facts about matrices that can facilitate the machine learning process.
Orthogonalization and Orthonormalization. Orthogonalization is the process of making vectors orthogonal. Orthogonal vectors are set of vectors that at right angles of each other (perpendicular to each other with a 90° angle between them) and their "dot product" is zero. The dot product (scalar product or inner product) coverts a vector coordinate set into a single number. It is obtained multiplying the Euclidean magnitude of the vectors and the cosine of the angle between them. The cosine of the angle between two orthonormal vectors is their dot product. Dot product is used for defining the "length" or magnitude of a vector. The length or magnitude of a vector is the sum of of the square root of the sum of the square of the vector components. The length is a "scalar", vector product in a three-dimensional space. Vector sets are orthonormal when all vectors are "normal" (i.e. have a length of one), and each vector pair in the set is orthogonal. Orthonormalization is "normalizing" vectors by changing them from non-unit vectors to "unit" vectors. Unit vectors are those with a length or magnitude of one.
Matrix Operations (addition, subtraction, and multiplication of matrices).
Symmetric Matrix Covariance. A symmetric matrix is a square matrix that is equal to its transpose, and remains unchanged when the transpose is calculated. In machine learning many functions are symmetrical, making the corresponding matrices also symmetric. Symmetric matrices are useful in discovering "covariance" or joint variability of two random variables in data sets. In machine learning it is useful to measure the covariance of data points that could lead to the identification of patterns in data sets.
Matrix Projection which is mapping vectors to subspaces. In statistics, the projection matrix (influence or hat matrix) correlates or maps vectors dependent variable values (response values) to the vectors of predicted (fitted) values
Eigenvalues & Eigenvectors, which refer to non zero vectors that do not change their own direction in a transformation, but rather scale to their own characteristic values in the same direction. They do not change direction, but rather keep their own characteristic direction. Eigen is a German word meaning own or characteristic.
Vector Spaces and Norms, which refers to using a function (called "norm") that associates vectors to their length on a vector space (also called "linear space" or set of vectors scaled (added or multiplied by numbers called "scalars").
Comments
Post a Comment