What is a Sigmoid function?
The sigmoid function is a mathematical function that has a characteristic that can take any real value and map it to between 0 to 1 shaped like the letter “S”.
The sigmoid function is also known as a logistic function.
Y = 1 / 1+e -z
If the value of z goes up to positive infinity, then the predicted value of y will become 1. But if the value of z goes down to negative infinity, then the predicted value of y will become 0.
If the outcome of the sigmoid function is greater than 0.5 then you would classify that label to be class 1 or positive class and if it is less than 0.5 then you would classify it to be a negative class or label it as class 0.
The Sigmoid function performs the role of an activation function in machine learning which is used to add non-linearity in a machine learning model.
Basically, the function determines which value to pass as output and what not to pass as output. There are 7 types of activation functions that are used in machine learning and deep learning.
What are the types of sigmoid functions?
There are several types of sigmoid functions available. Here are three of the most common types of sigmoid functions.
1. Logistic Sigmoid Function
The logistic sigmoid function is normally referred to as the sigmoid function in the world of machine learning. The logistic sigmoid function can take any real-valued input and outputs a value between zero and one. This is how the logistic sigmoid function is mathematically defined:
2. Hyperbolic Tangent Function
The hyperbolic tangent function is another commonly used sigmoid function. This function maps any real-valued input to the range between -1 and 1. Here is the mathematical definition of the hyperbolic tangent function:
3. Arctangent Function
This is yet another type of sigmoid function. The arctangent function is essentially the inverse of the tangent function. This function maps any real-valued input to the range −π/2 to π/2. This is the mathematical definition of the arctangent function:
4. Guddermannian Function
This type of sigmoid function is related to the hyperbolic tangent function (tanh) in the same way that the arctangent function is related to the tangent function.
This function is generally applied in signal processing, mathematical physics & communication theory. This is the mathematical definition of the Guddermannian function:
5. Error Function
The error function (Gauss error function) is used in probability theory and statistics for describing the probability distribution of a Gaussian random variable.
This function delivers better performance in pattern recognition & classification algorithms. This is the mathematical definition of the error function:
6. Smoothstep Function
This function is most commonly used in computer graphics, animation, and a couple of other areas of computer science.
This function helps to create smooth transitions between colours, textures, & other visual elements in computer graphics. This is the mathematical definition of the Smoothstep function:
7. Generalised Logistic Function
This type of sigmoid function helps to model the growth of population, spread of diseases, and other applications in biology, ecology, and economics.
A practical example of this function is in marketing to model the growth of sales over a certain period, or in economics to chart out the adoption of new technologies. This is the mathematical definition of the logistics function:
What are the applications of Sigmoid function?
- Artificial neural networks - as an activation function for neurons
- Logistics regression - for modelling the probability of a binary outcome
- Image processing - to adjust the intensity values for enhancing the contrast between the dark & light regions
- Economics & Finance - for representing the rate of new technology adoption by consumers
- Biological systems - to represent the rate of change of system activation over time
What is the history of the sigmoid function?
1798 - A book named An Essay on the Principle of Population was published by the English cleric and economist Thomas Robert Malthus. He asserted that the population was increasing in a geometric progression (doubling every 25 years) while food supplies were increasing arithmetically. He claimed that this difference between the two would cause widespread famine.
1830 - Pierre François Verhulst, a Belgian mathematician wanted to account for the fact that a population's growth is ultimately self-limiting, it does not increase exponentially forever. For the purpose of modeling the slowing down of a population's growth which occurs when a population begins to exhaust its resources, Verhulst picked the logistic function as a logical adjustment to the simple exponential model.
1943 - Warren McCulloch and Walter Pitts developed an artificial neural network model using a hard cutoff as an activation function. In this model, a neuron generates an output of 1 or 0 depending on whether its input is above or below a threshold.
1972 - The biologists Hugh Wilson and Jack Cowan at the University of Chicago were trying to model biological neurons computationally and ended up publishing the Wilson–Cowan model, in which a neuron sends a signal to another neuron if it receives a signal greater than an activation potential. Wilson and Cowan employed the logistic sigmoid function to model the activation of a neuron as a function of a stimulus.
1998 - Yann LeCun selected the hyperbolic tangent as an activation function in his groundbreaking convolutional neural network LeNet, which was the first CNN to have the ability to recognize handwritten digits to a practical level of accuracy.
Recently, ANNs have shifted away from sigmoid functions towards the ReLU function, because all the variants of the sigmoid function are computationally intensive to calculate, and the ReLU offers the required nonlinearity to take advantage of the depth of the network, while also being very fast to compute.
What is sigmoid in deep learning?
The sigmoid neuron is essentially the building block of the deep neural networks. These sigmoid neurons are similar to perceptrons, but they happen to be slightly modified so that the output from the sigmoid neuron is far smoother than the step functional output from perceptron.
The sigmoid function is a smoother (less harsh) function than perceptron. In a sigmoid neuron, a minor change in the input only causes a minor change in the output, unlike the stepped functional output generated by a perceptron.
The inputs to the sigmoid neuron can be real numbers unlike the boolean inputs in MP Neuron. Even the output will be a real number between 0–1. In the sigmoid neuron, you are trying to regress the relationship between X and Y in terms of probability. Even though the output is between 0 and 1, you can still make use of the sigmoid function for binary classification tasks by selecting a threshold.