Search courses, chapters, or pages...
Trace a few input numbers through a fixed calculation to get one guess, such as estimating whether a plant needs water. You’ll use forward pass for this one-way prediction step, before any learning happens.
Use what you learned in the previous lesson to solve real-world problems.
Treat each weight as how strongly an input pushes the guess, and the bias as a baseline push that is always present. You’ll reason through how changing one of these numbers can raise or lower a prediction.
Check what you understood with a short quiz.
Recognize that a network usually begins with small random weights, not useful knowledge. You’ll see why random starts help different neurons learn different roles as training changes the numbers.
Pass a neuron’s weighted score through an activation such as ReLU so the network can do more than one straight-line rule. You’ll compare a raw score with the value the neuron actually sends onward.
Follow how several neurons feed their outputs into another layer, creating hidden layers between the input and final answer. You’ll see how simple pieces can combine into a richer pattern detector.
Map the final layer to the kind of answer the task needs: a number, a yes/no probability with sigmoid, or class probabilities with softmax. You’ll practice reading the output without treating every prediction as the same type of value.
Calculate a single penalty from a guess and the true answer, using mean squared error for number guesses or cross-entropy for class guesses. You’ll see why loss is more useful for training than just saying “right” or “wrong.”
Average loss across several examples so the model does not overreact to one lucky or strange case. You’ll connect mini-batches to the idea of learning from a small sample of the training set at a time.
Interpret a gradient as a signal that says which direction would make the loss go up or down for one weight. You’ll reason about a knob change before actually updating the model.
Follow the loss signal from the output layer back toward earlier layers with backpropagation. You’ll see how each layer gets credit or blame for its part in the final mistake.
Update weights by moving a small amount against the gradient, controlled by the learning rate. You’ll compare plain stochastic gradient descent (SGD) with Adam, which adapts step sizes using recent gradient history.
Classify weights and biases as learned parameters, while choices like learning rate, layer count, batch size, and epochs are hyperparameters you set. You’ll keep “the model learned it” separate from “the trainer chose it.”
Use iteration for one mini-batch update and epoch for one full pass through the training data. You’ll reason why many repeated updates usually improve predictions more safely than one giant change.
Compare loss on training examples with loss on held-out validation examples. You’ll recognize underfitting when both stay poor and overfitting when training improves but new-example performance gets worse.
Review this chapter with practice based on your mistakes.