Seeing is not enough; a robot must remember. Object tracking is the bridge between static detection and dynamic understanding of the world.
1State Estimation and Kalman Filters
Tracking is essentially a State Estimation problem. We want to know the object's position and velocity at any given time. However, sensors are noisy. The Kalman Filter solves this by maintaining a 'Belief' about the object's state and updating it with every new measurement. It works in two steps: Predict (where should it be?) and Update (where did the sensor see it?). This recursive process allows for incredibly smooth and accurate tracking even when the sensor data is intermittent.
2Data Association and DeepSORT
When tracking multiple objects, the hardest challenge is Data Association: which new detection belongs to which existing track? Modern systems like DeepSORT use both geometric cues (where is the box?) and appearance cues (what does the object look like?) to make this decision. By creating a 'Feature Embedding' of the object's appearance, the system can re-identify a person even after they have been completely occluded for several seconds, which is critical for robots operating in crowded public spaces.
