One of the jobs that a #securityrobot needs to be able to do (and do very well) is “seeing” what happens around it. The robot needs to understand what is happening so it can let humans know when something out of the ordinary is taking place. Vision is one of the most important features on Knightscope Autonomous Data Machines (ADMs) and it also is one of the top innovations in the security space in recent years. Knightscope has led its implementation on fully autonomous security robots which now patrol outdoors and indoors nationwide.
How does a security robot see?
The answer is quite simple but not at all an easy problem to solve: Artificial Intelligence (AI). AI is defined as: the ability of a system to perform tasks that normally require human intelligence. With AI, we try to simulate how the human brain works: nodes (neurons) and synapses that connect those neurons to each other. The system has layer upon layer of nodes (i.e., “Machine Learning”) and since we utilize a large number of layers, the algorithm is classified as “Deep Learning”.
There are many examples of Machine Learning available today, for example: IoT sensor monitoring, GPS enabled navigation, operational intelligence, listening systems, etc. For this blog, we will focus on visual perception.
So this is how it works: let’s say that we want to determine if there is person at a commercial property at night when no one is supposed to be there between 10pm – 5am.. The algorithm needs to first be able to tell what a person looks like. It learns things by being shown examples of what needs to be learned (both positive and negative). We then pre-label people images and these are fed to the system to recognize (or more appropriately ‘classify’) objects in our video and/or images.
As you might have already imagined, we need lots of examples to feed these algorithms – under our own specific conditions. Here at Knightscope we have labeled well over 1 million images to get our algorithms to perform as well as they do today and detecting all of the different type of objects that we need to detect. That is a lot of images! At one point we even built an in-house mobile app to help speed up the process.
We train the algorithm over and over until it gets to the accuracy that is acceptable for implementation at a client’s site. Sounds simple right? But some of the issues that come up during this type of work (especially since the camera is moving) are: blurred frames, dark shadows, distance concerns, weather conditions, time of day, etc. All of that needs to be eliminated or accounted for so that the user will only see crisp images of the person detected. Once we are live, the detection of a person in an image is performed at real-time speed.
Then the person who was roaming around an unauthorized area after hours, is caught!
“It is 2:10am and you are trespassing. The authorities have been notified.” Humans working with machines….imagine that!