Deep Vision's core capability lies in the broad domain of sensor exploitation and machine perception. Utilising Deep Vision's technology, a machine equipped with a sensor(s) is able not only to to extract objects of interest from within the data but also, ultimately, to represent the essence of those objects in a meaningful, reproducible, and irreducible form. Through a series of foundational philosophical insights and mathematical discoveries, Deep Vision has developed the Language of Form.
In this language, objects found within the sensor data are described by a string, or sequence, of rich qualitative meaning. These strings are invariant of all motion transformations which may act on the objects they represent. For instance, every unique object is always described by the same unique equivalent linguistic expression, regardless of its size, its position in the sensor's view, and its orientation in three dimensional space. This is the typical rotation, translation, and scale invariance oft sought after but never achieved by traditional image processing methods.
The Language of Form rapidly transcribes machine-meaningless sensor data into machine-meaningful descriptive expressions of sensor content (~150 frames per second 512 x 512), expressions that can be used to identify all particular things of interest and to relate their movements through time and space. The mathematical theory leading up to the Language of Form, and proving its existence, is built entirely upon foundational principles of mathematics and is very complex and rich. In computational terms, however, the implementation of the theory is trivial.
A direct corollary of the mathematics is the extended capability for Deep Vision to extract the exact physical quantities of, and relationships between, the objects themselves. At the primary level, these physical parameters include the complete three dimensional position of an object, relative to the sensor. This position can be provided in real world units, and has an accuracy and range limited only by the hardware on which it depends (e.g. camera and lens, sensor apparatus); an upper limit on the error is currently estimated at +/- 2 cm. The system is entirely passive, as it is dependent on the data alone, and can be used in any situation, independent of any prior knowledge of the environment.
Deep Vision's technology employs no conventional image processing methods (e.g. neural networks, curve fitting, template matching, colour segmentation, filtering, etc.). Instead, Deep Vision's technology interprets
the sensor data in such a way as to produce a meaningful description of the its salient content. Simply put, the technology reads
the data, rather than searching it for specific items - which may, or may not, be present. This is a fresh approach in a field that has remained stagnant for over 40 years.
Technical Facts
- Translates sensor data into a semantically rich, formal (symbolic) representation
- Extremely fast - 60 frames per second conservatively
- Easily customisable for a specific application
- Embedded into a custom, closed module for easy interoperabilty with existing systems
Deep Vision's technology is availablein a closed, custom designed module. Typically the custom solution is integrated directly into the client's system, thus enabling it with high-end sensor exploitation and machine perception capabilities.



