Semantic abstraction vs. end-to-end:
(“Semi-classical” and “mid-to-mid” defined here.)
Is it possible to combine these approaches?
Only pixels that correspond to known categories are used in the semantic abstraction pipeline. This includes relevant categories — vehicles, pedestrians, cyclists, animals, roadways, bus stops, street lights, signs, stop lights, trees, etc. — and irrelevant categories — sky, clouds, airplanes, helicopters, some vegetation, rooflines, etc.
If pixels fall outside known categories, then they may correspond to an unknown relevant category. All pixels are fed into an end-to-end pipeline. This is a way, in theory, to learn implicit/tacit representations of relevant phenomena in the environment that human engineers would never think to label.
To use both a semantic abstraction pipeline and an end-to-end pipeline, I think there would need to be some system for adjudicating disagreements between the two pipelines. Lex Fridman and colleagues at MIT experimented with a system that adjudicated between Hardware 1 Autopilot (a semi-classical pipeline, as far as I know) and an end-to-end network: