Automated surveillance from a fixed camera works extremely well (VSAM). Certain object class recognition like face or pedestrian detection works extremely well (Viola & Jones). Stereo depth extraction, some systems with amazing & cheap custom ASICs, works extremely well (Tyzx). 2D object instance recognition works extremely well (SIFT). This can power SLAM when combined with stereo imagery. Object Tracking works extremely well (Collins). Traversable road classification, car detection, road boundary & lane detection, and Stereo/SFM will power the first semi-autonomous cars. All that tech works really well today -- the task is system integration.
Also, vision isn't just visible light cameras. ASC is making a flash lidar that would make you drool. The "Swiss Ranger" is also good. Both can provide snap-shot volumetric 3D data combined with color and texture information, with no moving parts. They'll scale much better than scanning lidars and even multi-laser systems like Velodyne.
Add 3D processing like Spin Images to the mix, and integrated vision systems will get very, very powerful, very soon.
I like AnyBot's approach, but I think it is incorrect to assume "AI" won't be good enough and tele-operation will be king. If I were a pure-software robotics startup, I would focus on vision & navigation software, large/networked system integration/configuration/scalability utilities (like I'm sure you're building), and behavior systems.
For a text, I would recommend reading the winning papers from major vision/machine learning/robotics conferences, and the papers they reference. Grad students at CMU's Robotics Institute use Forsyth & Ponce's "computer vision, a modern approach".
I would also recommend Mitchell's "Machine Learning", Strang's "Linear Algebra and it's applications", and throw in Thrun's new "Probabilistic Robotics" for fun.
Fresh book recommendations delivered straight to your inbox every Thursday.