Is any way of doing computer vision over the cloud? The idea is like people log in a website, then the webcam is activated, the video data is sent to the server through internet. Server processes those data and sent back the processed data to user in real time or 10 frame per second at least.
Is this doable? What kind of skills do we need on the network side? I know video streaming is one component. Also, How can we set up the server? Distributed system can help or not considering very large computation in limited time?
The different scale-space detection levels can run in parallel, also the database you compare your images against can be distributed over a number of servers.
As I understand you want to create a kind of augmented reality. I can not answer with a clear yes or no if it can be done with current mobile cpu power and bandwidth.
I would start by implementing a very rudamentary feature detection on the client side, then sending still pictures to the server (high resolution is the key). The server can process the image with large computing power and check the objects against the database. Then send back the result.
The client then can connect its very basic feature detection with the server's response and this way create a real-time "labelled" video. The server has to be called when the client detects that new image data is available (the user turns the phone in a different direction).