Skip to end of metadata
Go to start of metadata


Columbus in an iPhone application, that should allow its user to control remotely Colby's robot fleet. A part of the larger Robotic Avatars project, Columbus aims to become a groundbreaking application, that will allow for at least a partial immersion of the user in the remote environment, by providing video and haptic feedback from the location, and allowing to use natural gestures for control purposes. Columbus is a part of Independent Study during JanPlan 2010 under Dr. Bruce Maxwell's supervision. This is a wiki entry that will document the proceedings of the project.
Make sure to check Various (useful) notes for some random, but useful things/hints I came across while working on Columbus.

Stages of the project

Drawing Board (05/01/10 - 11/01/10)


At this stage, a documentation of gestures, feedback loops and possible problems and solutions will be created, to serve as a foundation and guide lines for the project. This will by no means a list of features that will make it to the final application, however they provide certain hopes and plans for the whole thing.
During the Drawing Board, IPC should be completely debugged and made to run on iPhone and Mac OS X.


IPC was compiled to work on iPhone through external ARM libraries.
Other then that, it seems that a fairly comfortable and usable design for UI was created, that would be easy to use and comfortable to program. Dictionary of gestures and actions seems to be extensive enough to use it as a whitepaper for programming something more complicated.

Radio Controller (11/01/10 - 28/01/10)

At this stage, a first application will be created to take in the user actions and prepare any sort of stream of data that could be usefully sent to robot and/or represented in space. Namely, at the end of Radio Controller stage, there should exist an application that will let user to create a full set of movements and have an entire list of commands for the robot available. The application would act similarly to a controller of an RC car or a gamepad.


An application that allows to control the turning by moving the phone like a steering wheel (and velocity, by using an on-screen throttle) was successfully created.

Video Feed (21/01/10 - 28/01/10)

At this stage, a second application will be created to receive video feed from the robot. This stage is postponed after the radio controller, since it will require a complete cooperation of IPC and GCM on both sides, as well as (most probably) numerous performance optimization on the side of iPhone to properly handle huge amounts of incoming data at very fast rates. At this stage there should exist an application, that would ideally be able to stream video directly from robot.


An application displaying the incoming images was created. Later it was looped using NSTimer to periodically download the image (thus, not exactly following the SVM guidelines, but it works fairly well) and display it in a stream.

Proof-Of-Concept (28/01/10 - ongoing)

At the final stage, the video feed and radio controller should be merged into one application, that should, ideally, allow for a control of the robot with certain haptic and visual feedback. Ideally, a fullscreen video feed managed by hand, with eye-candy interface, that with certain touches in later part of the year, could be sent off for Apple Design Awards competition.


A full, working application was created. It is not yet fully calibrated, but there is a working user interface (both tactile and on-screen), working really well. App might not be too fancy, but again, it is just a stepping stone for something further.

Problems in question

Delay in interaction


As the transmission will occur over a TCP/IP network, possibly over very long distances, there might be a substantial delay between users action and visual feedback on the screen.

Possible solution

Humans are incredibly apt to accommodating to delays, which in IT is best shown by computer games, where the user action vs. user behavior delay sometimes reaches up to a second. However, bigger problem is the fact that delay in actions will cause delay in responding to onscreen threats to robot. Therefore, an array of sensors has to be employed to ascertain, we stop in right places (i.e. before the wall or a person). Perhaps velocity space could be a good solution. It would be wise to add a dead-man switch (for any actions, a finger needs to be pressed on the screen to send data), so if the user disconnects we are notified of it immediately and can handle it before we drive into the wall.


The delays have not been solved yet. There is a slight delay in interaction, which one can accommodate to fairly well if just driving the robot in person, but not necessarily over a TCP/IP connection (delay in video, delay in gestures).

Dictionary of gestures


There is no common dictionary of gestures for such applications, therefore we need to devise a dictionary from ground up for all and every gesture that might be useful. Creating such dictionary also defines further programming and possible actions.


There should be four types of gestures:

  • Angular Movement gestures - a set of gestures responsible for rotating the robot.
  • Linear Movement gestures - a set of gestures responsible for robot going further.
  • Camera Control gestures - a set of gestures responsible for controlling the camera on the robot.
  • Misc gestures - a set of gestures that can represent additional features or actions.

These would be broken down as follows:

  • Angular Movement gestures
    Will definitely rely on the accelerometers built into iPhone, as we can calculate a accurate enough angle of rotation from the angle of rotation of the iPhone (but not necessarily in the space - in this case I presume rotation could be similar to controls on RC car - we just add some velocity, and it's larger if we move more). It's also one of the biggest good things regarding the platform (i.e. ease to tap into such settings).
    • Rotate to the left - rotating will cause the robot itself to rotate left (i.e. add the rotational velocity to the velocity already in)
    • Rotate to the right - rotating will cause the robot itself to rotate right
    • Following rotation by rotation in opposite direction - will cause decreasing the rotation velocity by appropriate value or coming to standstill; in case rotation exceeds the original value, it will rotate in the other direction
  • Linear Movement gestures
    I am not certain of accelerometers accuracy regarding translation of movement into real-life units (at least, without preceding calibration). Therefore, while an idea that walking forward would cause enough inertia to make robot move by one meter forward is appealing (and might be useful if we tap into GPS subsystem on the phone), we might need to use a solution similar to automatic cars:
    • Native state - robot is in standstill
    • Holding a gas button - robot goes forward
    • Releasing the button - robot stops
    • Reverse button - robot drives backwards (however, this should work only in override, as we cannot see anything behind us with a current camera solution - also, laser does not work in backward direction), possibly by rotating the robot by 180 degrees automatically
  • Camera Control
    Since these are very subtle controls and we will use tilt up and down only (tilt to the sides will be done using the native rotation gesture, to minimize the gestural confusion of the user), we might again rely on iPhone, in similar pattern to left and right:
    • Move up (not necessarily tilt) - look up with camera
    • Move down - look down with camera
  • Misc gestures
    Misc gestures will be used for certain features. This will be:
    • Screen tap - used as a dead-man switch and reassuring that the action is ongoing (i.e. accidental iPhone move or person rotation will not cause the robot to go crazy).
    • Shake-shake - go wandering around. Same gesture brings it back to normal mode.
    • iPhone rotation - robot rolls... maybe. (smile)


      Only the movement of robot was implemented at this point (i.e. left and right).

Feedback loops


No dictionary of feedback loops for the user, that would be part of a comfortable use.

Possible solution

Smart feedback loops

  • Vibration - as vibration and lack of feedback are attached to the feeling of something is going wrong, this seems as a very natural feedback loop for the velocity remaining zero although the user input something (e.g. velocity space says no to movement)
  • Markup - a little glyph on the top of the screen might indicate the direction of movement (arrows for direction, x for straight), what would allow the user to know for sure if he or she is turning the robot.
  • Visual - movement of the phone should be more of a Wiimote sort of control then 1:1 sort of control - while angles are not accurate, the movement is precise enough to ascertain user his commands are followed.


    Visual solution was created, vibration could be easily added.

Touchscreen and UI


No precedent for such app does not allow to follow someone's example. (but allows to be creative in GUI; and I like it (smile))

Possible solution

Follow the design on side. (it's 1:1 iPhone screen representation)
Due to disproportional iPhone screen (i.e. video feeds will be delivered in 4:3 format, while iPhone displays 3:2 - 480x320 pixels), either black side stripes should be left or video cropped appropriately (i.e. top and bottom). Since tilting the camera up and down will be enough to get look at unseen parts of the video and in most of the cases, it is of no interest to us what's on the ceiling or on the floor, it should be safe to just crop the video to the screen size (fill most of the space).
In case update info is necessary, it should be kept on the bottom. It might provide velocity information and GPS location (possibly, with nice visual effects). The transparent zones will be visible only on the connection/tutorial screens of the app. Throttle should work by moving finger on it. Using dead man zone allows to remove a gesture for braking and allows for the most natural withdrawal action if something goes wrong. I see a possible an arrow on the top being used to indicate direction of the movement happening, if we are still analyzing it off axis.
Additional features are:

  • Disconnecting when locking screen - many iPhone users just locks their screen without exiting the app. This would cause the robot to go crazy if phone is put in the pocket. Thus, disconnecting after locking the screen. Might not be necessary with dead man switch.

IPC/GCM incompatibility with device


GCM and IPC may not work well with the device. iPhone tends to be very wonky when it comes to custom C/C++ and basic network libraries, especially when it is to be plugged into a completely Obj-C/Cocoa application (Apple prefers everything to go in Obj-C wrappers).

Possible solution

Follow the pattern from Bernard Project and run a separate handler app for receiving and passing the messages from the robot. For video, a separate video streaming application could be set on the netbook to take advantage of the hardware H.264 streaming support in iPhone. This however would come at the cost of any computer vision on the side of the robot.

Older solution

Mac OS X has rather fine support for native IPC library and apps tap into central easily. However, on iPhone compiling IPC library is much more problematic - tracking the source, it turned out, that inclusion of IPC.h indeed makes the GCC compile central for its own needs and might run some of its parts in a separate threads (depending on what the compiler will allow). Furthermore, while I consistently set up all the requested variables (what included a little hack for supporting environmental variables on iPhone - the platform doesn't support it natively), I manage only to obtain the prompt on the right - the network connection subsystem on iPhone is not happy with Cs native calls and crashes during the connection setup.
Seeing such situation, I probably will consider a separate Mac or Linux app for marshaling the commands out of iPhone and packaging them as IPC commands (or even tapping into a control loop on the robot directly). I am becoming however afraid of the video support. Possibly, a handler app for Mac or Linux could take GCM input, explode it into JPG or a video stream and send it to iPhone (to further save on the little of processing power available)

So-far solution

Read more on Getting IPC to work on iPhone.


The project was a success, when everything planned was created. I attach my to-do list that was created in mid-January. The time estimates are a bit off, since I was changing them based on the remaining work in each chunk - in total, the whole project estimated for a solid week of 8-hours-per-day work, which I think suits the conception of JanPlan (smile)

That's all, folks!

  • No labels