Saturday, March 17, 2007

Surf web and read documents without using mouse

One of my friend asked me “How it would be if you have to surf net with their eyes?” I just told him “Really GREEEEE888888.” Then I asked him how this idea came to his mind. In reply he answered that he read a article on net discussing about such technology. When I came back to home, I search on net about this. What I found I want to share with you.

Student of Standford Mr. Manu Kumar has developed a new Gaze-enhanced User Interface User Interface Design Project, which enables users to navigate web using eye-tracking technology, without the help of keyboard or mouse.

Previously it was designed for disabled persons only but now it has been designed for all. This is done by the technology called EyePoint with standard eye-tacking software.

The interface of the computer has built-in infrared lights that shine into the user’s eyes. A camera senses the reflection and the computer uses that information to pinpoint where the user is looking and now there is no requirement to move mouse and click on the target, you have already done.

I wanted to know in details about the new technology, so I searched on net and whatever I found there, I am putting it down with brief descriptions.


Eye-Point is a technique for pointing and selection using combination of eye gaze and keyboard triggers. It uses a two-step progressive fine-tuning action, which makes it possible to compensate for accuracy limitations of the current state of the art eye gaze trackers. While research in gaze-based pointing has traditionally focused on disable users, Eye-Point makes gaze-based pointing effective and simple enough for even able-bodies users.


The key and mouse have been the traditional central forms of input devices in computer. What we instruct them, they work accordingly. Now you can also use your eyes as input device, with Eye-Point. Initially it was developed for disabled users who are unable to use keyboard and pointing devises normally. However with highly decreasing inaccuracy and price it will become practical for normal person. In this post I have focused on using eye gaze for the purpose of pointing and selection, simple means how it works.

EyePoint: Pointing and Selection

EyePoint provides a practical gaze-based solution for everyday pointing and selection using a combination of gaze and keyboard. EyePoint works by using a two-step, progressive refinement process that is fluidly stitched together in a look-press-look-release cycle.

To use EyePoint, the user simply looks at the target on the screen and presses a hotkey for desired action singly click, double click, right click, mouse over, or start click-and-drag. EyePoint displays a magnified view of the region the user was looking at. The user looks at the target again in the magnified view and releases the hotkey. This results in the appropriate action being performed on the target (sidebar).

To abort an action the user look away or anywhere outside the zoomed region and release the hotkey, or press the Esc key on the keyboard.

The region around the user’s initial gaze point is presented in the magnified view with a grid of orange dots overlaid. These orange dots are called focus points and may aid in focusing the user’s gaze at a point within the target. Focusing at a point reduces the jitter and improves the accuracy of the system.

Single click, double click and right click actions are performed as soon as the user releases the key. Click and drag is a two-step interaction. The user first selects the starting point for the click and drag with one hotkey and then the destination with another hotkey.

Technical Details:

The eye tracker constantly tracks the use’s eye movements. A modified version of Salvucci’s Dispersion Threshold Identification fixation detection algorithm is used along with its own smoothing algorithm to help filter the gaze data. When the user presses and holds one of the action specific hotkeys on the keyboard, the system uses the key press as a trigger to perform a screen capture in confidence interval around the user’s current eye-gaze. The default settings use a confidence interval of 120 pixels square (60 pixels in all four directions from the estimated gaze point). The system then applies a magnification factor (default 4x) to the captured region of the screen. The resulting image is shown to the user at a location centered at the previously estimated gaze point but offset to remain within screen boundaries.

The user then looks at the desired target in the magnified view and releases the hotkey. The user’s eye gaze is recorded with the hotkey is released. Since the view has been magnified, the resulting eye-gaze is more accurate by a factor equal to the magnification. A transform is applied to determine the location of the desired target in screen coordinates. The cursor is then moved to this location and the action corresponding to the hotkey (single click, double click, right click etc.) is executed. EyePoint therefore uses a secondary gaze point in the magnified view to refine the location of the target.


A quantitative evaluation of EyePoint shows that the performance of EyePoint is similar to the performance of mouse, though with slightly high errors rates. Users strongly preferred the experience of using gaze-based pointing over the mouse even though they had years of experience with the mouse.

Eye Expose: Application Switching

Eye Expose, combines a full-screen two-dimensional thumbnail view of the open application with gaze-based selection for application switching.

The Above figures show how Eye Expose works – to switch to a different application, the user presses and holds down a hotkey. Eye Expose responds by showing a scaled down view of all the applications that are currently open on the desktop. The user simply looks at the desired target application and releases the hotkey.

The use of eye gaze instead of the mouse for pointing is nature choice. The size of the tiled windows in Expose is usually large enough for eye-tracking accuracy to the not be an issue. Whether the user relies on eye gaze or the mouse for selecting the garget, the visual search task t find the desired application in the tiled view is prerequisite step. By using eye gaze with an explicit action (the release of the hotkey) we can leverage the user’s natural visual search to point to the desired selection.


According to the authorities quantitative evaluation showed that Eye Expose was significantly faster than using Alt- Tab when switching between twelve open applications. Error rates in application switching where minimal, with one error occurring in every twenty or more trials. In a qualitative evaluation, where subjects ranked four different application switching techniques (Alt-Tab, Task bar, Expose w/mouse and EyeExpose), Eye Expose was the subjects of choice for speed, ease of use, and the technique they would prefer to use if they had all four approaches available. Subjects felt that EyeExpose was more natural and faster than other approaches.

EyeScroll: Reading mode and Scrolling

Eye-Scroll allows computer users to automatically and adaptively scroll through content on their screen. When scrolling starts or stops and the speed of the scrolling is controlled by the user’s eye gaze and speed at which the user is reading. Eye Scroll provides multiple modes for scrolling – specifically, a reading modes and off-screen dwell-based targets.

Reading Mode:

The Eye Scroll reading modes allows users to read a web page or a long document, without having to constantly scroll the page manually. The reading mode is toggled by using the scroll lock key – a key which has otherwise been relegated to having not function on the keyboard. Once the reading mode is enabled by pressing the scrolls lock key, the system tracks the user’s gaze. When the user’s gaze location falls bellow a system-defined threshold (i.e. the user looks at the bottom part of the screen) Eye Scroll starts to slowly scroll the page. The rate of the scrolling is determined by the speed at which the user is reading and is usually slow enough to allow the user to continue reading even as the text scrolls up. As the use’s gaze slowly drifts up on the screen and passes an upper threshold, the scrolling is paused. This allows the reader to continue reading naturally, without being afraid that the text will run off the screen.

The reading speed is estimated by measuring the amount of time t it takes the user to complete one horizontal sweep from left to right and back. The delta in the number of vertical pixels the user’s gaze moved (accounting for any existing scrolling rate) derines the distance d in number of vertical pixels. The reading speed can then be estimated as d/t.

In order to facilitate scanning the scrolling speed can also take into account the location of the user’s gaze on the screen. If the user’s gaze is closer to the lower edge of the screen, the system can speed up scrolling and if the gaze is in the center or upper region of the screen the system can slow down scrolling accordingly. Since the scrolling speed and when the scrolling start and stopped are both functions of the user’s gaze, the system adapts to the reading style and patterns of each individual user.

By design the page is always scrolled only in one direction. This is because while reading top to bottom is fairly natural activity that can be detected based on gaze pattern, attempting to reverse scrolling direction triggers too many false positives. Therefore, it has been preferred to chose to combine the reading mode with explicit dwell-based activation for scrolling up in the document.

Off-Screen dwell-based targets:

The eye tracker’s field of view is sufficient to detect when the user looks at these off-screen targets. A dwell duration of 400-450 ms is used to trigger activation of the targets which are mapped to page up, page down, home and end keys on the keyboard.

The off-screen dwell-based targets for documents navigation complement the automated reading mode described above. Since the reading mode only provides scrolling in one direction, if the user wants to scroll up or navigate to the top of the document he/she can do that by using the off-screen targets.


Well, when I want to type passwords for your paypal or email account or something other, and some is behind me, I fell uncomfortable. Even sometimes I could not say them to go some other place for a while. You can understand. But this technology has the solution. You can type keywords or password(s) to your account with eyes and no one would be able to know this.


Pilot studies showed that subjects found the Eye Scroll reading mode to be natural and easy to use. Subjects particularly liked that the scrolling speed adapted to their reading speed. Additionally, they felt in control of the scrolling and did not feel that the text was running away at any point.

But after all this technology is some expensive and currently costs $25,000.

1 comment:

Aaron said...

Thank you for this fantastic article and demonstration videos to boot! I love technology that could go mainstream and would also be accessible to more people than the mouse is.