Keynote Speakers

Title: Body Part Recognition for Kinect

Date/Time: July 14 (Thu), 9:00 - 10:10

Room: JHC.02

Abstract: The Microsoft Kinect for Xbox 360 sensor provides a stream of 3D depth images that is analysed by software to give a moving interpretation of the human skeleton, in real time. Before Kinect, equipment for motion-capture was already available commercially but required instrumentation of the moving human subject, in the form of retro-reflective markers, placed on all body joints. For user interfaces, however, it is imperative that motion capture be markerless. Machine learning techniques were applied to build a capability to analyse depth images independently, classifying pixels in each depth image as belonging to one of 31 body parts. The classifier is trained and tested using a very large database of pre-classified images, covering varied poses and body types. It is engineered so efficiently that it uses only a fraction of the total available computing capacity. Initially launched for use in gaming, Kinect represents a more general advance in technology for natural user interface between man and machine.

Andrew Blake is a Deputy Managing Director at Microsoft Research Cambridge, where he is responsible for academic interaction and business operations, he also jointly leads the Machine Learning and Perception Group (MLP), where his main research focus is computer vision.

Andrew graduated from Trinity College, Cambridge in 1977 with a BA in Mathematics and Electrical Sciences. After a year as a Kennedy Scholar at MIT and two years in the defence electronics industry, he studied for a doctorate at the University of Edinburgh, which was awarded in 1983.

He ran the Visual Dynamics Research Group as faculty in the Department of Engineering Science at the University of Oxford from 1987-1999. He became a Professor in 1996, and was a Royal Society Senior Research Fellow from 1998-1999. Andrew then joined Microsoft Research Cambridge in 1999. He also holds an honorary Professorship at the University of Cambridge.

He was elected Fellow of the Royal Academy of Engineering in 1998 and Fellow of the Royal Society in 2005. In 2006 the Royal Academy of Engineering awarded him its Silver Medal. 2007 saw Andrew elected as a Fellow of the IEEE and also the recipient of the 2007 IET Mountbatten Medal.

Andrew has published several books including “Visual Reconstruction” with A. Zisserman (MIT press), “Active Vision” with Alan Yuille (MIT Press) and “Active Contours” with Michael Isard (Springer-Verlag). His research interests are in building machine intelligence into image processing software, with applications to motion capture, image editing, remote collaboration and medical imaging.


Title: Towards Exaflop Supercomputers

Date/Time: July 12 (Tue), 9:00 - 10:10

Room: JHC.02

Abstract:Having recently surpassed the Petascale barrier, supercomputers designers and users are now facing the next challenge. A thousand fold performance increase that if the improvement rate of the last decades continues will be reached around 2018. Being power the main constraint and facing many hardware challenges, software is probably the biggest one. Worldwide and cooperative initiatives are being started to perform research facing such objective. The Barcelona Supercomputing Center is involved in such initiatives and carries out the MareIncognito research project aiming at developing some of the technologies that we consider will be of key relevance on the way to Exascale. The talk will briefly discuss relevant issues, foreseen architectures and software approaches that will have to be developed in order to successfully install and operate such machines.

Mateo Valero is a professor in the Computer Architecture Department at UPC, in Barcelona. His research interests focuses on high performance architectures. He has published approximately 500 papers, has served in the organization of more than 300 International Conferences and he has given more than 300 invited talks. He is the director of the Barcelona Supercomputing Centre, the National Centre of Supercomputing in Spain.

Dr. Valero has been honoured with several awards. Among them, the Eckert-Mauchly Award, by the IEEE and the ACM, the IEEE Harry Goode, two Spanish National awards, the "Julio Rey Pastor" to recognize research on IT technologies and the “Leonardo Torres Quevedo” to recognize research in Engineering, by the Spanish Ministry of Science and Technology, presented by the King of Spain and the “King Jaime I” in research by the Generalitat Valenciana presented by the Queen of Spain. He has been named Honorary Doctor by the University of Chalmers, by the University of Belgrade, by the Universities of Las Palmas de Gran Canarias and Zaragoza in Spain and by the University of Veracruz in Mexico.  "Hall of the Fame", selected as one of the 25 most influents European researchers in IT during the period 1983-2008.

In December 1994, Professor Valero became a founding member of the Royal Spanish Academy of Engineering. In 2005 he was elected Correspondant Academic of the Spanish Royal Academy of Science and in 2006, member of the Royal Spanish Academy of Doctors and member of the “Academia Europaea”, the “Academy of Europe”. He is a Fellow of the IEEE, Fellow of the ACM and an Intel Distinguished Research Fellow. In 1998 he won a “Favourite Son” Award of his home town, Alfamén (Zaragoza)and in 2006, his native town of Alfamén named their Public College after him.


Title: Fast, Automatic, Photo-Realistic, 3D Modeling of Building Interiors

Date/Time: July 13 (Wed), 9:00 - 10:10

Room: JHC.02

Abstract: Automated 3D modeling of building interiors is useful in applications such as virtual reality and entertainment. In this talk, we develop an architecture and associated algorithms for fast, automatic, photo-realistic 3D models of building interiors. The central challenge of such a problem is to localize the acquisition device in GPS denied environments, while it is in motion, rather than collecting the data in a stop and go fashion. In the past, such acquisition devices have been placed on robots with wheels or human operated pushcarts, which would limit their use to planar environments. Our goal is to address the more difficult problem of localization and 3D modeling in more complex non-planar environments such as staircases, caves, or non-even surfaces. We propose a human operated backpack system made of a suite of sensors such as laser scanners, cameras, orientation measurement units (OMU)s which are used to both localize the backpack, and build the 3D geometry and texture of building interiors. We describe a number of localization algorithms based on merging laser, camera and OMU sensor information, and compare their performance using a high end IMU sensor which serves as the ground truth. Once the backpack is localized, a 3D point cloud can be generated and 3D meshing algorithms are applied to generate texture mapped 3D models. We show examples of resulting models for multiple floors of the electrical engineering building at U.C. Berkeley. Applications to image based rendering of 3D environments and mobile augmented reality will also be discussed.

Avideh Zakhor joined the faculty at UC Berkeley in 1988 where she is currently the Qualcomm Professor of Electrical Engineering and Computer Sciences. Her areas of interest include theories and applications of signal, image and video processing, 3D computer vision, and multimedia networking. She has won a number of best paper awards, including the IEEE Signal Processing Society in 1997 and 2009, IEEE Circuits and Systems Society in 1997 and 1999, international conference on image processing in 1999, Packet Video Workshop in 2002, and IEEE Workshop on Multimodal Sentient Computing in 2007. She holds 6 U.S. patents, and is the co-author of three books with her students.

Prof. Zakhor received the B. S. degree from California Institute of Technology, Pasadena, and the S. M. and Ph. D. degrees from Massachusetts Institute of Technology, Cambridge, all in electrical engineering, in 1983, 1985, and 1987 respectively. She was a General Motors scholar from 1982 to 1983, was a Hertz fellow from 1984 to 1988, received the Presidential Young Investigators (PYI) award, and Office of Naval Research (ONR) young investigator award in 1992. In 2001, she was elected as IEEE fellow and received the Okawa Prize in 2004.

She co-founded OPC technology in 1996, which was later acquired by Mentor Graphics (Nasdaq: MENT) in 1998, Truvideo in 2000, and UrbanScan Inc. in 2005 which was acquired by Google in 2007.

Plenary Speakers

Title: Sparse and Redundant Representations: Theory and Applications

Date/Time: July 12 (Tue), 16:00 - 17:00

Room: JHC.02

Abstract: In this talk we describe some of the recent advances in sparse and redundant representations of signals. Specific applications are described ranging from the matrix completion problem, to video retrieval and compressive sensing. A Bayesian framework is utilized in the formulation and solution of such problems. Specific examples are shown from image and video processing and comparisons are made with the state-of-the-art algorithms. Open problems and future research directions are discussed.

Aggelos K. Katsaggelos received the Diploma degree in electrical and mechanical engineering from the Aristotelian University of Thessaloniki, Greece, in 1979, and the M.S. and Ph.D. degrees in Electrical Engineering from the Georgia Institute of Technology, in 1981 and 1985, respectively.

In 1985, he joined the Department of Electrical Engineering and Computer Science at Northwestern University, where he is currently a Professor. He was the holder of the Ameritech Chair of Information Technology (1997–2003). He is also the Director of the Motorola Center for Seamless Communications, a member of the Academic Staff, NorthShore University Health System, an affiliated faculty at the Department of Linguistics and he has an appointment with the Argonne National Laboratory.

He has published extensively in the areas of multimedia processing and communications and he is the holder of 16 international patents. He is the co-author of Rate-Distortion Based Video Compression (Kluwer, 1997), Super-Resolution for Images and Video (Claypool, 2007) and Joint Source-Channel Video Transmission (Claypool, 2007).

Among his many professional activities Prof. Katsaggelos was Editor-in-Chief of the IEEE Signal Processing Magazine (1997–2002), a BOG Member of the IEEE Signal Processing Society (1999–2001), and a member of the Publication Board of the IEEE Proceedings (2003-2007). He is a Fellow of the IEEE (1998) and SPIE (2009) and the recipient of the IEEE Third Millennium Medal (2000), the IEEE Signal Processing Society Meritorious Service Award (2001), an IEEE Signal Processing Society Best Paper Award (2001), an IEEE ICME Paper Award (2006), an IEEE ICIP Paper Award (2007) and an ISPA Paper Award (2009). He was a Distinguished Lecturer of the IEEE Signal Processing Society (2007–2008).


Title: Mobile Visual Search

Date/Time: July 13 (Wed), 16:00 - 17:00

Room: JHC.02

Abstract: Handheld mobile devices, such as camera phones or PDAs, are expected to become ubiquitous platforms for visual search and mobile augmented reality applications. For mobile image matching, a visual data base is typically stored at a server in the network. Hence, for a visual comparison, information must be either uploaded from the mobile to the server, or downloaded from the server to the mobile. With relatively slow wireless links, the response time of the system critically depends on how much information must be transferred in both directions. We review recent advances in mobile matching, using a "bag-of-visual-words" approach with robust feature descriptors, and show that dramatic speed-ups and power savings are possible by considering recognition, compression, and retrieval jointly. We will use real-time implementations for different example applications, such as recognition of landmark, media covers or printed documents, to show the benefit from image processing on the phone, the server, and/or both.

Download the Slides (MS PPTX, 18MB)

Bernd Girod is Professor of Electrical Engineering in the Information Systems Laboratory of Stanford University, California. He also holds a courtesy appointment with the Stanford Department of Computer Science and serves as Director of the Stanford Center for Image Systems Engineering (SCIEN). His current research interests include image and video coding, networked media systems, and image-based retrieval.

He received his M. S. degree in Electrical Engineering from Georgia Institute of Technology, in 1980 and his Doctoral degree from University of Hannover, Germany, in 1987. He joined Massachusetts Institute of Technology, Cambridge, MA, USA, and was an Assistant Professor of Media Technology at the Media Laboratory there until 1990. From 1990 to 1993, he was Professor of Computer Graphics and Technical Director of the Academy of Media Arts in Cologne, Germany, jointly appointed with the Computer Science Section of Cologne University. From 1993 until 1999, he held the Chair of Electrical Engineering / Telecommunications at University of Erlangen-Nuremberg, Germany, and was the Head of the Telecommunications Institute I and director of the Telecommunications Laboratory. He served as the Chairman of the Electrical Engineering Department from 1995 to 1997.

As an entrepreneur, Professor Girod has worked successfully with several start-up ventures as founder, investor, director, or advisor. Most notably, he has been a co-founder and Chief Scientist of Vivo Software, Inc., Waltham, MA (1993-98); after Vivo's aquisition, 1998-2002, Chief Scientist of RealNetworks, Inc. (Nasdaq: RNWK). He has served on the Board of Directors for 8x8, Inc., Santa Clara, CA, (Nasdaq: EGHT) 1996-2004, and for GeoVantage, Inc., Swampscott, MA, 2000-2005. In 2007, he co-founded Dyyno, Inc. Palo Alto, CA. From 2004 to 2007, he also served as Chairman of the Steering Committee of the new Deutsche Telekom Laboratories at the Technical University of Berlin.

Professor Girod has authored or co-authored one major text-book (printed in 3 languages), four monographs, and over 400 book chapters, journal articles and conference papers, and is a named inventor of over 20 US patents. He has been a member of the IEEE Image and Multidimensional Signal Processing Technical Committee from 1989 to 1997 and has served on the Editorial Boards for several journals in his field, among them as founding Associate Editor for the IEEE Transactions on Image Processing and Area Editor for Speech, Image, Video & Signal Processing of the IEEE Transactions on Communications. He has served on numerous conference committees, e.g., as Tutorial Chair of ICASSP-97 in Munich and again for ICIP-2000 in Vancouver, as General Chair of the 1998 IEEE Image and Multidimensional Signal Processing Workshop in Alpbach, Austria, as General Chair of the Visual Communication and Image Processing Conference (VCIP) in San Jose, CA, in 2001, and General Chair of Vision, Modeling, and Visualization (VMV) at Stanford, CA, in 2004, and General Co-Chair of ICIP-2008 in San Diego.

Professor Girod was elected Fellow of the IEEE in 1998 'for his contributions to the theory and practice of video communications.' He has been named 'Distinguished Lecturer' for the year 2002 by the IEEE Signal Processing Society. He received the 2002 EURASIP Best Paper Award (with J. Eggers) and the 2004 EURASIP Techical Achievement Award, the IEEE Multimedia Communication Best Paper Award in 2007, and the EURASIP Image Communication Best Paper Award 2008. He was elected a member of the German National Academy of Sciences (Leopoldina) in 2007 and a Fellow of EURASIP in 2008.


Title: Telepresence: Transcending Space and Time

Date/Time: July 13 (Wed), 17:00 - 18:00

Room: JHC.02

Abstract: From Star Trek and Star Wars to The Matrix and Avatar, Hollywood has reflected man's dream of Telepresence. Today Telepresence is embodied in the marketplace by solutions such as HP Halo and Cisco Telepresence, dedicated conference rooms sporting built-in furniture and life-sized high-definition video, costing hundreds of thousands of dollars per room. In the future, Telepresence systems will be more diverse, enabling connections between not only meeting rooms but also offices, hotel rooms, vehicles, and even large unstructured spaces such as conference halls and stadiums. Mixed virtual and physical reality as well as ubiquitous computing - including robotics - will play key roles, because these systems will not only need to immerse the participants in a common world, but will also need to empower the participants in ways that are better than being physically present. In this talk, I will take you on a tour of various component technologies as well as experiences that are being developed in Microsoft Research for the future of Telepresence.

Starting January 1, 2010, Anoop Gupta has moved back to Microsoft Research (MSR) as Distinguished Scientist. He works on strategic cross-cutting scenarios that leverage MSR strengths and have potential for large business impact. He reports to Rick Rashid, senior vice president and global head of Microsoft Research.

Prior to re-joining MSR, from 2007-2009 Gupta served as corporate vice president of technology policy and strategy for Microsoft. In this role, he guided Microsoft's engagement with governments and institutions around the world regarding Microsoft's vision of upcoming technology innovations and the combination of policies and regulations that might maximize their benefits for citizens. In this capacity, Gupta reported to and worked closely with Craig Mundie, Microsoft's chief research and strategy officer.

During 2007-2009 Gupta also served as the corporate vice president of the Unlimited Potential Group and Education Products Group. He led the company's efforts in new business models and technologies to help close the digital divide and help bring benefits of access to technology and economic opportunity to people at the base of the economic pyramid. He was also responsible for leading Microsoft's education efforts across the company and bringing to market education solutions in both developed and emerging markets.

Prior to that, from 2003-2007 Gupta served as corporate vice president and founding leader of Microsoft's Unified Communications Group, leading the company's client-server-service efforts to provide business communications solutions (e-mail, IM, VoIP-telephony, unified messaging, audio/video/web conferencing). His team was responsible for Microsoft Exchange Server, Microsoft Office Communications Server, Microsoft Office Live Meeting, Exchange Hosted Services, Microsoft Office Communicator and RoundTable, and other related communications products and services.

Before leading the Unified Communications Group, from 2001-2003 Gupta was technology assistant to Bill Gates, Microsoft's chairman. In that role, Gupta contributed to a host of Microsoft product initiatives. In particular, he helped define the company's strategy for real-time collaboration, which led to the formation of Unified Communications business.

Gupta became Bill Gates' technology assistant after working for four years at Microsoft Research (1997-2001), where he led the Collaboration and Multimedia Group. His team was responsible for development and transfer of many key technologies to product groups, and publication of numerous research papers in top-tier conferences and journals.

Before joining Microsoft in 1997, Gupta was a professor of computer science and electrical engineering at Stanford University for 11 years. His research at Stanford spanned computer architecture, operating systems, programming languages, simulation and performance debugging tools, and parallel applications. He also co-led, with John Hennessy, the development of hardware and software for the Stanford DASH multiprocessor, a highly concurrent shared-memory parallel computer that had a large impact on the industry. At Stanford, Gupta also led the Virtual Classroom project, which explored compression and networking issues related to transmission of audio-video over the Internet and its applications in education. In 1995, Gupta used the seeds of the technology developed in that project to form VXtreme Inc., a provider of technologies for streaming audio-visual content over the Web, which Microsoft acquired in 1997.

Gupta has published more than 100 papers in major conferences and journals, including several that have won awards. He has contributed to more than 50 patents. With David Culler and Jaswinder Pal Singh, he co-authored the book "Parallel Computer Architecture: A Hardware-Software Approach" in 1998. He received the National Science Foundation (NSF) Presidential Young Investigator Award in 1990, and he held the Robert N. Noyce Faculty Scholar Chair at Stanford for 1993 and 1994. Before joining Stanford in 1987, Gupta was on the research faculty at Carnegie Mellon University, where he received his Ph.D. in computer science in 1986. He holds a bachelor's degree in electrical engineering from the Indian Institute of Technology, Delhi, where he graduated receiving the President's Gold Medal in 1980.


Title: Future Challenges in Computer Graphics

Date/Time: July 14 (Thu), 16:00 - 17:00

Room: JHC.02

Abstract: In this talk I will give an overview of some current research trends and explore the challenges in several sub-fields of the scientific discipline of computer graphics: interactive and photo-realistic rendering, visualization and visual analytics. Five challenges are discussed that play a role in each of these areas: scalability, semantics, fusion, interaction, acquisition. Of course, not all of these issues are disjunct to each other, however the chosen structure allows for an easy to follow overview of concrete future challenges.

Download the Slides (PDF, 2MB)

Werner Purgathofer is a full professor of computer science and head of the Institute of Computer Graphics and Algorithms at the Technische Universität Wien (Vienna University of Technology), responsible for research and education in computer graphics.

He has been working in this field since 1980 when he started his academic life as a research assistant, during which he received a Ph.D. in 1984 from TU Vienna. In 1981 and 1986 he received "best paper awards" at the Eurographics annual conferences. He has written a German language textbook on computer graphics and co-edited several other books. Purgathofer has organised the very successful 1991 and 2006 Eurographics annual conferences in Vienna. His current activities concentrate on rendering algorithms, color, and visualization. He is the president and scientific director of the VRVis research center for virtual reality and visualisation in Vienna, which he founded in 2000. In 2001 he became vice-chair of the Commission for Scientific Visualisation of the Austrian Academy of Sciences. Purgathofer has been a member of Eurographics since 1981 and is also a member of the European Academy of Sciences, ACM, IEEE Computer Society, CGS, GI and OCG. Serving in the Executive Committee of Eurographics from 1988 until 2006, he was Publications Board Chairman from 1993 to 2000, and he became a fellow of the association in 1997. Together with Werner Hansmann, Francois Sillion and Dieter Fellner he is currently co-editor of the Eurographics Proceedings Series. In 2006 he received the Distinguished Career Award from Eurographics.


Title: Virtual Vision: Computer Vision in Virtual Reality

Date/Time: July 12 (Tue), 17:00 - 18:00

Room: JHC.02

Abstract: Realistic virtual worlds can serve as software laboratories within which vision researchers may efficiently develop and evaluate sophisticated, active machine perception systems. Known as "Virtual Vision", this unorthodox philosophy posited at the intersection of the fields of computer vision and computer graphics, enables virtual reality to subserve computer vision research. In the context of this new paradigm, this talk will focus on the rapid development and evaluation of distributed smart-camera sensor networks and intelligent surveillance systems that can persistently monitor humans in large-scale urban environments. The visually realistic virtual environments exploited in this work are populated by autonomous virtual humans, which are the product of a comprehensive, artificial life approach to multi-human simulation.

Download the Slides (PDF, 2MB)

Demetri Terzopoulos (PhD '84 MIT) is the Chancellor's Professor of Computer Science at the University of California, Los Angeles. He is a Guggenheim Fellow, a Fellow of the ACM, IEEE and Royal Society of Canada, and a Member of the European Academy of Sciences. Among his many honors are an Academy Award for Technical Achievement from the Academy of Motion Picture Arts and Sciences for his pioneering work on physics-based computer animation, and the inaugural Computer Vision Significant Researcher Award from the IEEE for his pioneering and sustained research on deformable models and their applications. One of the most highly cited authors in engineering and computer science according to ISI and other indexes, his publications include more than 300 research papers and several volumes, primarily in computer graphics, computer vision, medical imaging, computer-aided design, and artificial intelligence/life. He has given over 400 invited talks internationally on these topics, among them about 100 distinguished, keynote, and plenary addresses. Before joining UCLA in 2005, Dr. Terzopoulos held the Lucy and Henry Moses Endowed Professorship in Science at New York University and was Professor of Computer Science and Mathematics at NYU's Courant Institute of Mathematical Sciences. Previously, he was Professor of Computer Science and Professor of Electrical and Computer Engineering at the University of Toronto, where he continues to hold status-only faculty appointments.


Title: Behavioral Informatics: Challenges and Opportunities for Multimedia Signal Processing

Date/Time: July 14 (Thu), 17:00 - 18:00

Room: JHC.02

Abstract: Human behavior is exceedingly complex. Its expression and experience are inherently multimodal, and are characterized by individual and contextual heterogeneity. The confluence of sensing, communication and computing is however allowing access to data, in diverse forms and modalities, that is enabling us understand and model human behavior in ways that were unimaginable even a few years ago. No domain exemplifies these opportunities more than that related to human health and wellbeing. Consider for example the domain of Autism where crucial diagnostic information comes from manually-analyzed audiovisual data of verbal and nonverbal behavior. Behavioral signal processing advances can enable not only new possibilities for gathering data in a variety of settings--from laboratory and clinics to free living conditions—but in offering computational models to advance evidence-driven theory and practice.

This talk will describe our ongoing efforts on multimodal Behavioral Signal Processing—technology and algorithms for quantitatively and objectively understanding typical, atypical and distressed human behavior—with a specific focus on communicative, affective and social behavior. Using examples drawn from different domains, the talk will also illustrate Behavioral Informatics applications of these processing techniques that contribute to quantifying higher-level, often subjectively described, human behavior in a domain-sensitive fashion. In particular, we will draw on examples from our work on health domains such as Autism, Addiction, Family Studies, and Obesity to illustrate the challenges and opportunities for multimedia behavioral signal processing.

Download the Slides (PDF, 13MB)

Shrikanth (Shri) Narayananis the Andrew J. Viterbi Professor of Engineering at the University of Southern California (USC), and holds appointments as Professor of Electrical Engineering, Computer Science, Linguistics and Psychology and as the founding director of the Ming Hsieh Institute. Prior to USC he was with AT&T Bell Labs and AT&T Research from 1995-2000. At USC he directs the Signal Analysis and Interpretation Laboratory (SAIL). His research focuses on human-centered information processing and communication technologies with a special emphasis on behavioral signal processing and informatics.

Shri Narayanan is a Fellow of IEEE, the Acoustical Society of America, and the American Association for the Advancement of Science (AAAS) and a member of Tau-Beta-Pi, Phi Kappa Phi and Eta-Kappa-Nu. Shri Narayanan is also an Editor for the Computer Speech and Language Journal and an Associate Editor for the IEEE Transactions on Multimedia, IEEE Transactions on Affective Computing, and the Journal of the Acoustical Society of America. He was also previously an Associate Editor of the IEEE Transactions of Speech and Audio Processing (2000-04) and the IEEE Signal Processing Magazine (2005-2008).

Shri Narayanan is a recipient of a number of honors including Best Paper awards from the IEEE Signal Processing society in 2005 (with Alex Potamianos) and in 2009 (with Chul Min Lee) and selection as an IEEE Signal Processing Society Distinguished Lecturer for 2010-11. He has published over 400 papers and has eight granted U.S. patents.