Tamir Hassan

I am a researcher at Hewlett-Packard Laboratories in the field of Automated Publishing, working on delivering documents for screen (desktop and mobile devices) and print. I am located in Vienna, Austria.

This page contains a summary of my research activities and links to some of my open-source contributions in the field of document engineering.

You can email me at: web (at) tamirhassan.com
My page at HP Labs


I am interested in several topics related to document engineering, such as automatic layout, document authoring, document analysis, information extraction and digital typography. Previously, I worked at the Zukunftskolleg, University of Konstanz working on semi-flexible layouts and at IUPR, TU Kaiserslautern on the Decapod project. Before, I was at PRIP and DBAI, TU Wien. In Spring 2010, I worked for three months with Prof. Roger D. Hersch at the EPF Lausanne on parametric representations of fonts.

I wrote my doctoral thesis at the Database and Artificial Intelligence Group at TU Wien under the supervision of Prof. Georg Gottlob. I have worked on methods for wrapping, or supervised semi-automatic data extraction, from PDF files. Because PDF documents are not structured in the same way as HTML, my work involves using a number of techniques from the document analysis and understanding field, and I have recently worked on page segmentation, converting PDF to HTML (as an input filter to the Lixto Visual Wrapper) and table recognition in PDF files.

In 2009, I worked on a novel approach for wrapping documents using visual extraction patterns; this approach represents the document in an attributed relational graph and uses error-tolerant graph matching techniques to locate the desired wrapping instances. A prototype of this system was presented at CeBIT at the stand of the Austrian Computer Society, a trial version of which is now available for download. For more information, please see the page on GraphWrap.

My first degree is a M.Eng. (Hons) in computer science, obtained from the University of Warwick in 2004.


Here is a selection of recent publications which I have authored or co-authored:

A list of my publications on DBLP is available here.

More about me

In addition to my current field or research, I am also interested in a number of other areas in applied computer science. I have written about two of these areas below:

I have long had a love for typography and am particularly fascinated by the multi-disciplinary aspect of computer science, particularly its application to the arts, as well as human-oriented issues such as HCI. More generally, I love work which requires great attention to detail.

In my free time, I enjoy photography (in particular architecture) and sing in a choir.

Previous work

My PDF-to-HTML converter, created as part of my study at the University of Warwick, can be found here.