VisionDocs: 1st Workshop on Computer Vision Systems for Document Analysis and Recognition

Program

Workshop date: 4 March, 2025 afternoon (1-5pm)
Workshop Location: AZ Ballroom Salon 3-4

1:00pm Opening Remarks

1:10pm Oral Session

Mixed-Precision is All You Need for Efficient Document Image Classification
Tushar Shinde (IIT Madras Zanzibar); Shivam Bhardwaj (IIT Madras Zanzibar)
RAPTOR: Refined Approach for Product Table Object Recognition
Eliott THOMAS (L3i); Mickael Coustaty (L3i laboratory); Aurélie JOSEPH (Yooz); Gaspar DELOIN (Yooz); Elodie CAREL (Yooz); Vincent Poulain d'Andecy (Yooz); Jean-Marc Ogier (University of La Rochelle)

1:40pm First Keynote

Teaching machine to write and read
Silvia Cascianelli and Rita Cucchiara, University of Modena and Reggio Emilia, Italy

2:05pm Oral Session

Low-Rank Adaptation vs. Fine-Tuning for Handwritten Text Recognition
Lukas Hüttner (Friedrich-Alexander-Universität Erlangen-Nürnberg); Martin Mayr (Friedrich-Alexander-Universität Erlangen-Nürnberg); Thomas Gorges (Friedrich-Alexander-Universität Erlangen-Nürnberg); Fei Wu (Friedrich-Alexander-Universität Erlangen-Nürnberg); Mathias Seuret (Friedrich-Alexander-Universität Erlangen-Nürnberg); Andreas Maier (Friedrich-Alexander-Universität Erlangen-Nürnberg); Vincent Christlein (Friedrich-Alexander-Universität Erlangen-Nürnberg)
Offline Signature Verification in the Banking Domain
Valentina Arrigoni (Unicredit)
DocSum: Domain-Adaptive Pre-training for Document Abstractive Summarization
Phan Phuong Mai Chau (University of Science and Technology of Hanoi); Souhail Bakkali (L3i-lab, La Rochelle Université); Antoine Doucet (L3i-lab, La Rochelle Université)

3:00pm Coffee Break and Poster Session

3:45pm Oral Session

Multi-Modal Large Language Model driven Augmented Reality Situated Visualization: the Case of Wine Recognition
Vincenzo Armandi (University of Bologna); Andrea Loretti (University of Bologna); Lorenzo Stacchio (University of Macerata); Pasquale Cascarano (University of Bologna); Gustavo Marfia (University of Bologna)
Improving the Identification of Layers in 3D Images of Ancient Papyrus using Artificial Neural Networks
Nicolas Klenert (Zuse Institute Berlin); Finn Schwoerer (Zuse Institute Berlin); Noushin Hajarolasvadi (Zuse Institute Berlin); Siloé Bournez (Zuse Institute Berlin); Tobias Arlt (Helmholtz-Zentrum Berlin); Heinz-Eberhard Mahnke (Helmholtz-Zentrum Berlin); Verena Lepper (Ägyptisches Museum und Papyrussammlung); Daniel Baum (Zuse Institute Berlin)

4:15pm Second Keynote

Interdisciplinary Collaborations: Advancing Document Analysis Together
Vincent Christlein, Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany

4:40pm Oral Session

Label Errors in the Tobacco3482 Dataset
Gordon Lim (University of Michigan); Stefan Larson (Vanderbilt University); Kevin Leach (Vanderbilt University)
Short paper: Download
DocEdit Redefined: In-Context Learning for Multimodal Document Editing
Muhammad Waseem (Shiv Nadar University Chennai); Sanket Biswas (Computer Vision Centre); Josep Llados (Computer Vision Centre)
Short paper: Download
A Comparative Analysis of OCR Models on Diverse Datasets: Insights from Memes and Hiertext Dataset
Iknoor Singh (University of Sheffield); Miguel Colom (ENS Paris-Saclay); Kalina Bontcheva (University of Sheffield)

5:00pm Closing Remarks and Best paper award

Zoom Link for Oral Presentation

Zoom: link

Virtual platform

Zoom link for remote participants can be found on the virtual platform of WACV.

Instructions for Oral Presentation

Each accepted paper will be allocated 15 minutes for an oral presentation (10 minutes for the presentation and 5 minutes for questions). We kindly ask you to send us the PDF of your presentation by February 18th to ensure smooth organization during the workshop, as some presentations will also be delivered remotely.

During the afternoon break (45 minutes), participants are welcome to display their posters as a poster session (this is optional). Boards and pins will be available in the room for hanging posters.