
Paul Gay
Computer vision, multimedia indexing and graph analysisClick to reveal email
Google Scholar Profile
Welcome Visitor! Here are the latest news :
- Here we are, 2023! Most of the year will be spent on environmental impact of datacenters (Coca4AI). Also, the discussions are going very well on the social computing side with the Tree laboratory. Join me to discuss this topic at my next seminar (link below)
- 30/01/2023 : Social Computing seminar: working with embedding in 2022 to understanding social aspects of the environmental transition.
- PSfMO a probabilistic method to impose prior on object reconstruction (code)
- VGfM a geometry aware Graph neural network designed to detect object relationships in 3D scene.
- AIPowerMeter A simple tool to record the power consumption of your python programs (Intel CPU and Nvidia GPU).
- SfMO in python I implemented a python version of the CVPR Structure from Motion with Objects paper, mostly as a material for my computer vision classes.
- There is a github for the code of our ACCV paper. Unfortunately, it is not simple to use as you have to press more than one button.
- LfD method i.e. 3D location with perspective cameras, developped mainly by Cosimo Rubino and detailed in his 2018 PAMI. I added two improved versions: LfDc which uses additional linear constraints and has more accurate results (see our ACCV 2018 paper), and PLfDc which uses both constraints and the prior developed for PSfMO.
- PSfMO detailed in our ICCV 2017 paper
The story behind my resume
Current affiliation: I am a research engineer working at the GreenAI Uppa team where I study environmental impacts of AI, and environmental applications of IA. I am involved in different projects included early exit for embedded algorithms, fish counting on videos and multimedia indexing for crisis management. NOTE : This year, I am officially part of the LISN laboratory where I am involved in carbon footprint study of the lab-ia data center. Studies: 2006-2011 I graduated in computer sciences from INSA Rouen (France) in 2011 and obtained at the same time a M. of Sciences from university of Rouen. There I discovered machine learning, and get involved in the research projects of the LITIS laboratory. To complete this master, I had the chance to found an internship position with Ronan Fablet, in Peru, on the classification of underwater animals in acoustic data.Phd: 2011-2014 After this warm up, I started a phd under the supervision of Sylvain Meigner (LIUM, France) and Jean Marc Odobez (IDIAP Research Institute, Switzerland). My research there focused on unsupervised Audio-visual person identification in broadcast data. The major part of this work consisted in improving speaker diarization and face clustering systems. I used probabilistic graphical models to integrate the different modalities in a global framework. This work took place in the context of the REPERE evaluation campaign and the european project EUMSSI.
Avignon teaching period: 2014-2015 During the school year 2014-2015, I was a teaching assistant (ATER) at Avignon university and member of the LIA lab.
Post-doc in Italy: 2016-2018 I was a post-doctoral fellow at IIT/PAVIS (Genova Italy) with Alessio Del Bue, where I worked on merging multiple view geometry and machine learning. We build models which understand a 3D scene by localizing the objects, estimate their occupancy and recognize the relations between them.
Late 2018: Before getting back into research, I went into some travelling, among other things, to practice my watercolors. You can see some samples there.
R&D Engineer 2019-2020: Two entertaining years at LumenAI start-up where I worked on document indexing with large industrial companies and graph analysis problems like community detection.
Research highlights
The Pavis/VGM group has a strong expertise in geometry and 3D reconstruction. When I joined them in 2016, recent efforts tend to merge their expertise in geometry with machine learning techniques to produce representations which can model both the structure (e.g. the 3D shape) and the semantic (e.g. object labels) of the scene. My two main contributions are:Publications (click on titles for abstracts, download links and code)
2016-2018: 3D visual scene understanding
Visual Graphs from Motion (VGfM): Scene understanding with object geometry reasoning2018 ACCV, (Asian Conference on Computer Vision)
Probabilistic Structure from Motion with Objects
2017 ICCV, (International Conference on Computer Vision)
Factorization based Structure from Motion with Object Priors
2017 CVIU (Computer Vision and Image Understanding)
2014-2011: audio-visual people indexing in broadcast news
Phd thesis (in french): Audiovisual segmentation and identification of persons in broadcast news.CRF-Based Context Modeling for Person Identification in Broadcast Videos
2015 Frontiers journal in ICT
Comparison of Two Methods for Unsupervised Person Identification in TV Shows
2014 CBMI (IEEE worshop on Content based Multimedia Indexing)
A conditional random field approach in broadcast news using overlaid texts
2014 ICIP (IEEE International Conference on Image Processing)
A conditional random field approach for audio-visual people diarization
2014 ICASSP (International Conference on Acoustic Speech and Signal Processing)
An open-source state-of-the-art toolbox for broadcast news diarization
2013 INTERSPEECH (Conference of the International Speech Communication Association)
Fusing matching and biometric similarity measures for face diarization in video
2013 ICMR (IEEE International Conference on Multimedia Retrieval)