Paul Gay
Environmental impact of ai, multimedia indexing applied to social sciencesClick to reveal email
Google Scholar Profile Institutional page
I am a research engineer working on environmental impacts of AI, and environmental applications of IA. I am involved in different projects including NLP applications for social sciences, fish counting on videos, multimedia indexing for crisis management. This year, I am sharing my time between teaching environmental impact of AI with CY-tech institute and research on Media analysis applied to environmental controversies at Tree (human science department of university of Pau).
Welcome Visitor! Here are the latest news :
- 06/12/2024 : IAPau6 is coming! Join to discuss how AI is shaping business and society!
- 26/6/2024 : Meet us at the ICT4S conference where I'll present our work on the Coca4AI project where we measure environmental cost of data centers.
- Check our AIPowerMeter software: Yet another toolbox to measure the energy consumption of your AI.
- Small video for a flavour of our Social Computing Work in collaboration with the Tree lab (in french).
- PSfMO a probabilistic method to impose prior on object reconstruction (code)
- VGfM a geometry aware Graph neural network designed to detect object relationships in 3D scene.
- AIPowerMeter A simple tool to record the power consumption of your python programs (Intel CPU and Nvidia GPU).
- SfMO in python I implemented a python version of the CVPR Structure from Motion with Objects paper, mostly as a material for my computer vision classes.
- There is a github for the code of our ACCV paper. Unfortunately, it is not simple to use as you have to press more than one button.
- LfD method i.e. 3D location with perspective cameras, developped mainly by Cosimo Rubino and detailed in his 2018 PAMI. I added two improved versions: LfDc which uses additional linear constraints and has more accurate results (see our ACCV 2018 paper), and PLfDc which uses both constraints and the prior developed for PSfMO.
- PSfMO detailed in our ICCV 2017 paper
- slides: background in image processing
- slides: Geometry and structure from motion
- slides: Geometry and structure from motion
The story behind my resume
R&D Engineer 2019-2020: Two entertaining years at LumenAI start-up where I worked on document indexing with large industrial companies and graph analysis problems like community detection.
Late 2018: Before getting back into research, I went into some travelling, among other things, to practice my watercolors. You can see some samples there.
Post-doc in Italy: 2016-2018 I was a post-doctoral fellow at IIT/PAVIS (Genova Italy) with Alessio Del Bue, where I worked on merging multiple view geometry and machine learning. I created PSFMO to include probabilistic prior into 3D object reconstruction. I also built the first Visual Graphs from Motion, a model which understands a 3D scene by localizing the objects, estimate their occupancy and recognize the relations between them.
Avignon teaching period: 2014-2015 During the school year 2014-2015, I was a teaching assistant (ATER) at Avignon university and member of the LIA lab.
Phd: 2011-2014 After this warm up, I started a phd under the supervision of Sylvain Meigner (LIUM, France) and Jean Marc Odobez (IDIAP Research Institute, Switzerland). My research there focused on unsupervised Audio-visual person identification in broadcast data. The major part of this work consisted in improving speaker diarization and face clustering systems. I used probabilistic graphical models to integrate the different modalities in a global framework. This work took place in the context of the REPERE evaluation campaign and the european project EUMSSI.
Studies: 2006-2011 I graduated in computer sciences from INSA Rouen (France) in 2011 and obtained at the same time a M. of Sciences from university of Rouen. There I discovered machine learning, and get involved in the research projects of the LITIS laboratory. To complete this master, I had the chance to found an internship position with Ronan Fablet, in Peru, on the classification of underwater animals in acoustic data.
Research highlights in Vision (2016-2018)
The Pavis/VGM group has a strong expertise in geometry and 3D reconstruction. When I joined them in 2016, recent efforts tend to merge their expertise in geometry with machine learning techniques to produce representations which can model both the structure (e.g. the 3D shape) and the semantic (e.g. object labels) of the scene. My two main contributions are:
Selected Publications (click on titles for abstracts, download links and code)
2023-currently : GreenAI and AI4Green (social computing)
Coca4ai: checking energy behaviors on AI data centers
2024 ICT4S (ICT for Sustainability)
Active Learning with few shot learning for crisis management
2023 CBMI, (Content Based Multimedia Indexing)2019 - 2022 Working on a Start-up ...
2016-2018: 3D visual scene understanding
Visual Graphs from Motion (VGfM): Scene understanding with object geometry reasoning
2018 ACCV, (Asian Conference on Computer Vision)
Probabilistic Structure from Motion with Objects
2017 ICCV, (International Conference on Computer Vision)
Factorization based Structure from Motion with Object Priors
2017 CVIU (Computer Vision and Image Understanding)2014-2011: audio-visual people indexing in broadcast news
Phd thesis (in french): Audiovisual segmentation and identification of persons in broadcast news.
Comparison of Two Methods for Unsupervised Person Identification in TV Shows
2014 CBMI (IEEE worshop on Content based Multimedia Indexing)
A conditional random field approach in broadcast news using overlaid texts
2014 ICIP (IEEE International Conference on Image Processing)
A conditional random field approach for audio-visual people diarization
2014 ICASSP (International Conference on Acoustic Speech and Signal Processing)
An open-source state-of-the-art toolbox for broadcast news diarization
2013 INTERSPEECH (Conference of the International Speech Communication Association)
Fusing matching and biometric similarity measures for face diarization in video
2013 ICMR (IEEE International Conference on Multimedia Retrieval)
Software
Teaching Activity
A small subset of my Cytech 2020 classes : Deep learning for engineers (in french)