Past seminars

You can find below the list of past seminars. Videos of some of the past seminars are available online, or via the link bellow.

Thursday May 12th, 2022, Gerard Ben Arous (New York University)

**Title:** *Effective dynamics and critical scaling for Stochastic Gradient Descent in high dimensions*

**Abstract:** SGD in high dimension is a workhorse for high dimensional statistics and machine learning, but understanding its behavior in high dimensions is not yet a simple task. We study here the limiting 'effective' dynamics of some summary statistics for SGD in high dimensions, and find interesting and new regimes, i.e. not the expected one given by the population gradient flow. We find that a new corrector term is needed and that the phase portrait of these dynamics is substantially different from what would be predicted using the classical approach including for simple tasks. (joint work with Reza Gheissari (UC Berkeley) and Aukosh Jagannath (Waterloo))

Friday April 8th, 2022, Florence d'Alché-Buc (Telecom ParisTech)

**Title:** *Learning to predict complex outputs: a kernel view*

**Abstract:** Motivated by prediction tasks such as molecule identification or functional regression, we propose to leverage the notion of kernel to take into account the nature of output variables whether they be discrete structures or functions. This approach boils down to encode output data as vectors of the Reproducing kernel Hilbert Space associated to the so-called output kernel. We present vector-valued kernel machines to implement it and discuss different learning problems linked with the chosen loss function. Eventually large scale approaches can be developed using low rank approximations of the outputs. We illustrate the framework on graph prediction and infinite task learning.

Thurs. Dec. 9th, 2021, Brice Ménard (John Hopkins University)

**Title:** *Data science and science with data*

**Abstract:** The young field of Machine learning has changed the ways we interact with data and neural networks have made us appreciate the potential of working with millions of parameters. Interestingly, the vast majority of scientific discoveries today are not based on these new techniques. I will discuss the contrast between these two regimes and I will show how an intermediate approach, i.e. neural network inspired but mathematically defined statistics (scattering and phase harmonic transforms), can provide the long-awaited tools in scientific research. I will illustrate these points using astrophysics as an example.

Thurs. Oct. 21th, 2021, Eric Vanden-Eijnden (NYU)
[video]

**Title:** *Machine learning and applied mathematics*

**Abstract:** The recent success of machine learning suggests that neural networks may be capable of approximating high-dimensional functions with controllably small errors. As a result, they could outperform standard function interpolation methods that have been the workhorses of scientific computing but do not scale well with dimension. In support of this prospect, here I will review what is known about the trainability and accuracy of shallow neural networks, which offer the simplest instance of nonlinear learning in functional spaces that are fundamentally different from classic approximation spaces. The dynamics of training in these spaces can be analyzed using tools from optimal transport and statistical mechanics, which reveal when and how shallow neural networks can overcome the curse of dimensionality. I will also discuss how scientific computing problem in high-dimension once thought intractable can be revisited through the lens of these results, focusing on applications related to (i) solving Fokker-Planck equations associated with high-dimensional systems displaying metastability and (ii) sampling Boltzmann-Gibbs distributions using generative models to assist MCMC methods.

Thurs. April 29th, 2021, Giuseppe Carleo (EPFL)
[Slides]

**Title:** *Learning Solutions to the Schrödinger equation with Neural-Network Quantum States*

**Abstract:** The theoretical description of several complex quantum phenomena fundamentally relies on many-particle wave functions and our ability to represent and manipulate them. Variational methods in quantum mechanics aim at compact descriptions of many-body wave functions in terms of parameterised ansatz states, and are at present living exciting transformative developments informed by ideas developed in machine learning. In this presentation I will discuss variational representations of quantum states based on artificial neural networks [1] and their use in approximately solving the Schrödinger equation. I will further highlight the general representation properties of such states, the crucial role of physical symmetries, as well as the connection with other known representations based on tensor networks [2]. Finally, I will discuss how some classic ideas in machine learning, such as the Natural Gradient, are being used and re-purposed in quantum computing applications [3].

[1] Carleo and Troyer, Science 365, 602 (2017)

[2] Sharir, Shashua, and Carleo, arXiv:2103.10293 (2021)

[3] Stokes, Izaac, Killoran, and Carleo, Quantum 4, 269 (2020)

Thurs. March 25th, 2021, Caroline Uhler (MIT)

**Title:** *Causality and Autoencoders in the Light of Drug Repurposing for COVID-19*

**Abstract:** Massive data collection holds the promise of a better understanding of complex phenomena and ultimately, of better decisions. An exciting opportunity in this regard stems from the growing availability of perturbation / intervention data (genomics, advertisement, education, etc.). In order to obtain mechanistic insights from such data, a major challenge is the integration of different data modalities (video, audio, interventional, observational, etc.). Using genomics as an example, I will first discuss our recent work on coupling autoencoders to integrate and translate between data of very different modalities such as sequencing and imaging. I will then present a framework for integrating observational and interventional data for causal structure discovery and characterize the causal relationships that are identifiable from such data. We then provide a theoretical analysis of autoencoders linking overparameterization to memorization. In particular, I will characterize the implicit bias of overparameterized autoencoders and show that such networks trained using standard optimization methods implement associative memory. We end by demonstrating how these ideas can be applied for drug repurposing in the current COVID-19 crisis.

Thurs. Feb. 18th, 2021, Josh McDermott (MIT)
[Slides]

**Title:** *New Models of Human Hearing via Machine Learning*

**Abstract:** Humans derive an enormous amount of information about the world from sound. This talk will describe our recent efforts to leverage contemporary machine learning to build neural network models of our auditory abilities and their instantiation in the brain. Such models have enabled a qualitative step forward in our ability to account for real-world auditory behavior and illuminate function within auditory cortex. But they also exhibit substantial discrepancies with human perceptual systems that we are currently trying to understand and eliminate.

Jan. 14th, 2021, Eero Simoncelli (New York University)
[video]

**Title:** *Sampling and Solving Linear Inverse Problems Using the Prior Implicit in a Denoiser*

**Abstract:** Prior probability models are a central component of many image processing problems, but density estimation is a notoriously difficult problem for high dimensional signals such as photographic images. Deep neural networks have provided impressive solutions for problems such as denoising, which implicitly rely on a prior probability model of natural images. I’ll describe our progress in understanding and using this implicit prior. We rely on a little-known statistical result due to Miyasawa (1961), who showed that the least-squares solution for removing additive Gaussian noise can be written directly in terms of the gradient of the log of the noisy signal density. We use this fact to develop a stochastic coarse-to-fine gradient ascent procedure for drawing high-probability samples from the implicit prior embedded within a CNN trained to perform blind (i.e., unknown noise level) least-squares denoising. A generalization of this algorithm to constrained sampling provides a method for using the implicit prior to solve any linear inverse problem, with no additional training. We demonstrate this general form of transfer learning in multiple applications, using the same algorithm to produce high-quality solutions for deblurring, super-resolution, inpainting, and compressive sensing. Joint work with Zahra Kadkhodaie, Sreyas Mohan, and Carlos Fernandez-Granda

Nov. 12th, 2020, Alice Guionnet (ENS Lyon)

**Title:** *Rare events in Random Matrices and Applications*

**Abstract:** In this talk, I will discuss recent developements in the theory of large deviations in random matrix theory and their applications in statistical learning.

Oct. 8th, 2020, Andrew Saxe (Oxford)
[video]

**Title:** *The Neural Race Reduction: Dynamics of learning in ReLU networks*

**Abstract:** What is the relationship between task geometry, network architecture, and emergent learning dynamics in nonlinear deep networks? I will describe the neural race reduction, which describes gradient descent learning dynamics in ReLU networks in the feature learning regime for a subset of nonlinear tasks. The reduction reveals a bias in gradient descent dynamics toward exploiting shared structure and abstraction where possible. I will then turn to an fMRI experiment testing predicted representational geometry in a nonlinear context-dependent task. These results provide a new window into learning dynamics in nonlinear neural networks.

Feb. 27th, 2020, Michael Biehl (University of Groningen)
[video]

**Title:** *Prototype-Based Classifiers and Their Application in the Life Science*

**Abstract:** This talk briefly reviews important aspects of prototype based systems in the context of supervised learning. A key issue is the choice of an appropriate distance or similarity measure for the task at hand. The powerful framework of relevance learning will be discussed, in which parameterized distance measures are adapted together with the prototypes in the same data-driven training process. Example applications in the bio-medical domain are presented in order to illustrate the concept: (I) the classification of adrenocortical tumors using steroid metabolomics data, (II) the early diagnosis of rheumatoid arthritis based on cytokine expression and (III) the detection and discrimination of neuro- degenerative diseases in 3D brain images.

Feb. 6th, 2020, Martin Weigt (Sorbonne Université)
[video]

**Title:** *From generative models of protein sequences to evolution-guided protein design*

**Abstract:** Thanks to the sequencing revolution in biology, protein sequence databases have been growing exponentially over the last years. Data-driven computational approaches are becoming more and more popular in exploring this increasing data richness. In my talk, I will show that global statistical modeling approaches, like (Restricted) Boltzmann Machines are able to accurately capture the natural variability of amino-acid sequences across entire families of evolutionarily related but distantly diverged proteins. We show that these models are biologically interpretable; they allow to extract information about the three-dimensional protein structure and about protein-protein interactions from sequence data, and they unveil distributed sequence motifs. These models can be seen as highly performant generative models - they capture the natural sequence variability far beyond fitted quantities, and they allow to design novel, fully functional proteins by simple MCMC sampling approaches.

*Bio:* Martin Weigt is Professor for Computational Biology at Sorbonne Université, Paris, where he heads the research team 'Statistical Genomics and Biological Physics' within the Laboratory of Computational and Quantitative Biology (LCQB). Combining his original scientific background in theoretical and statistical physics with the exploding data richness in genomics and biology, he is particularly interested in the development of data-driven modeling approaches for biological sequences, their evolution and de novo design.

Nov. 26th, 2019, Yue M. Lu (John A. Paulson School of Engineering and Applied Sciences, Harvard University)

**Title:** *Exploiting the Blessings of Dimensionality in Big Data*

**Abstract:** The massive datasets being compiled by our society present new challenges and opportunities to the field of signal and information processing. The increasing dimensionality of modern datasets offers many benefits. In particular, the very high-dimensional settings allow one to develop and use powerful asymptotic methods in probability theory and statistical physics to obtain precise characterizations that would otherwise be intractable in moderate dimensions. In this talk, I will present recent work where such blessings of dimensionality are exploited. In particular, I will show (1) the exact characterization of a widely-used spectral method for nonconvex statistical estimation; (2) the fundamental limits of solving the phase retrieval problem via linear programming; and (3) how to use scaling and mean-field limits to analyze nonconvex optimization algorithms for high-dimensional inference and learning. In these problems, asymptotic methods not only clarify some of the fascinating phenomena that emerge with high-dimensional data, they also lead to optimal designs that significantly outperform heuristic choices commonly used in practice.

June 11th, 2019, Jean-Remi King (ENS)
[video]

**Title:** *From brains to algorithms: parsing neuroimaging data to infer the computational architecture of human cognition.*

**Abstract:** While machine learning is an autonomous research field, a number of historical (e.g. artificial neural networks) as well as more recent computational strategies (e.g. attentional gating) have been influenced by cognitive and neuroscientific findings. To what extent can cognitive neuroscience continue to guide and intersect with the development of machine learning? To highlight potential directions to this major issue, I will present three studies that investigate the computational organization of brain processing. For each of them, I will show that we can use non-invasive neuroimaging techniques with high temporal precision to parse the computational stages of visual processing in the healthy human brain. Our results show that the raw visual input that bombards our retina is progressively transformed into meaningful representations by a hierarchical algorithm, distributed both over time and space. Finally, I will briefly show how these methods can now be applied to understand language processing in humans, and thus help us tackle the modern challenges of machine learning.

May 9th, 2019, Matthieu Husson (Observatoire de Paris)

**Title:** *Artificial Intelligence and data sets from the history of astronomy: new opportunities?*

**Abstract:** The recent development of Digital Humanities and the exigences of publishing research data besides research results transform the availability of historical sources, along with the means to analyse, edit, and relate them. In this context, DISHAS relies on a network of international projects in Chinese, Sanskrit, Arabic, Latin and Hebrew sources in the history of astral sciences and aims at providing tools to edit and analyse the different types of sources usually treated in the field, namely, prose and versified texts, iconography and technical/geometrical diagrams, and astronomical tables. This leads to the progressive constitution of precisely described datasets that are a promissing field of experimentation for data sciences in general and artificial intelligence in particular. In this presentation we want to introduce our datasets, their characteristics and historical meaning. We will discuss different lines of research that could converge toward data sciences topics, with a specific focus on the understanding of historical actors computations from the analysis of the numerical tables they produced.

March 25th, 2019, Béatrice Prunel and Gregory Chatonsky (ENS)

**Title:** *Art and artificial imagination*

**Abstract:** Contemporary media are fascinated by the applications of neural networks in creation. They regularly highlight moments when artificial artistic productions have "deceived" humans and "replaced" artists. All this seems to confirm that AI would have conquered up to the last ramparts of humanity: interiority and creativity. The dialogue between an art historian, invested in the digital humanities, and an artist who is himself familiar with deep learning, invites us to change our perspective. A historical and materialistic approach makes it possible to better distinguish what is new in the apparent emergence of AI in the arts and to better grasp the implicit conception of art that develops there: the change of purpose of a new technique, which generates surprising results, is also a way of thwarting the assumptions of the contemporary economic system. It suggest criticism of it as much as it opens up new possibilities.

October 17th, 2018, Emmanuel Dupoux (EHESS)
[video]

**Title:** *Towards developmental AI*

**Abstract:** Even though current machine learning techniques yield systems that achieve parity with humans on several high level tasks, the learning algorithms themselves are orders of magnitude less data efficient than those used by humans, as evidenced by the speed and resilience with which infants learn language and common sense. I review some of our recent attempts to reverse engineer such abilities in the area of unsupervised or weakly supervised learning of speech representations, the segmentation of speech terms, and the learning the laws of intuitive physics by observation of videos. I argue that a triple effort in data collection, algorithm development and fine grained human/machine comparisons is needed to uncover these developmental algorithms.
**Bio:**E. Dupoux is full professor at the Ecole des Hautes Etudes en Sciences Sociales (EHESS), and directs the Cognitive Machine Learning team at the Ecole Normale Supérieure (ENS) in Paris and INRIA (www.syntheticlearner.com). His education includes a PhD in Cognitive Science (EHESS), a MA in Computer Science (Orsay University) and a BA in Applied Mathematics (Pierre & Marie Curie University, ENS). His research mixes developmental science, cognitive neuroscience, and machine learning, with a focus on the reverse engineering of infant language and cognitive development using unsupervised or weakly supervised learning. He is the recipient of an Advanced ERC grant, the organizer of the Zero Ressource Speech Challenge (2015, 2017), the Intuitive Physics Benchmark (2017) and led in 2017 a Jelinek Summer Workshop at CMU on multimodal speech learning. He has authored 150 articles in various peer reviewed outlets.

June 12th, 2018, Joan Bruna (New York University)
[video]

**Title:** *Learning Graph Inverse Problems with Neural Networks*

**Abstract:** Inverse Problems on graphs encompass many areas of physics, algorithms and statistics, and are a confluence of powerful methods, ranging from computational harmonic analysis and high-dimensional statistics to statistical physics. Similarly as with inverse problems in signal processing, learning has emerged as an intriguing alternative to regularization and other computationally tractable relaxations, opening up new questions in which high-dimensional optimization, neural networks and data play a prominent role. In this talk, I will argue that several tasks that are ‘geometrically stable’ can be well approximated with Graph Neural Networks, a natural extension of Convolutional Neural Networks on graphs. I will present recent work on supervised community detection, quadratic assignment, neutrino detection and beyond showing the flexibility of GNNs to extend classic algorithms such as Belief Propagation.

**Bio:** Joan Bruna is an Assistant Professor of Computer Science, Data Science and Mathematics (affiliated) at the Courant Institute of Mathematical Sciences, New York University, and at the Center for Data Science. His research interests touch several areas of Machine Learning, Signal Processing and High-Dimensional Statistics. In particular, in the past few years he has been working on Convolutional Neural Networks, studying some of its theoretical properties, extensions to more general geometries, and applications to physical sciences and statistics. Before that, he worked at FAIR (Facebook AI Research) in New York. Prior to that, he was a postdoctoral researcher at Courant Institute, NYU. He completed his PhD in 2013 at Ecole Polytechnique, France. He is the recipient of an Alfred. P. Sloan Fellowship (2018), and he has organized multiple tutorials and workshops on geometric deep learning, including NIPS and CVPR in 2017.

May 15th, 2018, Balázs Kégl (Université Paris Saclay)
[video]

**Title:** *Machine learning in scientific workflows*

**Abstract:** I will describe our contributions to scientific ML workflow building and optimization, which we have carried out within the Paris-Saclay Center for Data Science. I will start by mapping out the different use cases of machine learning in sciences (data collection, inference, simulation, hypothesis generation). Then I will detail some of the particular challenges of ML/science collaborations and the solutions we built to solve these challenges. I will briefly describe the open code submission RAMP tool that we built for collaborative prototyping, detail some of the workflows (e.g., the Higgs boson discovery pipeline, El Nino forecasting, detecting Mars craters on satellite images), and present results on rapidly optimizing machine learning solutions.

**Bio:** Balázs Kégl received the Ph.D. degree in computer science from Concordia University, Montreal, in 1999. From January to December 2000 he was a Postdoctoral Fellow at the Department of Mathematics and Statistics at Queen's University, Kingston, Canada, receiving NSERC Postdoctoral Fellowship. He was in the Department of Computer Science and Operations Research at the University of Montreal, as an Assistant Professor from 2001 to 2006. Since 2006 he has been a research scientist in the Linear Accelerator Laboratory of the CNRS. He has published more than hundred papers on unsupervised and supervised learning (principal curves, intrinsic dimensionality estimation, boosting), large-scale Bayesian inference and optimization, and on various applications ranging from music and image processing to systems biology and experimental physics. At his current position he has been the head of the AppStat team working on machine learning and statistical inference problems motivated by applications in high-energy particle and astroparticle physics. Since 2014, he has been the head of the Center for Data Science of the University of Paris Saclay. In 2016 he is co-created the RAMP (www.ramp.studio).

March 13rd, 2018, Elizabeth Purdom (Berkeley)
[video]

**Title:** *Statistical challenges in analyzing high-dimensional experiments in molecular biology*

**Abstract:** Molecular biology experiments frequently measure tens of thousands of measurements on a cell in order to obtain a full snapshot of the activity of the cell. Analysis of these experiments requires the integration of statistical techniques to the unique biological aspects of these experiments. In this talk I will give an overview the data challenges faced in these settings, as well as highlight how the solutions to these problems compare to those used in the wider data science community. I will illustrate these points with examples from my research in developing methods for the analysis of the measurements of mRNA abundance of cells.

Feb. 6th, 2018, Maureen Clerc (INRIA)
[video]

**Title:** *Brain-computer interfaces: two concurrent learning problems*

**Abstract:** Brain-Computer Interfaces (BCI) are systems which provide real-time interaction through brain activity, bypassing traditional interfaces such as keyboard or mouse. A target application of BCI is to restore mobility or autonomy to severely disabled patients. In BCI, new modes of perception and interaction come into play, which users must learn, just as infants learn to explore their sensorimotor system. Feedback is central in this learning. From the point of view of the system, features must be extracted from the brain activity, and translated into commands. Feature extraction and classification issues, are important components of a BCI. Adaptive learning strategies, because of the high variability of the brain signals. Moreoever, additional markers may also be extracted to modulate the system's behavior. It is for instance possible to monitor the brain's reaction to the BCI outcome. In this talk I will present some of the current machine learning methods which are used in BCI, and the adaptation of BCI to users' needs.

Nov. 14th, 2017, Rémi Monasson (ENS)
[video]

**Title:** *Searching for interaction networks in proteins: from statistical physics to machine learning, and back*

**Abstract:** Over the last century, statistical physics was extremely successful to predict the collective behaviour of many physical systems from detailed knowledge about their microscopic components. However, complex systems, whose properties result from the delicate interplay of many strong and heterogenous interactions, are notoriously difficult to tackle with first-principle approaches. It is therefore tempting to use data to infer adequate microscopic models. I will present some efforts made along this direction for proteins, based on the well-known Potts model of statistical mechanics, with an emphasis on computational and theoretical aspects. I will then show how machine learning, whose unsupervised models encompass the Potts model, can be an inspiring source of new questions for statistical mechanics.

Oct. 3rd, 2017, Jean-Luc Starck (CEA)
[video]

**Title:** *Cosmostatistics: Tackling Big Data from the Sky*

**Abstract:** Since the dawn of time, humans have been wondering about their place in the Universe. Over the past century, advances in modern physics, technology and engineering, along with the unique possibilities offered by space missions, have opened new windows to explore the cosmos. All-sky surveys, with observations across the entire electromagnetic spectrum, are the best strategy to fully understand and model the Universe in detail. Major upcoming research facilities, such as the Large Synoptic Survey Telescope (LSST), the Square Kilometer Array (SKA) and the Euclid space telescopes will provide key elements to addressing this challenge, by producing high quality data of petabyte volumes. These surveys prove to be a major 'big data' challenge, which require the development of innovative statistical methods essential both for the data analysis and their physical interpretation. I will present some highlights of this methodology and more specifically show how novel techniques of sparsity and compressed sensing open new perspectives in analysing cosmological data. These enable us to answer fundamental questions about the nature of our Universe with impressive accuracy.

June 27th, 2017, Guillermo Sapiro (Duke University)

**Title:** *Learning to Succeed while Teaching to Fail: Privacy in Closed Machine Learning Systems*

**Abstract:** Security, privacy, and fairness have become critical in the era of data science and machine learning. More and more we see that achieving universally secure, private, and fair systems is practically impossible. We have seen for example how generative adversarial networks can be used to learn about the expected private training data; how the exploitation of additional data can reveal private information in the original one; and how what seems as unrelated features can teach us about each other. Confronted with this challenge, in this work we open a new line of research, where the security, privacy, and fairness is in a closed environment. The goal is to ensure that a given entity, trusted to infer certain information with our data, is blocked from inferring protected information from it. For example, a hospital might be allowed to produce diagnosis on the patient (the positive task), without being able to infer the irrelevant gender of the subject (negative task). Similarly, a company can guarantee they internally are not using the provided data for any undesired task, an important goal that is not contradicting the virtually impossible challenge of blocking everybody from the undesired task. We design a system that learns to perform the positive task while simultaneously being trained to fail at the negative one, and illustrate this with challenging cases where the positive task is actually harder than the negative one. The particular framework and examples open to door to security, privacy, and fairness in very important closed scenarios. Joint work with Jure Sokolic and Miguel Rodrigues.

Tuesday, May 16th, 2017, Sophie Deneve (ENS)
[video]

**Title:** *The brain as an optimal efficient adaptive learner*

**Abstract:** Understanding how neural networks can learn to predict and represent time-varying variables is a fundamental challenge in neuroscience. A key complication is the error credit assignment problem: how to determine the local contribution of each synapse to the network’s global output error. Previous work on solving this problem in spiking networks has either been restricted to linear systems (Boerlin, Machens, Deneve 2013; Bourdoukan, Deneve 2015), or to non-local learning rules (FORCE learning; Sussillo, Abbott 2009; Thalmeier et al 2016). Here we show how to learn arbitrary non-linear dynamical systems with local learning rules. Our approach uses tools from adaptive control theory, and applies them to a spiking network with nonlinear dendrites. The spiking network receives its own tracking error through feedback and learns to approximate a nonlinear dynamical systems using a purely local learning rule. The local credit assignment problem is solved because each neuron effectively contains partial information of the error made by the entire network. This error is captured by the tightly balanced voltage of each neuron. Here, a balanced network effectively acts as a predictive auto-encoder that learns to cancel its own error and feedback. The resulting network is extremely efficient in terms of the number of spikes fired, and it is highly robust to noise and neural elimination. It produces asynchronous, irregular spiking activity matching the Poisson-like neural variability observed experimentally. Our framework has several important implications. It suggests that a global learning problem, like learning to implement complex nonlinear dynamics from examples, can be solved with local rules, as long as output errors are fed back as driving input signals. Our approach inherits the analytical tools of control theory such as convergence and stability theorems that can now be applied to learning in spiking networks.

March. 7th, 2017, Francis Bach (INRIA)
[video]

**Title:** *Beyond stochastic gradient descent for large-scale machine learning*

**Abstract:** Many machine learning and signal processing problems are traditionally cast as convex optimization problems. A common difficulty in solving these problems is the size of the data, where there are many observations ('large n') and each of these is large ('large p'). In this setting, online algorithms such as stochastic gradient descent which pass over the data only once, are usually preferred over batch algorithms, which require multiple passes over the data. In this talk, I will show how the smoothness of loss functions may be used to design novel algorithms with improved behavior, both in theory and practice: in the ideal infinite-data setting, an efficient novel Newton-based stochastic approximation algorithm leads to a convergence rate of O(1/n) without strong convexity assumptions, while in the practical finite-data setting, an appropriate combination of batch and online algorithms leads to unexpected behaviors, such as a linear convergence rate for strongly convex problems, with an iteration cost similar to stochastic gradient descent. (joint work with Nicolas Le Roux, Eric Moulines and Mark Schmidt).

Feb. 21st, 2017, Yann Ollivier (CNRS and Paris-Sud)

**Title:** *Intelligence artificielle et raisonnement inductif : de la théorie de l'information aux réseaux de neurones artificiels*

**Abstract:** Les problèmes de raisonnement inductif ou d'extrapolation comme "deviner la suite d'une série de nombres", ou plus généralement, "comprendre la structure cachée dans des observations", sont fondamentaux si l'on veut un jour construire une intelligence artificielle. On a parfois l'impression que ces problèmes ne sont pas mathématiquement bien définis. Or il existe une théorie mathématique rigoureuse du raisonnement inductif et de l'extrapolation, basée sur la théorie de l'information. Cette théorie est très élégante, mais difficile à appliquer.

En pratique aujourd'hui, ce sont les réseaux de neurones qui donnent les meilleurs résultats sur toute une série de problèmes concrets d'induction et d'apprentissage (vision, reconnaissance de la parole, récemment le jeu de Go ou les voitures sans pilote...) Je ferai le point sur quelques-uns des principes mathématiques sous-jacents et sur leur lien avec la théorie de l'information.

*Short bio:* Yann Ollivier is a researcher in computer science and mathematics at the CNRS, LRI, Université Paris-Saclay. After starting his career in pure mathematics, with topics ranging from probability to group theory, he decided to move to artificial intelligence, with an emphasis on artificial neural network training, deep learning, and their links with information theory. In 2011 he was awarded the bronze medal of the CNRS for his work.

Jan. 10th, 2017, Bertrand Thirion (INRIA and Neurospin)

**Title:** *A big data approach towards functional brain mapping*

**Abstract:** Functional neuroimaging offers a unique view on brain functional organization, which is broadly characterized by two features: the segregation of brain territories into functionally specialized regions, and the integration of these regions into networks of coherent activity. Functional Magnetic Resonance Imaging yields a spatially resolved, yet noisy view of this organization. It also yields useful measurements of brain integrity to compare populations and characterize brain diseases. To extract information from these data, a popular strategy is to rely on supervised classification settings, where signal patterns are used to predict the experimental task performed by the subject during a given experiment, which is a proxy for the cognitive or mental state of this subject. In this talk we will describe how the reliance on large data copora changes the picture: it boosts the generalizability of the results and provides meaningful priors to analyze novel datasets. We will discuss the challenges posed by these analytic approaches, with an emphasis on computational aspects, and how the use of non-labelled data can be further used to improve the model learned from brain activity data.

Dec. 15th, 2016, Special Inauguration Conference (-)

**Title:** *Inauguration of the Chair CFM-ENS*

**Abstract:** The ENS and CFM organizes the inauguration conference of the Chair 'Modèles et Sciences des Données'.

Nov. 8th, 2016, Cristopher Moore (Santa Fe Institute)
[video]

**Title:** *What physics can tell us about inference?*

**Abstract:** There is a deep analogy between statistical inference and statistical physics; I will give a friendly introduction to both of these fields. I will then discuss phase transitions in two problems of interest to a broad range of data sciences: community detection in social and biological networks, and clustering of sparse high-dimensional data. In both cases, if our data becomes too sparse or too noisy, it suddenly becomes impossible to find the underlying pattern, or even tell if there is one. Physics both helps us locate these phase transiitons, and design optimal algorithms that succeed all the way up to this point. Along the way, I will visit ideas from computational complexity, random graphs, random matrices, and spin glass theory.

Oct. 11th, 2016, Jean-Philippe Vert (Mines ParisTech, Institut Curie and ENS)
[video]

**Title:** *Can Big Data cure Cancer?*

**Abstract:** As the cost and throughput of genomic technologies reach a point where DNA sequencing is close to becoming a routine exam at the clinics, there is a lot of hope that treatments of diseases like cancer can dramatically improve by a digital revolution in medicine, where smart algorithms analyze « big medical data » to help doctors take the best decisions for each patient or to suggest new directions for drug development. While artificial intelligence and machine learning-based algorithms have indeed had a great impact on many data-rich fields, their application on genomic data raises numerous computational and mathematical challenges that I will illustrate on a few examples of patient stratification or drug response prediction from genomic data.