I am on a mission to help the world adopt advanced analytics and take optimal data-driven decisions. This poses two interesting challenges. First, advanced analytics needs to be made more accessible, and easy to use (for non-analysts), providing clear actionable results. Second, the world needs to be helped to start thinking, getting hunches, testing them, and getting insights.
My efforts go to the development of new algorithms and interactive workflows to solve problems using data and science.
I am managing director at Evolved Analytics Europe and chief data scientist and partner at Evolved Analytics LLC. Passionate about science and technology, I also teach graduate courses and give lectures on data science and data-driven problem solving.
I have a Ph.D. in data-driven problem solving from Tilburg University, Netherlands, Professional Doctorate in Engineering from Eindhoven University of Technology (Netherlands), and M.Sc. in intelligent systems from Lomonosov Moscow State University (Russia).
I love solving hard problems, working with great people, looking for simplicity in complex things, and making the complicated look simple.
I like people who have passion for what they do. I see challenges as opportunities, try to make the world better, and enjoy making my dreams come true.
My research is in the area of predictive modeling and data science with a focus on data-driven problem solving. It can be split into the following categories (not mutually exclusive): (1) uncovering relationships in given data, (2) selecting data features that matter, (3) data acquisition, curation, modeling, and interpretation, (4) data balancing, (5) symbolic regression for system identification, (6) system modeling and optimization. Lately, I've been trying to reinvent myself by diving into the field of interactive visualization.
The list of notable scientific publications can be found on Google scholar.
For the most recent information about the projects and data-driven solutions, please, visit evolved-analytics.be. Below I mention a few published examples, which facilitated generation of new technology.
Collaborators: Sean Stijven, Ruben Van den Bossche, Kurt Vanmechelen, Jan Broeckhove (University of Antwerp, Antwerp, Belgium) an Mark Kotanchek (Evolved Analytics LLC) (2013). See our book chapter.
Collaborators: Lander Willem, Sean Stijven, Niel Hens, Jan Broeckhove, and Phillip Beutels, University of Antwerp, Antwerp, Belgium (September 2010 - April 2014). See our articleq
Collaborators: Oliver Flasch, Martina Friese, Thomas Bartz-Beielstein, Olaf Mersmann, Boris Naujoks, Jörg Stork, Martin Zaefferer (Cologne University of Applied Sciences, Germany), 2013
Collaborators: Nicholas Staelens, Dirk Deschrijver, Tom Dhaene (Ghent University & iMinds, Belgium)
Collaborators: Tobias Friedrich (Friedrich-Schiller-Universität Jena, Germany), Frank Neumann & Markus Wagner (University of Adelaide, Australia), 2013
Collaborators: Una-May O’Reilly & Kalyan Veeramachaneni (Massachusetts Institute of Technology, USA), 2009-2010
In 2008-2012 I supervised several very interesting graduate projects. In all but the first one in the selected list below I was the promotor.
Master student: Chinnappa Subramoniam Narendhran
Co-supervisors: Olaf Meersmann, Thomas Bartz-Beielstein (Cologne University of Applied Sciences, Germany)
Master thesis of Wouter Minnebo and Sean Styven (University of Antwerp, Belgium)
Bachelor thesis of Joachim van der Herten (University of Antwerp, Belgium)
Bachelor thesis of Pieter Kerstens (University of Antwerp, Belgium)
Co-promotor and Problem owner: Ignace Van de Woestyne (IÉSEG School of Management, University College Brussels, Belgium)
Bachelor thesis of Sean Styven (University of Antwerp, Belgium)
Bachelor thesis of Wouter Minnebo (University of Antwerp, Belgium)
I have several (beautiful) graduate courses prepared and tested. For inquiries and booking, please, send me an email. For tutorials and talks, please, check the website of our company
This one-day training is on feature selection and feature importance in data-driven modeling for hard regression problems. Variable selection is a process of identifying influential variables (features, attributes) in a real or simulated system, that are discriminative and necessary to describe the system's performance characteristics.
Focusing the research (and modeling) on relevant variables reduces the dimensionality of the original problem (by making the problem tractable), shortens time to market (by facilitating insights), improves generalization (by generating robust knowledge), and heavily cuts down the costs for development and deployment of data-driven solutions.
Our aim is to provide a critical and objective analysis of the feature selection problem for regression, with complicating factors of having noisy, imbalanced data, correlated and coupled variables, and possibly many redundant variables. This is a hands on course. All methods will be illustrated on toy and real-world examples. If you feel like you have an interesting challenging feature selection problem to be used as an example during the course - please contact email@example.com (at least one week before the course).
Symbolic Regression is a field of supervised learning by evolutionary algorithms, aimed at modeling given numeric input-response data. Unlike classical regression, which assumes a certain model structure and optimizes the parameters, symbolic regression searches for appropriate model structure and coefficients. Symbolic regression models are defined in a space of all possible explicit expressions of the response variable, given analytically, as functions of some of the input variables, constants and operators from a given set.
This one-day introductory course in Symbolic Regression presents symbolic regression as powerful methodology for industrial data analysis and data-driven modeling, and covers the state-of-the art strategies for efficient generation of plausible regression models, which are designed to optimize competing trade-offs of high accuracy, low complexity, improved generalization capabilities and trustworthiness.
The following topics of symbolic regression (also applicable to other iterative search methods) will be covered:
The main objective of the course is to give participants a comprehensive overview of the variety of optimization methods, both classical and theoretically tractable methods of local (and some cases of global) unconstrained and constrained optimization with one objective, and "modern" heuristics for global optimization like stochastic search methods, including simulated annealing, and evolutionary algorithms, which are applicable to optimization of one or more objectives.
Participants are expected to be able to compare various optimization methods with each other, know their strong and weak points; give advice on appropriate optimization methods for a given problem, be able to find solutions for given problems using own or existing implementations, and elaborate on the quality of obtained solutions.
This course is an up to date, easily accessible introduction to various optimization problems and techniques. Basic techniques for local and global optimization are treated, single-objective and multi-objective optimization, and techniques for solving discrete optimization problems (such as stochastic iterative search methods). The course content can be extended to dynamic optimization if there is sufficient interest among participants.
Requirements: Participants should possess a good spacial awareness and a good knowledge of basic concepts of calculus and differential geometry (fluency in differentiation, integration, limits, sequences, and vectors, as well as manipulations with multi-dimensional surfaces will be assumed). A strong interest in computational intelligence and programming is encouraged. If the need be - basic knowledge of calculus can be given in a additional half-day training.
At the end of the course participants are expected to grasp the essential approached for working effectively with vectors and matrices, know how to effectively solve several types of systems of equations, understand the taxonomy of studied algorithms, situations when these algorithms are effective and ineffective, elaborate on the conditioning of offered problems and on the stability of algorithms available to solve these problems.
The course covers the following topics of applied linear algebra:
Requirements: The course assumes a good knowledge of basic calculus and linear algebra.
I am re-designing the website for our companies. The current version is oriented at analysts and contains many jewels for learning about data-driven modeling. The must-see sources are tutorials, DataModeler illustrations, testimonials, and DataModeler update notes.
To ensure a smooth transition, the new website is located at the .be domain. Please, help us find broken links, or inconsistencies if any. Your suggestions are very welcome! I will also greatly appreciate testimonials and endorsements of our work and software. Thank you!