Internship Projects

Internship Title

Evolving switch-neuron networks

Description of Internship Switch neurons (Vassiliades & Christodoulou, 2016) were proposed as a mechanism that enables flexible and adaptive behavior in an artificial neural network (ANN) through gating the flow of information. These models have shown to produce optimal exploration strategies in association tasks and T-maze problems. However, their ANN architectures were manually designed, and it remains an open problem how they can be automatically produced. To this end, this project will investigate neuroevolution and specifically how to use indirect encodings to automatically design architectures of switch neurons that match the performance of the manually designed ones.
Required Skills Good programming skills (C++ or Python, multi-processing, ability to adapt existing code)
Internship Objectives The objective of the internship is to:
• implement switch neurons in existing code for neuroevolution (NEAT, HyperNEAT, EvoNeuro)
• implement the binary association tasks (and if time permits, the T-maze tasks) introduced by Vassiliades & Christodoulou (2016) to benchmark the ANN architectures
• perform evolutionary experiments to investigate whether ANN architectures emerge that have comparable performance to the manually designed ones
Internship Title

Visual programming of reinforcement learning agents

Description of Internship Reinforcement learning research has seen an explosion of ideas in recent years, especially when combined with deep neural networks. These ideas are often presented in research papers as figures that show the architecture of the agent. However, often the source code is not available making it difficult, time-consuming and error-prone to translate the figures to code. We would like to be able to take a figure from a paper, program its individual components so that we can easily translate it into code (e.g., lines showing the gradient flow), and seamlessly make visual modifications in order to rapidly experiment with various ideas.
Required Skills Experience in creating GUIs (e.g., in HTML5 and/or Javascript), Programming skills in Python, Basic experience with deep learning frameworks (Tensorflow and/or PyTorch), Basic knowledge of reinforcement learning
Internship Objectives The objective of this internship is to create a graphical user interface (GUI) as a toolbox that makes it easy for researchers to design agent architectures for reinforcement learning settings. The toolbox will initially contain drag-and-drop primitives such as the current environmental observation, the action produced by the agent, the next environmental observation and extrinsic reward. These primitives will be used to create visual components that implement Python code of the agent (e.g., Q-learning). An important feature of the toolbox will be the ability for high customization, as well as the ability to extract snippets of agent code (in Tensorflow or PyTorch) that can be used with an OpenAI gym. The toolbox is envisioned to be able to aid researchers in creating a variety of architectures (such as actor-critic, model-based, episodic, hierarchical, deep Q networks etc.) and easily visualize them.
Internship Title

Ensemble negative correlation temporal-difference learning

Description of Internship Ensembles of learning machines have both theoretical and empirical advantages over single learners. An ideal ensemble is one that has accurate members which at the same time are diverse, i.e., they disagree as much as possible. The fundamental issue of ensemble learning is how to control generalization error, through a trade-off between bias, variance and covariance (Ueda & Nakano, 1996), and some work has focused on how to achieve this by reducing the correlations between ensemble members. Negative correlation learning (Liu & Yao, 1999) is one such techniques and has been shown to have direct control over this trade-off (Brown et al., 2005). However, currently, it has been used only in supervised learning settings. This project will investigate how to extend it to the reinforcement learning (RL) setting and in particular for gradient temporal-difference learning algorithms (Maei, 2011). A comparison will be performed with existing ensemble RL algorithms (Wiering & Van Hasselt, 2008; Hans & Udluft, 2010; Fau├čer & Schwenker, 2015), as well as a single, monolithic learner in terms of performance and sample-efficiency in prediction and/or control problems.
Required Skills Machine learning, Programming skills
Internship Objectives The objective of this internship is to derive new ensemble RL algorithms using the principle of negative correlation learning in combination with gradient temporal different learning algorithms. The new algorithms will be implemented and compared against existing ones. A first-step towards this direction is provided by Vassiliades (2015, Appendix B)
Internship Title

Learning structured exploration strategies through learned neuromodulated plasticity rules

Description of Internship Can structured exploration behavior be encoded in an artificial neural network (ANN)? This question underlies the subject of this project, where we will investigate it through embedding neuromodulated plasticity rules (Vassiliades, 2015 p.199; Miconi et al., 2018, 2019) inside the architecture of an ANN (augmented ANN). More specifically, this project will work in two phases. First, we will generate traces of optimal/structured exploration behavior in non-stationary environments, where the reward function changes (e.g. T-mazes). Second, we will use an imitation learning-like methodology, where the augmented ANN will attempt to reproduce similar traces. As this augmented ANN is a special type of recurrent ANN, we will compare it with a long short-term memory (LSTM; Hochreiter & Schmidhuber, 1997) network, which is a state-of-the-art model that can be used for sequence modeling.
Required Skills Good programming skills, Machine learning, Knowledge of neural computation
Internship Objectives The objective of this internship is to:
• implement differentiable neuromodulated plasticity
• implement the non-stationary T-maze tasks
• generate human traces of structured exploration behavior in T-mazes
• use backpropagation to learn the adaptive behavior
• compare with LSTM networks

The applicants may address any technical or research related inquiries to the Research Office at


Subscribe to our Newsletter

* indicates required