Google Summer of Code (GSoC)
Google Summer of Code (GSoC) is a global program focused on bringing new developers into open source software development. Since 2011, the INCF network has served as a mentoring organization that pairs GSoC candidates with developers from its community to work on 3-month programming projects. GSoC contributors are paid a stipend by Google. Between 2016 and 2020, when the program still was focused on students, INCF paired 87 students with 107 mentors.
GSoC 2022
Important links:
INCF Ideas List for 2022
Google Summer of Code FAQ
Google Summer of Code 2022 timeline
Recommendations for GSoC contributors
Google Summer of Code guides for mentors and students
Read our blog about GSoC 2022
GSoC 2021- Accepted project list
Student: Dinesh Sathia Raj
Mentors: Vineet Gandhi, Suresh Krishna, Tiago Falk, Reza Farivar
Eye tracking has many applications from driver safety to improved accessibility for people with disabilities. There exist expensive and bulky hardware solutions for eye tracking but this project aims to bring robust and accurate eye tracking to everyone. Recent research and the pervasiveness of handheld devices with powerful cameras have now made it easy to have high quality eye tracking right in our pockets!.
The aim of this project is to allow researchers and developers all over the world to use our open-sourced eye tracker (with additional features) in new and varied use cases. From simple systems where the phone can be mounted on a stand in a vehicle to track multiple parameters of the driver: drowsiness, gaze, attentiveness to more complex applications like emotion analysis and even lie detection, the sky's the limit.
Student: Nitik Jain
Mentors: Alex Dewar, Thomas Nowotny, James Knight
BoB Robotics is an open-source robotics framework in C++, for interfacing hardware with various robot platforms and software tools for running simulation and visualizing data.
The aims of this project are: i. to work on the problem of sensor fusion using BoB robotics and Gazebo and ii. to implement a suitable fusion algorithm for efficient Path Integration and tune the algorithm to inculcate inputs from visual sensors for visual homing.
Student: Pranav Mahajan
Mentors: Daniele Marinazzo, Fernando Rosas
The functioning of complex systems (i.e. the brain, and many others) depends on the interaction between different units; crucially, the resulting dynamics is different from the sum of the dynamics of the parts. In order to deepen our understanding of these systems, we need to make sense of these interdependencies. Several tools and frameworks have been developed to look at different statistical dependencies among multivariate datasets. Among these, information theory offers a powerful and versatile framework; notably, it allows detecting high-order interactions that determine the joint informational role of a group of variables.
The goal of this project is to collect and refine existing metrics of high-order interactions currently implemented in Matlab or Java, and integrate them in a unified python-based toolbox. The main deliverable of the project is a toolbox, whose inputs are the measurements of different variables plus some parameters, and whose outputs are the measures of higher-order interdependencies. Ideally, the toolbox will interface with visualisation & processing platforms of neuroimaging data, such as MNE, fMRIPrep and can become a docker container too.
Student: Ante Kapetanović
Mentor: Marcel Stimberg
Unlike traditional inverse identification tools that rely on gradient and gradient-free methods, simulation-based inference has been established as the powerful alternative approach that yields twofold improvement over such methods. Firstly, it does not only result in a single set of optimal parameters, rather simulation-based inference acts as if the actual statistical inference is performed and provides an estimate of the full posterior distribution over parameters. Secondly, it exploits prior system knowledge sparsely, using only the most important features to identify mechanistic models which are consistent with the measured data.
The aim of the project is to support the simulation-based inference in the brian2modelfitting toolbox by linking it to the sbi, PyTorch powered library for simulation-based inference, development of which is coordinated at the Macke lab.
Student: Viet Hoang
Mentor: Christine Rogers
LORIS is an open source data platform that stores data important for numerous ongoing neuroscience projects. These data include brain scans, genetic data, psychological tests, and much more. Used in 22 countries around the globe, LORIS currently hosts many well known datasets such as the UK Biobank and the BigBrain 3D atlas.
The objective of this project is to further the development of LORIS by maintaining the codebase, implementing automated tests, and supporting the LORIS team with various tasks so that the organization remains on track with their roadmap.
Student: Yorguin Mantilla Ramos
Mentors: Aswin Narayanan, Oren Civier, Steffen Bollmann, Tom Johnstone
BIDS is a standard for neuroimaging datasets that helps with data sharing and reusability; as a result it has been widely adopted by the community. Although developed for MRI originally, it has gained popularity within the EEG and MEG realms. Converting raw data to the BIDS standard is not difficult but requires a lot of time if done by hand. Currently, software like mne-bids and Biscuit are available to assist and facilitate this conversion but there is still no automated way to produce a valid BIDS dataset for the EEG and MEG use-cases. Mostly what is missing is metadata.
The objective of this project is to implement a python package that infers the missing information from files accompanying the EEG and MEG files and/or a single bids-conversion example provided by the user. The idea of having human-readable configuration files will also be explored since this would help the sharing of common conversion parameters within similar communities. If successfully implemented, batch conversion of BIDS datasets in the EEG and MEG cases would be realized.
Student: Mainak Deb
Mentor: Bradley Alicea
Devolearn was developed during Google Summer of Code 2020 and is now a python library released on PyPI. DevoLearn contains pre-trained Deep Learning models for the segmentation/analysis of microscopy images. It is specialized for the analysis of 2-D slices of C. elegans embryogenesis, however it can also be useful in the analysis of embryogenesis in other species.
The aims of this project are: i. to upgrade the existing models in the library, ii. add more useful functionalities to DevoLearn, iii. improve usability, and iv. to add Interactive online demos.
Student: Aditya Wagh
Mentor: Dimiter Prodanov
Maxima is a system for the manipulation of symbolic and numerical expressions with more than 40-year history. WxMaxima is a widely used user interface for Maxima. Texinfo is a documentation system that uses a single source file to produce both online information and printed output. It is primarily designed for writing software manuals. Moreover, Maxima provides the support for scientists without strictly technical background (does not require programming skills to use Maxima). The idea behind this project is to convert the wxMaxima worksheet to Texinfo.
The objectives of this project are to: i. to work on making a wxMaxima worksheet to Texinfo converter, which will be entirely accessible from a running Maxima instance (It will be a loadable Maxima add-on), ii. to write tests for the converter , to ensure it gives output in proper format and is usable, and iii. to develop documentation as well as tutorials for both end users and developers.
Student: David Romero Bascones
Mentor: Bramsh Q Chandio
Understanding the inner wiring of the human brain is one of the most long-pursued goals of neuroscientists. In this journey, diffusion MRI (dMRI) and tractography algorithms offer the essential possibility of reconstructing the white matter bundles that interconnect brain regions. Building on this, it is of high interest to obtain tractography bundle atlases that describe the average connectome in different populations. Constructing these atlases, however, requires specific technical expertise and cannot be done in a simple way by using currently available neuroimaging software.
This project aims to fill this gap by integrating a tractography bundle atlas creation workflow into DIPY, a reference library for dMRI processing in python. The developed solution offers scientists a quick and easy way of obtaining population-specific tractography atlases in an automated fashion. The algorithm takes segmented bundles as input and relies on streamline based registration to align the bundles and construct an atlas. The whole atlas creation process is controlled by the user via a command line interface. Obtained atlases can be exported and used for visualization or further analyses.
Student: Kinshuk Kasyap
Mentors: Anibal Solon, Hebbianloop
The Interplanetary File System protocol is an emerging web standard that offers a unified open-source service for peer-to-peer sharing of datasets. Datasets stored on IPFS can be anonymized and permissioned, available as read-only by those with sufficient privileges, by linking the individual records to research participant or data curator DIDs, which can be revoked at any time by the owner of the dataset.
The goals of this project are to: i. build a pipeline for decentralized (IPFS) storage of BIDS-compliant neuroimaging data with support for version control and authentication through signatures associated with Decentralized IDs (DIDs) and ii. develop documentation, walk-throughs, and supplemental guidance to enable smooth integration with front-end and UX development.
Student: Diptanshu Mittal
Mentor: Ben Fulcher
Time-series analysis is a broad, interdisciplinary field, and features for analyzing time-series datasets are ever-increasing. This has led to creating an online Django platform, CompEngine-Features, for comparing new time-series analysis features with an existing set of over 7000 time-series analysis features in the hctsa package. However, the platform lacks: support for incorporating new features contributed by users, client-side rendering of web pages, network visualizations, line plots for Empirical1000 dataset, and async views in Django, all of which are crucial for the adoption of the platform and thus its ability to have a significant impact in driving progress in time-series analysis.
The aim of this project is to continue developing a Django online platform, CompEngine-Features, for comparing the performance of time-series analysis methods on real time-series data, including a wide range of neural dynamics.
Student: Aditya R Rudra
Mentors: Roberto Toro, Katja Heuer, Anibal Solon
The idea of the project is to extend the coverage of tests in the BrainBox project and also implement Continuous Integration and Continuous Deployment using CircleCI. Currently there are unit, integration and e2e tests written for the BrainBox project but the coverage is not extensive and only amounts up to 16% of the code. The aim of this project is to extend the coverage of unit and end to end tests, while integrating them in the continuous integration of the platform and implement continuous deployment using CircleCI.
Student: Piyumal Demotte
Mentors: Dimiter Prodanov, Sumit Kumar Vohra
ImageJ is extensively used in major areas of biological and material sciences. Previously developed active segmentation platform as a plugin for ImageJ incorporate Weka toolbox-based statistical machine learning algorithms as well as deep learning techniques for trainable image segmentation. The end goal of the active segmentation platform for ImageJ is to provide researchers an extensible toolbox enabling them to select custom filters and machine learning algorithms for their research. Under the existing implementation of the active segmentation platform, it only supports users with a limited way to load the ground truth for learning ( at the moment only as of the region on interest format (ROI)). Thus, this reduced the usability of the tool and it urges the users to convert the ground truth to the specific format which is designed to be used within the application.
The objective of this project is to incorporate several ground-truth formats as image-based in which each pixel uniquely belongs to a particular class, partial ground truth format in which instead of the whole image and several partial boxes in an image or stack are labeled.
Student: Nga Tran
Mentors: Cengiz Gunay, Anca Doloc-Mih
AnalySim is a data sharing platform similar to GitHub, but specialized for scientific projects. It seeks to simplify the analysis and visualization of datasets. AnalySim is designed to promote collaborations and to improve existing datasets through features like forking or cloning, features specialized to allow users to start new projects, collaborations or join existing teams or projects.
The objectives of this project are: i. to add to AnalySim is forking projects, which will help collaboration between researchers (one can fork someone else’s project and improve upon it by adding more data or improving analysis), ii. add a new design template, and iii. improve the existing documentation.
Student: Anoushka Ramesh
Mentors: Lotty Coupat, Kirstie Whitaker
AutSPACEs is a citizen science platform that aims to understand how sensory processing differences affect autistic people all around the world. It captures an autistic user's experiences with sensory processing challenges and generates a qualitative dataset. This data serves two main purposes:
-
Contribute to making the world more inclusive of autistic people by urging policy-makers to make changes based on evidence and recommendations from a large number of lived experiences.
-
Educate non-autistic people on how they can be better allies for autistic individuals and destigmatize autism in our society.
This project aims to build and implement a working prototype of the website in collaboration with the autistic community. The potential approach to this project would be the following: i. solve issues from milestones 2 and 3 which aim to increase the accessibility of the website and make it into a minimum viable product, ii. test and review PRs to ensure that they meet their purpose, iii. make appropriate documentation for the code, iv. send the final working prototype to the autistic community for feedback and make additional issues based on it.
Student: Psyogi Soma
Mentors: Cengiz Gunay, Padraig Gleeson
The Calabrese Lab 8-cell Leech Tutorial that is described by Hill et al 2001 has been a staple for teaching computational neuroscience at Emory University for many years, and it has also been used in various summer courses. This tutorial is not only a fully constructed 8-cell circuit that can generate heart rhythms, but also a great teaching tool thanks to the visual interface where a student can turn on synaptic connections or change maximal ionic conductances. However, the original tutorial has been developed using the now obsolete Genesis simulator. Unfortunately, running the Genesis simulator nowadays requires complicated software set up that prevents many non-technical students from accessing the tutorial. We had previously started porting the tutorial to a more modern format that can be executed through a web browser (see 8-cell Leech Heartbeat Network Model Tutorial), increasing the accessibility of this classic tutorial for teaching and research purposes.
The objectives of this project are to complete the port previously started and make the tutorial available. We selected the Neuron simulator language for running the model using NeuroML and Python as the description languages.
Student: Steph Prince
Mentors: Ankur Sinha, Padraig Gleeson
Neuroscientists have begun to publicly share more and more datasets, however there are still barriers to making these datasets easily reusable by the community. One of these barriers has been the accessibility of shared data; it takes extensive time and effort to understand different data formats and determine which datasets are best suited for the scientific questions being asked.
The goal of this project is to convert publicly available datasets to the standardized NeuroData Without Borders (NWB) format so that they can be better interpreted and reused by other researchers. The data will then be made available for interactive analysis and visualization through the NWB explorer on the Open Source Brain repository. By using a standardized data format, researchers can more quickly work with new data and develop analysis methods to apply to a wide variety of datasets. The ability to easily explore and visualize data will also allow researchers to quickly assess the contents of the data and if they can reuse it. Thus, the results of this project will contribute to an important resource for scientists.
Student: Shiven Tripathi
Mentor: Sarah Marzen
To guide behaviour, it has been proposed that neurons eventually learn to predict future states of sensory inputs. The project mentors have worked in this direction to get metrics on these predictions about how accurate those predictions are if the neuron used synaptic learning rules. The main contribution of this project would be to publish highly optimised library codes that can serve as evaluation benchmarks for predictive accuracy. We also believe that neurons can generate efficient encodings on these predictions. Through this project, estimates of the predictive information would also be obtained by neural models.
Goals:
- Benchmark existing codes and optimise for integration
- Mutual Information Estimation between neural response and future stimulus
- Estimation of predictive accuracy
- Using Statistical Models
Student: Svea Marie Meyer
Mentors: Markus Löning, Martina Vilas
Time series data is ubiquitous in many applications. Examples include sensor readings from industrial processes, spectroscopy wavelength data from chemical samples, or bed-side monitor medical data from patients. Developing advanced time series analysis capabilities for researchers and practitioners is one of the major challenges of contemporary machine learning.
sktime is a new Python toolbox for machine learning with time series and, to the best of our knowledge, the first unified toolbox for time series. The goal of this project is to provide for time series what scikit-learn provides for tabular data. This involves extending scikit-learn to the different time series learning tasks, such as time series classification, clustering, forecasting and anomaly detection.
Student: Evgeniia Karunus
Mentor: Rick Gerkin
Student: Ishan Vatsaraj
Mentor: Lia Domide
Student: Harsh Khilawala
Mentor: Lungsi Ngwua
Student: Sahil Walke
Mentor: Stewart Heitmann