Deep Learning in Neuroimaging

Deep Learning in Neuroimaging

. 28 min read

Our brain is constantly working to make sense of the world around us and finding patterns in it, even when we are asleep the brain is storing patterns. Making sense of the brain itself, however, has remained an intricate pursuit.

Christoff Koch, a well-known neuroscientist, famously called the human brain the “most complex object in our observable universe” [1]. Aristotle, on the other hand, thought it was the heart that gave rise to consciousness and that the brain functioned as a cooling system both practically and philosophically [2]. Theories of the brain have evolved since then, generally shaped by knowledge gathered over centuries. Historically, to analyze the brain, we had to either extract the brain from deceased people or perform invasive surgery. Progress over the past decades has led to inventions that allow us to study the brain without invasive surgeries. A few examples of imaging techniques that do not require surgery include macroscopic imaging techniques such as functional magnetic resonance imaging (fMRI) or approaches with a high temporal resolution such as electroencephalogy (EEG). Advances in treatments, such as closed-loop electrical stimulation systems, have enabled the treatment of disorders like epilepsy and more recently depression [3, 4]. Existing neuroimaging approaches can produce a considerable amount of data about a very complex organ that we still do not fully understand which has led to an interest in non-linear modeling approaches and algorithms equipped to learn meaningful features.

This article provides an informal introduction to unique aspects of neuroimaging data and how we can leverage these aspects with deep learning algorithms. Specifically, this overview will first explain some common neuroimaging modalities more in-depth and then discuss applications of deep learning in conjunction with some of the unique characteristics of neuroimaging data. These unique characteristics tie into a broader movement in deep learning, namely that data understanding should be a goal in itself to maximize the impact of applied deep learning.

Although some people state that we know little about the brain, it is relevant to put this in perspective. For example, relative to how large we think the universe is, our observable universe seems small [5]. Its relative size does not mean that our observable universe is small by any ordinary measure. It means that proportional to the estimated wealth of information in the universe, the information we have currently gathered seems insignificant by comparison. The same is true for the brain; hundreds of years of research have gone into understanding it, which has led to a vast body of knowledge. However, to get a conspicuous and verifiable understanding of our brain we must gather and make sense of magnitudes more data.

Various key techniques to record the brain with

This article will focus on two prevalent neuroimaging techniques, magnetic resonance imaging (MRI) and electroencephalogy (EEG). MRI will be covered by discussing three of its widely used modalities. We focus on EEG and MRI because of their widespread use, my expertise with them, and their public availability [6]. Many neuroimaging modalities complement each other, either in the scale at which they record the brain, their resolution, cost, or applicability to a certain field or disorder. The scales of neuroimaging techniques we discuss here range from recording electrical signals in pre-specified brain regions (EEG) to recording the whole brain at a resolution of roughly 2.4M neurons (an MRI scan with a resolution of 3x3x3mm) [7].

Magnetic resonance imaging (MRI)

As the name suggests, magnetic resonance imaging (MRI) utilizes the magnetic properties of molecules, specifically hydrogen, and other naturally occurring elements to record the brain. Structural MRI (sMRI) is the main modality obtained during an MRI scan session and displays the contrast between different tissues in the brain. An MRI scanner essentially consists of a large magnet that aligns protons in your molecules with its magnetic field. Perturbing these protons and measuring how they react to those perturbations allows us to distinguish between different tissues. For example, tissues that contain fat will show up as light voxels in a T1 sMRI scan. T1 is a method of measuring what happens after the perturbations. Tissues or areas that contain water, and thus more hydrogen atoms, will be dark on a T1 sMRI scan. Figure 1 shows an example of a T1 sMRI scan. If you want to get a more detailed explanation of the inner workings of an MRI scanner, check out the extra information section.

Figure 1: Traveling through the brain using slices captured in a structural MRI scan. Left: from the right side of the brain to the left. Middle: a view from the back of the brain to the front. Right: A view from the bottom to the top of the brain. Source:

Diffusion tensor imaging (MRI)

There are two other types of commonly used MRI modalities. The first is diffusion MRI (dMRI), which creates contrast based on the diffusion direction of water molecules in the brain. This technique is commonly used to produce brain structures such as the white matter tracts in the brain. White matter tracts are insulated and can more efficiently transmit electrical signals over long distances. This allows communication between brain regions with high bandwidth and speed. Brain regions are made up of gray matter and form the ‘outside’ of the brain. Grey matter is often regarded as the brain’s substrate for computation and memory. The magnitude and direction of the diffusion of water molecules measured with dMRI are used to provide us with an idea of the structure and directionality of white matter tracts. Mapping the white matter tracts in the brain is called white matter tractography. Figures 2 and 3 illustrate white matter tractography.

Figure 2: A visual representation of white matter tractography using dMRI, where the top of the image is the top of the brain and we are looking at it from the left side. Source:
Figure 3: A visual representation of the structure and directionality of white matter tracts with dMRI, we are looking at the right hemisphere (side) of the brain from the left.

Functional MRI

The third commonly-used MRI-based modality is called functional MRI (fMRI), which indirectly records the activity in the brain. This method does not use the properties of hydrogen to create a contrast but uses the blood-oxygen-level-dependent (BOLD) signal. The BOLD signal measures the change in blood flow and blood oxygenation in the brain over time. An increase in energy use in a brain region increases the transportation of oxygen to that brain region. Functional MRI thus indirectly records neural activity, either while a person is performing a task inside the scanner or while performing no task at all to understand the neural activity in a brain at rest (resting-state). The signal is spatio-temporal (4D) with high dimensionality and approximately 10M+ voxels per patient.

Figure 4: A 9.4T structural MRI scan (right), compared to a ‘normal’ 3T structural MRI (left), for the same subject. The higher the Tesla of the MRI scanner, the stronger the magnetic field and the higher the resolution of the MRI scan. Source:

Electroencephalography (EEG)

Instead of using a magnet, like an MRI scanner, electroencephalography (EEG) records brain signals using electrodes. These electrodes are attached to the patient's skin, either by gluing the electrodes to their skin or having them wear an EEG cap, as pictured in Figure 5. The electrodes record differences in electrical activity over time. With EEG, it is possible to record the electrical activity in the brain more directly than with fMRI, but it captures signals through the skull and skin of the patient. These obstructions cause some signal loss, so it is harder to record brain signals coming from deeper in the brain because neurons closer to the electrodes will interfere with other neurons. It is therefore important to connect the electrodes at the correct anatomical locations. EEGs use pre-defined location arrangements and hats to make this easier.

Methodological considerations for EEG

The spatial resolution of EEG recordings depends on the number of electrodes; each output channel corresponds to the voltage difference between two neighboring electrodes. The EEG signals are then analyzed using their frequency spectrum. For example, alpha waves (7.5 - 13 Hz) present themselves when someone is relaxing and are absent when someone is alert or performing a task [8]. These frequency spectra are helpful to study the sleep patterns in animals and humans. It is possible to study the effect of sleep on the brain with frequency spectra extracted from EEG signals. These types of studies last for about an hour, but because EEG caps can be worn for longer periods, mobile EEG caps can record a person’s brain throughout the day. Finding the exact location of a signal is currently still a challenge and is called the inverse problem.

Figure 5: An example of an EEG cap. Source: 

Differences between EEG and MRI

An essential difference between EEG and MRI is that EEG can only record some parts of the brain (like the cortex) with a high temporal resolution, but an MRI can record higher-level structural and functional information of the whole brain with a low temporal resolution. EEG data is also understandable for clinicians without preprocessing and is used in practice, whereas fMRI scans are harder to understand without preprocessing. The other two MRI modalities, especially sMRI, are often used in a clinical setting as well, although they are not temporal signals like EEG. EEG’s higher temporal resolution, however, comes at the cost of a lower spatial resolution. The number of EEG channels is often 256 and is thus much lower than the number of voxels in an MRI volume: ~100k-2M. Another advantage of EEG over fMRI is that it is cheaper and easier to obtain. There is thus a trade-off between EEG and fMRI; both modalities complement each other with respect to spatial and temporal resolution. This is more generally true in the field of neuroimaging, where many modalities complement each other.

Simultaneous EEG and fMRI recordings

The trade-off between EEG and fMRI has accelerated the development of techniques that allow researchers to record both EEG and fMRI simultaneously [9]. Recording both EEG and fMRI concurrently is hard because suitable electrodes for EEG recordings distort the fMRI signal, and ferromagnetic electrodes are attracted to the magnet in the MRI scanner. The materials that circumvent these issues are not yet optimal for EEG recordings. Concurrent EEG and fMRI recordings allow researchers to solve the inverse problem for EEG recordings more accurately by constraining the solution space with the fMRI signal [10]. The EEG recordings can also determine the underlying neuronal activity that fMRI records [11]. Furthermore, EEG recordings can predict the BOLD signal in fMRI scans, which may help improve the temporal resolution of fMRI recordings. The relation between EEG and fMRI recordings is highly non-linear and deep learning models would thus be perfectly suited to accurately model these relationships [12]. Frameworks like fastMRI framework [13] can also enable faster fMRI and EEG recordings.

What do we aim to find with EEG and fMRI?

The brain is often seen as a highly complex dynamical system and its dynamics can be described at multiple scales of abstraction [14]. When we use both EEG and fMRI data, we mainly want to infer the underlying dynamics of the brain [12]. The microscale dynamics are molecular, chemical, genetic, and electrophysiological interactions between neurons or within neurons [14, 15]. The mesoscale dynamics are interactions between different types of neurons across brain regions [15] and the macroscale dynamics describe interactions between larger brain regions or systems. These interactions are commonly studied through what is called a connectome, a description of the connectivity between neurons or brain regions. The connectome helps us understand how neural activity ‘flows’ through the brain and how these dynamics may differ for people with brain disorders or while performing certain tasks. Modalities that do not record activity, but rather record structural aspects of the brain such as sMRI and DW-MRI, can be aligned with other modalities that do record activity, to condition the dynamics of that signal on the structure of the brain [16, 17]. The features from multiple modalities can also be fused into a single representation that more accurately resembles a subject than any single modality would be able to [16].

Deep learning as a more flexible tool to perform multimodal representation learning

More recent work in multimodal representation learning focuses on extracting deep features from minimally processed data, such as MRI volumes, and use them in a downstream linear classification task18. A visual representation of the method is shown in Figure 6.

Figure 6: A self-supervised method to multimodal neuroimaging representation learning [18]. It uses a Deep InfoMax [19] objective function to maximize the similarity between activation maps within the encoder and between the latent features.

This is useful mainly because it allows researchers to work with lower-dimensional data and potentially reduces the required number of labeled data points. Furthermore, given the nature of the signal that is captured, we know there are latent factors that constitute the observed signal. Namely, the activity of neuronal circuits for fMRI (indirectly) and EEG, and molecular densities for sMRI. Finding latent factors in the observed data allows for more interpretable predictions as well. Most work on multimodal representation learning, however, often takes pre-computed features from neuroimaging modalities and concatenates them together. These concatenated multimodal features are then used in a deep neural network for a classification task. The trend of working with minimally processed data is exciting and can lead to more robust and complex representations that capture the variance of the data more accurately, but requires researchers to handle the data and outcomes with care. The importance of understanding the implications of what a model has learned can not be understated in a biomedical field like neuroimaging, especially because the underlying processes that constitute neuroimaging data are often not fully known. Furthermore, data representations in neuroimaging have often been designed with decades of knowledge and should not simply be cast aside.

Unique and leverageable aspects of neuroimaging data

Among others [20], one critical challenge with deep learning in the field of neuroimaging is the limited number of samples; many neuroimaging datasets range from roughly 300 to 1300 subjects. These datasets are often used for binary classification, but the input data exists in an extremely high-dimensional space, especially for MRI. For example, a structural MRI scan obtained at a resolution of 1 mm is represented by a volume of around 4-6M voxels per subject. Although fMRI is often recorded at a lower resolution, it does record MRI volumes over time, which means that a single subject can be represented by roughly 15-50M voxels, depending on the acquisition time. The imbalance between the dimensionality of the data and the number of subjects is a recipe for overfitting. A conventional way to tackle this problem is to increase the size of the dataset. For MRI data, UK Biobank (>50k subjects with a goal of 100k) [21] was recently released and for EEG, the Temple University hospital repository contains a large number of recordings (~30k recordings) [22]. One important challenge in neuroimaging is that there is a great deal of variation between (MRI) scanners at different sites, both due to the settings of the scanners and/or their manufacturers. Another challenge is that the use of biomedical data has to adhere to strict data privacy laws. There are however some unique properties of neuroimaging data that, in combination with deep learning, can be leveraged to learn meaningful representations and constrain the solution space.

The importance of repeated measurements

Neuroimaging data is not only acquired in breadth; for multiple modalities, but more and more research is acquiring neuroimaging data longitudinally, meaning: the same participant is recorded repeatedly over time. One interesting extremum is the single-subject acquisition of multiple MRI modalities over 18 months [23]. This study shows that dynamics in a middle-aged adult’s brain still change, over a period of 1.5 years. Longitudinal studies are often done to study the development of the brain in children and adolescents. Some large and impactful longitudinal neurodevelopmental MRI datasets are the Generation R Study 1 and 2, and the ABCD study. These datasets gather a variety of biomedical measurements but also capture neuroimaging data through childhood and adolescence. For EEG, longitudinal recordings are done with more rapid succession, because meso-scale brain dynamics likely change on a shorter interval, compared to the structural and macro-scale brain dynamics that are captured by MRI modalities. Another example of a longitudinal study is the Alzheimer’s Disease Neuroimaging Initiative (ADNI). It records the progression of Alzheimer's disease (AD) over time [24]. Longitudinal studies are important for AD because it first manifests itself as mild cognitive impairment (MCI), although not every subject with MCI will develop AD. It may be relatively easy to predict whether someone has AD, but it is much more complex and important to predict whether someone will get MCI and/or whether their MCI will develop into AD, so clinicians can intervene early on.

How can we leverage longitudinal data?

The variability of the structure or dynamics in the brain longitudinally can be used to more accurately represent a subject’s brain using deep learning, given that regions in the brain have different (non-linear) growth curves. Previous brain scans may inform future scans and vice versa because there is limited variability in the structure and dynamics of the brain over time. Since the development of the brain over time is not a linear process, deep learning methods can use longitudinal data and/or multimodal data to constrain representations of a subject’s brain. A recent example of this is a model that can predict the progression of AD using multiple modalities and longitudinal data [25]. A more accurate model of growth curves allows us to understand individual differences in neurodevelopment, which will lead to more accurate individualized predictions and treatment. Figure 7 illustrates an adapted view of their method.

Figure 7: An example of how longitudinal data can be used to get a better understanding of the progression of Alzheimer’s disease (AD). The figure is based on but does not exactly depict previous work25, which does not longitudinally model sMRI volumes, but rather uses a single sMRI volume, a longitudinal questionnaire, and biomarker data. 

Individual representation of subjects

Individualized representations are probably one of the most important current goals in the neuroimaging community which deep learning appears to be well-suited for [26]. Being able to simulate and represent an individual brain leads to a better understanding of individual variability. For example, our understanding of brain disorders has so far largely been obtained through group analyses, where differences are compared between a group of controls and a group of people with a disorder. Potential issues with only using group differences are shown in Figure 8.

Figure 8: This figure shows a comparison between p-values for correlations and classification accuracies. In subfigure A, we have significant differences between groups (p < 0.05), but it is hard to make individual predictions (60% accuracy). Subfigure B shows highly uncorrelated groups where individual predictions are much better. The final subfigure shows two groups that have significant differences and good individual prediction accuracies. The point is that group differences have often been used to study mental disorders, but that mental disorders are very individual-dependent in how they may appear quantitatively. Group comparisons lack the power to understand these individual differences and this is where deep learning could help [27].

These types of studies are important, but if we want to move towards improved treatment for a specific individual or a better understanding of individual dynamics in the brain, an accurate overview or representation of that individual’s data is essential [27]. Deep learning’s ability to learn highly complex and non-linear patterns from neuroimaging data can lead to more accurate individualized representations [26] because of the higher degrees of freedom that the algorithms can model. Predictions do not need to be learned end-to-end by a deep learning algorithm. They can also arise from learned latent representations that experts in the field may use to come to a more accurate diagnosis or treatment plan. A new wave of clinicians, trained to use representations obtained using machine learning may move away from categorical classifications of disorders and towards subject-wise predictions on a spectrum [28]. Note that due to the noise and lower sample size in neuroimaging data, it is imperative to be careful when making inferences about individuals.

Improving deep learning for neuroimaging with data-centric AI

The need for individualized representations is also driven by the co-occurrence of multiple mental disorders in the same person [29] and the spectra associated with many mental disorders. Symptoms related to autism spectrum disorder in men, for example, are vastly different from women [30], which may translate into different activity patterns that are related to autism spectrum disorder between men and women. Even within a group of men or women with autism spectrum disorder, their symptoms and brain activity may be different. Further, people with schizophrenia may also suffer from substance abuse, anxiety and depressive symptoms, and PTSD to name a few [31]. The co-occurrence of mental disorders also means that each person should be treated on an individual basis; someone labeled as having a mental disorder in a dataset, may have different activity dynamics from others with the same label. Not only that, but neuroimaging datasets may have noisy, potentially outdated, or simplistic, labels [32]. Especially given that the definition of mental disorders changes over time. Due to the plethora of differences between individuals’ brains, it may not be entirely possible, or even desirable, to fully classify people. The broader goal is to understand the brain, aid clinicians, and provide accurate and individualized treatment plans. This speaks to a broader trend in the deep learning community as well: data understanding is crucial. In neuroimaging this goes both ways, it is important to understand the data you are working with and, possibly more importantly, to understand the additional information the model provides that set of data. Andrew Ng coined the term data-centric AI [33] and I think this philosophy can easily be extended to take into account the complexities of biomedical, especially neuroimaging data.

What would data-centric AI in neuroimaging look like?

Data-centric AI in neuroimaging should focus on leveraging the rich data that we often already have about subjects to create a complete and useful representation of individuals. Both to understand dynamics during tasks or with respect to mental disorders. As an example of the latter, this could mean that people with and without diagnosed mental disorders are represented on a spectrum instead of a single (binary) prediction. On top of that, deep learning practitioners in the field of neuroimaging should understand deeply what each data point represents. In the case of natural images, we often know what we want our neural network to focus on, e.g. a dog in a classification task. In neuroimaging the focus is unknown because that is exactly what we are trying to figure out; for example, the dynamics in the brain that are related to a certain mental disorder or task. It is therefore important to take into account the underlying process that generates the data and each modality’s shortcomings. The researcher is then able to fuse the strengths of each modality into a representation that is more useful to experts in the field. In the case of mental disorders, these experts could then formulate a more accurate and personal treatment plan. In general, though, representing the dynamics of the brain in a more interpretable way could aid a wide range of researchers, such as psychologists, neuroscientists, etc. find new research directions or understand certain high-level functions in the brain.

The role of neuroimaging-centric deep learning

An additional improvement in terms of data understanding is the use of neuroimaging-specific architectures. For example, the use of convolutional neural networks (CNNs) in the field of neuroimaging is grounded, given that CNNs have achieved impressive results on natural image datasets partly because they are based on inductive biases related to natural images. It is not clear whether CNNs are optimal for neuroimaging data however. The translational invariance property of CNNs is likely not strictly necessary when it comes to predicting mental illnesses because brains are often registered to a common space such that the location of each voxel in each scan represents the same or a similar location. As opposed to natural images, where a dog, for example, could appear anywhere in the image. The advent of research in visual transformer architectures [34] and architectures like MLPMixer [35] that either implicitly or explicitly try to build in translational invariance, opens up a field of neuroimaging-specific architectures, where we do not build in potentially unnecessary inductive biases like translational invariance. Instead, we could look at sharing parameters along dimensions  that are more natural for neuroimaging data. An additional way to build in a neuroimaging-specific inductive bias is to restrict the receptive field of a neural network such that downstream representations can only be created based on certain anatomical regions in the brain; this could also increase the interpretability of a method. Since this space is largely unexplored, there is a wealth and variety of inductive biases we can explore as a field. Opening up the field of neuroimaging-specific architectures may also lead to other data representations, such as point clouds [36]. A point cloud may be beneficial because the cortex is a large sheet of neurons that are folded in a specific way. Thus, voxels may not be able to accurately represent distances between neurons. However, extracting the cortical surface and representing it as a point cloud with accurate distance measures between the points may be a more natural way to analyze MRI data. The standard method that is used to do this is called FreeSurfer [37], although there is a lot of ongoing research. Potentially moving away from convolutional layers may allow us to preserve invariances in the brain that 3D convolutions can not, such as rotational invariance [36]. Furthermore, representations such as graphs and neural ODEs are likely more natural to brain organization and activity dynamics in the brain [38]. There is work in this direction [39], but it is still a largely untapped subfield.

Figure 9: An example of a possibly more natural representation of the brain, where the opacity of the connections between brain regions (pink, yellow, and copper) indicates the strength of the connection. In case we look at structures, it can be beneficial to represent the structures as point clouds [36].

Nonetheless, methods that are currently used for the analysis of neural recordings may not be sufficient and could be supplemented by methods that allow us to understand a microprocessor [40]. This is not to say that the brain is a microprocessor, but microprocessors analysis techniques may lead to new insights about neuroscientific data. Namely, current neuroimaging methods applied to the output of a microprocessor are not sufficient to understand how that microprocessor processes information [40].

Finally, the direct prediction of phenotypes, including some mental disorders (e.g. depression), is a controversial discussion in the field, especially because deep learning classifiers may not capture the relevant non-linear patterns due to issues like small sample sizes and noise [41]. Focussing less on direct predictions, but rather on the development of representations that help us understand the brain more broadly is likely a more fruitful direction. This requires taking into account both neuroimaging-specific representations and neuroscientific knowledge when designing new models.


Acquiring raw data is becoming easier with the advent of new technologies that allow us to visualize different types of brain signals at multiple levels of abstraction. The increase in data and the number of modalities have led to a need for more complex algorithms to make sense of it. The parallel advent of deep learning creates an interesting opportunity to build end-to-end learning algorithms to help us understand neuroimaging data and the functioning of the brain more deeply. One of the first validation studies on the potential of deep learning in the field of neuroimaging was published in 2014 [42]. Today, there are still many hurdles when it comes to utilizing deep learning in the field of neuroimaging. There are also many potential ways to overcome these hurdles and build models suited to the unique challenges of neuroimaging.

We have come a long way since the days of antiquity, and have acquired a great deal of knowledge about the brain and its functions. We also know that there is still an enormous amount of information yet to uncover. Deep learning can help us to do this structurally, but it is important to leverage neuroimaging’s unique features as well. These unique features and previous knowledge should be synthesized into the creation and interpretation of deep learning models. Interpretations and data understanding are essential. Furthermore, previous research in fields like data processing should not be dismissed as something that a neural network can learn. Deep learning in neuroimaging has some important limitations, such as sample size, that may make it hard to learn fine variations. Furthermore, regions in a volume of the brain that a neural network ‘pays attention to’ need to be robust and backed by neuroscientific explanations of why those areas may affect a certain target variable. Both these points emphasize the importance of data understanding when applying deep learning to neuroimaging.

Data understanding and interpretation is an essential facet of deep learning applications to neuroimaging data and can lead to novel algorithms. It ties in with most of the topics discussed in this overview and links to a more general movement, championed by Andrew Ng, called data-centric AI. Specifically for neuroimaging, it is relevant to know what a model has learned to interpret its results. It is necessary to consider these things during the adaptation of state-of-the-art deep learning research to neuroimaging data. Progress in out-of-distribution generalization, self-supervised learning, and data efficiency on natural images has been remarkably promising. These results, however, may not directly translate to other domains, such as neuroimaging. One way to improve these results may be to gather more data and combine datasets, which can prove difficult, due to factors like scanner differences. Results may also be improved with the exploration of neuroimaging-specific models. All of these approaches require a good understanding of the data, both before applying the algorithms and when interpreting their results. In the end, the main goal should be to improve our understanding of the brain and enable people with brain-related disorders to live lives according to their wants and needs.


Thank you Kiran Vaidhya and Tariq Daouda for the insightful comments, the smooth editorial process, and helpful questions to move this piece forward. Furthermore, thank you, Vince Calhoun, Noah Lewis, and Eloïse Geenjaar for proofreading early versions of this article and helping me make it more accessible to a broader public.

Supplementary Information

sMRI An MRI scanner is made up of a large magnet, three gradient coils, and radiofrequency (RF) coils; the large magnet creates a magnetic field M0, the gradient coils create a gradient in the magnetic field along the x, y, and z-direction, and the RF coils create the signal we use as the contrast in an sMRI scan. The large magnet causes the axes of the protons of the hydrogen molecules in the brain to align in the direction of M0 while they spin around their axes. To make sure we can reconstruct locations in the scanner into a 3D space, the x, y, and z gradient coils cause frequency variations in the spin of the protons, these frequency variations are unique for each location in a 3D grid. Then, to obtain a signal that can be used to distinguish between tissues, the RF coils transmit a short pulse that ‘pushes’ the axes of the protons orthogonal to the main magnetic field M0. After the pulse ends, the axes of the protons will gradually re-align with the main magnetic field in the scanner. The realignment causes the protons to release energy in the form of radio waves, these radio waves are used together with the time it takes protons to realign with the main magnetic field, to reconstruct a 3D sMRI scan. The realignment times are called relaxation times and ways of measuring (e.g. the axis along which we measure realignment) the relaxation times lead to different contrasts in a brain scan. One commonly used relaxation time definition is T1, under this definition, fat protons relax much faster than water protons. This causes fat to be bright in the sMRI scan and water to be dark. Fat relaxes faster because hydrogen atoms in fat are contained in large carbon chains, this causes the spinning of the protons to be damped. Water, however, is made up of a hydrogen atom and two oxygen atoms. The oxygen atom does not damp the resonance of the proton spins as much, so it takes longer to realign with the main magnetic field. Each of these RF pulses is done for a slice of the brain until the whole brain is imaged. To learn more about the physics of MRI scanners I would advise you to read [42, 43]. The signal that the scanner produces is not a 3D volume of the brain, but rather a K-space, which encodes the information about the brain in terms of spatial frequencies. To obtain a 3D volume of the brain that we can interpret, we perform a Fourier transform of the K space. An example image of a K-space is shown in Figure 10 and more on this subject can be found online [44].
fMRI To transport the oxygen to the site that is or has used energy, oxygen can be bound to hemoglobin sites in a blood cell, where a hemoglobin site that carries oxygen is called oxyhemoglobin (Hb), and sites that do not are called deoxyhemoglobin (dHb). The latter is attracted to the magnet of the MRI scanner, which causes paramagnetism in the magnetic field of the MRI scanner that can be measured.
K-space The resolution of an MRI scan is mostly determined by the strength of the main magnet, this strength is expressed in Tesla (T), 3T scanners are the most common. 7T scanners are starting to become more common however, recently researchers have also started experimenting with 9.4T magnets on humans, see Figure 4. The higher the Tesla of the scanner, the more dangerous it becomes to have any ferromagnetic materials in the vicinity of an MRI scanner. Stories of large metal equipment, like chairs or oxygen tanks, being pulled into an MRI scanner with tremendous force, often do the rounds at MRI centers. Bigger magnets lead to higher costs, this is partly because it can take longer to acquire MRI scans, which leads to lower throughput. This is already a problem for MRI scanners with smaller magnets. Further, the longer a person is inside an MRI scanner the more likely they are to start moving and cause artifacts in the scan. One recent approach to tackle some of these problems has been introduced by Facebook AI Research (FAIR) in collaboration with New York University (NYU), namely FastMRI [13]. The idea is to undersample the brain you want to reconstruct and instead use a deep learning algorithm to effectively reconstruct the MRI scan. Undersampling is much faster and it can also open up the use of even smaller magnets for mobile MRI scanners in rural areas [45]. This is one of the earliest steps in the pipeline of neuroimaging where deep learning algorithms are starting to make a difference in how we look at neuroimaging data and the process of reconstructing it.
Federated learning This also means that any training signal should not contain any subject-identifying information. I liked this video on privacy-preserving AI that describes the problems and ways to attempt to solve them [46]. Although these constraints complicate federated learning, there are a few privacy-preserving federated learning initiatives for neuroimaging that tackle this problem, such as COINSTAC [47].


For attribution in academic contexts or books, please cite this work as

Eloy Geenjaar, "Deep Learning in Neuroimaging", The Gradient, 2022.

BibTeX citation:

 author = {Eloy Geenjaar},
 title = {Deep Learning in Neuroimaging},
 journal = {The Gradient},
 year = {2022},
 howpublished = {\url{}},


  2. Gross, C. G. (1995). Aristotle on the brain. The Neuroscientist, 1(4), 245-250.
  3. Theodore, W. H., & Fisher, R. S. (2004). Brain stimulation for epilepsy. The Lancet Neurology, 3(2), 111-118.
  4. Fisher, R. S., & Velasco, A. L. (2014). Electrical brain stimulation for epilepsy. Nature Reviews Neurology, 10(5), 261-270.
  6. Horien, C., Noble, S., Greene, A. S., Lee, K., Barron, D. S., Gao, S., ... & Scheinost, D. (2021). A hitchhiker’s guide to working with large, open-source neuroimaging datasets. Nature human behaviour, 5(2), 185-193.
  9. Huster, R. J., Debener, S., Eichele, T., & Herrmann, C. S. (2012). Methods for simultaneous EEG-fMRI: an introductory review. Journal of Neuroscience, 32(18), 6053-6060.
  10. Ritter, P., & Villringer, A. (2006). simultaneous EEG–fMRI. Neuroscience & Biobehavioral Reviews, 30(6), 823-838.
  11. Logothetis, N. K., Pauls, J., Augath, M., Trinath, T., & Oeltermann, A. (2001). Neurophysiological investigation of the basis of the fMRI signal. nature, 412(6843), 150-157.
  12. Philiastides, M. G., Tu, T., & Sajda, P. (2021). Inferring Macroscale Brain Dynamics via Fusion of Simultaneous EEG-fMRI. Annual Review of Neuroscience, 44.
  13. Zhang, Z., Romero, A., Muckley, M. J., Vincent, P., Yang, L., & Drozdzal, M. (2019). Reducing uncertainty in undersampled mri reconstruction with active acquisition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 2049-2058).
  14. van den Heuvel, M. P., Scholtens, L. H., & Kahn, R. S. (2019). Multiscale neuroscience of psychiatric disorders. Biological psychiatry, 86(7), 512-522.
  15. Zeng, H. (2018). Mesoscale connectomics. Current opinion in neurobiology, 50, 154-162.
  16. Baltrušaitis, T., Ahuja, C., & Morency, L. P. (2018). Multimodal machine learning: A survey and taxonomy. IEEE transactions on pattern analysis and machine intelligence, 41(2), 423-443.
  17. Plis, S. M., Amin, M. F., Chekroud, A., Hjelm, D., Damaraju, E., Lee, H. J., ... & Calhoun, V. D. (2018). Reading the (functional) writing on the (structural) wall: Multimodal fusion of brain structure and function via a deep neural network based translation approach reveals novel impairments in schizophrenia. NeuroImage, 181, 734-747.
  18. Fedorov, A., Sylvain, T., Geenjaar, E., Luck, M., Wu, L., DeRamus, T. P., ... & Plis, S. M. (2021, August). Self-Supervised Multimodal Domino: in Search of Biomarkers for Alzheimer’s Disease. In 2021 IEEE 9th International Conference on Healthcare Informatics (ICHI) (pp. 23-30). IEEE.
  19. Hjelm, R. D., Fedorov, A., Lavoie-Marchildon, S., Grewal, K., Bachman, P., Trischler, A., & Bengio, Y. (2018). Learning deep representations by mutual information estimation and maximization. arXiv preprint arXiv:1808.06670.
  21. Sudlow, C., Gallacher, J., Allen, N., Beral, V., Burton, P., Danesh, J., ... & Collins, R. (2015). UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS medicine, 12(3), e1001779.
  23. Poldrack, R. A., Laumann, T. O., Koyejo, O., Gregory, B., Hover, A., Chen, M. Y., ... & Mumford, J. A. (2015). Long-term neural and physiological phenotyping of a single human. Nature communications, 6(1), 1-15.
  24. Weiner, M. W., Aisen, P. S., Jack Jr, C. R., Jagust, W. J., Trojanowski, J. Q., Shaw, L., ... & Alzheimer's Disease Neuroimaging Initiative. (2010). The Alzheimer's disease neuroimaging initiative: progress report and future plans. Alzheimer's & Dementia, 6(3), 202-211.
  25. Lee, G., Nho, K., Kang, B., Sohn, K. A., & Kim, D. (2019). Predicting Alzheimer’s disease progression using multi-modal deep learning approach. Scientific reports, 9(1), 1-12.
  26. Arbabshirani, M. R., Plis, S., Sui, J., & Calhoun, V. D. (2017). Single subject prediction of brain disorders in neuroimaging: Promises and pitfalls. Neuroimage, 145, 137-165.
  27. Sui, J., Jiang, R., Bustillo, J., & Calhoun, V. (2020). Neuroimaging-based individualized prediction of cognition and behavior for mental disorders and health: methods and promises. Biological psychiatry, 88(11), 818-828.
  28. Bzdok, D., & Meyer-Lindenberg, A. (2018). Machine learning for precision psychiatry: opportunities and challenges. Biological Psychiatry: Cognitive Neuroscience and Neuroimaging, 3(3), 223-230.
  29. Roca, M., Gili, M., Garcia-Garcia, M., Salva, J., Vives, M., Campayo, J. G., & Comas, A. (2009). Prevalence and comorbidity of common mental disorders in primary care. Journal of affective disorders, 119(1-3), 52-58.
  30. Werling, D. M., & Geschwind, D. H. (2013). Sex differences in autism spectrum disorders. Current opinion in neurology, 26(2), 146.
  31. Buckley, P. F., Miller, B. J., Lehrer, D. S., & Castle, D. J. (2009). Psychiatric comorbidities and schizophrenia. Schizophrenia bulletin, 35(2), 383-402.
  32. Rokham, H., Pearlson, G., Abrol, A., Falakshahi, H., Plis, S., & Calhoun, V. D. (2020). Addressing inaccurate nosology in mental health: A multilabel data cleansing approach for detecting label noise from structural magnetic resonance imaging data in mood and psychosis disorders. Biological Psychiatry: Cognitive Neuroscience and Neuroimaging, 5(8), 819-832.
  34. Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., ... & Houlsby, N. (2020). An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929.
  35. Tolstikhin, I., Houlsby, N., Kolesnikov, A., Beyer, L., Zhai, X., Unterthiner, T., ... & Dosovitskiy, A. (2021). Mlp-mixer: An all-mlp architecture for vision. arXiv preprint arXiv:2105.01601.
  36. Yang, L., & Chakraborty, R. (2020, April). A GMM based algorithm to generate point-cloud and its application to neuroimaging. In 2020 IEEE 17th International Symposium on Biomedical Imaging Workshops (ISBI Workshops) (pp. 1-4). IEEE.
  37. Bassett, D. S., & Sporns, O. (2017). Network neuroscience. Nature neuroscience, 20(3), 353-364.
  38. Fischl, B. (2012). FreeSurfer. Neuroimage, 62(2), 774-781.
  39. Bessadok, A., Mahjoub, M. A., & Rekik, I. (2021). Graph neural networks in network neuroscience. arXiv preprint arXiv:2106.03535.
  40. Jonas, E., & Kording, K. P. (2017). Could a neuroscientist understand a microprocessor?. PLoS computational biology, 13(1), e1005268.
  42. Plis, S. M., Sarwate, A. D., Wood, D., Dieringer, C., Landis, D., Reed, C., ... & Calhoun, V. D. (2016). COINSTAC: a privacy enabled model and prototype for leveraging and processing decentralized brain imaging data. Frontiers in neuroscience, 10, 365.
  43. Pooley, R. A. (2005). Fundamental physics of MR imaging. Radiographics, 25(4), 1087-1099.
  46. Sarracanie, M., LaPierre, C. D., Salameh, N., Waddington, D. E., Witzel, T., & Rosen, M. S. (2015). Low-cost high-performance MRI. Scientific reports, 5(1), 1-9.
  48. Plis, S. M., Sarwate, A. D., Wood, D., Dieringer, C., Landis, D., Reed, C., ... & Calhoun, V. D. (2016). COINSTAC: a privacy enabled model and prototype for leveraging and processing decentralized brain imaging data. Frontiers in neuroscience, 10, 365.