ahof1704 commited on
Commit
eddbc75
1 Parent(s): 8282a22

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +44 -56
README.md CHANGED
@@ -16,8 +16,6 @@ The pretrained model of Brain Language Model (BrainLM) aims to achieve a general
16
 
17
  We introduce the Brain Language Model (BrainLM), a foundation model for brain activity dynamics trained on 6,700 hours of fMRI recordings. Utilizing self-supervised masked-prediction training, BrainLM demonstrates proficiency in both fine-tuning and zero-shot inference tasks. Fine-tuning allows for the prediction of clinical variables and future brain states. In zero-shot inference, the model identifies functional networks and generates interpretable latent representations of neural activity. Furthermore, we introduce a novel prompting technique, allowing BrainLM to function as an in silico simulator of brain activity responses to perturbations. BrainLM offers a novel framework for the analysis and understanding of large-scale brain activity data, serving as a “lens” through which new data can be more effectively interpreted.
18
 
19
-
20
-
21
  - **Developed by:** [van Dijk Lab](https://www.vandijklab.org/) at Yale University
22
  - **Shared by [optional]:** [More Information Needed]
23
  - **Model type:** [More Information Needed]
@@ -35,92 +33,82 @@ We introduce the Brain Language Model (BrainLM), a foundation model for brain ac
35
 
36
  ## Uses
37
 
38
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
39
-
40
- ### Direct Use
41
-
42
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
43
-
44
- [More Information Needed]
45
-
46
- ### Downstream Use [optional]
47
-
48
- <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
49
 
50
- [More Information Needed]
 
 
 
51
 
52
  ### Out-of-Scope Use
53
 
54
- <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
55
-
56
- [More Information Needed]
57
 
58
  ## Bias, Risks, and Limitations
59
 
60
- <!-- This section is meant to convey both technical and sociotechnical limitations. -->
61
-
62
- [More Information Needed]
 
63
 
64
  ### Recommendations
65
 
66
- <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
67
-
68
- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
69
 
70
  ## How to Get Started with the Model
71
 
72
  Use the code below to get started with the model.
73
 
74
- [More Information Needed]
75
 
76
  ## Training Details
77
 
78
- ### Training Data
79
-
80
- <!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
81
-
82
- [More Information Needed]
83
-
84
- ### Training Procedure
85
-
86
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
87
-
88
- #### Preprocessing [optional]
89
-
90
- [More Information Needed]
91
 
 
 
 
92
 
93
- #### Training Hyperparameters
 
 
 
 
94
 
95
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
 
 
 
96
 
97
- #### Speeds, Sizes, Times [optional]
 
98
 
99
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
 
 
 
100
 
101
- [More Information Needed]
102
-
103
- ## Evaluation
104
-
105
- <!-- This section describes the evaluation protocols and provides the results. -->
106
-
107
- ### Testing Data, Factors & Metrics
108
-
109
- #### Testing Data
110
 
111
- <!-- This should link to a Data Card if possible. -->
112
 
113
- [More Information Needed]
114
 
115
- #### Factors
 
 
 
 
116
 
117
- <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
118
 
119
- [More Information Needed]
120
 
121
  #### Metrics
122
 
123
- <!-- These are the evaluation metrics being used, ideally with a description of why. -->
 
 
 
124
 
125
  [More Information Needed]
126
 
 
16
 
17
  We introduce the Brain Language Model (BrainLM), a foundation model for brain activity dynamics trained on 6,700 hours of fMRI recordings. Utilizing self-supervised masked-prediction training, BrainLM demonstrates proficiency in both fine-tuning and zero-shot inference tasks. Fine-tuning allows for the prediction of clinical variables and future brain states. In zero-shot inference, the model identifies functional networks and generates interpretable latent representations of neural activity. Furthermore, we introduce a novel prompting technique, allowing BrainLM to function as an in silico simulator of brain activity responses to perturbations. BrainLM offers a novel framework for the analysis and understanding of large-scale brain activity data, serving as a “lens” through which new data can be more effectively interpreted.
18
 
 
 
19
  - **Developed by:** [van Dijk Lab](https://www.vandijklab.org/) at Yale University
20
  - **Shared by [optional]:** [More Information Needed]
21
  - **Model type:** [More Information Needed]
 
33
 
34
  ## Uses
35
 
36
+ BrainLM is a versatile foundation model for fMRI analysis. It can be used for:
 
 
 
 
 
 
 
 
 
 
37
 
38
+ - Decoding cognitive variables and mental health biomarkers from brain activity patterns
39
+ - Predicting future brain states by learning spatiotemporal fMRI dynamics
40
+ - Discovering intrinsic functional networks in the brain without supervision
41
+ - Perturbation analysis to simulate the effect of interventions on brain activity
42
 
43
  ### Out-of-Scope Use
44
 
45
+ Currently, this model has been trained and tested only on fMRI data. There are no guarantees regarding its performance on different modalities of brain recordings.
 
 
46
 
47
  ## Bias, Risks, and Limitations
48
 
49
+ - The model was trained only on healthy adults, so may not generalize to other populations
50
+ - The fMRI data has limited spatial-temporal resolution and BOLD signals are an indirect measure of neural activity
51
+ - The model has only been evaluated on reconstruction and simple regression/classification tasks so far
52
+ - Attention weights provide one method of interpretation but have known limitations
53
 
54
  ### Recommendations
55
 
56
+ - Downstream applications of the model should undergo careful testing and validation before clinical deployment.
57
+ - Like any AI system, model predictions should be carefully reviewed by domain experts before informing decision-making.
 
58
 
59
  ## How to Get Started with the Model
60
 
61
  Use the code below to get started with the model.
62
 
 
63
 
64
  ## Training Details
65
 
66
+ ### Data
 
 
 
 
 
 
 
 
 
 
 
 
67
 
68
+ Data stats:
69
+ - UK Biobank (UKB): 76,296 recordings (~6450 hours)
70
+ - Human Connectome Project (HCP): 1002 recordings (~250 hours)
71
 
72
+ Preprocessing Steps:
73
+ - Motion Correction
74
+ - Normalization
75
+ - Temporal Filtering
76
+ - ICA Denoising
77
 
78
+ Feature Extraction:
79
+ - Brain Parcellation: AAL-424 atlas is used to divide the brain into 424 regions.
80
+ - Temporal Resolution: ~1 Hz with 0.735s for UKB and 0.72s for HCP.
81
+ - Dimensionality: 424-dimensional time series per scan.
82
 
83
+ Data Scaling
84
+ - Robust scaling was applied, involving the subtraction of the median and division by the interquartile range across subjects for each parcel.
85
 
86
+ Data split:
87
+ - Training data: 80% of the UKB dataset
88
+ - Validation data: 10% of the UKB dataset
89
+ - Test data: 10% of the UKB dataset and HCP dataset
90
 
91
+ ### Training Procedure
 
 
 
 
 
 
 
 
92
 
93
+ BrainLM was pretrained on fMRI recordings from the UK Biobank and HCP datasets. Recordings were parcellated, embedded, masked, and reconstructed via a Transformer autoencoder. The model was evaluated on held-out test partitions of both datasets.
94
 
95
+ Objective: Mean squared error loss between original and predicted parcels
96
 
97
+ Pretraining:
98
+ - 100 epochs
99
+ - Batch size 512
100
+ - Adam optimizer
101
+ - Masking ratios: 20%, 75% and 90%
102
 
103
+ Downstream training: Fine-tuning on future state prediction and regression/classification clinical variables
104
 
 
105
 
106
  #### Metrics
107
 
108
+ In this work, we use the following metrics to evaluate the model's performance:
109
+ - Reconstruction error (MSE between predicted and original parcel timeseries)
110
+ - Clinical variable regression error (e.g. age, neuroticism scores)
111
+ - Functional network classification accuracy
112
 
113
  [More Information Needed]
114