The provided text is a detailed methodological excerpt from a study involving the MULTI Consortium and several large biobank and cohort datasets, including UK Biobank (UKBB), FinnGen, Psychiatric Genomics Consortium (PGC), TriNetX, the Baltimore Longitudinal Study of Ageing (BLSA), and the Multi-Ethnic Study of Atherosclerosis (MESA). Below is a concise overview and key points summary organized by major topics:
MULTI Consortium Overview
- Purpose: Integrate multi-organ and multi-omics data (imaging, genetics, metabolomics, proteomics) to model human aging and disease across the lifespan.
- Data Sources: Builds on existing consortia and studies.
- Ethics: Approved by Columbia University IRB (AAAV6751).
UK Biobank (UKBB)
- Participants: ~500,000 UK individuals (2006-2010).
- Sleep Data: Self-reported sleep duration via touchscreen (field ID 1160), including naps.
- Data Collection Details: Basic quality control excludes unrealistic values; average sleep over 4 weeks considered.
- Imaging Data: Multi-organ MRI-derived Image-Derived Phenotypes (IDPs) across brain, heart, liver, pancreas, spleen, adipose, kidney, and eye OCT.
- Biomarkers: Plasma proteomics (UKB Pharma Proteomics Project) and metabolomics (Nightingale Health).
- Ageing Clocks: Developed 23 multi-organ biological age gaps (BAGs)—7 MRIBAGs (MRI-based age gaps), 11 proteomics BAGs (ProtBAGs), 5 metabolomics BAGs (MetBAGs).
- Analyses: Used Generalized Additive Models (GAMs) to model sleep duration relationships with BAGs and other phenotypes, adjusting for multiple covariates and minimizing overfitting via penalized regression splines.
Key Methodological Approaches
- Ageing Clocks: Training and validation using nested cross-validation on healthy control populations, with splitting into training, validation, test, and hold-out test datasets.
- Multi-Omics Integration: Proteins and metabolites were quality controlled, normalized, and annotated with the Human Protein Atlas.
- Modeling Sleep Effects: GAMs allowed flexible nonlinear patterns including U-shaped relationships; tested main effects, sex differences, and sex-sleep interactions.
- Effect Size Outcomes: Associations quantified while excluding outliers and controlling for confounders.
FinnGen
- Dataset: >500,000 Finnish biobank samples.
- GWAS Data: Summary statistics for 521 disease endpoints included after harmonization.
- Method: No individual data used; analyses based on REGENIE-generated summary statistics with age, sex, PCs, and batch covariates.
Psychiatric Genomics Consortium (PGC)
- Focus: Genetics of psychiatric disorders.
- Data: 6 brain disease GWAS summary datasets included (e.g., schizophrenia, bipolar disorder, major depression).
- Usage: Summary data only; quality controlled and harmonized.
TriNetX
- Data Type: Real-world clinical data on >90 million patients from >70 healthcare organizations.
- Purpose: Assess associations of sleep traits (insomnia, hypersomnia) with systemic diseases identified in UKBB.
Baltimore Longitudinal Study of Ageing (BLSA)
- Goal: Track physiological and cognitive ageing.
- Data: Brain MRI, self-reported, and actigraphy (wearable) sleep duration measures (n=385).
- Replication: Used to replicate U-shaped sleep-brain aging associations observed in UKBB.
Multi-Ethnic Study of Atherosclerosis (MESA)
- Participants: >6,000 diverse US adults.
- Data Used: 573 participants with brain MRI, self-reported sleep duration.
- Purpose: Replicate UKBB U-shaped sleep and brain ageing pattern.
Summary of Major Analytical Techniques
- GAMs: Flexible nonlinear models for exploring sleep-organ ageing associations. Used to identify U-shaped or other nonlinear trends.
- Machine Learning for Ageing Clocks: Nested cross-validation, hyperparameter tuning, held-out test sets, algorithms including LASSO, SVR, elastic net, neural nets.
- Covariate Controls: Age, sex, anthropometrics, blood pressure, assessment center, disease presence, organ-specific confounders.
Important URLs and References
- UKBB ethics and data gateway: https://www.ukbiobank.ac.uk
- Human Protein Atlas: https://www.proteinatlas.org/
- FinnGen GWAS repository: https://www.finngen.fi/en/access_results
- PGC GWAS data: https://pgc.unc.edu/for-researchers/download-results/
- TriNetX platform: https://trinetx.com
If you want, I can also assist with:
- Summarizing specific sections or figures
- Explaining particular methodologies (e.g., GAM, imaging data types)
- Designing analyses based on these data
- Helping develop code snippets for GAM modeling or machine learning
- Or answering questions about study design, cohort characteristics, or multi-omics integration.
Please specify if you want detailed explanations or help with specific aspects!

