Neural Networks and Nucleotides: AI in Genomic Manufacturing

Plant breeding is pivotal in ensuring stable food for the growing global population. To meet increasing food demands efficiently, plant breeding must achieve high rates of genetic gain. Genomic selection is a powerful tool, leveraging genome-wide DNA variation and phenotypic data to predict the performance of unobserved individuals. Empirical studies have demonstrated GSâ€™s superiority over conventional methods, enhancing selection gains and reducing breeding cycles across various crops. Furthermore, deep learning techniques, a subset of artificial intelligence, are increasingly explored in genomic prediction, showing promise in improving prediction accuracy, particularly with the expanding volume of genetic data. This intersection of genomics and DL holds the potential for revolutionizing various fields, including precision medicine and agriculture.

Deep Learning Architectures: A Genomic Perspective:

Recent advancements in genomic deep learning architectures have enabled more efficient and accurate biological data processing. CNNs excel in capturing genomic motifs, while RNNs handle sequential data like DNA sequences. Autoencoders, including Variational Autoencoders (VAEs), are valuable for feature extraction and dimensionality reduction. Emerging architectures, like hybrid models combining CNNs and RNNs, tackle specific genomic tasks effectively. Transformer-based LLMs, such as GPT, overcome the limitations of CNNs and RNNs by efficiently processing long sequences and capturing global dependencies. However, the high cost of training and serving LLMs remains challenging, especially for genomics tasks with extensive data requirements and privacy concerns.

Genomic Applications:

Deep learning is a powerful tool in various genomic applications, including gene expression characterization, regulatory genomics, functional genomics, and structural genomics. In gene expression characterization, deep learning models like denoising autoencoders and variational autoencoders have been employed to extract features from gene expression data, leading to an understanding of biological processes and better performance in tasks such as clustering and prediction. Moreover, deep learning methods have shown promise in predicting gene expression levels from DNA sequences, incorporating epigenetic data for enhanced accuracy, and even utilizing generative models to explore hypothetical gene expression profiles under different perturbations.

In regulatory genomics, deep learning techniques have been applied to identify regulatory motifs such as promoters, enhancers, and splice sites, with CNNs being particularly effective in capturing sequence features. Subcellular localization prediction of proteins has also benefited from deep learning, with models like CNNs and RNNs achieving high accuracy by effectively learning from biological sequence data. Additionally, deep learning methods in structural genomics have shown promise in protein structure classification and homology detection, leveraging techniques such as LSTM networks and CNNs to extract features from amino acid sequences and accurately classify protein folds. Overall, deep learning revolutionizes genomic research by providing powerful tools for analyzing complex biological data and uncovering novel insights into genetic mechanisms.

Materials and methods:

The study employed two datasets from the 1000 Genomes project, consisting of 10,000 and 65,535 single-nucleotide polymorphisms (SNPs) on specific chromosomal regions. They trained generative models, including Wasserstein GAN with gradient penalty (WGAN-GP), Restricted Boltzmann Machines (RBM), and Variational Autoencoders (VAE) to generate artificial genomic sequences. WGAN-GP and VAE were implemented with convolutional layers, while RBM utilized out-of-equilibrium learning. The evaluation included assessing the modelsâ€™ ability to mimic real data via PCA and calculating the nearest neighbor adversarial accuracy (AATS) to measure overfitting and underfitting. Privacy leakage was quantified using a privacy score computed from AATS values of test and training datasets.

Generating large-scale genomic data:

The study trained WGAN and CRBM models on 1000 genome data containing 65,535 SNPs to generate artificial genomic sequences. While the VAE model couldnâ€™t be trained effectively, WGAN and CRBM generated sequences that well captured real population structure and allele frequencies. However, WGAN-generated sequences had more fixed alleles with low frequencies than CRBM. LD decay analysis showed that both models had lower LD than real genomes. CRBM outperformed WGAN in 3-point correlation analysis but showed anomalies in AATS values, potentially indicating sequences outside the real data space. Further analysis revealed higher frequencies of chains of true data points compared to synthetic ones.

Conclusion:

Deep learning shows promise in genomic research for its ability to capture nonlinear patterns and integrate diverse data sources without explicit feature engineering. However, its superiority over conventional models in predictive power has yet to be definitive. While generative neural networks can efficiently simulate large-scale genomic data, challenges like computational complexity and model optimization persist. Privacy concerns also necessitate further investigation. Despite these hurdles, advancements in model training and privacy safeguards could lead to artificial genome banks, expanding access to genomic data. Deep learning holds the potential to revolutionize genomics but requires careful navigation of challenges to achieve meaningful breakthroughs in predictive accuracy and interoperability.

Sources:

https://www.mdpi.com/1422-0067/24/21/15858

https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1011584

https://bmcgenomics.biomedcentral.com/articles/10.1186/s12864-020-07319-x

The post Neural Networks and Nucleotides: AI in Genomic Manufacturing appeared first on MarkTechPost.

Source: Read MoreÂ

IBM’s next generation Granite models are now available

The Human Element: Using Research And Psychology To Elevate Data Storytelling

Google to offer free version of Gemini Code Assist

MongoDB acquires Voyage AI for its embedding and reranking models

AI-generated content in games is here to stay — the bigger issue is the outright deception and what the future may look like

Razer and Minecraft just announced a limited-edition collection, and I’m surprised it took so long

Panos Panay’s Amazon AI move: A bold bet or another Surface Duo?

OpenAI expands ‘Deep Reseach’ to those paying $20 a month or more, a day after Microsoft made OpenAI’s ‘Think Deeper’ free for all Copilot users with no usage caps

Rethink State💡 Why You Should Model Your Frontend Around Events

Rethink State💡 Why You Should Model Your Frontend Around Events

What To Expect When Migrating Your Site To A New Platform

Kotlin Multiplatform vs. React Native vs. Flutter: Building Your First App

AI-generated content in games is here to stay — the bigger issue is the outright deception and what the future may look like

AI-generated content in games is here to stay — the bigger issue is the outright deception and what the future may look like

Razer and Minecraft just announced a limited-edition collection, and I’m surprised it took so long

Panos Panay’s Amazon AI move: A bold bet or another Surface Duo?

Neural Networks and Nucleotides: AI in Genomic Manufacturing

ANDI Accessibility Testing Tool Tutorial

How Data Analytics in Insurance is Driving Smarter Decisions

North Korean hackers masquerade as remote IT workers and venture capitalists to steal crypto and secrets

Symposium highlights scale of mental health crisis and novel methods of diagnosis and treatment

API with NestJS #181. Prepared statements in PostgreSQL with Drizzle ORM

Mastering SFMC Automation Studio: Activities, Capabilities, and Journey Integration

New tech, new problems: Why application development needs a big-picture view

Kyutaiâ€™s AI voice assistant beats OpenAI to public release

Transformative Impact of Artificial Intelligence AI on Medicine: From Imaging to Distributed Healthcare Systems

8 Best Online Collaboration Software for Teams in 2024

Neural Networks and Nucleotides: AI in Genomic Manufacturing

Related Posts