Genomics use case of machine learning techniques

Genomics is a branch of biology where structural and functional knowledge of genomes is studied. This lecture is mainly focused on Big Data challenges in Genomics, application of Artificial Intelligence and Machine Learning in solving such challenges.The amount of data that we are dealing with in genomics is going to reach a zettabyte by 2020, which is much more than the computing power that we are going to have by then. A normal human contains around 32 billion base pairs, so this means storing this data itself is a big challenge let alone analyzing and sequencing the genes. This gives an insight to the depth of the Big Data challenges in the field of genomics and also why efficient Artificial Intelligence, Machine Learning techniques are needed in this area. So this makes it an interesting research direction nowadays.Artificial Intelligence and Machine Learning are used in various aspects of Genomics such as sequencing, mapping and analyzing structure of RNA, DNA. Identifying genetic variants is an interesting use case of machine learning techniques in genomics. Genetic diseases such as cancer can be cured by properly identifying the genetic variant that is causing cancer, which helps in creating cancer vaccine that modifies the appropriate genome to cure the disease. Another interesting use case is Next Generation Sequencing (NGS). Next generation sequencing is a technique to sequence entire human genome. This enables researchers to study genetics at a level so deep which was never achieved before. That is why it is also called deep DNA sequencing technology. In this method the challenge is to analyze the very long human genome input to find the protein sequence in DNA. Techniques from Artificial Intelligence can be used to efficiently solve this problem.Coming to the achievements that Big Data Analytics brought into this field, the cost of sequencing came down from $100 Million to $1000 in the span of 15 years from 2000 to 2015. Thus there is a lot of research scope in the field of Big Data analytics to solve Genomics.