Agricultural Genomics and Bioinformatics Group

Agricultural Genomics and Bioinformatics Group - Detailed

Last modified: 21. October 2022

Detailed introduction of the group

Our group is the only one at MATE that covers bioinformatics. Mainly, we are working on basic and applied research in bioinformatics. In addition to our main research areas, we also perform bioinformatic services, such as running the university’s genomic server in Sopron and conducting bioinformatic analyses for other MATE groups.
The current members of the group are: 1 scientific advisor, 2 research fellows, 1 research associate and 2 PhD students.

The group has x projects, including basic and applied research. In the group there are   1 scientific advisor, 0 senior research fellow, 2 research fellows, 1 research associate, 2 PhD students, 0 MSc student, 0 technician, 0 laboratory aid, 0 animal carer.

Short introduction of the research projects

  • Farm animal genomics
    • supervisor: Dr Endre Barta (MATE, GBI)
    • participating researcher: Dr Viktor Stéger (MATE, GBI)
  • Studying regulatory polymorphisms (OTKA K-132814 pályázat)
    • supervisor: Dr Endre Barta (MATE, GBI)


Description of the current research projects

Topic 1.

Genomics of farm animals

Previously, first in Hungary, we have sequenced mammalian genomes and also assembled one of them de novo. In those projects we have analyzed the genome of the three Mangalica pig varieties and their specific genetic markers. This work was published in BMC Genomics. It is also of great importance that we have assembled the genome of red deer, an iconic game in Hungary, and published it in MGG. The importance of that work is clearly reflected by using that genome as reference in other research projects.

A collaborative research project (entitled ‘Házinyúl tenyészetek termelőképességének növelése genomikai módszerekkel’) by the S&K-LAP Nyúltenyésztő Kft., the University of Veterinary Medicine (UVM, Budapest), and the Hungarian Agricultural and Life Sciences University (MATE, previously by the ABC of NARIC) was initiated to increase the productivity of house-rabbit farms. This research is supported by the Hungarian National Research, Development and Innovation Office in a three years grant. The project’s aim is to develop such rabbit breeding stocks, of which progenies can be raised on antibiotic-free food, thus providing healthier and more natural meat(products) for the national and international markets. In order to reach this goal, the participating researchers develop such non-GMO breeding stocks, using modern genomic methods, which are disease-resistant and still having excellent characters and qualities.  On the medium term, enhancement of the current scientific cooperation between agro-economy, university and research institute partners is planned. We see a number of opportunities for this on areas such as genomics, diagnostics, biobank‑development, and in agro-biotechnology research, development and innovation, to increase the international competitiveness of Hungarian agro-businesses.

In this project, our group performs bioinformatic and genomic analyses, in collaboration with the University of Debrecen, using MATE and University of Debrecen servers.  So far in this work, we have analyzed nearly 300 full genomes and data from 50 RNASeq, and metagenome samples are also being analyzed. Based on primary data, we have completed the first Genome-wide association studies (GWAS), in which we have identified several genomic regions that may be involved in ERE-resistance. The functional annotation of these regions (is also underway) VAGY has started.  
The primary house-rabbit GWAS and genomic analyses were done using the publicly available OryCun2 reference rabbit genome. In has turned out, however, that that genome assembly is far from perfect, so we have started the de novo assembly of the genome of the XXL stock, used in our experiences and being important for breeding. For this, first we have used the Chromium 10x technology that did not resulted in satisfying results; thus in the second round the Pacific Biosciences Sequel II long read technology was used for genome sequencing.

In this project, we have sequenced nearly 300 rabbit genomes and also have downloaded almost 100 publicly available ones. At the end we have obtained a huge amount of genome data, which are good for not only population genomic analyses, but also for marker development, which can make breeders’ work more efficient.

To analyze the mentioned amount of genome data, we have developed a bioinformatic pipeline, which has to handle both the huge amount of data and also helps to bridge the long geographical distance between the available servers that we had to use in the absence of a server-cluster. Data analyses were aided by the Genomic Group of the FIK Big Data area of the University of Debrecen.
 Using the pipeline, we have obtained VCL files containing genome variations. Linking these, a huge, larger than 100 Gbytes file is obtained, which is suitable for further analyses, such as surveying population structure by Admixture and Principal Component Analysis (PCA). For example, evolutionary progresses were monitored by searching homozygous regions and analyzing selective pressure. As a result of our work, we have revealed the purity, admixture and the level of inbreeding of different lines used for meat production. We have developed a database containing genomic variations, which can be used for the rapid digital genotyping of all sequenced individual animals for certain important markers.


Topic 2.

Studying regulatory polymorphisms

In farm animals breeding, certain, economically important traits should be enhanced. At the DNA level, the qualitative and/or quantitative differences in these traits are manifested in different variations called polymorphisms. Simplified, these variations either cause changes in the amino acid sequence of proteins or are in the binding site(s) of transcription factors.  The latter is called regulatory polymorphism.  Another major research area of our group is the exploration of such regulatory polymorphisms in the genome of farm animals that may have great importance in their breeding.

Comparing the entire genome-sequence of two individuals can result in millions of variations. Therefore, it is important to determine that which of those can be considered as ’regulatory polymorphism’. Since in farm animals there are no functional genomics data about the binding sites of transcription factors, we first analyzed human and mouse ChIP-Seq data, in collaboration with the University of Debrecen. Based on this analysis, we are mapping putative transcription factor binding sites in these two genomes. Comparing full genome sequences of different species, the genome coordinates of the putative transcription factor binding sites can be mirrored from one species to another.

In the first phase of our work, we have analyzed ChIP-Seq data obtained from 3702 human experiments and built a catalogue of the genomic binding site for 262 human transcription binding sites. We have published our results as the ChIPSummitDB database .
 Using the publicly available rabbit genome sequence, we are now transforming the positions of the human transcription binding sites onto the rabbit genome. Using the more than 300 full rabbit genomes having in our hands, we have the opportunity to investigate how conservative the putative rabbit binding sites are. From the VKE project we have RNA‑Seq data from the intestine wall and muscle of more than 50 rabbit.  Next, we are going to analyze whether different gene-expression levels, determined by RNA-Seq, are correlating with the position of regulatory polymorphisms in the putative transcription factor binding sites. ​​

 

Main publications of the group:
​​​​​​​

Frank K, Bana NÁ, Bleier N, Sugár L, Nagy J, Wilhelm J, Kálmán Z, Barta E, Orosz L, Horn P, Stéger V*Mining the red deer genome (CerEla1.0) to develop X-and Y-chromosome-linked STR markers. PLoS One. 2020;15(11):e0242506. doi: 10.1371/journal.pone.0242506. eCollection 2020. PubMed PMID: 33226998; PubMed Central PMCID: PMC7986210.

Czipa E, Schiller M, Nagy T, Kontra L, Steiner L, Koller J, Pálné-Szén O, Barta E*ChIPSummitDB: a ChIP-seq-based database of human transcription factor binding sites and the topological arrangements of the proteins bound to them. Database (Oxford). 2020 Jan 1;2020. doi: 10.1093/database/baz141

Bana NÁ, Nyiri A, Nagy J, Frank K, Nagy T, Stéger V, Schiller M, Lakatos P, Sugár L, Horn P, Barta E, Orosz L*The red deer Cervus elaphus genome CerEla1.0: sequencing, annotating, genes, and chromosomes. Mol Genet Genomics. 2018 Jun;293(3):665-684. doi: 10.1007/s00438-017-1412-3. Epub 2018 Jan 2. PubMed PMID: 29294181.

Taller D, Bálint J, Gyula P, Nagy T, Barta E, Baksa I, Szittya G, Taller J, Havelda Z*Expansion of Capsicum annuum fruit is linked to dynamic tissue-specific differential expression of miRNA and siRNA profiles. PLoS One. 2018;13(7):e0200207. doi: 10.1371/journal.pone.0200207. eCollection 2018. PubMed PMID: 30044813; PubMed Central PMCID: PMC6059424.

Frank K, Molnár J, Barta E, Marincs F*The full mitochondrial genomes of Mangalica pig breeds and their possible origin. Mitochondrial DNA B Resour. 2017 Oct 17;2(2):730-734. doi: 10.1080/23802359.2017.1390415. PubMed PMID: 33473962; PubMed Central PMCID: PMC7800509.

Molnár J, Nagy T, Stéger V, Tóth G, Marincs F, Barta E*. Genome sequencing and analysis of Mangalica, a fatty local pig of Hungary. BMC Genomics. 2014 Sep 5;15:761. doi: 10.1186/1471-2164-15-761. PubMed PMID: 25193519; PubMed Central PMCID: PMC4162939.

Marincs F*, Molnár J, Tóth G, Stéger V, Barta E. Introgression and isolation contributed to the development of Hungarian Mangalica pigs from a particular European ancient bloodline. Genet Sel Evol. 2013 Jul 1;45:22. doi: 10.1186/1297-9686-45-22. PubMed PMID: 23815680; PubMed Central PMCID: PMC3704957.

Barta E*, Sebestyén E, Pálfy TB, Tóth G, Ortutay CP, Patthy L. DoOP: Databases of Orthologous Promoters, collections of clusters of orthologous upstream sequences from chordates and plants. Nucleic Acids Res. 2005 Jan 1;33(Database issue):D86-90. doi: 10.1093/nar/gki097. PubMed PMID: 15608291; PubMed Central PMCID: PMC540051.

Barta E*, Kaján L, Pongor S. IS: a web-site for intron statistics. Bioinformatics. 2003 Mar 1;19(4):543. doi: 10.1093/bioinformatics/btg019. PubMed PMID: 12611812.

Barta E*, Pintar A, Pongor S. Repeats with variations: accelerated evolution of the Pin2 family of proteinase inhibitors. Trends Genet. 2002 Dec;18(12):600-3. doi: 10.1016/s0168-9525(02)02771-3. PubMed PMID: 12446136.