Publications
2024 |
Lopez-Fernandez, A.; Gómez-Vela, F.; Saz-Navarro, Dulcenombre M.; Delgado, F. M.; Rodríguez-Baena, D. Optimized Python library for reconstruction of ensemble-based gene co-expression networks using multi-GPU Journal Article In: The Journal of Supercomputing, 2024, ISSN: 1573-0484. Abstract | Links | BibTeX | Tags: Big Data, Bioinformatics, Data Mining, Gene co-expression network, GPU, High-Performance Computing @article{Lopez-Fernandez2024b, Gene co-expression networks are valuable tools for discovering biologically relevant information within gene expression data. However, analysing large datasets presents challenges due to the identification of nonlinear gene–gene associations and the need to process an ever-growing number of gene pairs and their potential network connections. These challenges mean that some experiments are discarded because the techniques do not support these intense workloads. This paper presents pyEnGNet, a Python library that can generate gene co-expression networks in High-performance computing environments. To do this, pyEnGNet harnesses CPU and multi-GPU parallel computing resources, efficiently handling large datasets. These implementations have optimised memory management and processing, delivering timely results. We have used synthetic datasets to prove the runtime and intensive workload improvements. In addition, pyEnGNet was used in a real-life study of patients after allogeneic stem cell transplantation with invasive aspergillosis and was able to detect biological perspectives in the study. |
Lopez-Fernandez, A.; Gómez-Vela, F.; González-Domínguez, J.; Bidare-Divakarachari, P. bioScience: A new python science library for high-performance computing bioinformatics analytics Journal Article In: SoftwareX, vol. 26, pp. 101666, 2024, ISSN: 2352-7110. Abstract | Links | BibTeX | Tags: Bioinformatics, Data analysis, Data Mining, Data science, High-Performance Computing @article{Lopez-Fernandez2024, BioScience is an advanced Python library designed to satisfy the growing data analysis needs in the field of bioinformatics by leveraging High-Performance Computing (HPC). This library encompasses a vast multitude of functionalities, from loading specialized gene expression datasets (microarrays, RNA-Seq, etc.) to preprocessing techniques and data mining algorithms suitable for this type of datasets. BioScience is distinguished by its capacity to manage large amounts of biological data, providing users with efficient and scalable tools for the analysis of genomic and transcriptomic data through the use of parallel architectures for clusters composed of CPUs and GPUs. |