Browsing by Subject "APRENDIZAJE PROFUNDO"

Now showing items 1-3 of 3

A study of checkpointing in large scale training of deep neural networks

Rojas, Elvis; Kahira, Albert Njoroge; Meneses, Esteban; Bautista-Gomez, Leonardo; Badia, Rosa M (arXiv.Org, 2021-03-29)

Deep learning (DL) applications are increasingly being deployed on HPC systems to leverage the massive parallelism and computing power of those systems. While significant effort has been put to facilitate distributed ...
Exploring the effects of silent data corruption in distributed deep learning training

Rojas, Elvis; Pérez, Diego; Meneses, Esteban (Institute of Electrical and Electronics Engineers (IEEE), 2022-11-02)

The profound impact of recent developments in artificial intelligence is unquestionable. The applications of deep learning models are everywhere, from advanced natural language processing to highly accurate prediction of ...
Understanding soft error sensitivity of deep learning models and frameworks through checkpoint alteration

Rojas, Elvis; Pérez, Diego; Calhoun, Jon; Bautista-Gomez, Leonardo; Jones, Terry; Meneses, Esteban (Institute of Electrical and Electronics Engineers (IEEE), 2021-10-13)

The convergence of artificial intelligence, highperformance computing (HPC), and data science brings unique opportunities for marked advance discoveries and that leverage synergies across scientific domains. Recently, deep ...