- Journal Home
- Volume 37 - 2025
- Volume 36 - 2024
- Volume 35 - 2024
- Volume 34 - 2023
- Volume 33 - 2023
- Volume 32 - 2022
- Volume 31 - 2022
- Volume 30 - 2021
- Volume 29 - 2021
- Volume 28 - 2020
- Volume 27 - 2020
- Volume 26 - 2019
- Volume 25 - 2019
- Volume 24 - 2018
- Volume 23 - 2018
- Volume 22 - 2017
- Volume 21 - 2017
- Volume 20 - 2016
- Volume 19 - 2016
- Volume 18 - 2015
- Volume 17 - 2015
- Volume 16 - 2014
- Volume 15 - 2014
- Volume 14 - 2013
- Volume 13 - 2013
- Volume 12 - 2012
- Volume 11 - 2012
- Volume 10 - 2011
- Volume 9 - 2011
- Volume 8 - 2010
- Volume 7 - 2010
- Volume 6 - 2009
- Volume 5 - 2009
- Volume 4 - 2008
- Volume 3 - 2008
- Volume 2 - 2007
- Volume 1 - 2006
Commun. Comput. Phys., 16 (2014), pp. 599-611.
Published online: 2014-12
Cited by
- BibTex
- RIS
- TXT
The implicit 2D3V particle-in-cell (PIC) code developed to study the interaction of ultrashort pulse lasers with matter [G. M. Petrov and J. Davis, Computer Phys. Comm. 179, 868 (2008); Phys. Plasmas 18, 073102 (2011)] has been parallelized using MPI (Message Passing Interface). The parallelization strategy is optimized for a small number of computer cores, up to about 64. Details on the algorithm implementation are given with emphasis on code optimization by overlapping computations with communications. Performance evaluation for 1D domain decomposition has been made on a small Linux cluster with 64 computer cores for two typical regimes of PIC operation: "particle dominated", for which the bulk of the computation time is spent on pushing particles, and "field dominated", for which computing the fields is prevalent. For a small number of computer cores, less than 32, the MPI implementation offers a significant numerical speed-up. In the "particle dominated" regime it is close to the maximum theoretical one, while in the "field dominated" regime it is about 75-80% of the maximum speed-up. For a number of cores exceeding 32, performance degradation takes place as a result of the adopted 1D domain decomposition. The code parallelization will allow future implementation of atomic physics and extension to three dimensions.
}, issn = {1991-7120}, doi = {https://doi.org/10.4208/cicp.070813.280214a}, url = {http://global-sci.org/intro/article_detail/cicp/7055.html} }The implicit 2D3V particle-in-cell (PIC) code developed to study the interaction of ultrashort pulse lasers with matter [G. M. Petrov and J. Davis, Computer Phys. Comm. 179, 868 (2008); Phys. Plasmas 18, 073102 (2011)] has been parallelized using MPI (Message Passing Interface). The parallelization strategy is optimized for a small number of computer cores, up to about 64. Details on the algorithm implementation are given with emphasis on code optimization by overlapping computations with communications. Performance evaluation for 1D domain decomposition has been made on a small Linux cluster with 64 computer cores for two typical regimes of PIC operation: "particle dominated", for which the bulk of the computation time is spent on pushing particles, and "field dominated", for which computing the fields is prevalent. For a small number of computer cores, less than 32, the MPI implementation offers a significant numerical speed-up. In the "particle dominated" regime it is close to the maximum theoretical one, while in the "field dominated" regime it is about 75-80% of the maximum speed-up. For a number of cores exceeding 32, performance degradation takes place as a result of the adopted 1D domain decomposition. The code parallelization will allow future implementation of atomic physics and extension to three dimensions.