How much is the schedule

The basic data of computational experiments on the reorganization of the tiered-parallel form (LPF) of information graphs of algorithms (TGA) are given in the previous publication . The purpose of the current publication is to show the final results of research on the development of schedules for executing parallel programs in terms of the computational complexity of the transformation itself and the quality of the resulting schedules. This work is the result of a well-defined cycle of research in the area under consideration.





As mentioned earlier, the computational complexity (BT) in this case will be calculated in units of movement of operators from tier to tier in the process of reorganizing the YAPF. This approach is close to the classical method for determining the BT of ordering (sorting) operations of numeric arrays; the disadvantage is that it does not take into account the complexity of the procedures for determining the elements for permutation.





Because in the accepted model, the YAPF actually determines the order of execution of the operators of a parallel program (the operators are executed in groups of tiers one by one), for the sake of shortening, we will sometimes use the abbreviation “YAPF” as a synonym for the concept of a plan (schedule) for executing a parallel program. For obvious reasons, the studies were carried out on data of a relatively small volume on the assumption that the correctness of the results obtained was preserved when processing data of a larger size. The studies described in this publication are aimed at demonstrating the capabilities of the available tools in solving the assigned tasks. If desired, it is possible to investigate an arbitrary algorithm by describing and debugging it in the Data-Flow module with subsequent import in the format of an information graph into the SPF @ home module for further processing.





We continue to consider obtaining the maximum code density (in fact, the maximum load of the available individual computers of the parallel computing system) as the main goal of the LPF transformations . By the way, it is precisely with these concepts that the well-known evil-ironic statement about the excessive number of NOP-instructions in the “bundles” of an extra-long machine word in the VLIW-architecture computers is connected (even if there are sections of completely sequential code, the gaps in an extra-long word formally should be filled with some kind of operation - “ dummy ") ...





, (   ), Lua . ( ).





  ( ),   ( - ). , .   





    . ,   , .2 SPF@home (http://vbakanov.ru/spf@home/content/install_spf.exe). – , {k,l} ( ) ik,jk il,jl, i,j – ( , ; ).





(, ) , – .





( ) (, ) –   “1-01_bulldozer” vs “1-02_bulldozer”, - “WidthByWidtn” vs “Dichotomy”. , …





1.

  () . ( ). ( ). .. , .





– “1-01_bulldozer” “1-02_bulldozer”.





. 1-3; ( ):





  • a), b) ) – , (CV ),  ( ) ;





  • (), () - () – , “1-01_bulldozer”   “1-02_bulldozer” c.





 1.         
       2,3,5,7,10-  ( 
   )
1. 2,3,5,7,10- ( )
 2.         
        5,10,15,20- 
 (    )
2. 5,10,15,20- ( )
 3.         
       ()   
2,3,4,5,7,10-  (     )  
()
3. () 2,3,4,5,7,10- ( ) ()

. 1-3 , . ., . 1a) 1,7 ( “1-01_bulldozer”) 3 ( “1-02_bulldozer”) 10- .





(. 1b) 0,3 ( ) “1-02_bulldozer” , , .





(. 1c) “1-02_bulldozer” ( 3,7 10) “1-01_bulldozer”.





, .





  “1-02_bulldozer” (. 2).





() 10 (. 3) . (. 3a), CV (. 3b), “1-02_bulldozer” (. 3c).





 , , (   ) . .. , ( ).





2.

VLIW- ( “”, “” ). .





  W ( W=W0 W=1, W0 – , ). – “Dichotomy” “WidthByWidtn”:





  • “Dichotomy”. – c W c    . W, ,   W. , “” ( ).





  • “WidthByWidtn”. N>W   , :





  ,  .





. 4,5  -     () ; “WidthByWidtn” “Dichotomy” . ,   “” .





 4.   ()     (), 
;        
 5  10-  – . a)  b)
4. () (), ; 5 10- – . a) b)
 5.   ()     (), 
;         ()   5  10-  – 
. a)  b)
5. () (), ; () 5 10- – . a) b)

. 4 5, ( , ,  !). , .





“ -” “WidthByWidtn” , “Dichotomy”; . “WidthByWidtn” , N./W. , N. – , W. – .





 6.      - a)    CV - b)           10-    (  –     )
6. - a) CV - b) 10- ( – )
Figure 7. The number of movements of operators between tiers - a) and the coefficient of variation CV - b) with a decrease in the width of the LF for the algorithm for solving the system of linear algebraic equations of the 10th order by the direct (non-iterative) Gaussian method (the abscissa is the width of the LF after reforming)
7. - a) CV - b) 10- () ( – )

, . 6 7, ( , – ).   . 6 7,   “WidthByWidtn” ( 3-4 ) ( ) “Dichotomy” ( ). , () “WidthByWidtn” “Dichotomy” ( ).





.. () . .





, ( ) .





  ( ) .






:





  • (https://habr.com/ru/post/530078/, 26.11.2021)





  •   (https://habr.com/ru/post/534722/, 24.12.2021)





  • (https://habr.com/ru/post/535926/, 03.01.2021)





  • The dynamics of the stream calculator ( https://habr.com/ru/post/540122/ , 02/01/2021)





  • Concurrency and code density ( https://habr.com/ru/post/545498/ , 03/05/2021)





  • How much is the schedule ( https://habr.com/ru/post/551688/ , 10.04.2021) - current








All Articles