Speed ​​up loading large volumes in PostgreSQL using COPY from STDIN binary

Foreword





I am learning PostgreSQL at home and really enjoy processing large amounts of data. I am writing in C / C ++ on the Qt framework. Unfortunately the Qt post driver does not support the functionality required for fast loading. Therefore, I wrote my library in C ++ for this, and now I want to share with you this wonderful method of adding and the library itself.









Hello $ username!





Today we will talk about fast data loading into PostgreSQL DBMS (hereinafter referred to as `postik`). We will do this through the COPY mechanism with the transfer of data over the network in a binary format.









First of all, consider the advantages of this method of adding:





  • Very fast adding speed





    Everything is due to the fact that we remove the need for data processing (various transformations) to a minimum, the post only has to check whether we are using the correct format.





  • We do not lose data, in contrast to the text format.





    For example, how can this happen with a double number. We will not need to figure out how many digits before and after the decimal point in this method. The data is transmitted `as is`.









, . , .. . , β€” β€œlibpq-fe.h”. - /++.





:





[ ]





{ }





{ }





{ }





. . .





{ }





[ ]













:





  • COPY-





'P','G','C','O','P','Y','\n','\377','\r','\n','\0'
      
      







'\0','\0','\0','\0'
      
      



, OID – 16- 1









'\0','\0','\0','\0'
      
      



, . , .









  • ( 13.1 ), .













:









int16_t , , . , , , .





  • :





    1)





    , ,





    2)









, :





.





int64_t , . ( ). , - . , COPY TO



, .













0xff, 0xff
      
      



. , β€” , ( -1).





  • , β€”





string conninfo = 
  "host=127.0.0.1 port=5432 dbname=postgres user=postgres password=postgres connect_timeout=10";
PGconn *conn = PQconnectdb(conninfo.c_str());
// conninfo -     ( connect_timeout    )

      
      



  • COPY-





pg_result *res = PQexec(conn, cquery);
      
      



cquery COPY . COPY testtable5 ( col1, col2, col3, col4 ) FROM STDIN (format binary);















PQputCopyData(conn, buf, currentSize);
      
      



, buf – , currentSize β€” .









, . . 2-128 .





  • ,





PQputCopyEnd(conn, NULL);
      
      







!





.





? , int16_t tmp = 2;



: 0x02, 0x00



0x00, 0x02



. . SPARC . , SPARC-, ( )





.

Qt:





db.open();
QSqlQuery query(db);
query.prepare("insert into testtable5 ( col1, col2, col3, col4 ) values (?,?,?,?);");
for(int i=0; i<20000000; i++)
{
    query.addBindValue("column1");
    query.addBindValue(double(12983712987.4383453947384734853872837));
    query.addBindValue(int(12345678));
    query.addBindValue(float(123.4567));
    query.exec();
}
      
      







( ) 10.000 - Y, - - X.









COPY INSERT .





- INSERT- .





- INSERT- .





- COPY- .





- COPY- .





INSERT .





- INSERT- .





- INSERT- .





COPY .





- COPY- .





- COPY- .





.





:





10.000 ( ), :





12.620 INSERT





12.050 INSERT





150 COPY





120 COPY





... COPY. , .





β€” . , , , .





GitHub





COPY









P.S.: , .





.








All Articles