The fastest way to transfer data from PostgreSQL to MS SQL

Once I needed to regularly fetch relatively large amounts of data into MS SQL from PostgreSQL. Suddenly it turned out that the most obvious way, through Linked Server to native ODBC to PostgreSQL, is very slow.






History of the issue

At the prototyping stage, everything was fine. Simply because only a few thousand records were typed. As soon as we moved on to development, the suspicion immediately arose that something was wrong with the performance:





SET STATISTICS TIME ON
DECLARE
  @sql_str nvarchar(max)

DROP TABLE IF EXISTS #t
CREATE TABLE #t (
  N int,
  T datetime
)

SELECT @sql_str='
  SELECT N, T
  FROM generate_series(1,1000,1) N
  CROSS JOIN generate_series($$2020-01-01$$::timestamp,
    $$2020-12-31$$::timestamp, $$1 day$$::interval) T'
INSERT #t (N, T)
EXEC (@sql_str) AT LINKED_SERVER_TO_POSTGRES
      
      



366 :





SQL Server Execution Times:
   CPU time = 8187 ms,  elapsed time = 14793 ms.
      
      



, - ODBC. MS bcp Linux. bcp , PostgreSQL :





SET STATISTICS TIME ON
DECLARE
  @sql_str        nvarchar(max),
  @proxy_account  sysname='proxy_account',
  @proxy_password sysname='111111'

DROP TABLE IF EXISTS ##t
CREATE TABLE ##t (
  N int,
  T datetime
)
SELECT @sql_str='
  COPY (
    SELECT N, T
    FROM generate_series(1,1000,1) N
    CROSS JOIN generate_series($$2020-01-01$$::timestamp,
      $$2020-12-31$$::timestamp, $$1 day$$::interval) T )
  TO PROGRAM $pgm$ tmp_file=$'+'(mktemp /tmp/pgsql_bcp_to_mssql.XXXXXXXXX); '
    +'cat > $tmp_file; /opt/mssql-tools/bin/bcp ''##t'' '
    +'in $tmp_file -S '+REPLACE(@@SERVERNAME,'','\')
    +' -U '+@proxy_account+' -P '''
    +@proxy_password+''' -c -b 10000000 -a 65535; '
    +'rm $tmp_file $pgm$ NULL $nil$$nil$;'
EXEC (@sql_str) AT LINKED_SERVER_TO_POSTGRES
      
      



, :





SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 881 ms.
      
      



, . , bcp Linux Kerberos. .





, bcp . . .





, SQL , . . .





, , . SQL.





:





DECLARE
  @sql_str        nvarchar(max),
  @proxy_account  sysname='proxy_account',
  @proxy_password sysname='111111'

SELECT @sql_str='
  DROP TABLE IF EXISTS ##proxy_table_'+CONVERT(nvarchar(max),@@SPID)+'
  CREATE TABLE ##proxy_table_'+CONVERT(nvarchar(max),@@SPID)+' (
    N int,
    T datetime
  )'
EXEC (@sql_str)

SELECT @sql_str='
  COPY (
    SELECT N, T
    FROM generate_series(1,1000,1) N
    CROSS JOIN generate_series($$2020-01-01$$::timestamp,
      $$2020-12-31$$::timestamp, $$1 day$$::interval) T )
  TO PROGRAM $pgm$ tmp_file=$'+'(mktemp /tmp/pgsql_bcp_to_mssql.XXXXXXXXX); '
    +'cat > $tmp_file; /opt/mssql-tools/bin/bcp ''##proxy_table_'''
    +CONVERT(nvarchar(max),@@SPID)+' '
    +'in $tmp_file -S '+REPLACE(@@SERVERNAME,'\','\\')
    +' -U '+@proxy_account+' -P '''
    +@proxy_password+''' -c -b 10000000 -a 65535; '
    +'rm $tmp_file $pgm$ NULL $nil$$nil$;'
EXEC (@sql_str) AT LINKED_SERVER_TO_POSTGRES
      
      



PostgreSQL COPY . sh. COPY, , , mktemp. , bcp , .





, COPY , bcp, COPY NULL $nil$$nil$





bcp:





  • -c - , PostgreSQL MS SQL ;





  • -b - , . . , , , ;





  • -a - . . , , .





If someone knows a faster way to get data in MS SQL from PostgreSQL - I will be very glad to see a description of this method in the comments.








All Articles