Another Win 10 quirk - and how to deal with it

Sometimes it is necessary to improve the qualification, and in general it is useful for the brains to study. That is why I recently took a course to learn Python and all its frameworks. The other day I got to Django. And here, in the course of training, we collectively caught not only a bug, but a marvelous effect at the junction of Python 3, Sqlite 3, JSON and Win 10. Moreover, the effect was so marvelous that Google did not help us - the whole interested group had to get together with the teacher and to solve it with the collective mind.





But the point is this: we studied the database (and Django has Sqlite 3 preinstalled) and, so that we don't need to drive in the data again with our hands, we tightened the loading with a script from json files. And the data from the database was dumped into the files using the Python methods:





python manage.py dumpdata -e contenttypes -o db.json
      
      



Suddenly, those who worked under Windows (I canโ€™t vouch for all versions, only Win 10 residents came up with us), found that they were dumping in windows-1251 encoding. Moreover, jaysons in this encoding are perfectly fed to the database. But it was only necessary to reformat them into the standard documents for Sqlite 3, Python 3 and especially for JSON UTF-8 encoding, as at best, the Cyrillic alphabet in the database turned into a pumpkin, and at worst, the entire data loading process broke down.





Nothing of the kind could be found either in the documentation or in the rest of Google, including the English language. What is most mysterious, manual loading of the same data through the console or admin panel of the project worked like a clock, although the encoding there was definitely UTF-8. Moreover, the compulsory prescription of the encoding to the database had no effect.





We assumed that the reason for the effect was the interaction of Jason with the operating system - somehow, when writing and reading Jason, the system imposed its own encoding instead of the normal one. And indeed, when the UTF-8 encoding was forcibly set when opening the file:





open(os.path.join(JSON_PATH, file_name + '.json'), 'r', encoding="utf-8")
      
      



it was not krakozyabry that got into the base, but normal Russian letters. But the problem with creating a dump in this way cannot be solved, and then redoing the encoding by hand is also somehow not our way.





.





. :





  • , , :





  • ( ) :





  • "" " ":





  • "-: (UTF-8) .





, .





, ( , ), - . .








All Articles