python - Django - mysql IntegrityError 1062 -
i have cron job downloads data mysql every night @ 3. can test connection , download , works. download partially fails. (partial download) if try , re-run py script, barks. duplicate entry error key 2.
i want able run script , erase entries previous night can rerun script populates db. there 3 other tables tied one. django going if create sql script deletes yesterdays records? automatically delete necessary additions other tables, or should in script also?
traceback (most recent call last): file "manage.py", line 10, in <module> execute_from_command_line(sys.argv) file "/usr/local/lib/python2.6/site-packages/django/core/management/__init__.py", line 443, in execute_from_command_line utility.execute() file "/usr/local/lib/python2.6/site-packages/django/core/management/__init__.py", line 382, in execute self.fetch_command(subcommand).run_from_argv(self.argv) file "/usr/local/lib/python2.6/site-packages/django/core/management/base.py", line 196, in run_from_argv self.execute(*args, **options.__dict__) file "/usr/local/lib/python2.6/site-packages/django/core/management/base.py", line 232, in execute output = self.handle(*args, **options) file "/usr/local/django/grease/greaseboard/management/commands/import_patients.py", line 27, in handle mrn = row.mrn, file "/usr/local/lib/python2.6/site-packages/django/db/models/manager.py", line 134, in get_or_create return self.get_query_set().get_or_create(**kwargs) file "/usr/local/lib/python2.6/site-packages/django/db/models/query.py", line 449, in get_or_create obj.save(force_insert=true, using=self.db) file "/usr/local/lib/python2.6/site-packages/django/db/models/base.py", line 463, in save self.save_base(using=using, force_insert=force_insert, force_update=force_update) file "/usr/local/lib/python2.6/site-packages/django/db/models/base.py", line 551, in save_base result = manager._insert([self], fields=fields, return_id=update_pk, using=using, raw=raw) file "/usr/local/lib/python2.6/site-packages/django/db/models/manager.py", line 203, in _insert return insert_query(self.model, objs, fields, **kwargs) file "/usr/local/lib/python2.6/site-packages/django/db/models/query.py", line 1576, in insert_query return query.get_compiler(using=using).execute_sql(return_id) file "/usr/local/lib/python2.6/site-packages/django/db/models/sql/compiler.py", line 910, in execute_sql cursor.execute(sql, params) file "/usr/local/lib/python2.6/site-packages/django/db/backends/util.py", line 40, in execute return self.cursor.execute(sql, params) file "/usr/local/lib/python2.6/site-packages/django/db/backends/mysql/base.py", line 114, in execute return self.cursor.execute(query, args) file "/usr/local/lib/python2.6/site-packages/mysqldb/cursors.py", line 174, in execute self.errorhandler(self, exc, value) file "/usr/local/lib/python2.6/site-packages/mysqldb/connections.py", line 36, in defaulterrorhandler raise errorclass, errorvalue django.db.utils.integrityerror: (1062, "duplicate entry '000xxxxxxxx' key 2")
no idea how large work is, similar problem used transactions alongside savepoints - https://docs.djangoproject.com/en/dev/topics/db/transactions/#savepoints
so given like:
transaction.auto_commit(false) try: raw_input in streaming_filelikeobject.readline(): product = do_work(raw_input) mytable(**product).save() except somefileioerror: transaction.rollback() else: transaction.commit()
another idea have batch_id column , assign on start of each batch.
for large data sets, use memcache/redis managing inventory.
transaction.auto_commit(false) try: raw_input in streaming_filelikeobject.readline(): product = do_work(raw_input) if redis_conn.sadd("my_input_set", product['some_unique_id']): mytable(**product).save() except somefileioerror: transaction.rollback() else: transaction.commit()
.sadd() redis command returns true if element doesn't exist in redis set.
please note typing stuff of top of head method django transaction method signatures may not authoritative.
Comments
Post a Comment