[Mayan EDMS: 1892] Bulk Metadata creation

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

[Mayan EDMS: 1892] Bulk Metadata creation

Gerrit Van Dyk
Hi

I am trying to create a metadata field for documents that has some other metadata field.

When the code is run as a command, it runs most of the time, but then in some cases it gets "maximum recursion level reached" with the following logs:

lock_manager.managers <29572> [DEBUG] "acquire_lock() trying to acquire lock: document_indexing_task_do_rebuild_all_indexes"

lock_manager.managers <29572> [DEBUG] "acquire_lock() IntegrityError: duplicate key value violates unique constraint "lock_manager_lock_name_key"

DETAIL:  Key (name)=(document_indexing_task_do_rebuild_all_indexes) already exists.


What is the correct way of doing bulk metadata updates without running into the above problems?


The code is as follows:

from documents.models import Document
from metadata.models import MetadataType

class Command(BaseCommand):
    def handle(self, *args, **options):
        docs = Document.objects.all()
        for doc in docs:
            if doc.metadata.filter(metadata_type__name='field1').exists():
                md = MetadataType.objects.get(name="field2")
                doc.metadata.create(metadata_type=md,value="123")

--

---
You received this message because you are subscribed to the Google Groups "Mayan EDMS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

[Mayan EDMS: 1894] Re: Bulk Metadata creation

rosarior
Administrator
Hi,

The code looks good, the issue is with the indexing that is being triggered by the metadata updates. What version are you using?
The indexing code was rewritten in 2.3 to use less locking.

For the time being you can disable the indexes at the start of the code and re-enable them at the end and it should work.

from documents.models import Document
from document_indexing.models import Index
from metadata.models import MetadataType  # Added

class Command(BaseCommand):
    def handle(self, *args, **options):
         Index.objects.update(enabled=False)  # Disable all indexes

        docs = Document.objects.all()
        for doc in docs:
            if doc.metadata.filter(metadata_type__name='field1').exists():
                md = MetadataType.objects.get(name="field2")
                doc.metadata.create(metadata_type=md,value="123")

         Index.objects.update(enabled=True)  # Enable all indexes



On Friday, July 14, 2017 at 10:03:34 AM UTC-4, Gerrit Van Dyk wrote:
Hi

I am trying to create a metadata field for documents that has some other metadata field.

When the code is run as a command, it runs most of the time, but then in some cases it gets "maximum recursion level reached" with the following logs:

lock_manager.managers <29572> [DEBUG] "acquire_lock() trying to acquire lock: document_indexing_task_do_rebuild_all_indexes"

lock_manager.managers <29572> [DEBUG] "acquire_lock() IntegrityError: duplicate key value violates unique constraint "lock_manager_lock_name_key"

DETAIL:  Key (name)=(document_indexing_task_do_rebuild_all_indexes) already exists.


What is the correct way of doing bulk metadata updates without running into the above problems?


The code is as follows:

from documents.models import Document
from metadata.models import MetadataType

class Command(BaseCommand):
    def handle(self, *args, **options):
        docs = Document.objects.all()
        for doc in docs:
            if doc.metadata.filter(metadata_type__name='field1').exists():
                md = MetadataType.objects.get(name="field2")
                doc.metadata.create(metadata_type=md,value="123")

--

---
You received this message because you are subscribed to the Google Groups "Mayan EDMS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

[Mayan EDMS: 1992] Re: Bulk Metadata creation

Gerrit Van Dyk
Hi,

Thanks for your reply.

I have implemented the code as you said, and the errors did disappear.

How do I force an index rebuild after all the changes have been done, as I don't see the changes reflected in the indexes.

Gerrit

On Friday, July 14, 2017 at 7:24:37 PM UTC+2, Roberto Rosario wrote:
Hi,

The code looks good, the issue is with the indexing that is being triggered by the metadata updates. What version are you using?
The indexing code was rewritten in 2.3 to use less locking.

For the time being you can disable the indexes at the start of the code and re-enable them at the end and it should work.

from documents.models import Document
from document_indexing.models import Index
from metadata.models import MetadataType  # Added

class Command(BaseCommand):
    def handle(self, *args, **options):
         Index.objects.update(enabled=False)  # Disable all indexes

        docs = Document.objects.all()
        for doc in docs:
            if doc.metadata.filter(metadata_type__name='field1').exists():
                md = MetadataType.objects.get(name="field2")
                doc.metadata.create(metadata_type=md,value="123")

         Index.objects.update(enabled=True)  # Enable all indexes



On Friday, July 14, 2017 at 10:03:34 AM UTC-4, Gerrit Van Dyk wrote:
Hi

I am trying to create a metadata field for documents that has some other metadata field.

When the code is run as a command, it runs most of the time, but then in some cases it gets "maximum recursion level reached" with the following logs:

lock_manager.managers <29572> [DEBUG] "acquire_lock() trying to acquire lock: document_indexing_task_do_rebuild_all_indexes"

lock_manager.managers <29572> [DEBUG] "acquire_lock() IntegrityError: duplicate key value violates unique constraint "lock_manager_lock_name_key"

DETAIL:  Key (name)=(document_indexing_task_do_rebuild_all_indexes) already exists.


What is the correct way of doing bulk metadata updates without running into the above problems?


The code is as follows:

from documents.models import Document
from metadata.models import MetadataType

class Command(BaseCommand):
    def handle(self, *args, **options):
        docs = Document.objects.all()
        for doc in docs:
            if doc.metadata.filter(metadata_type__name='field1').exists():
                md = MetadataType.objects.get(name="field2")
                doc.metadata.create(metadata_type=md,value="123")

--

---
You received this message because you are subscribed to the Google Groups "Mayan EDMS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

[Mayan EDMS: 1999] Re: Bulk Metadata creation

rosarior
Administrator
Yes, since the indexes are disabled during the metadata update they will not update. You can force the rebuild by calling the .rebuild() method of each Index model instance.

for index in Index.objects.all():
    index.rebuild()

Since this will call the rebuild in a synchronous manner each call will block and take a while to complete if the index is big.

You can call the index rebuild as a background task as follows:

from document_indexing.tasks import task_rebuild_index

for index in Index.objects.all():
    task_rebuild_index.apply_async(kwargs=dict(index_id=index.pk))


On Thursday, August 10, 2017 at 10:07:40 AM UTC-4, Gerrit Van Dyk wrote:
Hi,

Thanks for your reply.

I have implemented the code as you said, and the errors did disappear.

How do I force an index rebuild after all the changes have been done, as I don't see the changes reflected in the indexes.

Gerrit

On Friday, July 14, 2017 at 7:24:37 PM UTC+2, Roberto Rosario wrote:
Hi,

The code looks good, the issue is with the indexing that is being triggered by the metadata updates. What version are you using?
The indexing code was rewritten in 2.3 to use less locking.

For the time being you can disable the indexes at the start of the code and re-enable them at the end and it should work.

from documents.models import Document
from document_indexing.models import Index
from metadata.models import MetadataType  # Added

class Command(BaseCommand):
    def handle(self, *args, **options):
         Index.objects.update(enabled=False)  # Disable all indexes

        docs = Document.objects.all()
        for doc in docs:
            if doc.metadata.filter(metadata_type__name='field1').exists():
                md = MetadataType.objects.get(name="field2")
                doc.metadata.create(metadata_type=md,value="123")

         Index.objects.update(enabled=True)  # Enable all indexes



On Friday, July 14, 2017 at 10:03:34 AM UTC-4, Gerrit Van Dyk wrote:
Hi

I am trying to create a metadata field for documents that has some other metadata field.

When the code is run as a command, it runs most of the time, but then in some cases it gets "maximum recursion level reached" with the following logs:

lock_manager.managers <29572> [DEBUG] "acquire_lock() trying to acquire lock: document_indexing_task_do_rebuild_all_indexes"

lock_manager.managers <29572> [DEBUG] "acquire_lock() IntegrityError: duplicate key value violates unique constraint "lock_manager_lock_name_key"

DETAIL:  Key (name)=(document_indexing_task_do_rebuild_all_indexes) already exists.


What is the correct way of doing bulk metadata updates without running into the above problems?


The code is as follows:

from documents.models import Document
from metadata.models import MetadataType

class Command(BaseCommand):
    def handle(self, *args, **options):
        docs = Document.objects.all()
        for doc in docs:
            if doc.metadata.filter(metadata_type__name='field1').exists():
                md = MetadataType.objects.get(name="field2")
                doc.metadata.create(metadata_type=md,value="123")

--

---
You received this message because you are subscribed to the Google Groups "Mayan EDMS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.