[Mayan EDMS: 2115] Mayan's document_cache size on disk?

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

[Mayan EDMS: 2115] Mayan's document_cache size on disk?

Hans Fritz
I'm very surprised as I uploaded about 2GB of pdf files into Mayan. These PDFs are scanned images (grayscale 300dpi) cleaned with textcleaner to make OCR work better. It comes down to about 1MB per page.

Mayan has been processing the uploads for the last 11 hours now (my server is a bit old), but what worries me is how much space it uses. The mayan/media/document_storage directory is 3.8GB (it's possible I have duplicates, I'd have to check once it's done processing), and the mayan/media/document_cache is 16GB and growing.

What's in the cache that's taking so much space? Is it the JPEG thumbnails/preview of pages? If so, is there a setting somewhere for the JPEG quality? I don't need amazing quality for the previews, I'd rather save the space and processing time.

Thanks,

--

---
You received this message because you are subscribed to the Google Groups "Mayan EDMS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: [Mayan EDMS: 2116] Mayan's document_cache size on disk?

Jonathon Exley
It sound like a race condition in the watch folder process. If the files in the watch folder do not finish being processed by the next time the watch folder is checked then the same files will be processed again, leading to multiple copies in the document storage.
You will need to increase the watch folder check interval so that all the files can be processed before the next check cycle starts. I have mine set to once a day, since it's a low powered raspberry pi.
You should check your recent documents and remove any duplicates, after allowing some time for the documents to finish being processed. Usually checking the CPU utilisation is a good way to tell when the queue has emptied.

Jonathon.

On Sep 17, 2017 14:17, "Hans Fritz" <[hidden email]> wrote:
I'm very surprised as I uploaded about 2GB of pdf files into Mayan. These PDFs are scanned images (grayscale 300dpi) cleaned with textcleaner to make OCR work better. It comes down to about 1MB per page.

Mayan has been processing the uploads for the last 11 hours now (my server is a bit old), but what worries me is how much space it uses. The mayan/media/document_storage directory is 3.8GB (it's possible I have duplicates, I'd have to check once it's done processing), and the mayan/media/document_cache is 16GB and growing.

What's in the cache that's taking so much space? Is it the JPEG thumbnails/preview of pages? If so, is there a setting somewhere for the JPEG quality? I don't need amazing quality for the previews, I'd rather save the space and processing time.

Thanks,

--

---
You received this message because you are subscribed to the Google Groups "Mayan EDMS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "Mayan EDMS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.