[Mayan EDMS: 2372] Document date and title

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

[Mayan EDMS: 2372] Document date and title

Raul Garcia Sanchez
Hi guys,

I am pretty new to Mayan EDMS.
So far I have been only testing on how to install it and what hardware would be a good match for it.

I am not at a state where I have installed EDMS 3.0 and am wondering about the following:

1.) How can I create an index of the year, month of the document date itself?
2.) How can I set the document title so that it builds it out of for example invoice number, date and company?
3.) How can I create a trigger that automatically assigns tags to documents and adds them to a special cabinet? All outgoing from keywords that match with the OCR result.

Thanks for your help :)

--

---
You received this message because you are subscribed to the Google Groups "Mayan EDMS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

[Mayan EDMS: 2382] Re: Document date and title

Michael Price
Hi,

How are you liking version 3.0?

Here are the answer to your questions.
1. The file creation data is lost during the upload. These operating system fields and do not persist during upload via web. It could be possible to retain these values if the document is uploaded via a watch folder or staging folder. Since these methods open the file to be loaded into Mayan directly from the operating system the file creation could be read.

2. It is not possible using the normal installation. However I found this online: https://gitlab.com/mayan-edms/document_renaming it seems to do what you want. Haven't tried it and looks outdated. If there is enough interest we could add something like this in our fork of Mayan.

3. Not yet possible at the level you want right now but it is getting there. There is a workflow feature called triggers and another called actions. These allow you to create a workflow that will respond (trigger) based on an event (OCR finished) and perform an action (tag the document, move to the cabinet). The problem is that the triggers and actions are static. You can't program any kind of intelligence in them. There is no method to add a decision (what folder based on what OCR content). We have been talking about solving this with what we called workflow filters. The specs are still in design phase as we don't want to create a whole separate programming language for this. Eric is particularly interested in this still (we wants to auto tag documents based on OCR content) so this will get done as soon as we figure out the design.

On Tuesday, March 20, 2018 at 9:13:22 AM UTC-4, Raul wrote:
Hi guys,

I am pretty new to Mayan EDMS.
So far I have been only testing on how to install it and what hardware would be a good match for it.

I am not at a state where I have installed EDMS 3.0 and am wondering about the following:

1.) How can I create an index of the year, month of the document date itself?
2.) How can I set the document title so that it builds it out of for example invoice number, date and company?
3.) How can I create a trigger that automatically assigns tags to documents and adds them to a special cabinet? All outgoing from keywords that match with the OCR result.

Thanks for your help :)

--

---
You received this message because you are subscribed to the Google Groups "Mayan EDMS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

[Mayan EDMS: 2403] Re: Document date and title

Raul Garcia Sanchez
Hi Michael,

thanks for your response.
I finally got my server in place and am ready to rock now.

Regarding my questions:

1) I was more thinking about something like it is done in paperless (https://github.com/danielquinn/paperless).
Here the date is being extracted out of the OCR content and used as document date. Which is a very nice feature.

2) You are right the addon is outdated.

3) That would be definitely nice. I the way it is right now the OCR content is only used for searching. However, there could be so much more you could do with it. Like search for keywords and trigger the auto assign to a cabinet or get the document date etc....

Am Freitag, 23. März 2018 06:25:11 UTC+1 schrieb Michael Price:
Hi,

How are you liking version 3.0?

Here are the answer to your questions.
1. The file creation data is lost during the upload. These operating system fields and do not persist during upload via web. It could be possible to retain these values if the document is uploaded via a watch folder or staging folder. Since these methods open the file to be loaded into Mayan directly from the operating system the file creation could be read.

2. It is not possible using the normal installation. However I found this online: <a href="https://gitlab.com/mayan-edms/document_renaming" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgitlab.com%2Fmayan-edms%2Fdocument_renaming\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNFy3W3Xl1UfIfAtxyxRVGm6YNiPiQ&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgitlab.com%2Fmayan-edms%2Fdocument_renaming\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNFy3W3Xl1UfIfAtxyxRVGm6YNiPiQ&#39;;return true;">https://gitlab.com/mayan-edms/document_renaming it seems to do what you want. Haven't tried it and looks outdated. If there is enough interest we could add something like this in our fork of Mayan.

3. Not yet possible at the level you want right now but it is getting there. There is a workflow feature called triggers and another called actions. These allow you to create a workflow that will respond (trigger) based on an event (OCR finished) and perform an action (tag the document, move to the cabinet). The problem is that the triggers and actions are static. You can't program any kind of intelligence in them. There is no method to add a decision (what folder based on what OCR content). We have been talking about solving this with what we called workflow filters. The specs are still in design phase as we don't want to create a whole separate programming language for this. Eric is particularly interested in this still (we wants to auto tag documents based on OCR content) so this will get done as soon as we figure out the design.

On Tuesday, March 20, 2018 at 9:13:22 AM UTC-4, Raul wrote:
Hi guys,

I am pretty new to Mayan EDMS.
So far I have been only testing on how to install it and what hardware would be a good match for it.

I am not at a state where I have installed EDMS 3.0 and am wondering about the following:

1.) How can I create an index of the year, month of the document date itself?
2.) How can I set the document title so that it builds it out of for example invoice number, date and company?
3.) How can I create a trigger that automatically assigns tags to documents and adds them to a special cabinet? All outgoing from keywords that match with the OCR result.

Thanks for your help :)

--

---
You received this message because you are subscribed to the Google Groups "Mayan EDMS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: [Mayan EDMS: 2407] Re: Document date and title

Michael Price
Hello,

1) They must be using a regular expression feature to extract the data. I must warn that it is never a good idea to rely on OCR output for data. OCR is one of those features that will never work 100%. If you do have some fallback logic to avoid adding garbage data. We could add a post OCR processing step to add a feature like this. The best place for that would be the workflow engine. I think there are post OCR triggers. We would need a regular expression workflow action.

2) A pity. Looks very interesting. I have a lot on my plate but will take a look to see how difficult it would be add this as a standard app.

3) We added Filters to Paperattor which work a bit like SmartLinks. We are looking into reusing this method to add workflow trigger filters. This means that you can make a workflow to trigger the transition and add tags or extract OCR data for metadata only is certain condition programmed in the workflow filter is met.

Keep the ideas and use cases coming, they give us a good roadmap to develops the next set of features.

On Sun, Apr 1, 2018 at 1:55 PM, Raul <[hidden email]> wrote:
Hi Michael,

thanks for your response.
I finally got my server in place and am ready to rock now.

Regarding my questions:

1) I was more thinking about something like it is done in paperless (https://github.com/danielquinn/paperless).
Here the date is being extracted out of the OCR content and used as document date. Which is a very nice feature.

2) You are right the addon is outdated.

3) That would be definitely nice. I the way it is right now the OCR content is only used for searching. However, there could be so much more you could do with it. Like search for keywords and trigger the auto assign to a cabinet or get the document date etc....

Am Freitag, 23. März 2018 06:25:11 UTC+1 schrieb Michael Price:
Hi,

How are you liking version 3.0?

Here are the answer to your questions.
1. The file creation data is lost during the upload. These operating system fields and do not persist during upload via web. It could be possible to retain these values if the document is uploaded via a watch folder or staging folder. Since these methods open the file to be loaded into Mayan directly from the operating system the file creation could be read.

2. It is not possible using the normal installation. However I found this online: https://gitlab.com/mayan-edms/document_renaming it seems to do what you want. Haven't tried it and looks outdated. If there is enough interest we could add something like this in our fork of Mayan.

3. Not yet possible at the level you want right now but it is getting there. There is a workflow feature called triggers and another called actions. These allow you to create a workflow that will respond (trigger) based on an event (OCR finished) and perform an action (tag the document, move to the cabinet). The problem is that the triggers and actions are static. You can't program any kind of intelligence in them. There is no method to add a decision (what folder based on what OCR content). We have been talking about solving this with what we called workflow filters. The specs are still in design phase as we don't want to create a whole separate programming language for this. Eric is particularly interested in this still (we wants to auto tag documents based on OCR content) so this will get done as soon as we figure out the design.

On Tuesday, March 20, 2018 at 9:13:22 AM UTC-4, Raul wrote:
Hi guys,

I am pretty new to Mayan EDMS.
So far I have been only testing on how to install it and what hardware would be a good match for it.

I am not at a state where I have installed EDMS 3.0 and am wondering about the following:

1.) How can I create an index of the year, month of the document date itself?
2.) How can I set the document title so that it builds it out of for example invoice number, date and company?
3.) How can I create a trigger that automatically assigns tags to documents and adds them to a special cabinet? All outgoing from keywords that match with the OCR result.

Thanks for your help :)

--

---
You received this message because you are subscribed to a topic in the Google Groups "Mayan EDMS" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/mayan-edms/ONBovsdTKfI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "Mayan EDMS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

[Mayan EDMS: 2426] Re: Document date and title

Raul Garcia Sanchez
In reply to this post by Raul Garcia Sanchez
Hi Robert,

what do you think about my question 1?
Would it be possible to implement such a feature - with a few lines - into mayan?
This information could for example be used for sorting the documents.
Sorting them by the document name doesn't make a lot of sense if you ask me.

Looking forward to your feedback.


Am Sonntag, 1. April 2018 19:55:44 UTC+2 schrieb Raul:
Hi Michael,

thanks for your response.
I finally got my server in place and am ready to rock now.

Regarding my questions:

1) I was more thinking about something like it is done in paperless (<a href="https://github.com/danielquinn/paperless" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Fdanielquinn%2Fpaperless\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEnqQHMmErOLkBg3pATXfoGOdt-Vw&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Fdanielquinn%2Fpaperless\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEnqQHMmErOLkBg3pATXfoGOdt-Vw&#39;;return true;">https://github.com/danielquinn/paperless).
Here the date is being extracted out of the OCR content and used as document date. Which is a very nice feature.

2) You are right the addon is outdated.

3) That would be definitely nice. I the way it is right now the OCR content is only used for searching. However, there could be so much more you could do with it. Like search for keywords and trigger the auto assign to a cabinet or get the document date etc....

Am Freitag, 23. März 2018 06:25:11 UTC+1 schrieb Michael Price:
Hi,

How are you liking version 3.0?

Here are the answer to your questions.
1. The file creation data is lost during the upload. These operating system fields and do not persist during upload via web. It could be possible to retain these values if the document is uploaded via a watch folder or staging folder. Since these methods open the file to be loaded into Mayan directly from the operating system the file creation could be read.

2. It is not possible using the normal installation. However I found this online: <a href="https://gitlab.com/mayan-edms/document_renaming" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgitlab.com%2Fmayan-edms%2Fdocument_renaming\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNFy3W3Xl1UfIfAtxyxRVGm6YNiPiQ&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgitlab.com%2Fmayan-edms%2Fdocument_renaming\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNFy3W3Xl1UfIfAtxyxRVGm6YNiPiQ&#39;;return true;">https://gitlab.com/mayan-edms/document_renaming it seems to do what you want. Haven't tried it and looks outdated. If there is enough interest we could add something like this in our fork of Mayan.

3. Not yet possible at the level you want right now but it is getting there. There is a workflow feature called triggers and another called actions. These allow you to create a workflow that will respond (trigger) based on an event (OCR finished) and perform an action (tag the document, move to the cabinet). The problem is that the triggers and actions are static. You can't program any kind of intelligence in them. There is no method to add a decision (what folder based on what OCR content). We have been talking about solving this with what we called workflow filters. The specs are still in design phase as we don't want to create a whole separate programming language for this. Eric is particularly interested in this still (we wants to auto tag documents based on OCR content) so this will get done as soon as we figure out the design.

On Tuesday, March 20, 2018 at 9:13:22 AM UTC-4, Raul wrote:
Hi guys,

I am pretty new to Mayan EDMS.
So far I have been only testing on how to install it and what hardware would be a good match for it.

I am not at a state where I have installed EDMS 3.0 and am wondering about the following:

1.) How can I create an index of the year, month of the document date itself?
2.) How can I set the document title so that it builds it out of for example invoice number, date and company?
3.) How can I create a trigger that automatically assigns tags to documents and adds them to a special cabinet? All outgoing from keywords that match with the OCR result.

Thanks for your help :)

--

---
You received this message because you are subscribed to the Google Groups "Mayan EDMS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: [Mayan EDMS: 2427] Re: Document date and title

Matthias Löblich
In reply to this post by Michael Price
Hello,

1) Please take a look at my mayan-extension: https://gitlab.com/startmat/document_analyzer


br
Matthias

Am Sonntag, 1. April 2018 22:51:12 UTC+2 schrieb Michael Price:
Hello,

1) They must be using a regular expression feature to extract the data. I must warn that it is never a good idea to rely on OCR output for data. OCR is one of those features that will never work 100%. If you do have some fallback logic to avoid adding garbage data. We could add a post OCR processing step to add a feature like this. The best place for that would be the workflow engine. I think there are post OCR triggers. We would need a regular expression workflow action.

2) A pity. Looks very interesting. I have a lot on my plate but will take a look to see how difficult it would be add this as a standard app.

3) We added Filters to Paperattor which work a bit like SmartLinks. We are looking into reusing this method to add workflow trigger filters. This means that you can make a workflow to trigger the transition and add tags or extract OCR data for metadata only is certain condition programmed in the workflow filter is met.

Keep the ideas and use cases coming, they give us a good roadmap to develops the next set of features.

On Sun, Apr 1, 2018 at 1:55 PM, Raul <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="D--DuQ_qAAAJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">rgarc...@...> wrote:
Hi Michael,

thanks for your response.
I finally got my server in place and am ready to rock now.

Regarding my questions:

1) I was more thinking about something like it is done in paperless (<a href="https://github.com/danielquinn/paperless" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Fdanielquinn%2Fpaperless\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEnqQHMmErOLkBg3pATXfoGOdt-Vw&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Fdanielquinn%2Fpaperless\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEnqQHMmErOLkBg3pATXfoGOdt-Vw&#39;;return true;">https://github.com/danielquinn/paperless).
Here the date is being extracted out of the OCR content and used as document date. Which is a very nice feature.

2) You are right the addon is outdated.

3) That would be definitely nice. I the way it is right now the OCR content is only used for searching. However, there could be so much more you could do with it. Like search for keywords and trigger the auto assign to a cabinet or get the document date etc....

Am Freitag, 23. März 2018 06:25:11 UTC+1 schrieb Michael Price:
Hi,

How are you liking version 3.0?

Here are the answer to your questions.
1. The file creation data is lost during the upload. These operating system fields and do not persist during upload via web. It could be possible to retain these values if the document is uploaded via a watch folder or staging folder. Since these methods open the file to be loaded into Mayan directly from the operating system the file creation could be read.

2. It is not possible using the normal installation. However I found this online: <a href="https://gitlab.com/mayan-edms/document_renaming" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgitlab.com%2Fmayan-edms%2Fdocument_renaming\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNFy3W3Xl1UfIfAtxyxRVGm6YNiPiQ&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgitlab.com%2Fmayan-edms%2Fdocument_renaming\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNFy3W3Xl1UfIfAtxyxRVGm6YNiPiQ&#39;;return true;">https://gitlab.com/mayan-edms/document_renaming it seems to do what you want. Haven't tried it and looks outdated. If there is enough interest we could add something like this in our fork of Mayan.

3. Not yet possible at the level you want right now but it is getting there. There is a workflow feature called triggers and another called actions. These allow you to create a workflow that will respond (trigger) based on an event (OCR finished) and perform an action (tag the document, move to the cabinet). The problem is that the triggers and actions are static. You can't program any kind of intelligence in them. There is no method to add a decision (what folder based on what OCR content). We have been talking about solving this with what we called workflow filters. The specs are still in design phase as we don't want to create a whole separate programming language for this. Eric is particularly interested in this still (we wants to auto tag documents based on OCR content) so this will get done as soon as we figure out the design.

On Tuesday, March 20, 2018 at 9:13:22 AM UTC-4, Raul wrote:
Hi guys,

I am pretty new to Mayan EDMS.
So far I have been only testing on how to install it and what hardware would be a good match for it.

I am not at a state where I have installed EDMS 3.0 and am wondering about the following:

1.) How can I create an index of the year, month of the document date itself?
2.) How can I set the document title so that it builds it out of for example invoice number, date and company?
3.) How can I create a trigger that automatically assigns tags to documents and adds them to a special cabinet? All outgoing from keywords that match with the OCR result.

Thanks for your help :)

--

---
You received this message because you are subscribed to a topic in the Google Groups "Mayan EDMS" group.
To unsubscribe from this topic, visit <a href="https://groups.google.com/d/topic/mayan-edms/ONBovsdTKfI/unsubscribe" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://groups.google.com/d/topic/mayan-edms/ONBovsdTKfI/unsubscribe&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/d/topic/mayan-edms/ONBovsdTKfI/unsubscribe&#39;;return true;">https://groups.google.com/d/topic/mayan-edms/ONBovsdTKfI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="D--DuQ_qAAAJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">mayan-edms+...@googlegroups.com.
For more options, visit <a href="https://groups.google.com/d/optout" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;">https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "Mayan EDMS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.