[Mayan EDMS: 1744] OCR and document names

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Mayan EDMS: 1744] OCR and document names

Douglas Van Es
hello. just installed mayan via docker, it's up and running, it's looks
like it is going to work great.

i've read through the documentation, but i do have one question before i
continue the set up of users, document types, etc. and roll this out to
our users.

will i be able to use OCR to grab an invoice number from a scanned or
emailed document and have mayan name the document based on the results of
the OCR?

would that be set up as a transfromation, or some other way?

i am basically looking to really minimize the workload on our clerks who
will be scanning the invoices into mayan.

thank you all for your time, and the project looks amazing by the way!

doug van es

--

---
You received this message because you are subscribed to the Google Groups "Mayan EDMS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Mayan EDMS: 1773] OCR and document names

Douglas Van Es
i've now set up a couple of users, a group, a role, some metadata,
watched and staging folders.

test document uploads are working great.

after reading through the docs and website, i still can't figure out how
to set up OCR to capture an invoice number and rename the document based
on the result.

can anyone tell me if this is possible with mayan? any hint's on how to
implement?

thanks in advance!



On Sat, 27 May 2017 22:00:50 +0000, Douglas Van Es wrote:

> hello. just installed mayan via docker, it's up and running, it's looks
> like it is going to work great.
>
> i've read through the documentation, but i do have one question before i
> continue the set up of users, document types, etc. and roll this out to
> our users.
>
> will i be able to use OCR to grab an invoice number from a scanned or
> emailed document and have mayan name the document based on the results
> of the OCR?
>
> would that be set up as a transfromation, or some other way?
>
> i am basically looking to really minimize the workload on our clerks who
> will be scanning the invoices into mayan.
>
> thank you all for your time, and the project looks amazing by the way!
>
> doug van es


--

---
You received this message because you are subscribed to the Google Groups "Mayan EDMS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Mayan EDMS: 1776] OCR and document names

David Kornahrens
I'm currently trying to walk myself through the program as well.  We really see the potential here, but help doesn't come quick.  I'm interested in getting a support plan, but not if the support speed doesn't increase.

Roberto has answered a few questions, but it's more of a waiting game really.  I posted some issues in the GitLab repository, but nothing on that yet either.  Let me known if you figure it out, we are looking into the same thing.

--

---
You received this message because you are subscribed to the Google Groups "Mayan EDMS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Mayan EDMS: 1777] OCR and document names

Douglas Van Es

if i crack this or hear from anyone at mayan i'll be sure to let you know.

i'm in the same boat, if i can be sure mayan is going to work for us a
support plan is in our future as well.



On Tue, 06 Jun 2017 17:13:08 -0700, David Kornahrens wrote:

> I'm currently trying to walk myself through the program as well.  We
> really see the potential here, but help doesn't come quick.  I'm
> interested in getting a support plan, but not if the support speed
> doesn't increase.
>
> Roberto has answered a few questions, but it's more of a waiting game
> really.  I posted some issues in the GitLab repository, but nothing on
> that yet either.  Let me known if you figure it out, we are looking into
> the same thing.


--

---
You received this message because you are subscribed to the Google Groups "Mayan EDMS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Mayan EDMS: 1779] OCR and document names

Matthias Löblich
Hi,
it did an Extension for mayan called /document_analyzer

https://gitlab.com/mayan-edms/document_analyzer

The idea behind is to analyze a document and store the result in an generic way (similar to metadata structure). At the moment there are two "analyzers" implemented. One which reads the exif data and one where you can configure regular expressions which are used to parse the ocr result of an document.
If you are able to write an regular expression to parse the invoice number (be aware that the ocr qualtity is very important !) you can use the extension to store the invoice number in a metadata like structure. You can also configure an mayan index on it.

br
Matthias


Am Donnerstag, 8. Juni 2017 22:54:09 UTC+2 schrieb Douglas Van Es:

if i crack this or hear from anyone at mayan i'll be sure to let you know.

i'm in the same boat, if i can be sure mayan is going to work for us a
support plan is in our future as well.



On Tue, 06 Jun 2017 17:13:08 -0700, David Kornahrens wrote:

> I'm currently trying to walk myself through the program as well.  We
> really see the potential here, but help doesn't come quick.  I'm
> interested in getting a support plan, but not if the support speed
> doesn't increase.
>
> Roberto has answered a few questions, but it's more of a waiting game
> really.  I posted some issues in the GitLab repository, but nothing on
> that yet either.  Let me known if you figure it out, we are looking into
> the same thing.


--

---
You received this message because you are subscribed to the Google Groups "Mayan EDMS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Mayan EDMS: 1791] OCR and document names

Douglas Van Es
wow thank you matthias, this looks like it may work for me.

i have a couple of questions based on the docs at the github site, and am
wondering if you could help me out with them. what would my mayan root
folder be on an install using docker? i've looked around /var/lib/docker
and can't quite figure out the correct place to create a link to
document_analyzer...

would it be something like this: /var/lib/docker/aufs/mnt/HASHEDNAME/usr/
local/bin/ ? i don't have an apps folder in there.

i've found local.py in /var/lib/docker/volumes/mayan_settings/_data/ and
so will be able to edit that file to include document_analyser in the
list of installed apps, but can't find a /mymayanroot/apps folder.

will the migrations step shown on the git page be the same for a docker
install? eg: mayan-edms.py migrate ? i suppose i would execute that from /
var/lib/docker/aufs/mnt/HASHEDNAME/usr/local/bin/ right?

thank you for the help so far!



On Fri, 09 Jun 2017 01:42:16 -0700, Matthias Löblich wrote:

> Hi,
> it did an Extension for mayan called /document_analyzer
>
> https://gitlab.com/mayan-edms/document_analyzer
>
> The idea behind is to analyze a document and store the result in an
> generic way (similar to metadata structure). At the moment there are two
> "analyzers" implemented. One which reads the exif data and one where you
> can configure regular expressions which are used to parse the ocr result
> of an document.
> If you are able to write an regular expression to parse the invoice
> number (be aware that the ocr qualtity is very important !) you can use
> the extension to store the invoice number in a metadata like structure.
> You can also configure an mayan index on it.
>
> br Matthias
>
>
> Am Donnerstag, 8. Juni 2017 22:54:09 UTC+2 schrieb Douglas Van Es:
>>
>>
>> if i crack this or hear from anyone at mayan i'll be sure to let you
>> know.
>>
>> i'm in the same boat, if i can be sure mayan is going to work for us a
>> support plan is in our future as well.
>>
>>
>>
>> On Tue, 06 Jun 2017 17:13:08 -0700, David Kornahrens wrote:
>>
>> > I'm currently trying to walk myself through the program as well.  We
>> > really see the potential here, but help doesn't come quick.  I'm
>> > interested in getting a support plan, but not if the support speed
>> > doesn't increase.
>> >
>> > Roberto has answered a few questions, but it's more of a waiting game
>> > really.  I posted some issues in the GitLab repository, but nothing
>> > on that yet either.  Let me known if you figure it out, we are
>> > looking into the same thing.
>>
>>
>>


--

---
You received this message because you are subscribed to the Google Groups "Mayan EDMS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Mayan EDMS: 1798] OCR and document names

Matthias Löblich
Hi Douglas,
I have not done any stuff on Docker with the document_analyzer, but if I look into the mayan docker file:

https://gitlab.com/mayan-edms/mayan-edms-docker/blob/master/Dockerfile

It is using ubuntu:16.04 image and installing mayan by "RUN pip install mayan-edms==2.3". So I guess mayan will be installed in sitepackages.

How to find the sitepackages-folder:

MY Laptop is an:
~$ lsb_release -a
No LSB modules are available.
Distributor ID:    Ubuntu
Description:    Ubuntu 16.04.2 LTS
Release:    16.04
Codename:    xenial

Start python:
~$ python
Python 2.7.12 (default, Nov 19 2016, 06:48:10)
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.

Run:
>>> import site; site.getsitepackages()
['/usr/local/lib/python2.7/dist-packages', '/usr/lib/python2.7/dist-packages']
>>>


But this might be a good question for Roberto: How to integrate an Extension in to the mayan docker image.


br
Matthias




2017-06-15 19:18 GMT+02:00 Douglas Van Es <[hidden email]>:
wow thank you matthias, this looks like it may work for me.

i have a couple of questions based on the docs at the github site, and am
wondering if you could help me out with them. what would my mayan root
folder be on an install using docker? i've looked around /var/lib/docker
and can't quite figure out the correct place to create a link to
document_analyzer...

would it be something like this: /var/lib/docker/aufs/mnt/HASHEDNAME/usr/
local/bin/ ? i don't have an apps folder in there.

i've found local.py in /var/lib/docker/volumes/mayan_settings/_data/ and
so will be able to edit that file to include document_analyser in the
list of installed apps, but can't find a /mymayanroot/apps folder.

will the migrations step shown on the git page be the same for a docker
install? eg: mayan-edms.py migrate ? i suppose i would execute that from /
var/lib/docker/aufs/mnt/HASHEDNAME/usr/local/bin/ right?

thank you for the help so far!



On Fri, 09 Jun 2017 01:42:16 -0700, Matthias Löblich wrote:

> Hi,
> it did an Extension for mayan called /document_analyzer
>
> https://gitlab.com/mayan-edms/document_analyzer
>
> The idea behind is to analyze a document and store the result in an
> generic way (similar to metadata structure). At the moment there are two
> "analyzers" implemented. One which reads the exif data and one where you
> can configure regular expressions which are used to parse the ocr result
> of an document.
> If you are able to write an regular expression to parse the invoice
> number (be aware that the ocr qualtity is very important !) you can use
> the extension to store the invoice number in a metadata like structure.
> You can also configure an mayan index on it.
>
> br Matthias
>
>
> Am Donnerstag, 8. Juni 2017 22:54:09 UTC+2 schrieb Douglas Van Es:
>>
>>
>> if i crack this or hear from anyone at mayan i'll be sure to let you
>> know.
>>
>> i'm in the same boat, if i can be sure mayan is going to work for us a
>> support plan is in our future as well.
>>
>>
>>
>> On Tue, 06 Jun 2017 17:13:08 -0700, David Kornahrens wrote:
>>
>> > I'm currently trying to walk myself through the program as well.  We
>> > really see the potential here, but help doesn't come quick.  I'm
>> > interested in getting a support plan, but not if the support speed
>> > doesn't increase.
>> >
>> > Roberto has answered a few questions, but it's more of a waiting game
>> > really.  I posted some issues in the GitLab repository, but nothing
>> > on that yet either.  Let me known if you figure it out, we are
>> > looking into the same thing.
>>
>>
>>


--

---
You received this message because you are subscribed to a topic in the Google Groups "Mayan EDMS" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/mayan-edms/6P1AqlvNjWQ/unsubscribe.
To unsubscribe from this group and all its topics, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.

--

---
You received this message because you are subscribed to the Google Groups "Mayan EDMS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Mayan EDMS: 1802] OCR and document names

Douglas Van Es
yes any tips on installing an extension into a mayan docker container
roberto?

thanks again matthias! i really appreciate the help

doug

On Mon, 19 Jun 2017 13:25:34 +0200, Matthias Löblich wrote:

> Hi Douglas,
> I have not done any stuff on Docker with the document_analyzer, but if I
> look into the mayan docker file:
>
> https://gitlab.com/mayan-edms/mayan-edms-docker/blob/master/Dockerfile
>
> It is using ubuntu:16.04 image and installing mayan by "RUN pip install
> mayan-edms==2.3". So I guess mayan will be installed in sitepackages.
>
> How to find the sitepackages-folder:
>
> MY Laptop is an:
> ~$ lsb_release -a No LSB modules are available.
> Distributor ID:    Ubuntu Description:    Ubuntu 16.04.2 LTS Release:  
> 16.04 Codename:    xenial
>
> Start python:
> ~$ python Python 2.7.12 (default, Nov 19 2016, 06:48:10)
> [GCC 5.4.0 20160609] on linux2 Type "help", "copyright", "credits" or
> "license" for more information.
>
> Run:
>>>> import site; site.getsitepackages()
> ['/usr/local/lib/python2.7/dist-packages',
> '/usr/lib/python2.7/dist-packages']
>>>>
>>>>
>
> But this might be a good question for Roberto: How to integrate an
> Extension in to the mayan docker image.
>
>
> br Matthias
>
>
>
>
> 2017-06-15 19:18 GMT+02:00 Douglas Van Es
> <[hidden email]>:
>
>> wow thank you matthias, this looks like it may work for me.
>>
>> i have a couple of questions based on the docs at the github site, and
>> am wondering if you could help me out with them. what would my mayan
>> root folder be on an install using docker? i've looked around
>> /var/lib/docker and can't quite figure out the correct place to create
>> a link to document_analyzer...
>>
>> would it be something like this:
>> /var/lib/docker/aufs/mnt/HASHEDNAME/usr/ local/bin/ ? i don't have an
>> apps folder in there.
>>
>> i've found local.py in /var/lib/docker/volumes/mayan_settings/_data/
>> and so will be able to edit that file to include document_analyser in
>> the list of installed apps, but can't find a /mymayanroot/apps folder.
>>
>> will the migrations step shown on the git page be the same for a docker
>> install? eg: mayan-edms.py migrate ? i suppose i would execute that
>> from /
>> var/lib/docker/aufs/mnt/HASHEDNAME/usr/local/bin/ right?
>>
>> thank you for the help so far!
>>
>>
>>
>> On Fri, 09 Jun 2017 01:42:16 -0700, Matthias Löblich wrote:
>>
>> > Hi,
>> > it did an Extension for mayan called /document_analyzer
>> >
>> > https://gitlab.com/mayan-edms/document_analyzer
>> >
>> > The idea behind is to analyze a document and store the result in an
>> > generic way (similar to metadata structure). At the moment there are
>> > two "analyzers" implemented. One which reads the exif data and one
>> > where you can configure regular expressions which are used to parse
>> > the ocr result of an document.
>> > If you are able to write an regular expression to parse the invoice
>> > number (be aware that the ocr qualtity is very important !) you can
>> > use the extension to store the invoice number in a metadata like
>> > structure. You can also configure an mayan index on it.
>> >
>> > br Matthias
>> >
>> >
>> > Am Donnerstag, 8. Juni 2017 22:54:09 UTC+2 schrieb Douglas Van Es:
>> >>
>> >>
>> >> if i crack this or hear from anyone at mayan i'll be sure to let you
>> >> know.
>> >>
>> >> i'm in the same boat, if i can be sure mayan is going to work for us
>> >> a support plan is in our future as well.
>> >>
>> >>
>> >>
>> >> On Tue, 06 Jun 2017 17:13:08 -0700, David Kornahrens wrote:
>> >>
>> >> > I'm currently trying to walk myself through the program as well.
>> >> > We really see the potential here, but help doesn't come quick.
>> >> > I'm interested in getting a support plan, but not if the support
>> >> > speed doesn't increase.
>> >> >
>> >> > Roberto has answered a few questions, but it's more of a waiting
>> >> > game really.  I posted some issues in the GitLab repository, but
>> >> > nothing on that yet either.  Let me known if you figure it out, we
>> >> > are looking into the same thing.
>> >>
>> >>
>> >>
>> >>
>>
>> --
>>
>> ---
>> You received this message because you are subscribed to a topic in the
>> Google Groups "Mayan EDMS" group.
>> To unsubscribe from this topic, visit https://groups.google.com/d/
>> topic/mayan-edms/6P1AqlvNjWQ/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to
>> mayan-edms+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/
[hidden email]
>> For more options, visit https://groups.google.com/d/optout.
>>


--

---
You received this message because you are subscribed to the Google Groups "Mayan EDMS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Mayan EDMS: 1804] OCR and document names

rosarior
Administrator
Check the section "Customizing the image" here: https://hub.docker.com/r/mayanedms/mayanedms/

It is not the easiest thing to do but it is the way Docker images are officially customized.

However, after the next version, I plan to work on finding ways to customize the image without having to rebuild a new image.
One idea I want to try is providing an environment variable called MAYAN_PIP_PACKAGES or similar that contains
a comma delimited list of packages to download and install from the web. The disadvantage of this approach is that 
the installed packages are not persistent and need to be downloaded and installed every time the image starts.

Also planning on trying something like MAYAN_APT_PACKAGES too to allow installing Ubuntu packages like extra 
OCR language packs at runtime.

Docker provides a command called "commit" which could be the answer to the non persistent issue. 

These are all untested ideas at the moment and for now the only official way to customize an image is the one provided in the link above. 


On Wednesday, June 21, 2017 at 3:42:30 PM UTC-4, Douglas Van Es wrote:
yes any tips on installing an extension into a mayan docker container
roberto?

thanks again matthias! i really appreciate the help

doug

On Mon, 19 Jun 2017 13:25:34 +0200, Matthias Löblich wrote:

> Hi Douglas,
> I have not done any stuff on Docker with the document_analyzer, but if I
> look into the mayan docker file:
>
> <a href="https://gitlab.com/mayan-edms/mayan-edms-docker/blob/master/Dockerfile" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgitlab.com%2Fmayan-edms%2Fmayan-edms-docker%2Fblob%2Fmaster%2FDockerfile\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNG8N_1Io35zMKVckrray4eQ8Gy8Eg&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgitlab.com%2Fmayan-edms%2Fmayan-edms-docker%2Fblob%2Fmaster%2FDockerfile\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNG8N_1Io35zMKVckrray4eQ8Gy8Eg&#39;;return true;">https://gitlab.com/mayan-edms/mayan-edms-docker/blob/master/Dockerfile
>
> It is using ubuntu:16.04 image and installing mayan by "RUN pip install
> mayan-edms==2.3". So I guess mayan will be installed in sitepackages.
>
> How to find the sitepackages-folder:
>
> MY Laptop is an:
> ~$ lsb_release -a No LSB modules are available.
> Distributor ID:    Ubuntu Description:    Ubuntu 16.04.2 LTS Release:  
> 16.04 Codename:    xenial
>
> Start python:
> ~$ python Python 2.7.12 (default, Nov 19 2016, 06:48:10)
> [GCC 5.4.0 20160609] on linux2 Type "help", "copyright", "credits" or
> "license" for more information.
>
> Run:
>>>> import site; site.getsitepackages()
> ['/usr/local/lib/python2.7/dist-packages',
> '/usr/lib/python2.7/dist-packages']
>>>>
>>>>
>
> But this might be a good question for Roberto: How to integrate an
> Extension in to the mayan docker image.
>
>
> br Matthias
>
>
>
>
> 2017-06-15 19:18 GMT+02:00 Douglas Van Es
> <[hidden email]>:
>
>> wow thank you matthias, this looks like it may work for me.
>>
>> i have a couple of questions based on the docs at the github site, and
>> am wondering if you could help me out with them. what would my mayan
>> root folder be on an install using docker? i've looked around
>> /var/lib/docker and can't quite figure out the correct place to create
>> a link to document_analyzer...
>>
>> would it be something like this:
>> /var/lib/docker/aufs/mnt/HASHEDNAME/usr/ local/bin/ ? i don't have an
>> apps folder in there.
>>
>> i've found local.py in /var/lib/docker/volumes/mayan_settings/_data/
>> and so will be able to edit that file to include document_analyser in
>> the list of installed apps, but can't find a /mymayanroot/apps folder.
>>
>> will the migrations step shown on the git page be the same for a docker
>> install? eg: mayan-edms.py migrate ? i suppose i would execute that
>> from /
>> var/lib/docker/aufs/mnt/HASHEDNAME/usr/local/bin/ right?
>>
>> thank you for the help so far!
>>
>>
>>
>> On Fri, 09 Jun 2017 01:42:16 -0700, Matthias Löblich wrote:
>>
>> > Hi,
>> > it did an Extension for mayan called /document_analyzer
>> >
>> > <a href="https://gitlab.com/mayan-edms/document_analyzer" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgitlab.com%2Fmayan-edms%2Fdocument_analyzer\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNH7nxNDt0oYftJxcct4HYsA9z9Xvw&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgitlab.com%2Fmayan-edms%2Fdocument_analyzer\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNH7nxNDt0oYftJxcct4HYsA9z9Xvw&#39;;return true;">https://gitlab.com/mayan-edms/document_analyzer
>> >
>> > The idea behind is to analyze a document and store the result in an
>> > generic way (similar to metadata structure). At the moment there are
>> > two "analyzers" implemented. One which reads the exif data and one
>> > where you can configure regular expressions which are used to parse
>> > the ocr result of an document.
>> > If you are able to write an regular expression to parse the invoice
>> > number (be aware that the ocr qualtity is very important !) you can
>> > use the extension to store the invoice number in a metadata like
>> > structure. You can also configure an mayan index on it.
>> >
>> > br Matthias
>> >
>> >
>> > Am Donnerstag, 8. Juni 2017 22:54:09 UTC+2 schrieb Douglas Van Es:
>> >>
>> >>
>> >> if i crack this or hear from anyone at mayan i'll be sure to let you
>> >> know.
>> >>
>> >> i'm in the same boat, if i can be sure mayan is going to work for us
>> >> a support plan is in our future as well.
>> >>
>> >>
>> >>
>> >> On Tue, 06 Jun 2017 17:13:08 -0700, David Kornahrens wrote:
>> >>
>> >> > I'm currently trying to walk myself through the program as well.
>> >> > We really see the potential here, but help doesn't come quick.
>> >> > I'm interested in getting a support plan, but not if the support
>> >> > speed doesn't increase.
>> >> >
>> >> > Roberto has answered a few questions, but it's more of a waiting
>> >> > game really.  I posted some issues in the GitLab repository, but
>> >> > nothing on that yet either.  Let me known if you figure it out, we
>> >> > are looking into the same thing.
>> >>
>> >>
>> >>
>> >>
>>
>> --
>>
>> ---
>> You received this message because you are subscribed to a topic in the
>> Google Groups "Mayan EDMS" group.
>> To unsubscribe from this topic, visit <a href="https://groups.google.com/d/" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://groups.google.com/d/&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/d/&#39;;return true;">https://groups.google.com/d/
>> topic/mayan-edms/6P1AqlvNjWQ/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to
>> mayan-edms+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/
[hidden email]
>> For more options, visit <a href="https://groups.google.com/d/optout" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;">https://groups.google.com/d/optout.
>>


--

---
You received this message because you are subscribed to the Google Groups "Mayan EDMS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Mayan EDMS: 1806] OCR and document names

Douglas Van Es
thank you for the response roberto. great work on mayan, it looks like an
amazing tool. i think it will fill my organization's requirements for edm
rather well, if i can pull an invoice name out of the scanned documents
using OCR and then name the document using the invoice name, or at a
minimum populate a metadata field.

it seems like matthias has created an extension that will fit the bill
for my OCR needs, but i am having a little difficulty finding my way
around the docker container's environment.

do i need to customize the image in this case? or can i just install
matthias' document_analyzer extension by placing it in mayan's root
folder?

i just need to know what paths to use in the instructions on the
extension's git site and quoted below:

> Installation
>
> clone the sources from gitlab to you local env.
>
> add an link from your mayan/apps folder to the document_analyzer folder:
> cd /yourmayanroot/apps
> ln -s /yourgitroot/document_analyzer/document_analyzer/ .
>
> In your settings/local.py file add document_analyzer to your
INSTALLED_APPS list:
> INSTALLED_APPS += (
>     'document_analyzer',
> )
>
> Run the migrations for the app:
> mayan-edms.py migrate

i'm pretty sure local.py sits in /var/lib/docker/volumes/mayan_settings/
_data/ and that i can make the mentioned changes there.

it's figuring out what to substitute for "yourmayanroot" that has me
stumped. i don't have an apps folder in /var/lib/docker/volumes/
mayan_settings/_data/

problem is there are duplicates of these files and folders sprinkled
around the image: in hashed folders at /var/lib/docker/aufs/mnt and so on.

thanks again for your time!

doug van es



On Wed, 21 Jun 2017 12:52:24 -0700, Roberto Rosario wrote:

> Check the section "Customizing the image"
> here: https://hub.docker.com/r/mayanedms/mayanedms/
>
> It is not the easiest thing to do but it is the way Docker images are
> officially customized.
>
> However, after the next version, I plan to work on finding ways to
> customize the image without having to rebuild a new image.
> One idea I want to try is providing an environment variable called
> MAYAN_PIP_PACKAGES or similar that contains a comma delimited list of
> packages to download and install from the web. The disadvantage of this
> approach is that the installed packages are not persistent and need to
> be downloaded and installed every time the image starts.
>
> Also planning on trying something like MAYAN_APT_PACKAGES too to allow
> installing Ubuntu packages like extra OCR language packs at runtime.
>
> Docker provides a command called "commit" which could be the answer to
> the non persistent issue.
>
> These are all untested ideas at the moment and for now the only official
> way to customize an image is the one provided in the link above.
>
>
> On Wednesday, June 21, 2017 at 3:42:30 PM UTC-4, Douglas Van Es wrote:
>>
>> yes any tips on installing an extension into a mayan docker container
>> roberto?
>>
>> thanks again matthias! i really appreciate the help
>>
>> doug
>>
>> On Mon, 19 Jun 2017 13:25:34 +0200, Matthias Löblich wrote:
>>
>> > Hi Douglas,
>> > I have not done any stuff on Docker with the document_analyzer, but
>> > if I look into the mayan docker file:
>> >
>> > https://gitlab.com/mayan-edms/mayan-edms-docker/blob/master/
Dockerfile

>> >
>> > It is using ubuntu:16.04 image and installing mayan by "RUN pip
>> > install mayan-edms==2.3". So I guess mayan will be installed in
>> > sitepackages.
>> >
>> > How to find the sitepackages-folder:
>> >
>> > MY Laptop is an:
>> > ~$ lsb_release -a No LSB modules are available.
>> > Distributor ID:    Ubuntu Description:    Ubuntu 16.04.2 LTS Release:
>> > 16.04 Codename:    xenial
>> >
>> > Start python:
>> > ~$ python Python 2.7.12 (default, Nov 19 2016, 06:48:10)
>> > [GCC 5.4.0 20160609] on linux2 Type "help", "copyright", "credits" or
>> > "license" for more information.
>> >
>> > Run:
>> >>>> import site; site.getsitepackages()
>> > ['/usr/local/lib/python2.7/dist-packages',
>> > '/usr/lib/python2.7/dist-packages']
>> >>>>
>> >>>>
>> >>>>
>> > But this might be a good question for Roberto: How to integrate an
>> > Extension in to the mayan docker image.
>> >
>> >
>> > br Matthias
>> >
>> >
>> >
>> >
>> > 2017-06-15 19:18 GMT+02:00 Douglas Van Es
>> > <[hidden email]>:
>> >
>> >> wow thank you matthias, this looks like it may work for me.
>> >>
>> >> i have a couple of questions based on the docs at the github site,
>> >> and am wondering if you could help me out with them. what would my
>> >> mayan root folder be on an install using docker? i've looked around
>> >> /var/lib/docker and can't quite figure out the correct place to
>> >> create a link to document_analyzer...
>> >>
>> >> would it be something like this:
>> >> /var/lib/docker/aufs/mnt/HASHEDNAME/usr/ local/bin/ ? i don't have
>> >> an apps folder in there.
>> >>
>> >> i've found local.py in /var/lib/docker/volumes/mayan_settings/_data/
>> >> and so will be able to edit that file to include document_analyser
>> >> in the list of installed apps, but can't find a /mymayanroot/apps
>> >> folder.
>> >>
>> >> will the migrations step shown on the git page be the same for a
>> >> docker install? eg: mayan-edms.py migrate ? i suppose i would
>> >> execute that from /
>> >> var/lib/docker/aufs/mnt/HASHEDNAME/usr/local/bin/ right?
>> >>
>> >> thank you for the help so far!
>> >>
>> >>
>> >>
>> >> On Fri, 09 Jun 2017 01:42:16 -0700, Matthias Löblich wrote:
>> >>
>> >> > Hi,
>> >> > it did an Extension for mayan called /document_analyzer
>> >> >
>> >> > https://gitlab.com/mayan-edms/document_analyzer
>> >> >
>> >> > The idea behind is to analyze a document and store the result in
>> >> > an generic way (similar to metadata structure). At the moment
>> >> > there are two "analyzers" implemented. One which reads the exif
>> >> > data and one where you can configure regular expressions which are
>> >> > used to parse the ocr result of an document.
>> >> > If you are able to write an regular expression to parse the
>> >> > invoice number (be aware that the ocr qualtity is very important
>> >> > !) you can use the extension to store the invoice number in a
>> >> > metadata like structure. You can also configure an mayan index on
>> >> > it.
>> >> >
>> >> > br Matthias
>> >> >
>> >> >
>> >> > Am Donnerstag, 8. Juni 2017 22:54:09 UTC+2 schrieb Douglas Van Es:
>> >> >>
>> >> >>
>> >> >> if i crack this or hear from anyone at mayan i'll be sure to let
>> >> >> you know.
>> >> >>
>> >> >> i'm in the same boat, if i can be sure mayan is going to work for
>> >> >> us a support plan is in our future as well.
>> >> >>
>> >> >>
>> >> >>
>> >> >> On Tue, 06 Jun 2017 17:13:08 -0700, David Kornahrens wrote:
>> >> >>
>> >> >> > I'm currently trying to walk myself through the program as
>> >> >> > well. We really see the potential here, but help doesn't come
>> >> >> > quick. I'm interested in getting a support plan, but not if the
>> >> >> > support speed doesn't increase.
>> >> >> >
>> >> >> > Roberto has answered a few questions, but it's more of a
>> >> >> > waiting game really.  I posted some issues in the GitLab
>> >> >> > repository, but nothing on that yet either.  Let me known if
>> >> >> > you figure it out, we are looking into the same thing.
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> --
>> >>
>> >> ---
>> >> You received this message because you are subscribed to a topic in
>> >> the Google Groups "Mayan EDMS" group.
>> >> To unsubscribe from this topic, visit https://groups.google.com/d/
>> >> topic/mayan-edms/6P1AqlvNjWQ/unsubscribe.
>> >> To unsubscribe from this group and all its topics, send an email to
>> >> mayan-edms+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/
>> [hidden email]
>> >> For more options, visit https://groups.google.com/d/optout.
>> >>
>> >>
>>
>>


--

---
You received this message because you are subscribed to the Google Groups "Mayan EDMS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Loading...