Ephesoft is one of the world’s leading open source document scanning and parsing solution. Alfresco on the other hand is the world’s leading document management and storage platform. Companies worldwide depend upon Ephesoft to help ease their lives by implementing document scanning workflows and then eventually save these documents for later use in a document management solution like Alfresco. So an integration between the two solutions was utmost important since the day the ephesoft was launched. Fortunately, its not that difficult.
Why you need to integrate Ephesoft and Alfresco
The start of computer age touted that it will make this age paper redundant. However even at the turn of next millennium, papers are still ubiquitous and are common source of exchanging business and financial information. They are unmanageable and very slow to process, therefore most organization have moved to computer based systems. But the sad truth has been we have not yet gone paperless.
This prompted innovation in bringing papers to computers, mainly through paper scanners. Ephesoft is one such solution that helps companies manage scanning and its related workflow, and extract information out of scanned documents.
Now once we have scanned documents, where are we going to save it? A content management solution specializing in document management is the need of the hour and which platform better than Alfresco to do the same?
How to Integrate Ephesoft and Alfresco
Integration of Ephesoft with Alfresco is done by configuring CMIS plugin in Ephesoft. This plugin, as the name itself suggests, can be used to push or pull data from CMIS compatible document repositories such as Alfresco, Nuxeo, Sharepoint, IBM document repository, etc. However even before that we have to first prepare Ephesoft by mapping attributes.
The first step is to edit and configure the DLF-Attribute-mapping.properties file. This file can be found at location [EphesoftInstallationDirectory]SharedFolders[Batch-class-Folder]cmis-plugin-mappingDLF-Attribute-mapping.properties
For example, if Ephesoft is installed in d: drive then the location of a 3rd batch file would be
The format for mapping in this file is
DocumentTypeName=DocumentumTypeName DocumentTypeName.FieldTypeName1=Documentum’sType’sAttributeName1 DocumentTypeName.FieldTypeName2=Documentum’sType’sAttributeName2 DocumentTypeName.FieldTypeName3=Documentum’sType’sAttributeName3 For example if you are mapping invoice data it would be something like this Invoice-Data=D:ephesoft:document Invoice-Data.PartNumber=ephesoft:partNumber Invoice-Data.InvoiceTotal=ephesoft:invoiceTotal Invoice-Data.InvoiceTotal=ephesoft:invoiceDate Invoice-Data.State=ephesoft:state Invoice-Data.City=ephesoft:city
The next step is to prepare Alfresco for Ephesoft. There are three configuration files that need to be saved in Alfresco extension directory. This directory is at location
< Alfresco installation path>tomcatsharedclassesalfrescoextension
For Example, if your Alfresco is installed in D: drive, the path would be
The three files to add are
ephesoftModel.xml: This is the main configuration file that contains parameter mapping.
ephesoft-model-context.xml: Like any -context.xml file in Alfresco, this file is also used to define any custom configuration between Alfresco and Ephesoft.
web-client-config-custom.xml: This is the main file that was used configure the Alfresco webclient, but it’s nearly useless in the latest versions.
A typical ephesoftModel.xml looks something like this
<?xml version="1.0" encoding="UTF-8"?> <!-- Definition of new Model --> <!-- The important part here is the name - Note: the use of the my: namespace which is defined further on in the document --> <model name="ephesoft:custommodel" xmlns="http://www.alfresco.org/model/dictionary/1.0"> <!-- Optional meta-data about the model --> <description>Example ephesoft custom Model</description> <author></author> <version>1.0</version> <!-- Imports are required to allow references to definitions in other models --> <imports> <!-- Import Alfresco Dictionary Definitions --> <import uri="http://www.alfresco.org/model/dictionary/1.0" prefix="d"/> <!-- Import Alfresco Content Domain Model Definitions --> <import uri="http://www.alfresco.org/model/content/1.0" prefix="cm"/> </imports> <!-- Introduction of new namespaces defined by this model --> <!-- NOTE: The following namespace my.new.model should be changed to reflect your own namespace --> <namespaces> <namespace uri="custom.model" prefix="ephesoft"/> </namespaces> <types> <!-- Definition of new Content Type: Standard Operating Procedure --> <type name="ephesoft:ephesoft"> <title>ephesoft Document Procedure</title> <parent>cm:content</parent> <properties> <property name="ephesoft:invoiceDate"> <type>d:text</type> </property> <property name="ephesoft:partNumber"> <type>d:text</type> </property> <property name="ephesoft:invoiceTotal"> <type>d:text</type> </property> <property name="ephesoft:state"> <type>d:text</type> </property> <property name="ephesoft:city"> <type>d:text</type> </property> </properties> </type> </types> <aspects> <!-- Definition of new Content Aspect: Image Classification --> <aspect name="ephesoft:documentClassification"> <title>ephesoft Document Classfication</title> <properties> <property name="ephesoft:size"> <type>d:int</type> </property> <property name="ephesoft:type"> <type>d:text</type> </property> </properties> </aspect> </aspects> </model>
Similarly the ephesoft-model-context.xml, the main file that tells Alfresco to look for ephesoftModel.xml file looks something like this,
<?xml version='1.0' encoding='UTF-8'?> <!DOCTYPE beans PUBLIC '-//SPRING//DTD BEAN//EN' 'http://www.springframework.org/dtd/spring-beans.dtd'> <beans> <!-- Registration of new models --> <bean id="extension.dictionaryBootstrap" parent="dictionaryModelBootstrap" depends-on="dictionaryBootstrap"> <property name="models"> <list> <value>alfresco/extension/ephesoftModel.xml</value> </list> </property> </bean> </beans>
You can download both files at Sample CMIS Configuration for Alfresco
You can check if the configuration is correct by going to http:///alfresco/service/cmis, -> Types Collection -> Down
This will list all the objects inheriting from CMIS:Document. There you would be able to find D:Ephesoft document, and if you have configured it correctly, then selecting this would display list of properties, something like this
ephesoft:invoiceTotal id ephesoft:invoiceTotal localName invoiceTotal localNamespace http://com.ephesoft.demo/model/content/1.0 displayName Invoice Total queryName ephesoft:invoiceTotal propertyType decimal cardinality single updatability readwrite inherited false required false queryable true orderable true openChoice false ephesoft:invoiceDate id ephesoft:invoiceDate localName invoiceDate localNamespace http://com.ephesoft.demo/model/content/1.0 displayName Invoice Date queryName ephesoft:invoiceDate propertyType datetime cardinality single updatability readwrite inherited false required false queryable true orderable true openChoice false
Configuring CMIS plugin in Ephesoft
This is the easy part. Now you would get to actually export the batch. First navigate to document batches in Ephesoft. You can do that by going to
- Select the batch that you need to export. This will open module information and a list of selectable actions. Select Export.
- This will open a new screen. Now select CMIS export option
- Now this will open a proper configuration screen. Fill out all the necessary information and click Save.
That’s it. You are done.
Some Common Precautions
From our experience, we have seen that most errors in integration arises from value type mismatches. So it’s recommended to use String type fields in Ephesoft as it will make things much easier. If you map only document types in the mapping file, the correct document type should be set in Alfresco. This might be a good way to make sure your properties in Alfresco are set correctly.
References: Ephesoft, tpeelen, Addons