Tuesday, February 3, 2009

Indexing Freemind MindMaps with Alfresco - Alf Hack # 2

The idea of this Alfresco hack is to use a command line tool for text extraction of the Freemind .mm file. Steps to include this into Alfresco will be:
  1. Add Mimetype application/x-freemind for .mm
  2. Add transformer from appplication/x-freemind to text/plain
This article will talk about the second step. For adding a new MIME type please refer to the Alfresco Wiki. The MIME type of Freemind mid maps is application/x-freemind. There is also a nice blog post about adding the freemind MIME type and a nice map integration available.

Extract the text

An example shows how Freemind stores this sample map in a XML file:
<map version="0.7.1">
  <node text="Alfresco Hack No 2">
    <node text="Explore how Freemind XML looks like" position="right">
Quite simple XML without namespaces. The text of the map nodes is stored in a the value of the attribute text. To extract the text I will use a quick-and-dirty XSLT:
<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output omit-xml-declaration="yes" indent="no"/>
    <xsl:template match="/">
     <xsl:call-template name="t1"/>
   <xsl:template name="t1">
     <xsl:for-each select="//node">
       <xsl:value-of select="@TEXT"/>
       <xsl:value-of select="' '"/>

Throwing this XSLT on the Freemind XML results in the extracted text:
Alfresco Hack No 2 Explore how Freemind XML looks like

Add transformer to Alfresco
To keep things simple, I will use the Alfrescos feature to do content transformations with external tools or programs. This is done by configuring a RuntimeExecutableContentTransformer bean. But first, the command line of the external tool has to be figured out. I will use the xmlstarlet command line tool from http://xmlstar.sourceforge.net/. Depending on your linux distribution the executable will be called just xml or xmlstarlet. There is also a Windows version available from the download page. Transforming the above XSLT to xmlstarlets commandline results in:
xmlstarlet sel -t -m //node -v @TEXT -o ' ' Alfresco\ Hack\ No\ 2.mm
Sadly, the output always go to stdout and no output file can be specified. But this is required for the RuntimeExecutableContentTransformer, so a simple script wrapper can be used. I put the following to a file /home/lothar/bin/freemind2text.sh (made executable with chmod 775) which will be configured to the transformer bean:
# save arguments to variables

# to see what gets extracted append arguments to logfile
echo "from $SOURCE to $TARGET" >>/tmp/freemindtransform.log

# call xmlstarlet tool and redirect output to $TARGET
xmlstarlet sel --text --encoding UTF-8 -t -m //node -v @TEXT -o ' ' "$SOURCE" > "$TARGET"
Now we are ready to configure the RuntimeExecutableContentTransformer bean:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE beans PUBLIC "-//SPRING//DTD BEAN//EN" "http://www.springframework.org/dtd/spring-beans.dtd">
  <bean id="transformer.freemindToText" class="org.alfresco.repo.content.transform.RuntimeExecutableContentTransformer" parent="baseContentTransformer">
    <property name="transformCommand">
      <bean name="transformer.freemind.Command" class="org.alfresco.util.exec.RuntimeExec">
        <property name="commandMap">
            <entry key="Linux.*">
              <value>/home/lothar/bin/freemind2text.sh ${source} ${target}</value>
            <entry key="Windows.*">
              <value>...whatever windows needs here....</value>
        <property name="defaultProperties">
            <prop key="options"/>
    <property name="explicitTransformations">
        <bean class="org.alfresco.repo.content.transform.ContentTransformerRegistry$TransformationKey">

Now indexing of Freemind mindmaps will take place. On the plus side: No Java coding, just configuration of the standard Alfresco features. On the down side: ...is there anything? Anybody who could contribute the Windows batch file wrapper for the xmlstarlet call?

Monday, February 2, 2009

CMIS Link collection

Random link collection about CMIS: Blogs, Specs, Samples from Alfresco, EMC and others John Newton F2F

CMIS home: http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=cmis
Members: http://www.oasis-open.org/committees/membership.php?wg_abbrev=cmis
JIRA: http://tools.oasis-open.org/issues/browse/CMIS
CMIS TC list:http://lists.oasis-open.org/archives/cmis/
CMIS comments list:http://lists.oasis-open.org/archives/cmis-comment/ http://xml.coverpages.org/cmis.html http://info.emc.com/mk/get/DAP_RE?P.ctp_program_execution.Source_ID=16706
Also a nice link collection on CMIS