Archive for the ‘AMP’ tag
Alfresco: Simple File Diff
I’ve heard asked many times by customers and community members if there was a way to diff files in Alfresco and alas there isn’t an OTB way to do this. A month ago the discussion came up again internally. And I thought it might be fun to tackle this as side project just to see if/what was possible. So I took an evening and hammered out a simple Java class that did a comparison between two text files. Once I saw that I had at least the basics (annotate the differences between two files) and had gotten the question of basic possibility/difficulty out of the way I moved on to other projects.
Today almost the entire family is sick so I thought I’d pick up the project again, moving the Java class to a Java Backed web script.
The web script is a simple GET that takes the nodeRef of two files, or two versions of the same file and outputs a simple HTML page that highlights the differences between the two. There are no complex algorithms that take into account shifts in blocks or identifies just the text in a line that has changed. It is a simple line by line comparison of two pieces of content. It is not integrated in to Share or Explorer at this time. I might take that as a separate sick day project (or accept any code contributions to add that).
I’ll admit right off that the code is ugly and repetitive. But this is more of a Proof of Concept than a full production ready implementation (though it could definitely be used as such to provide a quick view of differences).
I’ve also probably bored you with the above so let’s just jump right in before I completely lose you…
Using The Web Script
The web script is called by the following URL:
alfresco/service/file/{protocol_1}/{identifier_1}/{id_1}/diff/{protocol_2}/{identifier_2}/{id_2}
For real world examples we’ll first look at comparing two files
alfresco/service/file/workspace/SpacesStore/dd83c9f6-81b7-462b-8a1a-1e9a2af251dd/diff/workspace/SpacesStore/ca65d129-6c2c-4ba0-936d-d7626f94f23a
Second comparing two versions of the same file
alfresco/service/file/workspace/SpacesStore/dd83c9f6-81b7-462b-8a1a-1e9a2af251dd/diff/versionStore/version2Store/ca65d129-6c2c-4ba0-936d-d7626f94f23a
What is returned is, as stated above, a simple HTML highlighting the differences
Each line that is different is highlighted in blue. Simple and to the point.
The Code
This is just a little Declarative Web Script that reads the content line by line and then compares the hash of each line to see differences. When a difference is found it is wrapped in HTML to annotate the difference so that when displayed, CSS can take care of highlighting the differences.
A couple of things that I think are important to note:
- File length: When comparing two files there is always the possibility that one is longer/shorter than the other. To simplify the comparison, I just append lines with a single space to the shorter file, simplifying any computational work needed for the comparison caused by the difference in length.
- I mentioned above that the appended line contains a single space. This is done so the that the line appears in the output. <div> tags with no content can be ignored by some browsers. The annotation/presentation uses a combination of <div> and <pre> tags. The space is maintained in a <pre> tag forces the div element to be visible.
- Special Characters: Because the output for the comparison is targeted for HTML, it is important to escape all characters/strings that could be interpreted by the browser as presentation elements. Apache Commons (included with Alfresco) has classes to help do this.
- Gotcha!: When I was initially testing the code, the file content kept appending files to the previous request. So remember when defining a Collection as a class scoped variable to call clear() on the List to make sure it is empty before it gets reused.
This extension is available as an AMP. The source is available in the Google Code Project.
Alfresco: Alfresco PDF Tool Kit – Insert PDF Action
I’ve taken a bit of Holiday time to update the Alfresco PDF Toolkit. Nate has been doing an outstanding job adding Watermarking, Digital Signatures, Encryption and cleaning up my messy code. But it was time to add a little bit myself.
So I took sometime this evening to add in one of my planned actions: Insert PDF. This action allows you to insert a PDF into another PDF at a specific page.
This is a pretty straight forward action to test: From the Document Details page of the PDF you want to insert content into, select Run Action (This action can also be run through the rules engine or scripted).
From the list of Actions you’ll want to select Insert PDF.
Now you will want to set the parameters for the insert. There are 4 parameters for an insert:
- Name: This is the name of the new file that will be generated. No extension is needed, it will automatically be added.
- Destination: Space where the new file will be stored.
- Insert at page: Where page 1 of the inserted PDF will start in the targeted PDF.
- Insert: The PDF to insert into the targeted PDF
Finally, you’ll get a message summarizing the action you are taking
See pretty simple!
The latest amp and source can be found at the Alfresco PDF Toolkit project site.
Alfresco: Default Quota Policy
Updated: Added new tested versions
For this post I want to share another policy from recent request from a customer. The project was to help them develop a way to have usage quotas set to a default value when a new user/person was added to Alfresco. (There is an important distinction between the two.) A few months ago I had a discussion about possible ways to implement this kind of functionality with a co-worker and had a few ideas brewing as we started the engagement.
First some background on quotas:
- The default quota in Alfresco is unlimited. In other words, there is no limit in the amount (total size) of content a user can add/own in Alfresco.
- A quota can be set either on creation or at anytime during a users lifetime.
- From within the UI a quota can be in either GB, MB or KB.
- From an API (Web Script, JavaScript, Java) it accepts the size in bytes.
- See http://wiki.alfresco.com/wiki/Usages_and_Quotas for additional details
When developing a behavior I continually reference Jeff Potts’s tutorial on implementing behaviors. His table of available policies is a great quick reference. Of course the definitive source is the Alfresco source code, but not much has changed since Jeff wrote the article.
JavaScript Behavior
The first attempt to implementing this policy was to use a behavior implemented in JavaScript. I used Jeff’s JavaScript example as the seed for my code. In fact there was very little I needed to change (mostly just the business logic). It is a great outline to get you going.
Using JavaScript wasn’t completely successful. The code did work up to a point and that point was setting the quota. There is an undocumented (in the wiki) JS function that can be used to set quotas.
From the People class (note the comment):
/** * Set the content quota in bytes for a person. * Only the admin authority can set this value. * * @param person Person to set quota against. * @param quota As a string, in bytes, a value of "-1" means no quota is set */ public void setQuota(ScriptNode person, String quota)
During the transaction the quota was being set, but once the transaction was committed it was lost. The problem being (again see the comments) that the setQuota must be executed by an admin. A quota must be set explicitly by an admin.
The JavaScript API lacks the ability to execute a script as one user but then run specific code in that script as another user. This is done for security reasons (like being able to keep a user from uploading and then executing a malicious script).
While the script did not work as desired it still may be useful for something else. So I’m including links to the context file and script as part of this post.
Java Behavior
Because we need to set the quota as an admin user, we’ll switch to Java which provides a useful means to run code as one user, and execute specific parts as a different user.
Before we jump to that, let’s talk about some of the specifics of our policy:
First, there is not much difference in the structure needed for this behavior and the Max Version Policy I wrote about in a previous post. So I used it as a template for this project.
Next, users/people in Alfresco are stored as nodes. If you start to dig into the node browser you may be drawn to find your users in the user://alfrecoUserStore store. This is, as the name suggestions, the Alfresco User Store which is used as the native Alfresco authentication source (alfrescoNTLM). The nodes in this store are of type {http://www.alfresco.org/model/user/1.0}user. Quotas are a property of {http://www.alfresco.org/model/content/1.0}person and are found in the workspace://SpacesStore under system -> people.
Alfresco has no specific policies for user nodes (like an onCreateUser policy), but since they are just nodes we can leverage the onCreateNode policy. Also because they are just nodes, when using an onCreateNode policy we don’t want the policy to be run every time a node is created. We want to target cm:person nodes. Otherwise, we would need to overcomplicate the code with additional tests to only only execute the business logic when the newly created node is of type cm:person. So we can bind our policy to a specific type. Similarly to the max version policy we initialize our behavior and bind it in the init method.
public void init() {
this.onCreateNode = new JavaBehaviour(this, "onCreateNode", NotificationFrequency.TRANSACTION_COMMIT);
this.policyComponent.bindClassBehaviour(QName.createQName(NamespaceService.ALFRESCO_URI, "onCreateNode"), ContentModel.TYPE_PERSON, this.onCreateNode);
}
In the init() above we bind our policy: QName.createQName(NamespaceService.ALFRESCO_URI, “onCreateNode”), the type: ContentModel.TYPE_PERSON and the behavior: this.onCreateNode together.
Next we need to implement the onCreateNode method from the OnCreateNodePolicy in our new class
public void onCreateNode(ChildAssociationRef childAssocRef) {
We can grab our reference to the newly created user node from the ChildAssociationRef parameter.
final NodeRef user = childAssocRef.getChildRef();
Notice that it is using the final modifier. Variables from an outer class that are being referenced in an inner class must be defined as final.
For our default policy we also want to allow the default quota to be overwritten. If an admin wants to set a different value for the quota at creation time we need to allow it. So we need to first get the value that was assigned to the user on the node at creation
long currentQuota = contentUsageService.getUserQuota((String) nodeService.getProperty(user,ContentModel.PROP_USERNAME));
The default value of no quota is -1. If an admin has set a value for the quota it will be greater than 0.
if (currentQuota < 0) {
...business logic here...
}
Because quotas can only be set by an admin we need to use the runAs utility
AuthenticationUtil.runAs(
new AuthenticationUtil.RunAsWork<Object>() {
public Object doWork() throws Exception {
...code to be run as a different user...
}
}, "admin");
runAs takes two parameters: The first is the RunAsWork Object, which contains an inner class that has the code that needs to be run as another user. Second the name of the user to execute the code as.
Now in our policy we can set the quota. We’ll use the contentUsageService to set the new users quota, taking the username and the value of the quota as parameters. Also note that default value is in bytes which is read from the spring context file for the OTB default I’ve chosen 2GB (2147483648).
contentUsageService.setUserQuota((String) nodeService.getProperty(user, ContentModel.PROP_USERNAME), Long.parseLong(defaultQuota)); return user;
RunAsWork is also looking for a return value which I’ve set as user. We won’t being doing anything with this value so in this case it is an arbitrary return.
Installing
The policy is packaged as an AMP and is downloadable from the alfresco-defaultquota-policy project hosted on Google Code. (See http://wiki.alfresco.com/wiki/Module_Management_Tool for detailed instructions on installing AMPs.)
The amp has been tested with Alfresco Enterprise 3.1.1 to 3.3.2.
I’ve only tested this on Enterprise 3.3.2. I’ll look to expand my testing soon. If you have a chance to try it on another version of Alfresco please let me know. (In that case you’ll need to build the amp from source, changing the version.min/max property to include the version you’re targeting.)
So what next?
While this policy is limited to setting the user quota, your probably thinking what else can I use this for? I know I am! Maybe your using a database or some other none supported user store in Alfresco and you want to populate the user node properties with details from that system. Or you want to add them to specific groups based on some complex business logic. Or interact with external systems when a user is created in Alfresco. What are your ideas?
I also believe that this should work with any external authentication/synchronization of users into Alfresco. I’ve not tested it, but it is the list of things to test next.
I’m always looking for feedback: let me know what worked or didn’t for you.
Alfresco: Permissions Web Scripts
A couple of months back I was asked to write a couple of web scripts to help one of our customers to be able to check and modify permissions for content/spaces in the Alfresco repository. I’ve finally had the chance to spend sometime testing and now writing about them.
The core of the web scripts was quick to write. The fun (more time consuming) part was working with exception handling in javascipt. I know tons of fun right! There are few different ways to use exception handling based on which version of Alfresco you are using. The customer is on Enterprise 3.1 and I wanted to make sure that the web scripts also worked on the more current releases of Alfresco as well. A change (re: addition) was made in Enterprise 3.2.1 and Community 3.3 to help simplify exception handeling. I’ll talk about exception handling and these differences in a follow up post. For now let’s talk about these new web scripts.
permissions GET
The first web script returns all of the permissions for a specified node.
The URL used is /alfresco/service/permissions/{store_type}/{store_id}/{id}
Where
store_type: The type of store you want to query, ex: workspace
store_id: The ID of the store you want to query, ex: SpacesStore
id: The UUID of the node, ex: aed218e8-df44-4865-84cd-0105252f4993
The above values are joined together to form the nodeRef.
If the node is not found a 404 error will be returned, any missing URI parameters will result in a 400 error and if you don’t have permission to view the node you will get a 401 error.
The web script will return a JSON object that looks like the following:
{ "permissions": [
"ALLOWED;user1;Coordinator",
"ALLOWED;user2;Coordinator"
] ,
"inherit": false }
The return object lists the permissions in a triplet for that node. The permissions triplet follow this format:
[ALLOWED|DENIED];[USERNAME|GROUPNAME];PERMISSION
It also returns a boolean value indicating if some permissions are inherited from the parent node.
The above example shows two permissions are assigned to the node: the Coordinator permission is given to user1 and user2 on this node. Permissions are not inherited from the parent node.
permissions POST
This web script enables you to modify the permissions for a given node
It is called through the same URL as the above web script but as a POST instead of a get: /alfresco/service/permissions/{store_type}/{store_id}/{id}
Again, if the node is not found a 404 error will be returned, any missing URI parameters will result in a 400 error and if you don’t have permission to modify the node, you will get a 401 error.
You must also pass a JSON object containing the permissions that are being changed, deleted or added.
{ permissions: [
"REMOVE;user3;All",
"REMOVE;user2;All",
"ADD;user4;Coordinator",
"ADD;GROUP_usergroup1;Consumer"
] ,
"inherit": false }
The above example uses the following triplet to define a permission
[ADD|REMOVE];[USERNAME|GROUPNAME];PERMISSION
Where the values are defined as:
ADD | REMOVE: Do you want to add or remove the permission for this user/group? Any other value passed will result in a 400 error.
USERNAME | GROUPNAME: The user or group you want the permission to be added or removed for. Group names must be prefixed by GROUP_. Unknown users or groups will result in a 400 error.
PERMISSION: The supported permissions options are defined in
org.alfresco.service.cmr.security.PermissionService or through custom extension to the permission model. Unknown permissions will result in a 400 error.
The object can also contain an optional inherit permission value to specifying if the permissions for this node should be inherited from the parent node. Without the inherit option, the current value for the node is maintained. Inherited permissions can not be removed from a node.
The return format is the same as the return format of the permissions GET web script above.
This web script is also transactional: any errors will result in the node being returned to the state before the call was made. (The exception handling in the controller was added for these conditions.)
These scripts can be installed as an AMP. The code and AMPs are hosted in the alfresco-permissions-webscripts project on Google Code. The code is available for either pre-3.2.1 (starting with 3.1) or 3.2.1 to 3.3.1. These are all Enterprise releases numbers. The web scripts have been tested against these releases. There may not be any need to modify the web scripts for Community releases (except for the min and max version numbers in the module.properties file). Pre community 3.3 should use the pre 3.2.1 release. No community releases have been tested. (If you try these on a community releases, please comment either here or in the Google Code Project.)
In a follow up post, I’ll cover exception handling with JavaScript.
Alfresco: Max Version Policy
UPDATE: Updated code examples to match updates to source code
As part of a POC for a customer I was asked to write an extension that allowed them to control the total number of versions allowed per versioned content. (Download links at bottom of page)
Alfresco has a strong versioning story, that gives you the ability to version any content stored in the repository, no matter what the file type. Versions are full files and not diffs of the files. Alfresco gives you the ability to have both major and minor versions of content. Versions can be created/updated by checkout/checkin, by rule, through any interface or through script/APIs.
Alfresco also provides the ability to apply behaviors / policies to content/metadata within the repository. You can think of these as event listeners, that allow you to take custom actions based on what is happening within the repository. Jeff Potts has written an excellent tutorial on creating behaviors, with examples for both Java and javascript. (Other resources include: )
There are 4 versioning policies that we can work with:
- beforeCreateVersion
- afterCreateVersion
- onCreateVersion
- calculateVersionLabel
(For an example of a calculateVersionLabel policy look at this post and the accompanying code by Peter Monks.)
For this policy we are going to use afterCreateVersion: after a version is created we want to remove any version that puts us past the max version value.
You start by implementing the policy interface for the policy you want to apply.
public class MaxVersionPolicy implements AfterCreateVersionPolicy
Adding the method implemented by the interface
public void afterCreateVersion( NodeRef versionableNode, Version version)
We also need to bind the behavior to the policy and we need to do this when the class is loaded by the springframework. We wrap this registration in an init() method
public void init() {
this.afterCreateVersion = new JavaBehaviour(this, "afterCreateVersion",
NotificationFrequency.TRANSACTION_COMMIT);
this.policyComponent.bindClassBehaviour(QName.createQName(
NamespaceService.ALFRESCO_URI, "afterCreateVersion"),
MaxVersionPolicy.class, this.afterCreateVersion);
}
The init method is then called when spring loads the bean
<bean id="maxVersion"
class="org.alfresco.extension.versioning.MaxVersionPolicy"
init-method="init">
<property name="policyComponent">
<ref bean="policyComponent" />
</property>
<property name="versionService">
<ref bean="versionService" />
</property>
<!-- The max number of versions per versioned node -->
<property name="maxVersions">
<value>10</value>
</property>
</bean>
Now to the meat of the policy, enforcing the removal of versions.
First we we have our maxVersion property. This is set in the springbean (see above) and read when the bean is loaded. (You can overwrite the default value by copying the maxversionpolicy-context.xml file into the extensions directory and changing the value of the maxVersion property)
public void setMaxVersions(int maxVersions) {
this.maxVersions = maxVersions;
}
Next we want to remove any version that puts us over the max
@Override
public void afterCreateVersion(NodeRef nodeRef, Version version) {
VersionHistory versionHistory = versionService
.getVersionHistory(nodeRef);
// If the current number of versions in the VersionHistory is greater
// than the maxVersions limit, remove the root/least recent version
if (versionHistory.getAllVersions().size() > maxVersions) {
logger.debug("Removing Version: "
+ versionHistory.getRootVersion().getVersionLabel());
versionService.deleteVersion(nodeRef, versionHistory
.getRootVersion());
}
}
We first get the versionHistory for the node and check it against the maxVersion property. If we have more versions than the limit we delete the least recent version from the version history.

Notice: 10 total versions; least recent version is 1.1; no pagination!
Q: Why are you using an if statement, instead of looping through all of the versions. I have a few (many) more versions over the limit I want to impose?
A: A while statement would be more efficient in removing everything above the limit, but the versionHistory object isn’t being updated with each delete in a loop (I tried) until the policy has completely run. Luckily, as you will find as you implement custom behaviors, they get called a lot! So while a single update will result in a single version added, you will actually see that the behavior is called multiple times for that one action (In testing I saw the behavior being called a minimum of 7 times). While you might not clear out every version over the limit with a single update, you will see a good many of them removed. If you want to bring every down to the limit you could create a “cleaner” extension that could go in and remove all of the versions over the limit (this would be a very intensive/costly operation depending on the amount of content in your repository and the size of your version history – There several different strategies for this). You could then use this extension to enforce that limit.
Q: Does this affect all versioned content in the repository? What if I only want it to work on some content?
A: Yes, by default this policy is being applied to all versioned content, but you could add in checks to look for a specific aspect, parent space name, property, etc. before passing through the delete code.
Q: Compliance regulations/laws don’t allow me to delete the versions, but require me keep them for X years. How can I archive versions?
A: There are probably several different ways to handle this. One way could be to use Alfresco’s content store selector. The content store selector allows you to configure multiple underlying filesystem locations for your content. To the end user all of the content appears to come from the same location, while the content itself lives in different disc systems. The idea would be to set up a secondary store that would act as the archive. When your trigger is reached, be it a property (status [property or aspect], date, etc.), a version number/count, etc. have the policy move the content to the archive space structure (The archive will need to have the cm:storeSelector aspect set to mark it to use the secondary store). Because we are working from the least recent version, the 1.0 version of the content, it becomes the first version in the archive, each subsequent version, is added as a version for the content, version numbers are then maintained in the archive. I would also add an aspect to the original node that contains a pointer/aspect to the archived version, for reference. Then delete the least recent version of the original versioned node.
Google Code Project: http://code.google.com/p/alfresco-maxversion-policy/
Download Alfresco Max Version Policy AMP
Tested on Alfresco (Enterprise) 3.2 – 3.3.1
Alfresco PDF Toolkit
A few weeks ago I made my first release of what I am calling the Alfresco PDF Toolkit on Google Code.
Alfresco PDF Toolkit, or as I originally named it, pdf-extension, has been around for a while. It was originally hosted on my SVN server and in then in my Alfresco SVN Repo. It was developed as a one-off side project at the request of an early Alfresco customer. The code has been sitting around with very few updates since that time. An occasional rebuild to make sure it would work with a new release of Alfresco but that was it. (I think it was original built for 2.1 or 2.2)
Why the update? Focused attention. It is a way to get my hands dirty and adding some new features that have been rattling around in my head for a while now.
What can it do? This initial release has three functions:
- Split PDF — This option allows you to split a PDF every specified number of pages
- Split at Page — This option allows you to split a PDF at a specified page
- Merge PDF — This options allows you to append a PDF to another PDF
All three options generate new PDFs leaving the originals untouched.
Where can I find these actions? The actions are available in the Run Action Wizard. They use the same names as above.
What does the future hold? The current list of enhancements is as follows (bot not necessarily in this order):
- PDF Watermarks / Rubber stamping
- Digital Signatures
- TIFF to PDF transformation
- Extract Pages
- Delete Pages
- Transform to PDF/A
Can I help? Yes! I’m more than happy to accept code contributions or add features the list of enhancements. And of course, merit can make you a contributor.
Where can I find the code or the AMP again? In the Google Code Project: Alfresco PDF Toolkit
Note: This is NOT an officially supported Alfresco Project.



