Extracting reuseable code from a Hydra head
05 Nov 2014 David Chandek-Stark
When we started building a repository application with the Hydra framework nigh about two years ago, nearly everything about the code environment was new to us – including Ruby, Rails, RSpec and Git. While we had a respectable level of repository domain knowledge (by no means experts), we tended to follow coding and testing patterns we observed within the Hydra development community.
I think we got the conceptual modeling mostly right. We stuck to models that seemed to satisfy our initial use cases and hoped that they would give us a good framework to refine as we rolled out the repository to a broader audience. With one significant exception, this wiki page describes our models and the relations between them. Originally we had a separate AdminPolicy object which implemented the Hydra admin policy behavior; however, because we observed in practice that we had one AdminPolicy object per Collection, we decided to migrate the admin policy functionality to the Collection model. (In the long term, we are developing a role-based access control approach, but that is another story …)
In recent months we have begun planning for at least two additional Hydra heads. This step caused us to embark on a refactoring of our original Hydra head, extracting the repository models into a separate project which we could reuse as a gem. In general, this process has gone reasonably well – meaning we’re getting through it – but we were rather naive about how complicated it would be. A better knowledge of Rails, gems, and refactoring in general would have helped, not to mention more code analysis and planning before diving in. A few specific issues deserve mention:
Namespaces (or not)
Following the typical Rails app development pattern, we originally put our repository models in app/models
, not namespaced. This created two issues:
-
We couldn’t namespace the models on refactoring (e.g., from
Component
toDdr::Models::Component
) without touching every object in the repository, which we didn’t want to do. -
We learned that
app/models
is not a magic path in a Rails engine (more on engines later), which affects Rails autoloading, etc., so we explicitly required each module:
(We also added app/models
to the gemspec require_paths
, but I don’t think that’s necessary.)
Hydra rightsMetadata dependency
If you have a true Hydra model, then you have rightsMetadata, which you get from Hydra::AccessControls::Permissions
. Unfortunately, this module (as of this writing) is not on a path available to the engine (b/c it depends on a Rails app?). Here’s the hack we came up with:
Fedora / Solr / Jetty
Testing repository models requires a Java servlet container for Fedora and Solr, for which most folks use Jetty. To integrate Jetty into the engine it’s convenient to use jettywrapper, which provides hydra-jetty and a number of useful rake tasks (jetty:start, jetty:stop, etc.). An undocumented (or poorly documented) feature of jettywrapper is that it will use a constant JETTY_CONFIG
for its default configuration, so the relevant parts of the Rakefile look like this:
Custom Predicates
ActiveFedora provides that custom predicates for RELS-EXT relations (FC3) can be added to a file at config/predicate_mappings.yml
. Unfortunately, this file apparently must be read when ActiveFedora::Predicates
first loads or it has no effect. Fortunately, you can use ActiveFedora::Predicates.set_predicates(mapping)
to add predicates later. So, back in the Rails engine module, we created an initializer:
where our predicates are defined:
Database migrations
Our repository models have a dependency on an ActiveRecord model that we use for event tracking, so we had to move that model into the gem as well. In order to test everything, we needed a db with migrations. While it appears to be technically possible to create something less than a full Rails engine that can deal with database connectivity and migrations (see Jeremy Friesen’s article), it’s also true that a Rails engine with an internal test app makes this easy (mainly because of the Rails engine rakefile). In the long run it would be nice to dynamically generate the test app (say, with engine_cart), but the static test app generated by Rails got us up and running fast.
Of course, since our original app already had the schema and database table for our Event model, we had to be a little more careful in defining the migration to only create the table if it does not exist:
Lessons Learned
Of course, some things only come with experience … but here some observations.
-
If you think there’s a chance you will build more than one hydra head, consider modularizing and namespacing your code from the start. I suspect that it’s much easier to connect pieces than to disentangle them. You’ll also get practice with gems and Rails engines.
-
Refactoring is a discipline (and I’m not very good at it). I tried to keep Martin Fowler’s definition in mind, but it’s hard to resist all the temptations to “fix stuff” at the same time that you’re reorganizing.
-
Good quality tests are critical when making extensive organizational changes. I’m still learning how to write good ones and avoid bad and unnecessary tests.
-
Unless you’ve got way more git-fu that I, you may want to freeze development on (or at least the relevant parts of) the original app while you’re extracting code. I didn’t want to think about how I was going to deal with merge conflicts and applying patches across projects.