Tuesday, May 24, 2011

Adventures in Multi-Architecture Eclipse

References

To bring to an end the sorry saga of trying to make a multi-architecture Eclipse build, the conclusion appears to be that it is impossible due to limitations in P2.

Although in my last post, I claimed to have success in having a per-architecture “configuration” directory and “eclipse.ini” file, the reality is that P2 leaps to the rescue in breaking what otherwise would be a fine set up. And I don't care how it's supposed to be capitalised.

The problem is twofold:

  1. An Eclipse P2 profile, written into the eclipse.p2.data.area location is single-architecture by design
  2. P2 doesn't check what the location of the configuration dir or eclipse.ini files are during updates; it just assumes they have the standard values

Earlier attempts of generating two profiles (on a per-architecture basis) succeeded in getting the application off the ground. However, since P2 also writes into configuration/org.eclipse.equinox.simpleconfigurator/bundles.info, we have to split the configuration directories as well.

Conventionally, the configuration directory is called configuration, but you can change it with a command-line switch (-configuration) or with an entry in the eclipse.ini file. This value is known at runtime (go to the 'About' page and you see it shown at the top) but P2 thinks it's still called configuration, regardless of what it actually is.

Not only that, but the eclipse.ini isn't always called eclipse.ini. The actual lookup is product.ini, with a fallback to eclipse.ini, if the product-specific one can't be found. This allows you to rename the launcher to something else, or (originally) to have multiple products in a single location.

Sadly, whilst you can launch a multi-architecture build using the previous case, you can't update it. P2 will write entries out to eclipse.ini and configuration, which if you've called them eclipse32.ini and configuration32 are of no use. Ironically, the plugins do actually get loaded and installed into the plugins directory; it's just the P2 metadata which is written to the wrong place. In The Good Old Days, you'd have been able to run eclipse -clean and it would have just worked – though to be honest, a multi-architecture would have just worked then as well.

All of this brings us to the conclusion that a single package, multi-architecture build is not going to happen with products in one folder. It might be possible to have one folder per executable with a shared plugin or feature store; but by the time you get to that, why not just stick all the plugins and features into a local M2_REPO type cache and just run from there?

Notes: yes, there's a bug, so please don't leave a comment asking whether I've raised one. And in all honesty, if you're going to leave a comment with P2 vs p2, then I'd much prefer you spent less time worrying about what people are calling your product and more time on getting it to work.

Monday, May 23, 2011

Git Tip of the Week: Git Revisions

References

This week's Git Tip of the Week is about looking for commits with git revisions syntax. You can subscribe to the feed if you want to receive new instalments automatically.


Git Log

Git log is a versatile tool which can let you introspect the state of your repository. We've been implicitly using it in a number of other examples already; in this post, we're going to look at some of the other options that git log takes, as well as the ways in which we can refer to items in Git's history.

Last week, we looked at Git reflogs, which (together with the previous post on git stash) discussed a notation for looking at commits with the HEAD@{1} (c.f. stash@{1}) syntax. In fact, there's a lot of other mechanisms we can use to refer to items in Git's history, over and above the commit hash or branch name.

A git history is a directed acyclic graph of commits, from the HEAD backwards to one (or more) roots. In most cases, commits have a single parent; but merge commits have two (or more) parents. (Hg, by contrast, can only have one or two parents. Converting from a Git repository to an Hg repository is therefore not totally faithful.)

Since each commit may have more than one parent, the parent operator (^ as a suffix) allows you to disambiguate which parent you are referring to. Given that each merge node is represented as a pair (or more) of commits, the parents are numbered from 1 to n. (The special number 0 is used to refer to itself.)


# Dummy repository
$ git log --oneline
77bc990 Third commit
25d4fc4 Second commit
f0faab6 First commit
$ git log --oneline HEAD^
25d4fc4 Second commit
f0faab6 First commit
$ git log --oneline HEAD^
25d4fc4 Second commit
f0faab6 First commit
$ git log --oneline HEAD^^
f0faab6 First commit
$ git log --oneline HEAD^2
fatal: ambiguous argument 'HEAD^2': unknown revision or path not in the working tree.

Here, HEAD is pointing to 77bc990 Third commit, and so both HEAD^ and HEAD^1 refer to the same item (25d4fc4 Second commit). However, HEAD^^ gives a different answer to HEAD^2; in the former, finding the root of the history and in the latter giving an error.

That's because HEAD^^ means “the (first) parent of the (first) parent of HEAD”, whereas HEAD^2 means “the second parent of HEAD”. Generally, HEAD^n where n >= 2 only makes sense on merge nodes.

However, there is another useful reference; the grandparent selector. Instead of considering a breadth-based lookup, it does a depth-based lookup:


# Dummy repository
$ git log --oneline
77bc990 Third commit
25d4fc4 Second commit
f0faab6 First commit
$ git log --oneline HEAD~
25d4fc4 Second commit
f0faab6 First commit
$ git log --oneline HEAD~~
f0faab6 First commit
$ git log --oneline HEAD~2
f0faab6 First commit

Like the parent operator, the grandparent operator can take a number as well; except instead of referring to the nth parent, HEAD~n refers to the nth grandparent.

Note that the ~ selects up the first parent (much like ^ does); but the two forms can be mixed if needed. In this case, both HEAD^~ and HEAD~^ have the same effect; but you can explicitly select which item you want with a numeric selector; for example, git log HEAD^2~10 gives you the current merge node's second parent's tenth ancestor.

Ranges and sets

As well as single references, it's possible to refer to ranges in Git as well. In fact, these are referred to as sets of commits (since they may not be contiguous on the commit tree). The most common way is to show the commits between one ref and another:


$ git checkout -b other f0faab6
Switched to a new branch 'other'
$ touch file
$ git add file
$ git commit -m "Adding file" file
$ git log --oneline
1762164 Adding file
f0faab6 First commit
$ git log --oneline other..master
77bc990 Third commit
25d4fc4 Second commit
$ git log --oneline master..other
1762164 Adding file

The syntax is between..and, where both references can either be one of the symbolic branch/tag references, or a commit hash etc. What this is saying is “Show me all the commits which are in master but not in other” (and vice versa, for the second one).

However, whilst it looks like a range, really this is just a set selection. The syntax between..and is actually a shorthand for ^between and, which itself is a shorthand for and --not between. In other words, the above code is showing you what is in other but not in master.

What use is the long hand, when the shorthand is much more convenient? Well, the long hand allows you to specify more than two references. For example:


# Show all unmerged changes between features 1,2,3 and master
$ git log ^master feature1 feature2 feature3
# Show changes in hotfix and release branches, but not in master
$ git log ^master hotfix release-1.0 release-1.1

In most cases, the asymmetric diff (using two commits) is probably what you want. It's worth noting, finally, that there is a symmetric diff (using two commits) – instead of two dots, use three:


$ git log master...other
1762164 Adding file
77bc990 Third commit
25d4fc4 Second commit
$ git log other...master
1762164 Adding file
77bc990 Third commit
25d4fc4 Second commit

Note that there's no difference in the order of the commits, since they're ordered by time (or however you want to order them in git log). And, since it's symmetric, the result is the same regardless of the order of operands.

Summary

Git allows you to specify single commits as well as a range of commits using a number of git revision identifiers, and these can be used to find out the state of and differences between trees. In this post, we covered the parent operator (^), the ancestor operator (~), as well as ranges with asymmetric and symmetric differences.


Come back next week for another instalment in the Git Tip of the Week series.

Tuesday, May 17, 2011

Git Tip of the Week: Reflogs

References

This week's Git Tip of the Week is about recovering work with the reflog. You can subscribe to the feed if you want to receive new instalments automatically.


Reflogs

The Git reflog is very different from the Hg reflog of the same name. Hg reflogs are equivalent to CVS's ,v files. (Actually, they're RCS's, but never mind that now ...)

A Git reflog is a list of hashes, which represent where you have been during commits. Each time a branch is updated to point to a new reference, an entry is written in the reflog to say where you were. Since the branch is updated whenever you commit, the git reflog has a nice effect of storing your local developer's history.

Furthermore, the pointer in the reflog points to a commit object, which in turn points to a tree object, which represents a directory-like structure of folders and files. So whilst the reflog is active, you can go back and see what changes you have made – and even recover specific files from previous commit versions. With Git, you never really lose anything; even if you've done a filter-branch to re-write history, you're only a reflog entry away from getting it all back.

To see what the reflog is all about, run git reflog from an active Git repository. It might look something like this:


9bdbd83 HEAD@{0}: commit: Adding  build script
86a7a39 HEAD@{1}: commit (amend): Updating commit message
325e0af HEAD@{2}: commit (amend): Example Project (fix typo)
06bf85e HEAD@{3}: commit (initial): Example project

The first number is simply the commit hash at the point the change was made. Even though these don't represent linear history (you'll see a couple of (amend) listed there), these are the sequence of actions taken on the local repository, in the order they were done.

The second is the state of HEAD, along with the number of changes. In this case, we have HEAD@{0}, which means where HEAD is now; HEAD@{1} is where HEAD was previously, and so on.

The final part is the type; whether it is a commit or an amended commit, and the commit subject. This is often helpful to remember where the code was, especially if it isn't part of the linear history. It also contains other operations, such as checkout, merge and reset:


abec02f HEAD@{0}: merge foo: Merge made by recursive.
9bdbd83 HEAD@{1}: 9bdbd83: updating HEAD
2d90ece HEAD@{2}: merge foo: Fast-forward
9bdbd83 HEAD@{3}: checkout: moving from foo to master
2d90ece HEAD@{4}: commit: hello
9bdbd83 HEAD@{5}: checkout: moving from master to foo

Reflog references

Most git commands accept a number of different references to point to a commit. For example, you can run git checkout master, git checkout abec02f and git checkout mytag. However, you can also checkout references by reflog as well.

In the example above, we can run git checkout 290ece, or we can refer to it as git checkout HEAD@{2}. Provided we haven't committed anything else (which would change the reflog), these two variations have the same effect.

You can use this to implement a crude form of undo:


git config --global alias.undo  "reset HEAD@{1}"

This will cause you to revert to the previous action (whether it was a commit or otherwise).

Reflog references revisited

Although we've been using HEAD here, reflogs are more general than just HEAD. The general representation is name@{qualifier}.

In fact, stashes (covered last week) are a specific form of reflog, whose name is stash. Not only that, but other branches can be referred to by their reflog as well.

All of the branch reflogs are stored under .git/logs/refs/heads/. (There's also one under .git/logs/HEAD, as well as .git/logs/refs/stash if you've used a stash before.)

We can identify reflogs for a specific branch as well:


$ git reflog show master
abec02f master@{0}: merge foo: Merge made by recursive.
9bdbd83 master@{1}: 9bdbd83: updating HEAD
2d90ece master@{2}: merge foo: Fast-forward
9bdbd83 master@{3}: commit: hello
$ git reflog show foo
2d90ece foo@{0}: commit: hello
9bdbd83 foo@{1}: branch: Created from HEAD

Since all of these are valid git references, we can perform diffs against them, e.g. git diff foo@{0} foo@{1}.

Timed reflogs

Since each reflog has an implicit time associated with it, you can filter not only by history, but also on time as well. Various supported forms include:

  • 1.minute.ago
  • 1.hour.ago
  • 1.day.ago
  • yesterday
  • 1.week.ago
  • 1.month.ago
  • 1.year.ago
  • 2011-05-17.09:00:00

The plural forms are also accepted (e.g. 2.weeks.ago) as well as combinations (e.g. 1.day.2.hours.ago).

The time format is most useful if you want to get back to a branch's state as of an hour ago, or want to see what the differences were in the last hour (e.g. git diff @{1.hour.ago}). Note that if a branch is missing, then it assumes the current branch (so @{1.hour.ago} refers to master@{1.hour.ago} if on the branch master.

Summary

Git never really loses anything, even if you perform filter branching or commit amending. The commit, tree and contents are still stored in the repository and the reflog still maintains pointers to those previous commits.

Each time you update a branch, it stores the set of past revisions in a reflog, which is both stored against HEAD and against a particular branch (including a special reflog for stashes).

The reflogs stay until expired (which can be done with the git reflog expire command). The default for unreachable commits is 30 days (or the gc.reflogExpireUnreachable config value) or, for reachable commits, 90 days (or the gc.reflogExpire config value).


Come back next week for another instalment in the Git Tip of the Week series.

Tuesday, May 10, 2011

Git, Gerrit and Jenkins for iOS development

References

Last week, I gave a presentation at the London iOS Developer Group meeting at the Apple Store in Regent Street, London. The purpose of the talk was to cover the points I'd made previously in Someday... and the subsequent Git, Gerrit and Jenkins presentation. Links to resources are available from the Lanyrd site.

Since these were both Java specific, and I gave a short talk at NSConf earlier this year, I wanted to do something more iOS specific. The resulting demonstration evolved into the one seen here.

This video is an audio dub of the same presentation material, but re-recorded for the purposes of vimeo. Enjoy!

Git, Gerrit and Jenkins for iOS development from Alex Blewitt on Vimeo.

I have added this to those subscribed to the git tip of the week feed as a departure from the regular Tuesday series.

Git Tip of the Week: Stashes

References

This week's Git Tip of the Week is about keeping work in progress safe, known as stashing. You can subscribe to the feed if you want to receive new instalments automatically.


Drop everything bugs

Sometimes, when you are working on a problem you need to drop it suddenly in order to work on a different problem, like a critical bug that has been reported against a production version of your code.

Whilst you could easily create a new checkout of your project with the specific branch, it's faster to switch branches in Git natively, and saves setting up other tools like an IDE or a test server.

The problem is, your work may not be in a fit state to commit at the time the bug comes in. Rather than committing a half working state, you ideally want to save the work in progress (including dirty, uncommitted files), to permit you to switch to the higher priority issue, and resume when necessary.

Git Stash

This is where git stash comes in handy. This takes a working tree and files it away in a location which can be retrieved at a later stage. The stash can then be popped to get the changes back as they were prior to the stash taking place.

To create a stash, just run the git stash command. You can see what stashes are present with git list:

$ git stash list
$ touch example
$ git add example
$ git stash
Saved working directory and index state WIP on master: dc3ea83 
HEAD is now at dc3ea83
$ git stash list
stash@{0}: WIP on master: dc3ea83
$ ls example
$ # do emergency bugfix here, and afterwards
$ git stash pop # or git stash apply
$ ls example
example

Here, we created a new stash which contained the example file. Once we'd done the stash, we were reset to the current commit (i.e. a clean workspace), so the example file is no longer present.

We can list the available stashes with git stash list, which gives them their identifier (in this case, stash@{0}) which can be used to identify the changes. (It defaults to the last stash if not specified.)

The stash identifier also includes what branch the stash was created from. Importantly, once the stash has been created, applying (or popping) the stash results in just the differences being added, rather than resetting back to a specific previous state. This allows subsequent changes to be made on the branch, followed by replaying the stash changes on the current version of the branch afterwards.

Git stash can be used to avoid merge conflicts with work-in-progress code. Simply stash before doing a pull, and unstash afterwards, as the example in the manpage shows:

$ git pull
...
file foobar not up to date, cannot merge.
$ git stash
$ git pull
$ git stash pop

Notes

  • The stash only applies to added files; it doesn't apply to untracked files. If you want to add untracked files as well, you need to run git add --all prior to running git stash operation.

  • The WIP stands for Work In Progress, since it's not obvious to non-English native speakers.

  • git stash pop will remove the stash from the list; git stash apply will keep the stash in the list

  • The {0} represents the last stash you did, the {1} represents the second to last, and so on. This syntax also turns up elsewhere (e.g. reflogs).

  • Individual stashes can be removed with git stash drop, or git stash clear to get rid of all of them.

  • Git stashes are stored as commit objects, but the branch HEAD isn't updated to point to them. Instead, they are referenced from a reflog entry in .git/logs/refs. You can use git show stash@{0} to see a stash.


Come back next week for another instalment in the Git Tip of the Week series.

Tuesday, May 03, 2011

Git Tip of the Week: Gollum

References

This week's Git Tip of the Week is slightly off topic, talking about the Git-based wiki Gollum. You can subscribe to the feed if you want to receive new instalments automatically.


Wikis

Wikis have been around with us for over 15 years, and being able to edit a page from any location has been a massive timesaver in documentation generation. Most of the EGit User guide has been created through the use of wiki contributions.

One problem with most wikis is that you need to be on-line in order to interact with them. (There are some clients that cache pages from Wikipedia for off-line reading, but these are not able to manage changes.) Wouldn't it be great if we could have a system that allowed us to edit wikis off-line, and merge our changes back in when we are connected again?

Gollum

Enter Gollum from those fine people at GitHub. This is a ruby implemented wiki server which allows you to view, edit and save documents into a wiki on your local machine. (It's been around for a while but not widely known about, for some reason.)

As it's from GitHub (and this post is about Git tips, after all) then it should come as no surprise that Gollum is a wiki server for wiki pages backed by a git repository. Each save corresponds to an individual commit in the repository, and the “write a change message” box at the bottom is translated to the Git commit message.

Each wiki page is translated to a page specified in the --page-file-dir directory (or the repository root, if not set). Furthermore, the markup is user-configurable and defaults based on extension type (the default for new pages is Markdown). Other sane wiki formats (like MediaWiki) are also supported, though insane formats (like confluence) are not. In fact, the multi-wiki-format is supported through Jekyll, so whatever formats are supported there are likely to be usable.

Up and running

Installing Gollum is easy, provided you have Ruby installed (which you do if you have OSX). You can run:

sudo gem install gollum

… which will install it in your system's path automatically, or without sudo to install in a per-user path (which on OSX is in ~/.gem/ruby/1.8/bin, which you'll have to add to your path if you want to run it from the command line).

Once installed, you can create a repository, fire up Gollum, open a web browser, and you're off:

git init TestWiki
gollum --page-file-dir wiki TestWiki
openurl http://localhost:4567

This creates a new wiki (for test purposes) and fires up the Gollum server, pointing it to that Git repository. We have specified wiki as the subdirectory, so that when we commit a file, we're writing it into TestWiki/wiki/PageName.md.

Formats

There are many different types of formats that are supported by Gollum, but the parsers have to be installed separately in order to see any wiki content rendered as you expect.

  • Markdown gem install rdiscount
  • MediaWiki gem install wikicloth

If you want to pretty-print code, you can use Pygments with sudo easy_install pygments. This allows you to begin code with ```java and end with ``` to pretty-print the embedded code snippet.

So, if you try editing a page and the markup isn't being rendered appropriately, check that you have the appropriate renderer installed in order to work.

Identity

Git commits in the repository use the credentials that are associated with the user who launched Gollum. Whilst this works for open projects, if you want to have a recorded user from some kind of SSO, you'll need to integrate this. Gollum uses the Ruby Sinatra to generate the web-based front end, and this is used to determine whether authentication is used or not (see the FAQ).

However, it doesn't support passthrough of the committer's identity into the commits, or using the GIT_AUTHOR_EMAIL or GIT_COMMITTER_EMAIL variables (though this is an issue with Grit, the Ruby front-end to the Git repository). This, combined with lack of multi-project support somewhat limits it for production uses, but works fine for a local (single user) wiki.

Summary

Gollum provides a powerful web-based mechanism to edit wiki pages in a local git repository, using the local user's commit credentials. This allows a distributed wiki to be edited remotely (whilst disconnected) but managed by Git under the covers, including the ability to branch, tag, and distributed pushing.

Whilst it's lacking in some key use cases that make a distributed wiki front end for teams of users, for an anonymous wiki (or one where the committer credentials are of lesser importance) it's an incredibly easy system to get up and running. And for those that regularly write whilst disconnected, it can be a good way to build up a repository of information without needing to be connected yet still allow those changes to be merged into a repository when reconnected.


Come back next week for another instalment in the Git Tip of the Week series.