[Attempto] User lexicons

Kaarel Kaljurand kaljurand at gmail.com
Fri Apr 3 19:03:14 CEST 2009


Hi,

On Fri, Apr 3, 2009 at 4:53 PM, Danica Damljanovic
<danica.damljanovic at gmail.com> wrote:
> I have also used this lexicon a bit, and was wondering if you use some real
> dictionary to create plural and past participle,
> or is this implemented as a set of rules such as lemma+s for plural or
> lemma+n for p.participle, etc.?

The automatic guessing is not done using a dictionary. ACE View uses
the lexicon/morphology system of the SimpleNLG package to do the guessing.
This system is based on morphg, see:

* http://www.csd.abdn.ac.uk/~ereiter/simplenlg/
* http://www.informatics.sussex.ac.uk/research/groups/nlp/carroll/morph.html

I don't really know the details of how morphg works, but it seems to be based
on rules + a list of exceptional words. My impression is that it covers the
English morphology quite well.

> The reason I am asking is that for some properties which are actually
> verb+noun, for example, hasEmail, it added 's' and 'ed' for pl and vbg
> respectively.

The generation of surface forms works well if the input lemmas are actual
English words. If they are some sort of combinations then the outcome is
not so natural of course.

I've extended the SimpleNLG Verb class to handle verb + preposition combinations
because such forms are supported by ACE. E.g. if you introduce an object
property "work-for", then the following forms are automatically generated:

works-for, work-for, worked-for

and you can then write sentences like:

John works-for IBM. (rather than "John work-fors IBM.")
IBM is worked-for by John.

Note that this works only if the parts are hyphenated, and not
camelCased, i.e. "worksFor" would not be handled correctly.

This extension is implemented like this:

http://code.google.com/p/aceview/source/browse/trunk/src/ch/uzh/ifi/attempto/ace/ACEVerb.java

This extension does not cover combinations of auxiliary verb + noun, such
as hasEmail or has-email. So, such "lemmas" are considered as normal verb lemmas
and the default generation is applied by SimpleNLG (i.e. add an "s",
add an "ed").

> Then I was wondering how to edit automatically created pl and
> vbg, as could not estimate how these are used by ACE.

You can edit all the forms in the "lexicon view". But in the case of "hasEmail"
it seems impossible to come up with a form that would look nice in the
eventual ACE sentences. So, I would recommend to use Saxon genitives
or of-constructs in this case, i.e.

instead:

John hasEmail "john at mail.com".

use

John's email is "john at mail.com".

or

"john at mail.com" is an email-address of John.

> Is there any 'rule of thumb' on how to handle these kind of properties i.e.
> how to add them to the lexicon in order to make them 'understandable' by
> ACE? Maybe just editing the verb part, so adding 'have' 'has' 'had' ?

This is possible, but as I said it might result in unreadable ACE sentences.
The goal in the end is to express ontologies in such a way that they
are understandable to an English speaker (with no background in logic).
So I would try to use Saxon genitives more.

The problem is that ACE View has no particular support for Saxon genitives,
i.e. nouns used as object and data properties. This needs to be improved
still in ACE View.

Another word of warning: currently, if you change a word form (in the
"lexicon view")
then the existing snippets that use this form are NOT automatically changed
to use the new form. So, that's another thing that needs to be fixed.

--
kaarel


More information about the attempto mailing list