[Attempto] ambiguous prefixes in APE

Kaarel Kaljurand kaljurand at gmail.com
Fri Jun 25 17:32:34 CEST 2010


Hi,

On Fri, Jun 25, 2010 at 15:10, Norbert E. Fuchs <fuchs at ifi.uzh.ch> wrote:
>
> On 24 Jun 2010, at 17:37, Jean-Marc Vanel wrote:
>
>> I'm running this, after installing the clex-6.5-090528.zip. By the
>> way, why isn't it part of the standard APE distribution ?
>
> Some users of APE may want to use their own lexicon instead of clex that is a relatively large download.

Another reason is that Clex is distributed under GPL, while APE is under LGPL.
I guess we wanted to have a single license per package.

>> ./ape.exe -guess -text 'Every material that contains some cement and
>> some n:aggregate is some concrete.' -solo drspp
>>
>> Here aggregate is considered as countable by APE. However, I intend
>> this WordNet synset :
>>
>> 2. aggregate -- (material such as sand or gravel used with cement and
>> water to make concrete, mortar, or plaster)
>>
>> I would like to say something like:
>> some m:aggregate
>> to flag a mass noun. But this does not work. Is there another way?
>
> Here the problem is that "aggregate" occurs in clex as countable noun. Perhaps prefixing it with "some n: " should allow users to redefine it as mass noun. I do not know whether the current situation should be called a feature or a bug. For a solution see below.
>
> Countable and mass nouns are distinguished by their determiners. You can write, for instance, "John has some water. Mary has a water." where both meanings are in clex. Thus a prefix "m" does not seem to be necessary. Well, this is not quite true since the determiner "no" is ambiguous between countable and mass, for instance in "no water" that always get a countable interpretation.
>

The choice between "plural countable" and "mass" in this case depends
on the context. If the context does not disambiguate then "plural countable"
is chosen. Consider the following examples:

$ ./ape.exe -guess -text 'Some n:aggregate waits.' -solo drspp
[A, B]
object(A, aggregate, mass, na, na, na)-1/4
predicate(B, wait, A)-1/5

"mass" is chosen because singular verb needs to agree with the subject.

$ ./ape.exe -guess -text 'Some n:aggregate wait.' -solo drspp
[A, B]
object(A, aggregate, countable, na, geq, 2)-1/4
predicate(B, wait, A)-1/5

"plural countable" is chosen because the plural verb needs to agree
with the subject.

$ ./ape.exe -guess -text 'John sees some n:aggregate.' -solo drspp
[A, B]
object(A, aggregate, countable, na, geq, 2)-1/6
predicate(B, see, named(John), A)-1/2

Here the noun is in the object position, i.e. no agreement is required, i.e.
there is no disambiguating information and "countable plural" is chosen
simply because in the parser the rule that chooses this reading comes before
the rule that would choose the mass reading.

The bottom line is that prefixes cannot be currently used as a replacement
for the (user) lexicon. Certain fine distinctions are not possible using the
prefixes. Note also that prefixes cannot be used to specify the lemma form
of the surface forms either.

Prefixes were never intended as an alternative for the user lexicon.
The goal was rather to provide a way of quick testing for the APE
developer.

It would be good to have a clear cut solution regarding the prefixes.
Either we remove them from the next version of APE to avoid
confusion (note that they are not part of the ACE language anyway),
or we upgrade them so that they become as expressive as the user lexicon.
I'm not sure which option is preferred...

--
kaarel


More information about the attempto mailing list