ch.uzh.ifi.attempto.chartparser
Class Grammar

java.lang.Object
  extended by ch.uzh.ifi.attempto.chartparser.Grammar
Direct Known Subclasses:
ExampleGrammar, StandardGrammar

public class Grammar
extends java.lang.Object

This class represents a grammar that is needed to run the chart parser. A grammar can be created either directly in Java or on the basis of a file in the ACGN format.

ACGN Format

ACGN stands for "Attempto Chartparser Grammar Notation" and uses Prolog notation to provide a nice grammar representation. Simple grammar rules in ACGN look almost the same as common Prolog DCG rules. Just replace the operator "-->" by "=>":
 vp => v, np.
 v => [does, not], verb.
Complex grammar rules in ACGN are different from common Prolog DCG rules in the sense that they are using features rather than arguments with fixed positions. Arguments are not recognized by their position but by their name:
 vp(num:Num,neg:Neg) => v(num:Num,neg:Neg,type:tr), np(case:acc).
 v(neg:plus,type:Type) => [does, not], verb(type:Type).
Every feature has the form Name:Value where Name has to be an atom and Value can be a variable or an atom (but not a compound term).

ACGN provides special support for anaphoric references which are used in (controlled) natural languages to refer to objects earlier in the sentence. For example, in the sentence

A country contains an area that is not controlled by the country.
the anaphoric reference "the country" refers to the antecedent "a country". Anaphoric references should be introduced only if the previous text contains a matching antecedent that is accessible. For example, in the case of the partial sentence
A country does not contain a river and borders ...
one can refer to "a country", but not to "a river" because being in the scope of a negation makes it inaccessible.

In order to define the accessibility information needed for anaphoric references in a declarative way, we distinguish two types of grammar rules: accessible rules "=>" and inaccessible rules "~>". The following example shows an inaccessible rule:

 vp(num:Num,neg:plus) ~> v(num:Num,neg:plus,type:tr), np(case:acc).
Inaccessible rules are handled in the same way as accessible rules with the only exception that the components that are in the scope of the rule are not accessible for subsequent anaphoric references.

This can be visualized by the introduction of a special node "~" in the syntax tree whenever an inaccessible rule is used. For the partial sentence introduced before, the syntax tree could look as follows:

example syntax tree

In this case, several accessible rules and exactly one inaccessible rule (indicated by the "~"-node) have been used. All preceding components that can be reached through the syntax tree without traversing a "~"-node in the top-down direction are accessible. Thus, "a country" is accessible from the position "*", but "a river'' is not. Furthermore, "a country" would be accessible from the position of "a river" because the "~"-node is in this case traversed only in the bottum-up direction.

The described procedure allows us to determine all possible anaphoric references that can be used to continue a partial sentence. In our example, one can refer only to "a country". The concept of accessible and inaccessible rules is a simple but powerful instrument to define in a declarative way the accessibility constraints for anaphoric references.

The information about which tokens are accessible for anaphoric references can be retrieved by the method getAccessiblePositions of the ChartParser class.

Transformations

ACGN grammars can be translated automatically into a Java class or into a Prolog DCG using the SWI Prolog programs "generate_java.pl" or "generate_dcg.pl", respectively. Those programs can be found in the directory "src/ch/uzh/ifi/attempto/chartparser/util" of the source code of this package. The Java class can be generated like this:
 swipl -s generate_java.pl -g "generate_java('my_acgn_grammar.pl', 'my.package', 'MyJavaGrammar', 'my_start_category')" -t halt
Note that the SWI Prolog command might be different on your machine (e.g. "plcon" or "pl"). The Prolog DCG file can be generated like this:
 swipl -s generate_dcg.pl -g "generate_dcg('my_acgn_grammar.pl', 'my_dcg_grammar.pl')" -t halt
Note that the information about accessible and inaccessible rules gets lost in the Prolog DCG file.

Author:
Tobias Kuhn

Constructor Summary
Grammar(Nonterminal startCategory)
          Creates a empty grammar with the given start category.
Grammar(java.lang.String startCategoryName)
          Creates a empty grammar with a start category of the given name.
 
Method Summary
 void addRule(Rule rule)
          Adds the rule to the grammar.
 java.util.ArrayList<Rule> getEpsilonRules()
          Returns all the rules that have no body categories.
 java.util.ArrayList<Rule> getRulesByHeadName(java.lang.String name)
          Returns the rules whose head category has the given name.
 Nonterminal getStartCategory()
          Returns the start category.
 java.lang.String toString()
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

Grammar

public Grammar(Nonterminal startCategory)
Creates a empty grammar with the given start category.

Parameters:
startCategory - The start category for the grammar.

Grammar

public Grammar(java.lang.String startCategoryName)
Creates a empty grammar with a start category of the given name.

Parameters:
startCategoryName - The name of the start category for the grammar.
Method Detail

getStartCategory

public Nonterminal getStartCategory()
Returns the start category.

Returns:
The start category.

addRule

public void addRule(Rule rule)
Adds the rule to the grammar.

Parameters:
rule - The rule to be added.

getRulesByHeadName

public java.util.ArrayList<Rule> getRulesByHeadName(java.lang.String name)
Returns the rules whose head category has the given name.

Parameters:
name - The name of the head category.
Returns:
A list of rules.

getEpsilonRules

public java.util.ArrayList<Rule> getEpsilonRules()
Returns all the rules that have no body categories.

Returns:
A list of rules.

toString

public java.lang.String toString()
Overrides:
toString in class java.lang.Object


Copyright 2008-2009, Attempto Group, University of Zurich (see http://attempto.ifi.uzh.ch)