General architecture of a CNL-based multilingual semantic wiki


Kaarel Kaljurand
Institute of Computational Linguistics, University of Zurich

CNL 2012, Zurich
2012-08-29

Presenter Notes

Structure of the talk

  • current wiki systems
    • existing systems, their types, their shortcomings
    • AceWiki
  • generalization
  • current implementation
  • (no evaluation results yet)

Presenter Notes

Wiki systems

  • wiki
    • user-friendly collaborative environment for knowledge management
    • powered by software, e.g. MediaWiki
    • e.g. Wikipedia
  • semantic wiki (= wiki + formal semantics)
    • provides: richer query language, consistency checking (via automatic reasoning)
    • software: Semantic Mediawiki, ...
  • CNL-based semantic wiki (= semantic wiki using CNL)
    • formal languages hidden
    • software: AceWiki
  • multilingual wiki
    • authoring in multiple (natural) languages
    • current systems: only document-level interlinking

Presenter Notes

CNL-based Semantic Wiki

Presenter Notes

Example: AceWiki

  • goal: user-friendly yet expressive semantic wiki system
  • wiki features: collaborative editing, multiple interlinked articles
  • background reasoning language: OWL
    • expressive fragment of first-order logic
    • decidable reasoning tasks: consistency checking, question answering, ...
    • complex syntax
  • front-end language: ACE
    • subset of natural English
    • well-defined translation into first-order logic / OWL
    • end-user documentation: construction and interpretation rules
  • developed by Tobias Kuhn
  • see more: http://attempto.ifi.uzh.ch/acewiki/

Presenter Notes

AceWiki (screenshot)

Screenshot: article

Presenter Notes

Look-ahead editor (screenshot)

Screenshot: look-ahead editor

Presenter Notes

Reasoning (screenshot)

Screenshot: reasoning

Presenter Notes

Shortcomings

  • single natural language
    • ACE (OWL-compatible subset)
  • single formal language
    • ACE (OWL-compatible subset)
  • grammar not modifiable
    • ACE grammar
    • ACE->OWL mapping
  • ambiguity not supported
  • single reasoner
    • OWL-reasoner (although: multiple implementations/subsets)

Presenter Notes

Proposal:
generalize the idea of CNL-based semantic wiki

Presenter Notes

Multilingual CNL-based Semantic Wiki

  • multilingual
    • multilingual interface for editing and querying
    • synchronized multilingual content
  • CNL-based
    • backed by formal grammar(s)
    • formal languages are hidden
  • semantic
    • notions of consistency, entailment, automatic Q&A
  • wiki
    • collaborative
    • easy to use

Presenter Notes

Generalization

  • multiple languages
    • natural: English, German, ACE, ...
    • formal: ACE, Sage, ...
    • languages for content, UI, meta information
  • multiple grammars
    • ACE (or its subsets)
    • mathematical expressions
    • tourist phrasebook expressions
    • ...
  • multiple reasoners
    • ACE-based (i.e. DRS-based: RACE, OWL, SWRL, TPTP, ...)
    • math reasoners, e.g. Sage, WolframAlpha
    • ...

Presenter Notes

Multiple languages

  • multiple languages for
    • content
    • UI (labels etc.)
    • meta queries (authors, edits)
  • the content
    • viewable/editable/queryable in multiple languages
    • automatically kept in sync
  • some languages are formal, i.e. they are (mainly) meant for the reasoners

Presenter Notes

Multiple grammars

  • different grammars in different sections of the wiki
  • collaborative grammar engineering
    • full grammar editor
    • UI for adding/editing words and their forms

Presenter Notes

Multiple reasoners

Syntactic vs semantic reasoning

  • syntactic
    • e.g. "for a given sentence, show me all the syntactically similar sentences"
  • semantic (e.g. ACE-based, math reasoning)
    • consistency checking
    • Q&A
    • ...
    • explanation of reasoning results

Presenter Notes

Ambiguity management

  • collaborative
  • based on translation
    • ambiguity is revealed in another language
  • based on semantic reasoning
    • remove the reading that would introduce inconsistency or redundancy

Presenter Notes

Use cases

  • multilingual ACE wiki
    • as AceWiki, but in multiple natural languages
    • collaborative design (grammar editing, regression test sentences)
  • tourist phrasebook
    • book structure (ToC, chapters, index)
    • multiple languages
    • grammar editing
  • catalog of museum objects (paintings, painters)
    • each object on a separate wiki page
    • multiple languages
    • linking and queries (list works of a given artist in a given period)
  • math exercises
    • multiple user solutions
    • automatically checked

Presenter Notes

Current implementation

Presenter Notes

Technologies

  • Attempto Controlled English (ACE)
    • first-order language with English syntax
    • offers reasoning via Discourse Representation Structures
  • AceWiki
    • ACE-based wiki
    • OWL reasoning
  • Grammatical Framework (GF)
    • framework for building multilingual CNLs
    • various tools and libraries

Needed:

  • integration of AceWiki with GF services
  • implementation of the ACE syntax in GF

Presenter Notes

Grammatical Framework (GF)

  • functional programming language for grammar engineering
  • expressivity beyond context-free
  • parsing and generation (linearizing)
  • focus on multilinguality
    • multiple concrete grammars
    • common single abstract grammar (language-neutral)
    • translate = parse to abstract tree + linearize into a concrete language
  • special support for natural language features
    • long-distance dependencies
    • word form generation
    • Resource Grammar Library (RGL)

Presenter Notes

AceWiki integration with GF

currently based on the GF Webservice and the GF online editor

  • wiki entry is GF abstract tree set
    • viewed via linearization
    • expresses ambiguity
  • access to multiple online GF grammars
    • provided by GF webservice
    • single grammar per wiki
  • multilingual viewing and editing of wiki content
    • look-ahead editing
  • presentation of the GF-analysis of sentences
    • translations, word alignment diagrams, GF syntax trees, ...
  • grammar editing using the online GF editor

Presenter Notes

AceWiki and the online GF editor

Presenter Notes

ACE in GF-based AceWiki

Mapping between multiple natural and formal languages via ACE/DRS.

Multilinguality

Needed:

  • implementation of the ACE syntax in GF

Presenter Notes

ACE in GF

  • implementation of the ACE syntax (i.e. no DRS generation)
  • available in 10 natural languages via the RGL
    • Cat, Dut, Eng, Fin, Fre, Ger, Ita, Spa, Swe, Urd
    • design allows for easy extendability
  • focus on the AceWiki subset
    • less over-generation in the AceWiki context
  • almost 100% coverage at almost 0% ambiguity
  • some precision problems
    • anaphoric references
  • based on Angelov and Ranta (CNL 2009)
  • see http://github.com/Attempto/ACE-in-GF

Presenter Notes

Future work

  • structuring the content
    • multiple articles
    • sentence order
  • viewing/querying the content
    • dynamic views based on queries over the GF abstract trees
  • integrate various external reasoning tools
  • evaluate with real content and real users

Presenter Notes

Links

Presenter Notes