Current Research (COCOTA)

Monday 20th February, 2006 at 11:50 am 2 comments

  • Conceptual Contextual Translation (COCOTA) – self learning translation paradigm that translates by building a conceptual framework by inferring contextual relationships between language constructs.
  • The current framework environment is built using loosly coupled modula agents that build a conceptual model of language:
    • The framework is reflexive – in the sense that agents reflect on concept generation using contextual markers within the text, paragraph, sentence or phrase – and generate a consensus of meaning.
    • Agents have access to the same data repositories (previous historical and temporal concept trees as well as proxied self discovery [via the internet and corpus]) but are given different linguistic and programatic parameters (ala devils advocate).
    • Agents are able to adjust their own working environment (parameters, working set and code base).
    • Proxied information discovery means that requests for information go through a relatively dumb information proxy agent – which searches for information on behalf of the agent. The search conditions and results are cached to improve performance of the system, and to examine the approaches taken by agents to data discovery.
    • Technology:
      • Modified SOAR (the original SOAR environment: http://sitemaker.umich.edu/soar) in C++
      • .NET (a mixture of C#, VB, Python, LINQ)
      • Platform agnostic Remoting framework for intra agent communication
    • Stats:
      • The system isn’t tied to any language – it self learns – but testing and development is done using European languages (English, Italian, German, French, Spanish)
      • The current framework is limited to only 100 words – again – the system is able to learn new words and attribute meaning through Cocota (this is purely for performance reasons – I’m working with limited hardware and resources)
      • The system translates (languages from the above list) from language A -> language B -> Language C -> Langauge A with, on avg, 95% accuracy in terms of meaning at the final stage. In my mind – this is the only true way to measure the accuracy of the system – if sense and meaning is lost in translation then you have failed.

 

If you’re interested in investing – then more information can be found here

Advertisements

Entry filed under: .NET, Development, NLP & MT, Research.

Three is the Magic Number… WinFX Feb CTP (x64 & x86)

2 Comments Add your own

  • 1. Dasher’s Corner » Weird what you find when…  |  Monday 6th February, 2006 at 7:42 pm

    […] Google spat out a bunch of links and one of them took me to Richard Lowe’s Blog – a quick scan and I spotted an interesting section on Complex Adaptive Systems.  While Richard hasn’t posted anything recently – it’s an interesting topic.  CoCoTa a Machine Translation Engine I’ve been working on implements a decoupled dynamic engine.  Decoupling is an engine mechanism within the CoCoTa framework that enables dynamic code generation.  […]

    Reply
  • 2. Silence of the Night » Machine Translation  |  Tuesday 7th February, 2006 at 2:09 pm

    […] A good friend has setup his blog again. He has a very interesting research topic: Machine Translation. […]

    Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Trackback this post  |  Subscribe to the comments via RSS Feed


Recent Posts


%d bloggers like this: