ThoughtTreasure 0.00022 release notes (19991208)
-
Added web querying of the
ThoughtTreasure
knowledge base. This uses a new reformatted version of the ThoughtTreasure
database called TTKB. The ttkb (1) command
allows access to TTKB. TTKB can also be accessed from Python.
-
Simplified declarative representations of planning agents (PAs) called
scripts have been added. The existing planning agents have been
redescribed as scripts, and new scripts have been added. See the paper
on A database and lexicon of scripts in
ThoughtTreasure. Scripts have not yet been incorporated into the
planning agency.
-
Added a Python client API that communicates with a ThoughtTreasure server
over a socket connection. Perl and Tcl client APIs are in progress.
-
Attachments (and the feature character þ) have been eliminated.
Previously we had:
==country//|attachment-rel-of=nationality-of|
===Scotland.z//Scot.þz/Scottish.Az/
Attachments were originally designed to provide an efficient way of
entering related words. However the lack of an atomic concept for,
say, a Scottish person, caused problems in derivational morphology,
generation of occupations, and other tasks. So the above is now
represented using separate, but connected, concepts:
==attribute//
===Scottish.Az//
==human//
===Scot.z//|Scottish|nationality-of=Scotland|
==country//
===Scotland.z//
This sets up the following three-way equivalence:
[Scottish ?human] ==
[nationality-of ?human Scotland] ==
[isa ?human Scot]
As a result the occupation, religion, geography, and other ontologies
have been reorganized.
-
Ranges have been eliminated. Ranges allowed the representation of gradable
concepts using weights. For example, bad was represented as
good with a weight of -1.0. Lack of an atomic concept for
bad caused problems similar to those for attachments. Now
good and bad are separate concepts, linked by an
antonym-of relation. Stronger or weaker versions of concepts
are represented as subtypes. That is, OK is a good.
-
The role features (à, á, and ä) have
been eliminated and replaced by other mechanisms:
-
For relations (such as nationality-of), inverse relations have
been defined (such as citizen-of). The lexical entries defined
with role à (such as citizen) have been moved
to the inverse relations, while the lexical entries defined
with role á (such as nationality) remain associated
with the original relation. No existing relations made use of the
ä role.
-
For actions, concepts have been defined for all roles and actions have
been linked to those concepts via the relations role1-of,
role2-of, ... Thus à is now role1-of,
á is now role2-of, and ä is now role3-of.
-
The feature character ê (become) has been eliminated. This caused
problems similar to those for attachments. Previously we had:
===relaxed.Az//relax.Vêz/
This is now represented as two linked concepts:
===relaxed.Az//
===become-relaxed//relax.Vz/|leadto1=relaxed|
The ê feature was not yet used in the code, so no code
modifications were necessary. The new leadto relations are
as follows:
[leadto1 pred1 pred2] is equivalent to [leadto [pred1 X] [pred2 X]]
[leadto2 pred1 pred2] is equivalent to [leadto [pred1 X Y] [pred2 Y]]
[leadto12 pred1 pred2] is equivalent to [leadto [pred1 X Y] [pred2 X Y]]
-
parsegen has been eliminated. This caused the generator to generate
a more specific concept if the assertion satisfied the selectional constraints
of that more specific concept. For example, ingest would
be generated as eat if what was ingested was food, or
drink if what was ingested was a beverage. On parsing,
eat and drink would both be converted into ingest.
First of all, on the parsing side, this seems wrong: it prevents us from
representing John drank the pizza, which although bizarre, should
probably be passed along as is to the understanding agency. So this feature
is now completely eliminated for parsing. On the generation side, it should
be acceptable always to use a more specific word (for a more specific concept)
if one exists, so there is no need to assert explicitly that a concept
is parsegen.
-
Metonymy coercion and the wholeN relations have been
eliminated. These allowed Jim opened the jar to be converted by the
semantic parser into Jim's hand opened the jar. This sort of detail
should instead be handled by the understanding agency, and not be coded
directly into verb selectional constraints as before. That is, the subject of
wash face should not have a selectional constraint of grasper
(hand). Instead it should have a selectional constraint of human.
-
tran has been eliminated. This was intended to mark concepts
only useful in translation, but did not seem to be used anywhere in the
code and only a few concepts in the database were marked with tran.
-
The generation only feature character ª has been eliminated.
It was only being used in one lexical entry.
-
Assertions (not just tokens) may now be used as values of = and
¤. Instead of having to enter:
==con1//|[pred1 con1 [pred2 con2 con3]]|[pred2 [pred3 con4 con5] con1]|
you may now enter:
==con1//|pred1=[pred2 con2 con3]|pred2¤[pred3 con4 con5]|
-
The space and time efficiency of contexts has been improved by
eliminating copying. Instead, only differences are now stored.
-
Expletive himself coded before a verb now is located after
the verb.
-
The -g (analogical morphology) command-line argument has been
added to enable analogical morphology, which is better but uses more
memory than the default algorithmic morphology. Previously, recompilation
was necessary to activate analogical morphology.
-
Updated car ontology/lexicon.
-
Expanded other- entries in clothing ontology/lexicon.
-
Incorporated ThoughtTreasure server protocol registered port number
1832 as listed by the Internet Assigned Numbers Authority (IANA).
-
Moved duration-of in hierarchy so that it parses and generates
properly. Modified question-word question answering to allow questions
about actions, such as Web surfing takes how long? and What
is the duration of gardening?.
-
Fixed segmentation fault problems reported by JAmes Atwill, Matti Airas,
and others.
-
Fixed compilation problems exposed under Solaris 2.6 and made improvements
to makefile. Patches provided by David Arnold.
-
Changed server protocol encoding of GRIDSUBSPACE from (for example)
[GRIDSUBSPACE 1 2] to
[GRIDSUBSPACE NUMBER:u:1 NUMBER:u:2].
-
Fixed problem where Orange as in Orange, CA was being stored under the
lexical entry "orange" instead of "Orange".
ThoughtTreasure 0.00021 release notes (19981201)
-
ThoughtTreasure has been ported to compile and run under Red Hat Linux 5.2.
-
A new overview of ThoughtTreasure is
available. Portions of the book on
ThoughtTreasure are now available online.
-
A socket-based server interface has been added, allowing applications
written in any language to make use of ThoughtTreasure. A client
communicates with the server using the
ThoughtTreasure Server Protocol (TTSP).
The server is invoked via the ThoughtTreasure shell command server
or server -port PORT where
PORT
is the listen
port. The server returns control to the ThoughtTreasure shell when
a client issues the Bringdown command.
-
A Java-based client API has been added, enabling Java
programs to communicate with the ThoughtTreasure server using TTSP.
-
More words, phrases, concepts, and assertions have been added.
ThoughtTreasure now has 31,458 English and 20,162 French lexical
entries, 22,386 concepts, 14,105 database assertions, and
24,275 ISA links.
-
ThoughtTreasure now loads only English lexical entries by default.
The -g (language) and -d (dialect) command-line
arguments have been added to enable selection of what languages to load
and what dialects to enable. To start ThoughtTreasure for both English and
French, issue the command tt -g zy -d ?Àgç where:
z = English
y = French
? = all dialects
À = American English
g = British English
ç = Canadian English/French
-
ThoughtTreasure time and date stamps have been modified to conform to
the international standard ISO 8601. The new formats are:
YYYYMMDD"T"HHMMSS"Z" (in GMT)
YYYYMMDD"T"HHMMSS (in some unspecified local time)
YYYYMMDD"T"HHMMSS"-"HHMM
YYYYMMDD"T"HHMMSS"+"HHMM
YYYYMMDD
YYYYMM
YYYY
na
-Inf
+Inf
Inf
The appropriate changes have been made in the database.
Here are some examples of old and new format timestamps:
OLD NEW
19980103134502et 19980103T134502-0500
19980103134502gt 19980103T134502Z
199801031345gt 19980103T134500Z
19980103 19980103 (unchanged)
-
There was some inconsistency in the case of the constants for
negative and positive infinity (-Inf, -inf,
+Inf, +inf, Inf, and inf).
A lower-case "i" is no longer supported anywhere (neither in
timestamps nor numbers). The allowed constants are now:
-Inf
+Inf
Inf
-
The code has been modified to pass the -Wall tests
of gcc. Several potential bugs were fixed in the process.
-
Segmentation fault problems in the test and quit
commands have been fixed.
-
The file toolapi.c contains an easier-to-use application
programming interface (API) to ThoughtTreasure, which can be used
when ThoughtTreasure is linked into an application (as an alternative
to running it as a server).
-
A pcn command to understand novel compound nouns has been added.
This is experimental and not yet complete. See compnoun.c,
semparse.c (look for "CompoundNoun"), toolsh.c
(look for "pcn"), and ling.txt (look for "1-2-is-a-2").
-
Reports (from the report ThoughtTreasure shell command) are
now placed in the current directory ".", not "../reports".
ThoughtTreasure 0.00020 release notes (19970922)
-
ThoughtTreasure has been ported to compile and run under MS-DOS or Windows
using DJGPP 2.01.
-
Lexical entries, concepts, and text agents have been added for the movie
review application.
-
A help command has been added that prints the file help.txt
describing ThoughtTreasure shell commands and their arguments.
-
The TTROOT environment variable is now used to specify
the location of the ThoughtTreasure distribution (for reading db files).
-
chateng and chatfr commands have been added, which are
the same as talkeng and talkfr. The name of
online-talk has been changed to online-chat.
-
Filenames in src and db directories have been changed
to be DOS compatible (= maximum of 8 characters + "." + maximum of 3
characters).
-
Modification to UA_Time_Tense for inper.txt has been taken
out. This caused problems with intut.txt.
-
PNodeTypeClassMatch has been modified to allow proper operation of
TA_Time_MergeDateTod.
-
cancel.Vz has been added to to eninfl.txt so that appointment
example works properly in algorithmic morphology mode.
-
BE entry of actor-of (added for the movie review understanding
example) has been commented out to get the Perutz story to parse correctly.
Otherwise, it thinks "Mrs. Puchl was a grocer" means she played a grocer
in a movie. (Pruning is currently set to keep only one interpretation.)
-
ObjListSort functionality has been eliminated when compiling using
gcc. (I'm using the gcc and libraries that come with
DJGPP 2.01.) The comparison functions are seeing bad Obj pointers. Tests
prior to calling qsort seem to indicate its arguments are OK.
qsort or something else is trashing the array to be sorted,
and I was unable to find the problem. I decided to eliminate the sorting
for now. This means that some output won't be properly sorted by
timestamp. (qsort has given me trouble before; in Solaris 2.4
x86 it had a sign extension bug.)
-
Filenames in examples directory (and in the test script test.tts)
have been changed to be DOS compatible.
-
ThoughtTreasure loads slowly on my Windows PC. I tried replacing
malloc, but the result was the same. Disk I/O might be the problem,
but a test program indicates getc() can be used to read a large file very
quickly. Profiling turned up nothing useful. Load is slightly faster under
MS-DOS than under Windows 95.
-
Bug in YMDHMSToUnixTs has been fixed. It was not completely initializing
struct tm. This sometimes caused timestamps to be parsed incorrectly
(depending on the state of uninitialized memory).
-
Note "Who painted _Le Déjeuner sur l'herbe_?" doesn't parse
when the French lexicon isn't loaded (as is the case by
default). (There is no English entry for this work.)
-
Floating point exception problems caused by large numbers (usually
-9999) in database (db/*) files have been fixed.
-
Segmentation fault problems in talkeng caused by
HashTableHash() when compiling with gcc have been fixed.
-
Alignment problems have been fixed: malloc is now used by default.
If qalloc is used, it attempts always to maintain 8-byte alignment.
-
Memory management code including qalloc has been reworked to be
more robust across platforms.
-
In SleepMs(), call to usleep has been added under gcc,
which is used by the typing simulator in utiltype.c.
-
Init() has been modified to load ThoughtTreasure in English-only mode, to
reduce load time.
-
A few platform-specific (Mac OS and NEXT) modifications suggested by users
have been incorporated. I haven't tested these.
ThoughtTreasure 0.00015 release notes (19960716)
Minor modifications.
ThoughtTreasure 0.00014 release notes (19960713)
Minor modifications.
ThoughtTreasure 0.00013 release notes (19960630)
Modifications for NextStep from Frank M. Siegert have been incorporated.
ThoughtTreasure 0.00012 release notes (19960515)
Bugs exposed by compiling with gcc and running under SunOS on
Sparc have been fixed.
ThoughtTreasure 0.00011 release notes (19960430)
Minor modifications.
ThoughtTreasure 0.0001 release notes (19960428)
First release of ThoughtTreasure.
ThoughtTreasure documentation |
ThoughtTreasure home
Questions or comments?
webmaster@signiform.com
Copyright © 2000 Signiform.
All Rights Reserved. Terms of use.