Changes
In anticipation of the 2007 state of the union address, there have been extensive changes to the way the application works. I am always interested in your feedback in order to make State of the Union a better application (send messages to: sotu at onetwothree dot net).
- Interface
- Horizontal position is determined by average position of word in document (no longer alphabetical order).
- Rollovers give more information.
- Previous address is displayed in red for comparison.
- Words animate the changes between documents.
- Words move to avoid overlapping.
- Icons show information about the distribution of the text.
- Analysis
- Calculation of word significance was changed to be an average of Log Likelihood Statistic and TD-IDF, term frequency–inverse document frequency (see Appendices for more info).
- Stop words were added to better trap common words (the previous method was based only on frequency).
- Data
- Transcripts for President Bush's addresses have been taken from the congressional record rather than those released by the White House or CSPAN. These new versions do no have the repeated notations of applause that skewed the results.
- Two additional addresses have been added: Clinton's 1989 address, and G. H. W. Bush's 1993 address. These speeches to a joint session of congress, as well as Bush's 2001 address (already included), were not technically State of the Union addresses, but rather "Addresses on Administration Goals," given at the beginning of their terms in office. The distinction seemed unimportant for the purpose of this site, so they are included here for completeness.